New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat/write_elements #269
Comments
@Matthieu-Tinycoaching - Checkout out with open("elements.json", "w") as f:
json.dump(convert_to_isd(elements), f)
with open("elements.json", "r") as f:
elements = isd_to_elements(json.load(f)) Would that meet your needs? We'd also be happy to include an |
Hi @MthwRobinson thanks for the tip! It seems to do the job. However, when trying this on the example data When counting for the types of elements present in the document just after parsing: When doing this after reimporting from the JSON file: Any idea? |
Looks like we need to add handling for those element types in https://github.com/Unstructured-IO/unstructured/blob/main/unstructured/staging/base.py#L26-L33 |
Added #270 to capture the missing elements. Will add some serialization/deserialization helper functions while we're in there. |
@Matthieu-Tinycoaching - There's a PR up to address the issue you flagged. That also adds helper functions for saving to/loading from JSON |
@MthwRobinson nice! |
Right now! Just released the updated in |
Hi,
Is there any way to write
List[Element]
data into file and load from it then, in order to avoid to partition data each time?The text was updated successfully, but these errors were encountered: