This Streamlit app is designed to help you analyze and extract valuable insights from challenging data formats commonly found in enterprise settings, such as HTML, PDF, CSV, PNG, PPTX, and more.
This app uses unstructured.io as a base library, providing an easy way to extract and convert unstructured data into a format compatible with popular vector databases and LLM frameworks. With this tool, you can streamline complex data handling and ensure compatibility with your preferred data analysis pipelines.
Supported file types:
Category | Document Types |
---|---|
Plaintext | .txt , .eml , .msg , .xml , .html , .md , .rst , .json , .rtf |
Images | .jpeg , .png |
Documents | .doc , .docx , .ppt , .pptx , .pdf , .odt , .epub , .csv , .tsv , .xlsx |
Find out more about it unstructured.io |
To get started, upload any docs file and it will be show's on the preview. You can also adjust the parameters to fine-tune your tests.
You can access the app on the Streamlit Cloud community at https://unstructured-demo.streamlit.app/.
The app does not require any API key to function; extractions will be processed on streamlit cloud serverunless you choose to process them on unstructured.io server.
However, if you choose to use unstructured.io API, I gave you a temporary key in the app, but it might be limited. Create your own at unstructured. After obtaining your API key, select unstructured.io API, enter your own API, and upload your file.
If you have any feedback or questions about this app, please reach out to me on Twitter at @rririanto.
Thank you for checking out the tool!