This repository contains the code for the information extraction app that uses langchain to extract a structured output from unstructured data for a particular schema.
python -m venv <name_of_the_env>
source <name_of_the_env>/bin/activate
pip install -r requirements.txt
This application communicates with the OCR API service to generate the OCR outputs. Spawn the OCR service and then create the secrets.toml file in .streamlit directory at root level and add the following fields to it.
Learn more about Secrets management in Streamlit at: https://docs.streamlit.io/streamlit-community-cloud/deploy-your-app/secrets-management
HOST_URL = ""
OCR_SERVICE_PORT = ""
OCR_PDF_RESP_ENDPOINT = "ocr_pdf"
OCR_IMG_RESP_ENDPOINT = "ocr_image"
OPENAI_API_KEY = ""
ALLOW_FREE = false
To finally run the app:
streamlit run states.py
Hosted with the help of Streamlit Cloud!