st_doc_ext

This repository contains the code for the information extraction app that uses langchain to extract a structured output from unstructured data for a particular schema.

Create and activate a venv

python -m venv <name_of_the_env>
source <name_of_the_env>/bin/activate

Pip install all requirements

pip install -r requirements.txt

Setup Streamlit Secrets File

This application communicates with the OCR API service to generate the OCR outputs. Spawn the OCR service and then create the secrets.toml file in .streamlit directory at root level and add the following fields to it.

Learn more about Secrets management in Streamlit at: https://docs.streamlit.io/streamlit-community-cloud/deploy-your-app/secrets-management

HOST_URL = ""
OCR_SERVICE_PORT = ""
OCR_PDF_RESP_ENDPOINT = "ocr_pdf"
OCR_IMG_RESP_ENDPOINT = "ocr_image"
OPENAI_API_KEY = ""
ALLOW_FREE = false

Run Streamlit App

To finally run the app:

streamlit run states.py

Experience the app!

Hosted with the help of Streamlit Cloud!

https://extractinfo.streamlit.app/

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
llm.py		llm.py
requirements.txt		requirements.txt
state_diag.mmd		state_diag.mmd
states.py		states.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

st_doc_ext

Create and activate a venv

Pip install all requirements

Setup Streamlit Secrets File

Run Streamlit App

Experience the app!

About

Releases

Packages

Contributors 2

Languages

License

mohanbing/st_doc_ext

Folders and files

Latest commit

History

Repository files navigation

st_doc_ext

Create and activate a venv

Pip install all requirements

Setup Streamlit Secrets File

Run Streamlit App

Experience the app!

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages