Skip to content

In this project I have built an end to end langchain project using hugging face open source llm models such as Mistral and also open source embedding models.

Notifications You must be signed in to change notification settings

NebeyouMusie/End-to-End-Gen-AI-Powered-App

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

End to End Gen AI Powered App

  • In this project I have built an end to end langchain project using hugging open source llm models such as Mistral and also open source embedding models.

Streamlit Web App Interface

DEMO

  • You can check the project live here

Description

  • This project showcase the implementation of an advanced RAG system that uses Hugging Face as an llm to retrieve information from different PDF documents.

Steps I followed:

  1. I have used the PyPdfDirectoryLoader from the langchain_community document loader to load the PDF documents from the us-census-data directory.
  2. transformed each text into a chunk of 1000 using the RecursiveCharacterTextSplitter imported from the langchain.text_splitter
  3. stored the vector embeddings which were made using the HuggingFaceBgeEmbeddings using the FAISS vector store.
  4. setup the llm HuggingFaceEndpoint with the model name mistralai/Mistral-7B-Instruct-v0.2
  5. Setup PromptTemplate
  6. Setup vector_embedding function to enbedd the documents and store them in the FAISS vectorstore
  7. finally created the RetrievalQA for chaining llm, prompt and retriever.

Libraries Used

  • langchain==0.1.20
  • langchain-community==0.0.38
  • langchain-huggingface==0.0.1
  • faiss-cpu==1.8.0
  • python-dotenv==1.0.1

Installation

  1. Prerequisites
    • Git
    • Command line familiarity
  2. Clone the Repository: git clone https://github.com/NebeyouMusie/End-to-End-Gen-AI-Powered-App.git
  3. Create and Activate Virtual Environment (Recommended)
    • python -m venv venv
    • source venv/bin/activate
  4. Navigate to the projects directory cd ./End-to-End-Gen-AI-Powered-App using your terminal
  5. Install Libraries: pip install -r requirements.txt
  6. Navigate to the app directory cd ./app using your terminal
  7. run streamlit run app.py
  8. open the link displayed in the terminal on your preferred browser
  9. click on the Embedd Documents button and wait until the documnets are processed
  10. Enter your question from the PDFs found in the us-census-data directory

Collaboration

  • Collaborations are welcomed ❤️

Acknowledgments

Contact