## LangChain + PDF Full Stack App

## IMPORTANT: Installation with the exact packages we used
* When you download a full stack app you need to make sure that both backend and frontend use the original packages in order to avoid potential errors caused by installing more modern versions of these packages.
* Since we used pip to install the original backend packages and froze them using pip freeze, you will now use "pip install -r requirements.txt" to install them.
* Since we used npx to install the original frontend packages, you will now use "npm ci" to install them.
#### Backend installation
* In the terminal, make sure you are in the root directory of the project (1025-langchain-plus-full-stack-pdf-loading-app).
* **Go to the backend directory, create a virtual environment and use pip install to make sure you install the exact same packages we used**:
    * cd 001-langchain-pdf-fastapi-backend
    * pyenv virtualenv 3.11.4 your-virtual-environment-name
    * pyenv activate your-virtual-environment-name
    * pip install -r requirements.txt
#### Frontend installation
* Open a second terminal window, make sure you are in the root directory of the project (1025-langchain-plus-full-stack-pdf-loading-app).
* **Go to the frontend directory, and use npm ci to make sure you install the exact same packages we used**:
    * cd 002-langchain-pdf-vercel-frontend
    * cd langchain-pdf-app
    * npm ci
#### Ready to go!
* You can now see the code of the app in Visual Studio Code.
* You can now run the app in your computer, but remember that in order for it to work you will need to configure AWS S3.
* Relax and review the following steps. Remember, since you have pre-installed the modules you will not have to re-install them again.

## This is the LangChain version we used in this project
* If you want to replicate exactly what you see in the video, make sure you are in the backend directory with the virtual environment and install the same LangChain version from your terminal:

In [None]:
#pip install langchain==0.1.1

## Remember to install boto3

In [1]:
#pip install boto3

In [None]:
#pip install python-multipart

### Add OpenAI key in backend/.env

### Add imports in backend/routers/pdfs.py

In [None]:
# from langchain import OpenAI, PromptTemplate
# from langchain.chains import LLMChain

### Add basic LangChain code in backend/routers/pdfs.py

In [None]:
# # LANGCHAIN
# langchain_llm = OpenAI(temperature=0)

# summarize_template_string = """
#         Provide a summary for the following text:
#         {text}
# """

# summarize_prompt = PromptTemplate(
#     template=summarize_template_string,
#     input_variables=['text'],
# )

# summarize_chain = LLMChain(
#     llm=langchain_llm,
#     prompt=summarize_prompt,
# )

# @router.post('/summarize-text')
# async def summarize_text(text: str):
#     summary = summarize_chain.run(text=text)
#     return {'summary': summary}

### Check backend

In [2]:
#uvicorn main:app --reload

http://127.0.0.1:8000/docs
* Check how POST /todos/summarize-test works

### Add advanced LangChain code in backend/routers/pdfs.py

The following route includes the RAG technique to ask a question about the PDF file identified by id:

In [2]:
# @router.post("/qa-pdf/{id}")
# def qa_pdf_by_id(id: int, question_request: QuestionRequest,db: Session = Depends(get_db)):
#     pdf = crud.read_pdf(db, id)
#     if pdf is None:
#         raise HTTPException(status_code=404, detail="PDF not found")
#     print(pdf.file)
#     loader = PyPDFLoader(pdf.file)
#     document = loader.load()
#     text_splitter = RecursiveCharacterTextSplitter(chunk_size=3000,chunk_overlap=400)
#     document_chunks = text_splitter.split_documents(document)
#     embeddings = OpenAIEmbeddings()
#     stored_embeddings = FAISS.from_documents(document_chunks, embeddings)
#     QA_chain = RetrievalQA.from_chain_type(llm=llm,chain_type="stuff",retriever=stored_embeddings.as_retriever())
#     question = question_request.question
#     answer = QA_chain.run(question)
#     return answer

In order for it to work, we need to add the following lines on the top of the file:

In [None]:
# from langchain.document_loaders import PyPDFLoader
# from langchain.text_splitter import RecursiveCharacterTextSplitter
# from langchain.embeddings.openai import OpenAIEmbeddings
# from langchain.vectorstores import FAISS
# from langchain.chains import RetrievalQA
# from schemas import QuestionRequest
# llm = OpenAI()

You can check it:

In [3]:
#uvicorn main:app --reload

Click in the following link to open the fastAPI API in http://127.0.0.1:8000/docs
* There, you can check POST /pdfs/qa-pdf/{id}
* Remember, this will work if you enter the id of a PDF that is actually stored in your AWS.

## Task for you: update frontend

### Desired behavior:
* The user selects one PDF file
* Then he can enter a question about it in the input box below the main pdf-list box
* After the question is submitted, the answer is displayed below