# Querying OpenAI and RAG

This notebook implements a QA object which the user can use to ask questions about ProRail.
The scope of the questions are on the business strategical level of ProRail.
Questions such as: 
- "What are the points of concerns in the last years with respect to the corporate activities of ProRail?"
- "What are the future business objectives of ProRail?"
- "Is the Dutch government satisfied with the operational performance of ProRail?"

## Update your local environment
Assuming you have already created the conda environment `ddd-llm-app` proceed to execute the following actions.

Start a new terminal window (console), cd to the project's root directory and run the following command:

````
conda activate ddd-llm-app
````

Update your local conda environment `ddd-llm-app` with `requirements_local.txt` to make sure you have all required dependencies:

````
pip install -r requirements_local.txt
````

## Prepare and start Jupyter for running notebooks
If you have no Jupyter server installed for running notebooks, please install Jupyter Lab first with the following command.


If not, cd to the project's root directory and run the following conda command to activate the environment `ddd-llm-app`:
````
conda activate ddd-llm-app
````

Then, in the activated environment `ddd-llm-app` install Jupyter Lab if it is not already installed in your environment:
````
pip jupyterlab
````

To run Jupyer the server, execute the following command:

````
jupyter lab
````



In [1]:
import logging
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.chat_models import ChatOpenAI
from langchain.chains import RetrievalQA
from dotenv import load_dotenv

## Configurations

In [2]:
load_dotenv()

True

In [3]:
# Logging setup
logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")

# Configuration
VECTOR_STORE_PATH = "prorail_dom_idx"  # Persistent FAISS store

In [8]:
def get_qa(vector_store_path: str = VECTOR_STORE_PATH):
    """ Create and return a RetrivalQA object over OpenAI, OpenAIEmbedding model and the FAISS vectorstore """

    embedding_model = OpenAIEmbeddings()
    # Load the persisted FAISS index
    vector_store = FAISS.load_local(vector_store_path, embedding_model, allow_dangerous_deserialization=True)
    llm = ChatOpenAI(model_name="gpt-4o", temperature=0.5)  # Adjust model as needed
    retriever = vector_store.as_retriever(search_kwargs={"k": 3})  # Retrieve top 3 relevant chunks
    qa_chain = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=retriever)
    return qa_chain

In [9]:
# get a QA object for asking questions
qa = get_qa()

In [10]:
qa.invoke("Tell something about Prorail with respect to its annual report of 2023.")

2025-02-08 11:32:28,053 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-02-08 11:32:32,588 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


{'query': 'Tell something about Prorail with respect to its annual report of 2023.',
 'result': 'ProRail\'s annual report for 2023, titled "Verbindt. Verbetert. Verduurzaamt." (Connects. Improves. Sustains.), covers various aspects of its operations and performance for the year. The report is structured with reference to the GRI Standards, and Deloitte Accountants has provided limited assurance on the non-financial information contained in the report. Key sections of the report include \'Kerncijfers\', \'Het jaar 2023\', \'Profiel\', \'Onze prestaties\', \'Medewerkers\', \'Omgeving\', \'Besturing\', and \'Maatschappelijke verslaglegging\'.\n\nThe report indicates that there were no significant changes in the scope, structure, ownership, or value chain of ProRail during the reporting year. ProRail operates in the Netherlands, and the report does not include a regional breakdown. The report also highlights ProRail\'s focus on safety, punctuality, reliability, and the impact of infrastruc