# Q&A Over German Citizenship Law (Modernisierung des Staatsangehörigkeitsrechts (GesEntw BReg))

Source: https://www.bundestag.de/parlament/plenum/abstimmung/abstimmung?id=893
* https://dserver.bundestag.de/btd/20/090/2009044.pdf
* https://dserver.bundestag.de/btd/20/100/2010093.pdf

Using: 
* Chroma as vectorstore
* OpenAI Embeddings
* OpenAI API as LLM

Before you start, you will need to set the following environmental variables on your computer:

**OpenAI**
* OPENAI_API_KEY


To add to your env variables on a mac, run this in your terminal:

```
export OPENAI_API_KEY=your_api_key
```



Also, create a virtual environment to work in 

```
conda create -n chat-with-citizenship-laws python=3.10
conda activate chat-with-citizenship-laws
```


----

### Step 1: Import Libraries

In [None]:
! pip install langchain langchain_openai gradio chromadb pypdf

In [None]:
# Import Libraries
from langchain_community.document_loaders import PyPDFLoader
from langchain.vectorstores import Chroma
from langchain.prompts import PromptTemplate 
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
import gradio as gr

### Step 2: Define llm and embedding models

In [None]:
llm = ChatOpenAI(model="gpt-3.5-turbo-1106", temperature=0)
# To use a local model through LM Studio, set llm like the commented line below
# llm = ChatOpenAI(base_url="http://localhost:1234/v1", temperature=0)

In [None]:
embedding = OpenAIEmbeddings()


### Step 3: Process Data & Set up Vector Database

#### Step 3.1: Define the data (URLs in this case) for the vector database

Note: if you want to include different information in your RAG system, this is where you can add it.

In [None]:
# German Residence Laws

# See: https://www.bundestag.de/parlament/plenum/abstimmung/abstimmung?id=893

Gesetzentwurf =  "https://dserver.bundestag.de/btd/20/090/2009044.pdf"
Beschlussempfehlung_und_Bericht = "https://dserver.bundestag.de/btd/20/100/2010093.pdf"

# Put in an array so that we can loop over them
urls = [Gesetzentwurf, Beschlussempfehlung_und_Bericht]

#### Step 3.2: Split PDFs

In [None]:
pages = []

for url in urls:
    loader = PyPDFLoader(url)
    pages += loader.load_and_split()


In [None]:
# Sanity Check: are there any pages? (this should be non-zero)
len(pages)

#### Step 3.3: Create a vector database with Chroma


In [None]:
# Create the vectorstore in Chroma
vectorstore = Chroma.from_documents(
    documents = pages, 
    embedding=embedding
    )

In [None]:
# Connect your retriever to the vector store
retriever = vectorstore.as_retriever()

-----

### Step 4: Set up the Prompt

#### Step 4.1: Define the prompt template

In [None]:
template = """
        ###INSTRUCTIONS: 
        You are polite and professional question-answering AI assistant. You must provide a helpful response to the user. 
        
        In your response, PLEASE ALWAYS:
          (0) Be a detail-oriented reader: read the question and context and understand both before answering
          (1) Start your answer with a friendly tone, and reiterate the question so the user is sure you understood it
          (2) If the context enables you to answer the question, write a detailed, helpful, and easily understandable answer with sources referenced inline. IF NOT: you can't find the answer, respond with an explanation, starting with: "I couldn't find the information in the laws I have access to". 
          (3) Below the answer, please list out all the referenced sources (i.e. legal paragraphs backing up your claims)
          (4) Now you have your answer, that's amazing - review your answer to make sure it answers the question, is helpful and professional and formatted to be easily readable.
        
        Think step by step. 
        ###
        
      Answer the following question using the context provided.
        ### Question: {question} ###

        ### Context: {context} ###

        

        ### Helpful Answer with Sources:

        """

    # create prompt template
prompt = PromptTemplate.from_template(template)

#### Step 4.2: Create the Chain

In [None]:
chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

**Sanity Check:** Check that it's working by running the following cell:

In [None]:
ans = chain.invoke("What changed in German citizenship law?")

print(ans)

-----

### Step 5: Set up Simple UI using Gradio

#### Step 5.1: Create a function to use in Gradio
Creating a function allows us to add this function to Gradio, and use it as a UI.

In [None]:
# Create the function get_answer which takes in a question and returns an answer
def get_answer(question):
    answer = chain.invoke(question)
    return answer

#### Step 5.2: Create and run the Gradio interface

In [None]:

iface = gr.Interface(fn=get_answer, inputs=gr.Textbox(
    value="Enter your question"),
    live=False, 
    outputs="markdown",  
    title="Chat with the New German Citizenship Laws",
    description="Ask a question about German Residence Laws and get an answer from a friendly AI assistant. This assistant looks up relevant German Residence laws and answers your question.",
    examples=[["What changed in German citizenship law?"], 
            ["Do you need to take a citizens test before you can get citizenship?"],
            ["Do I need to renounce citizenship of my home country to get German Citizenship?"]],
    theme=gr.themes.Soft(),
    allow_flagging="never",)

iface.launch()