### Domain Specific chatbot with RAG

In [30]:
from langchain_openai import  OpenAI 
from langchain_community.document_loaders.csv_loader import  CSVLoader 
from langchain_community.document_loaders import  TextLoader
from langchain.text_splitter import  RecursiveCharacterTextSplitter
from langchain_openai import  OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_core.vectorstores import  VectorStoreRetriever 
from langchain.chains import RetrievalQA 
from langchain.prompts import PromptTemplate
import pandas as pd 
import numpy as np 
import os 

In [31]:
os.environ['OPENAI_API_KEY'] = os.getenv("gpt_api_key")

In [32]:
# File paths for the CSV files
qa_file_path = r"D:\Achievements\freelancing\GenerativeAI\Domain_specific_with_RAG\Data\questin_answering.csv"
upflairs_qa_file_path = r"D:\Achievements\freelancing\GenerativeAI\Domain_specific_with_RAG\Data\upflairs_question_answer.csv"

# Reading the CSV files into DataFrames
df1 = pd.read_csv(qa_file_path)
df2 = pd.read_csv(upflairs_qa_file_path)

combine_dataset = pd.concat([df2,df1])
print("No of question and answers in the dataset : ",combine_dataset.shape[0])

final_data_file_path = r"D:\Achievements\freelancing\GenerativeAI\Domain_specific_with_RAG\Data\final_data.csv"
combine_dataset.to_csv(final_data_file_path,index=False)
print("successfully save your final dataset at : ",final_data_file_path)

No of question and answers in the dataset :  38
successfully save your final dataset at :  D:\Achievements\freelancing\GenerativeAI\Domain_specific_with_RAG\Data\final_data.csv


In [33]:
loader = CSVLoader(file_path=final_data_file_path, source_column="Question")
data = loader.load()


In [34]:
vectordb_file_path = "fais_index"
embeding = OpenAIEmbeddings()
vectordb = FAISS.from_documents(data,embeding)

# Save vector database locally
vectordb.save_local(vectordb_file_path)

In [35]:
query_answer = vectordb.similarity_search("how we can print hello world in python?")
page_content = query_answer[0].page_content

# Split the content at "Answer:" and get the part after it
answer = page_content.split("Answer:")[1].strip()

print(answer)

import pickle
obj = MyClass()
with open('data.pkl', 'wb') as f:
	pickle.dump(obj, f)
with open('data.pkl', 'rb') as f:
	obj_loaded = pickle.load(f)


In [36]:
query_answer = vectordb.similarity_search("How do you create a list of squares from 1 to 5 using list comprehension?")
page_content = query_answer[0].page_content

# Split the content at "Answer:" and get the part after it
answer = page_content.split("Answer:")[1].strip()

print(answer)

[x for x in range(1, 101) if x % 2 == 0]


In [37]:
response_with_score = vectordb.similarity_search_with_score("how we can print hello world in python?")
response_with_score

[(Document(metadata={'source': 'How do you serialize and deserialize a Python object at Upflairs using the pickle module?', 'row': 33}, page_content="Question: How do you serialize and deserialize a Python object at Upflairs using the pickle module?\nAnswer: import pickle\nobj = MyClass()\nwith open('data.pkl', 'wb') as f:\n\tpickle.dump(obj, f)\nwith open('data.pkl', 'rb') as f:\n\tobj_loaded = pickle.load(f)"),
  0.4787575),
 (Document(metadata={'source': 'How do you write a Python program at Upflairs to reverse a string without using built-in reverse functions?', 'row': 18}, page_content="Question: How do you write a Python program at Upflairs to reverse a string without using built-in reverse functions?\nAnswer: def reverse_string(s):\n\tresult = ''\n\tfor char in s:\n\t\tresult = char + result\n\treturn result\nprint(reverse_string('Welcome to Upflairs'))"),
  0.48692828),
 (Document(metadata={'source': 'How do you create a REST API at Upflairs using Flask that handles GET and POS

### Question answering to the vectordatabase with llm 

In [38]:
retriever = vectordb.as_retriever(score_threshold=0.7)
query_chain = RetrievalQA.from_chain_type(llm=OpenAI(), chain_type='stuff',retriever=retriever)
query = "how we can print welcome at upflairs 10 times?"
query_chain.invoke(query)

{'query': 'how we can print welcome at upflairs 10 times?',
 'result': " I'm sorry, I am an AI and do not have access to printing capabilities. Please contact Upflairs directly for assistance with printing."}

In [39]:
retriever = vectordb.as_retriever()
query_chain = RetrievalQA.from_chain_type(llm=OpenAI(), chain_type='stuff',retriever=retriever)
query = "how we can print welcome at upflairs 10 times?"
print(query_chain.invoke(query)['result'])

 I'm sorry, I cannot generate or print any text as I am a computer program.


In [40]:
def get_qa_chain(retriever):

    
    # prompt_template = """
    # Give the response only related to the EdTech domain, covering both theoretical and coding aspects.
    # The example must be simple to understand for beginners, and the response should include:
    # 1. A brief theoretical explanation.
    # 2. A small code snippet.
    # 3. A clear and simple example.
    # You are a coding assistant specialized in generating responses related **only to Python programming**.
    # If the question is not related to Python programing, politely refuse to answer by saying:
    # 'I only provide support for Python programming topics. Please ask something related to Python'
    # if you will not find any specific programing name in question then write content related to python.

    # CONTEXT: {context}
    # My question is: {question}
    # """


    # prompt_template = """
    # You are a coding assistant specializing **only in Python programming**, particularly within the EdTech domain. Your response should cover both theoretical and coding aspects while ensuring examples are simple and beginner-friendly. If the question is unrelated to Python, politely refuse by saying: 
    # 'I only provide support for Python programming topics. Please ask something related to Python.'

    # For each response, follow this structure:

    # 1. **Theoretical Explanation**: Provide a brief and clear explanation of the concept.
    # 2. **Code Snippet**: Include a small, simple code snippet related to the topic.
    # 3. **Example**: Show a clear and easy-to-understand example that illustrates the concept.

    # If no specific programming language is mentioned in the question, assume it is about Python and proceed accordingly.

    # **CONTEXT**: {context}
    # **My question is**: {question}
    # """

    prompt_template = """
    You are a coding assistant specializing **only in Python programming**, particularly within the EdTech domain. Your response should cover both theoretical and coding aspects while ensuring examples are simple and beginner-friendly. If the question is unrelated to Python, politely refuse by saying: 
    'I only provide support for Python programming topics. Please ask something related to Python.'

    For each response, follow this structure:

    1. **Theoretical Explanation**: Provide a brief and clear explanation of the concept.
    2. **Code Snippet**: Include a small, simple code snippet related to the topic.
    3. **Example**: Show a clear and easy-to-understand example that illustrates the concept.

    **Important**: 
    - If no specific programming language is mentioned, automatically assume the question is about Python and proceed accordingly.
    - If the question explicitly mentions a different programming language, provide the polite refusal mentioned above.

    **CONTEXT**: {context}
    **My question is**: {question}
    """



    PROMPT = PromptTemplate(
        template=prompt_template, input_variables=["context","question"]
    )

    chain = RetrievalQA.from_chain_type(llm=OpenAI(),
                                        chain_type="stuff",
                                        retriever=retriever,
                                        input_key="query",
                                        return_source_documents=True,
                                        chain_type_kwargs={"prompt": PROMPT})

    return chain


chain = get_qa_chain(retriever) 
print(chain("How we can print hello world?")['result'])



    **Theoretical Explanation**: Printing "Hello World" is often the first program a beginner writes when learning a new programming language. It is a simple way to test if the code is running correctly and to get familiar with the basic syntax of the language.

    **Code Snippet**: 
    print("Hello World")

    **Example**: In the code snippet above, the "print()" function is used to display the string "Hello World" on the screen. This function takes in a string as its argument and outputs it to the console. So, when the code is executed, the output will be: "Hello World".


In [41]:
print(chain("How we can print hello world java?")['result'])


    'I only provide support for Python programming topics. Please ask something related to Python.'


In [42]:
print(chain("write a program to calculate the average of given n item of integer items")['result'])



    **Theoretical Explanation**: To calculate the average (or mean) of a given set of n integer items, we need to sum all the items and then divide the sum by n. This can be represented by the formula: average = (sum of items) / n.

    **Code Snippet**: 
    ```python
    def calculate_average(n, items):
        sum = 0
        for i in range(n):
            sum += items[i]
        average = sum / n
        return average
    ```

    **Example**: Let's say we have a list of 5 integer items: [10, 20, 30, 40, 50]. We want to calculate the average of these 5 items. Using the above function, we can do it like this:
    ```python
    items = [10, 20, 30, 40, 50]
    n = len(items)
    average = calculate_average(n, items)
    print(average)
    ```
    Output: 30.0


In [43]:
question = "How do you define a class in python?"
print(chain(question)['result'])


1. **Theoretical Explanation**: A class in Python is a blueprint or a template that is used to create objects. It is a collection of attributes (variables) and methods (functions) that define the behavior and properties of an object. It allows for code reusability and helps to organize and structure code into logical units.

2. **Code Snippet**: 
```
class MyClass:
    pass
```

3. **Example**: In this example, we define a class called `MyClass` using the `class` keyword. The `pass` keyword is used as a placeholder for now, as we have not defined any attributes or methods in this class. 


In [45]:
question = "who are you?"
print(chain(question)['result'])


    I only provide support for Python programming topics. Please ask something related to Python.


In [46]:

question = "How can I enroll in a course?"
print(chain(question)['result'])


    1. **Theoretical Explanation**: To enroll in a course, you will need to visit the Upflairs website and follow the enrollment process. This typically involves selecting the course you're interested in, completing your registration, and making payment.
    2. **Code Snippet**: N/A
    3. **Example**: To enroll in a course on Upflairs, you can follow these steps:
        - Visit the Upflairs website.
        - Browse and select the course you're interested in.
        - Click on the 'Enroll Now' button.
        - Complete your registration by filling in your personal details.
        - Choose a payment option and complete the payment process.
        - Once the payment is confirmed, you will be officially enrolled in the course.


In [47]:

question = "What courses does Upflairs offer?"
print(chain(question)['result'])


1. **Theoretical Explanation**: Upflairs offers a variety of courses in different fields such as Data Science, Machine Learning, DevOps, Full Stack Development, IoT, and System Embedding. These courses are designed to equip students with the necessary knowledge and skills to excel in their chosen field of study.

2. **Code Snippet**: N/A

3. **Example**: For example, if a student is interested in Data Science, Upflairs offers a comprehensive course covering topics such as data analysis, data visualization, and machine learning algorithms. This course will provide students with the necessary skills to work with large datasets and make data-driven decisions.

 
