<a href="https://colab.research.google.com/github/rayhanozzy/Mastering-AI/blob/main/Chatbot%252520using%252520Langchain/langchain.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Project: Langchain

**Description:**

You will be assigned the following novel `Pride and Prejudice` by Jane Austen:

* In text file format (.txt) as your source of data: https://www.gutenberg.org/cache/epub/1342/pg1342.txt
* Alternatively you can also use the html version: http://authorama.com/book/pride-and-prejudice.html

Your task is to:

* Create a chatbot that will receive a user query and get the answer based on the content of the novel.
* Create a gradio interface for your chatbot.

**Notes:**

Please take note of the following important points while working on this project:

1. Do not change the Query Space code block, you can make a copy for your own inference.

2. Feel free to add new code block to separate your code into manageable blocks.

3. We recommend OpenAI, a trial version is still available. But if you want to try other LLM, please feel free to do so.

4. You do need to pass OPENAI_API_KEY as an environment variable because the Google Colab will be public, there are many methods, but here is one that you may use:
   - Install python-dotenv
   - Create an env file
   - Fill the env file with the key-value pair for OPENAI_API_KEY
   - Run the following magic command
     - `%load_ext dotenv`
     - `%dotenv ./openai.env`
   - You can check if the API KEY is available using `os.environ`
     - `os.environ['OPENAI_API_KEY']`

## Installation and Import `rggrader` Package

In [None]:
%pip install rggrader
from rggrader import submit_image
from rggrader import submit



## Working Space

In [None]:
# Write your code here
%pip install python-dotenv openai langchain



In [None]:
import os
import openai
import sys
sys.path.append('../..')

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

openai.api_key  = os.environ['OPENAI_API_KEY']

In [None]:
import requests
import re
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import FAISS

from langchain.document_loaders import TextLoader

# Download the novel from the remote URL and save it to a local file.
file_path = "/tmp/pride_and_prejudice.txt"
response = requests.get("https://www.gutenberg.org/cache/epub/1342/pg1342.txt")
with open(file_path, "wb") as f:
    f.write(response.content)

# Create a TextLoader object using the path to the local file.
loader = TextLoader(file_path)

# Load the novel into a list of documents.
documents = loader.load()

# Split the documents.
c_splitter = CharacterTextSplitter(
    separator = '\n',
    chunk_size=1000,
    chunk_overlap=50
)

docs = c_splitter.split_documents(documents)

In [None]:
%pip install langchain sentence_transformers



In [None]:
from langchain.embeddings import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings()

In [None]:
%pip install tiktoken faiss-gpu



In [None]:
import tiktoken
from langchain.vectorstores import FAISS

db = FAISS.from_documents(docs, embeddings)

In [None]:
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI

OpenAIModel = 'gpt-3.5-turbo'
llm = ChatOpenAI(model=OpenAIModel, temperature=0.1)

qa = RetrievalQA.from_chain_type(llm=llm, retriever=db.as_retriever())

## Query Space

In [None]:
query = "What are the full names of the two main characters in Pride and Prejudice ?"
answer = qa.run(query)

In [None]:
answer

'The full names of the two main characters in Pride and Prejudice are Elizabeth Bennet and Fitzwilliam Darcy.'

## Submit Gradio screenshot

You need to submit screenshot of your Gradio's app. In Google Colab you can just use the "Folder" sidebar and click the upload button.

Make sure your screenshot match below requirements:

- It should have an input box for user to type the query and an output box for user to type the query.
- It should have the query and the answer from Query Space block in the respective boxes.

Example of Expected Output:

![gradio-result](https://storage.googleapis.com/rg-ai-bootcamp/projects/langchain-gradio.png)


In [None]:
%pip install gradio



In [None]:
#write your Gradio implementation here
import gradio as gr

# Define the chatbot function
def chatbot(Textbox):
    answer = qa.run(query)
    return answer

# Create a Gradio interface
iface = gr.Interface(
    fn=chatbot,
    inputs="text",
    outputs="text"
)

iface.launch()

Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Running on public URL: https://6644508c48daeb79a5.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)




Result:

![result](https://github.com/rayhanozzy/Mastering-AI/blob/main/Chatbot%20using%20Langchain/submission.jpg?raw=true)