<a href="https://colab.research.google.com/github/IndML101/gqa-bot/blob/main/gqa-bot.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
# ! pip install llama_index langchain faiss_gpu transformers gradio

In [7]:
# ! pip install PyPDF2

In [10]:
# ! pip install sentence_transformers

In [3]:
import os
from llama_index import SimpleDirectoryReader
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.chains import LLMChain
from langchain.llms.base import LLM
from langchain.prompts import PromptTemplate
from transformers import pipeline
import torch
import gradio as gr


In [11]:

class CreateIndex:
    def __init__(self, doc_path='docs', model_name="google/flan-t5-large") -> None:
        self._loader = SimpleDirectoryReader(doc_path)
        self.documents = [doc.to_langchain_format() for doc in self._loader.load_data()]
        self.embeddings = HuggingFaceEmbeddings(model_name=model_name)

    def split_text(self):
        text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
        return text_splitter.split_documents(self.documents)

    def create_vector_db(self):
        docs = self.split_text()
        db = FAISS.from_documents(docs, self.embeddings)
        db.save_local("metamorphosis_index")
        return

class customLLM(LLM):
    # model_name = "microsoft/DialoGPT-medium"
    model_name = "google/flan-t5-large"
    pipeline = pipeline("text2text-generation", model=model_name, device=0, model_kwargs={"torch_dtype":torch.bfloat16})

    def _call(self, prompt, stop=None):
        return self.pipeline(prompt, max_length=9999)[0]["generated_text"]
 
    def _identifying_params(self):
        return {"name_of_model": self.model_name}

    def _llm_type(self):
        return "custom"


class QnABot:
    def __init__(self, dbpath='metamorphosis_index', model_name="google/flan-t5-large") -> None:
        self.embeddings = HuggingFaceEmbeddings(model_name=model_name)
        self.db = FAISS.load_local(dbpath, self.embeddings)
        self.template = """
        for the following context start answering questions, if you don't find answer in the context just say that "I don't know the answer"
        CONTEXT: {context}
        QUESTION: {question} 
        """

    def qna_bot(self, input_text):
        context = ' '.join([doc.page_content for doc in self.db.similarity_search(input_text)[:2]])
        prompt = PromptTemplate(input_variables=['context', 'question'], template=self.template)
        qna_chain = LLMChain(llm=customLLM(), prompt=prompt)
        response = qna_chain.run({'context': context, 'question': input_text})
        return response

In [12]:
CreateIndex().create_vector_db()

bot = QnABot()

iface = gr.Interface(fn=bot.qna_bot,
                  inputs=gr.inputs.Textbox(lines=4, label="Enter your text"),
                  outputs="text",
                  title="Question Answering Bot")

iface.launch(share=True)

Downloading (…)2135a/.gitattributes:   0%|          | 0.00/1.43k [00:00<?, ?B/s]

Downloading (…)de16a2135a/README.md:   0%|          | 0.00/10.8k [00:00<?, ?B/s]

Downloading (…)16a2135a/config.json:   0%|          | 0.00/662 [00:00<?, ?B/s]

Downloading (…)neration_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/3.13G [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/3.13G [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/2.20k [00:00<?, ?B/s]

Downloading spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

Downloading (…)2135a/tokenizer.json:   0%|          | 0.00/2.42M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/2.54k [00:00<?, ?B/s]

Some weights of the model checkpoint at /root/.cache/torch/sentence_transformers/google_flan-t5-large were not used when initializing T5EncoderModel: ['lm_head.weight']
- This IS expected if you are initializing T5EncoderModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing T5EncoderModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at /root/.cache/torch/sentence_transformers/google_flan-t5-large were not used when initializing T5EncoderModel: ['lm_head.weight']
- This IS expected if you are initializing T5EncoderModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model 

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Running on public URL: https://ed474cac1c560cec89.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades (NEW!), check out Spaces: https://huggingface.co/spaces


