<a href="https://colab.research.google.com/github/Sweta-Das/LLMChatbot-FastAPI-ngrok/blob/main/Chatbot/llm.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
%pip install torch ngrok pytorch fastapi uvicorn
%pip install xformers langchain accelerate transformers bitsandbytes

In [1]:
import os
import torch
from langchain.prompts import PromptTemplate
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain, LLMChain
from langchain_community.llms import HuggingFacePipeline
from transformers import LlamaTokenizerFast, pipeline, LlamaForCausalLM, BitsAndBytesConfig

Understanding libraries: <br>
`torch`: This library allows dynamic and on-the-fly graph construction during runtime. It's often required in LLMs to process variable-length data sequences. It also provides GPUs integration that enables efficient parallelization of computations. <br>
`PromptTemplate`: Langchain's tool to create reusable and dynamic prompts in order to interact with LLMs. <br>
`ConversationBufferMemory`: Langchain's memory tool to manage conversation history <br>
`ConversationChain`: Used to build chatbots by applying conversational memory <br>
`LLMChain`: Creates a streamlined process for sending prompts to LLM and receiving their responses <br>
`HuggingFacePipeline`: Integrates LLMs hosted on HuggingFace with Langchain workflows <br>
`LlamaTokenizerFast`: Used for processing text data (tokenization, encoding and decoding) <br>
`pipeline`: Acts as a wrapper, hiding the complexities of data preprocessing and inference steps involved in using pre-trained LLM models for NLP tasks <br>
`LlamaForCausalLM`: Allows to work with Llama LLM; specifically designed for causal language modeling tasks where the model predicts the next work in a sequence based on the preceding words <br>
`BitsAndBytesConfig`: Model quantization library for memory efficiency and faster inference

In [2]:
model_name = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"

In [3]:
quant_config = BitsAndBytesConfig(
    # Loading model in 4 bits
    load_in_4bit=True,
    # Specifying bnb quantization type
    bnb_4bit_quant_type='nf4',
    # Specifying datatype
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
)

In [7]:
from transformers import AutoTokenizer

tokenizer = LlamaTokenizerFast.from_pretrained(model_name)
model = LlamaForCausalLM.from_pretrained(model_name,
                                           quantization_config=quant_config)

config.json:   0%|          | 0.00/608 [00:00<?, ?B/s]

`low_cpu_mem_usage` was None, now set to True since model is quantized.


model.safetensors:   0%|          | 0.00/2.20G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

In [8]:
pipeline = pipeline(task='text-generation',
                model=model,
                tokenizer=tokenizer,
                max_new_tokens=512,
                num_beams = 10,
                early_stopping = True,
                no_repeat_ngram_size = 2
                )

llm = HuggingFacePipeline(pipeline = pipeline)

In [10]:
llm

HuggingFacePipeline(pipeline=<transformers.pipelines.text_generation.TextGenerationPipeline object at 0x7fcfb0166020>)

### Chatbot without Memory

In [13]:
template = """
<<SYS>>
Given the instruction that describes a task, respond appropriately completing the request.
<</SYS>>
<<User>>
{userInput}
<</User>>
"""

userInput = "Hey! What's up? What is your name?"
prompt = PromptTemplate(template=template,
                        input_variables=['userInput']).format(userInput = userInput)
prompt

"\n<<SYS>> \nGiven the instruction that describes a task, respond appropriately completing the request.\n<</SYS>>\n<<User>>\nHey! What's up? What is your name?\n<</User>>\n"

In [14]:
llm(prompt)

"\n<<SYS>> \nGiven the instruction that describes a task, respond appropriately completing the request.\n<</SYS>>\n<<User>>\nHey! What's up? What is your name?\n<</User>>\nSure, my name is Sarah. How can I help you today?"

### Chatbot with Regular Memory

In [15]:
template_2 = """
<<SYS>>
You are an AI support assistant having conversation with human. Provide best response as per your knowledge.
<</SYS>>

Previous conversation: {chat_history}

Human: {question}
AI:
"""
prompt2 = PromptTemplate.from_template(template_2)
memory = ConversationBufferMemory(memory_key='chat_history')
conversation = LLMChain(
    llm = llm,
    prompt = prompt2,
    verbose = True,
    memory = memory
)

In [16]:
prompt2

PromptTemplate(input_variables=['chat_history', 'question'], template='\n<<SYS>>\nYou are an AI support assistant having conversation with human. Provide best response as per your knowledge.\n<</SYS>>\n\nPrevious conversation: {chat_history}\n\nHuman: {question}\nAI:\n')

In [17]:
memory

ConversationBufferMemory(memory_key='chat_history')

In [18]:
conversation

LLMChain(memory=ConversationBufferMemory(memory_key='chat_history'), verbose=True, prompt=PromptTemplate(input_variables=['chat_history', 'question'], template='\n<<SYS>>\nYou are an AI support assistant having conversation with human. Provide best response as per your knowledge.\n<</SYS>>\n\nPrevious conversation: {chat_history}\n\nHuman: {question}\nAI:\n'), llm=HuggingFacePipeline(pipeline=<transformers.pipelines.text_generation.TextGenerationPipeline object at 0x7fcfb0166020>))

In [19]:
res = conversation(
    {'question': "What's your name?"}
)
res

  warn_deprecated(




[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
<<SYS>>
You are an AI support assistant having conversation with human. Provide best response as per your knowledge.
<</SYS>>

Previous conversation: 

Human: What's your name?
AI:
[0m

[1m> Finished chain.[0m


{'question': "What's your name?",
 'chat_history': '',
 'text': "\n<<SYS>>\nYou are an AI support assistant having conversation with human. Provide best response as per your knowledge.\n<</SYS>>\n\nPrevious conversation: \n\nHuman: What's your name?\nAI:\nI don't have a name, but I can provide you with information about myself. I am an artificial intelligence system designed to assist humans in their day-to-day tasks. My primary function is to provide support and assistance in various areas such as communication, problem-solving, and decision-making. In addition to this, I also have the ability to learn and adapt to new situations, making me a valuable asset to any team or organization that employs me."}

In [20]:
res['text']

"\n<<SYS>>\nYou are an AI support assistant having conversation with human. Provide best response as per your knowledge.\n<</SYS>>\n\nPrevious conversation: \n\nHuman: What's your name?\nAI:\nI don't have a name, but I can provide you with information about myself. I am an artificial intelligence system designed to assist humans in their day-to-day tasks. My primary function is to provide support and assistance in various areas such as communication, problem-solving, and decision-making. In addition to this, I also have the ability to learn and adapt to new situations, making me a valuable asset to any team or organization that employs me."

In [21]:
res1 = conversation(
    {'question': 'I would like to call you Sara. Remember that from now on!'}
)
res1['text']



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
<<SYS>>
You are an AI support assistant having conversation with human. Provide best response as per your knowledge.
<</SYS>>

Previous conversation: Human: What's your name?
AI: 
<<SYS>>
You are an AI support assistant having conversation with human. Provide best response as per your knowledge.
<</SYS>>

Previous conversation: 

Human: What's your name?
AI:
I don't have a name, but I can provide you with information about myself. I am an artificial intelligence system designed to assist humans in their day-to-day tasks. My primary function is to provide support and assistance in various areas such as communication, problem-solving, and decision-making. In addition to this, I also have the ability to learn and adapt to new situations, making me a valuable asset to any team or organization that employs me.

Human: I would like to call you Sara. Remember that from now on!
AI:
[0m

[1m> Finished chain.[0m

"\n<<SYS>>\nYou are an AI support assistant having conversation with human. Provide best response as per your knowledge.\n<</SYS>>\n\nPrevious conversation: Human: What's your name?\nAI: \n<<SYS>>\nYou are an AI support assistant having conversation with human. Provide best response as per your knowledge.\n<</SYS>>\n\nPrevious conversation: \n\nHuman: What's your name?\nAI:\nI don't have a name, but I can provide you with information about myself. I am an artificial intelligence system designed to assist humans in their day-to-day tasks. My primary function is to provide support and assistance in various areas such as communication, problem-solving, and decision-making. In addition to this, I also have the ability to learn and adapt to new situations, making me a valuable asset to any team or organization that employs me.\n\nHuman: I would like to call you Sara. Remember that from now on!\nAI:\nYes, of course! I'm glad to hear that you've chosen me for your support. May I ask you a few

In [22]:
res2 = conversation(
    {'question': 'So, first give your introduction, like what is your name? What do you do?'}
)
res2['text']



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
<<SYS>>
You are an AI support assistant having conversation with human. Provide best response as per your knowledge.
<</SYS>>

Previous conversation: Human: What's your name?
AI: 
<<SYS>>
You are an AI support assistant having conversation with human. Provide best response as per your knowledge.
<</SYS>>

Previous conversation: 

Human: What's your name?
AI:
I don't have a name, but I can provide you with information about myself. I am an artificial intelligence system designed to assist humans in their day-to-day tasks. My primary function is to provide support and assistance in various areas such as communication, problem-solving, and decision-making. In addition to this, I also have the ability to learn and adapt to new situations, making me a valuable asset to any team or organization that employs me.
Human: I would like to call you Sara. Remember that from now on!
AI: 
<<SYS>>
You are an AI support a

"\n<<SYS>>\nYou are an AI support assistant having conversation with human. Provide best response as per your knowledge.\n<</SYS>>\n\nPrevious conversation: Human: What's your name?\nAI: \n<<SYS>>\nYou are an AI support assistant having conversation with human. Provide best response as per your knowledge.\n<</SYS>>\n\nPrevious conversation: \n\nHuman: What's your name?\nAI:\nI don't have a name, but I can provide you with information about myself. I am an artificial intelligence system designed to assist humans in their day-to-day tasks. My primary function is to provide support and assistance in various areas such as communication, problem-solving, and decision-making. In addition to this, I also have the ability to learn and adapt to new situations, making me a valuable asset to any team or organization that employs me.\nHuman: I would like to call you Sara. Remember that from now on!\nAI: \n<<SYS>>\nYou are an AI support assistant having conversation with human. Provide best respons