<a href="https://colab.research.google.com/github/SuccessSoham/Gen-AI/blob/main/chatbot_using_langchain_with_memory.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **📘 Step 1: Install Required Libraries**

In [None]:
!pip install langchain llama-cpp-python gradio huggingface_hub



In [None]:
!pip install langchain_community



# **📘 Cell 2: Download the GGUF Model from Hugging Face**

In [None]:
from huggingface_hub import hf_hub_download

model_path = hf_hub_download(
    repo_id="TheBloke/Mistral-7B-Instruct-v0.2-GGUF",
    filename="mistral-7b-instruct-v0.2.Q4_K_M.gguf"
)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


# **🧠 Step 3: Load the model with LlamaCpp**

In [None]:
from langchain.llms import LlamaCpp

llm = LlamaCpp(
    model_path=model_path,
    temperature=0.7,
    max_tokens=1024,
    top_p=0.95,
    n_ctx=16384,
    n_threads=16,
    n_batch=8,
    verbose=False
)

llama_context: n_batch is less than GGML_KQ_MASK_PAD - increasing to 64
llama_context: n_ctx_per_seq (16384) < n_ctx_train (32768) -- the full capacity of the model will not be utilized


# 📘 Cell 4: Define Custom Prompt and **Memory**

In [None]:
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain.memory import ConversationBufferMemory

prompt_template = PromptTemplate(
    input_variables=["history", "input"],
    template="""
You are a helpful and concise AI assistant. Continue the conversation based on the history.

{history}
Human: {input}
AI:"""
)

memory = ConversationBufferMemory(return_messages=True)

conversation = LLMChain(
    llm=llm,
    prompt=prompt_template,
    memory=memory,
    verbose=False
)


  memory = ConversationBufferMemory(return_messages=True)
  conversation = LLMChain(


# **📦 Step 1: Install Gradio**

# **🗣️ Step 4: Chat Loop**

In [None]:
import gradio as gr

def chat_with_bot(user_input, history=[]):
    response = conversation.predict(input=user_input)
    history.append((user_input, response))
    return history, history

with gr.Blocks() as demo:
    gr.Markdown("## 🧠 Mistral Chatbot with Memory")
    chatbot = gr.Chatbot()
    msg = gr.Textbox(label="Your message")
    clear = gr.Button("Clear")

    state = gr.State([])

    def respond(message, chat_history):
        return chat_with_bot(message, chat_history)

    msg.submit(respond, [msg, state], [chatbot, state])
    clear.click(lambda: ([], []), None, [chatbot, state])

demo.launch()

  chatbot = gr.Chatbot()


It looks like you are running Gradio on a hosted a Jupyter notebook. For the Gradio app to work, sharing must be enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://7d1bb950bc1f4f302a.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


