<b>This Jupyter Notebook is ran on local environment<b>

In [None]:
from huggingface_hub import notebook_login
notebook_login()

In [1]:
from langchain.llms import HuggingFacePipeline
import torch
from  transformers import AutoTokenizer, AutoModelForCausalLM, pipeline, AutoModelForSeq2SeqLM, BitsAndBytesConfig
from langchain import PromptTemplate, HuggingFaceHub, LLMChain

model_id = 'ccfai/squirmy-the-chatbot-v4'
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, load_in_8bit=True)

pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=1024,
    torch_dtype=torch.bfloat16
)

local_llm = HuggingFacePipeline(pipeline=pipe)

The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be removed in the future versions. Please, pass a `BitsAndBytesConfig` object in `quantization_config` argument instead.
`low_cpu_mem_usage` was None, now set to True since model is quantized.


Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

<b>Running chatbot without Gradio Interface<b>

Different prompting approach used here

In [None]:
template = """<s>[INST] <<SYS>>
You are a friendly expert in the field of vermicomposting. Please ensure that your responses are clear and concise.
Your task is to assist vermicompost enthusiast on answering their questions regarding vermicompost.
Note that your target audience are based in Singapore so do consider the geometrical factors of the location when answering.
If possible, make sure your answers are summarized, preferably in bullet-points.
<</SYS>>
{question} [/INST]
Bot:
"""


In [None]:
prompt = PromptTemplate(template=template, input_variables=["question"])

llm_chain = LLMChain(prompt=prompt, llm=local_llm)

In [None]:
def run_chatbot():
    while True:
        u_input = input("You: ").strip()

        if u_input.lower()=="bye":
            print("Squirmy: Happy composting! Goodbye!")
            break
        response = llm_chain.run(u_input)
        model_output=response.split("Bot:")[1].strip()
        print("Squirmy:"+"\n"+model_output)

print("Squirmy: Hello! My name is Squirmy! Ask me anything about composting, or type 'bye' to end the chat.")
run_chatbot()

<b>With Gradio Interface<b>

In [2]:
template = """
{question}
Bot:
"""

In [3]:
prompt = PromptTemplate(template=template, input_variables=["question"])

llm_chain = LLMChain(prompt=prompt, llm=local_llm)

  warn_deprecated(


different approach to system prompting + make use of chat history

In [4]:
SYSTEM_PROMPT = """<s>[INST] <<SYS>>
You are a friendly chatbot here to help with all things vermicomposting for your audience in Singapore.
Let's keep the answers clear and concise, and you'll make sure to consider factors relevant to Singapore's conditions in your responses. 
Always consider keeping the worm in a healthy state at all times.
<</SYS>>

"""

# Formatting function for message and history
def format_message(message: str, history: list, memory_limit: int = 3) -> str:
    """
    Formats the message and history for the Llama model.

    Parameters:
        message (str): Current message to send.
        history (list): Past conversation history.
        memory_limit (int): Limit on how many past interactions to consider.

    Returns:
        str: Formatted message string
    """
    # always keep len(history) <= memory_limit
    if len(history) > memory_limit:
        history = history[-memory_limit:]

    if len(history) == 0:
        return SYSTEM_PROMPT + f"{message} [/INST]"

    formatted_message = SYSTEM_PROMPT + f"{history[0][0]} [/INST] {history[0][1]} </s>"

    # Handle conversation history
    for user_msg, model_answer in history[1:]:
        formatted_message += f"<s>[INST] {user_msg} [/INST] {model_answer} </s>"

    # Handle the current message
    formatted_message += f"<s>[INST] {message} [/INST]"

    return formatted_message

In [5]:
# Generate a response from the Llama model
def get_llama_response(message: str, history: list) -> str:

    query = format_message(message, history)
    
    response = llm_chain.run(query)

    response = response.split("Bot:")[1].strip()

    print("Squirmy:", response.strip())
    return response.strip()


In [6]:
import gradio as gr

interface = gr.ChatInterface(fn=get_llama_response, title="Squirmy the Composting Chatbot🐛", description="Ask Squirmy anything about composting!")
interface.launch()

Running on local URL:  http://127.0.0.1:7860
IMPORTANT: You are using gradio version 4.28.3, however version 4.29.0 is available, please upgrade.
--------

To create a public link, set `share=True` in `launch()`.




  warn_deprecated(
  attn_output = torch.nn.functional.scaled_dot_product_attention(


Squirmy: Hey there, no worries! 😊 A bad smell in your worm bin is quite common, especially in the early stages of vermicomposting. There are a few reasons why your tank might be smelling bad:

1. Over-moisture: If your bin is too wet, anaerobic bacteria can thrive, causing a strong, unpleasant odor. Check your bin's moisture level by feeling the top inch of soil. It should be damp but not waterlogged. Adjust your watering schedule if necessary.
2. Food quality: Worms can't digest certain foods, like onions, garlic, and meat, which can cause a strong smell. Make sure to only add foods that are safe for worms, like vegetable scraps, fruit peels, and tea bags.
3. Lack of aeration: Worms need oxygen to survive, so if your bin doesn't have enough airflow, anaerobic bacteria can thrive, leading to bad odors. Make sure your bin has adequate airflow by adding some holes or drilling some air vents.
4. Overcrowding: If your bin is too full, the worms might not be able to move around and digest f