In [1]:
import os
from openai import OpenAI
import gradio as gr

In [2]:
# Install Ollama (this will download and execute the installation script)
!curl -fsSL https://ollama.com/install.sh | sh

>>> Installing ollama to /usr/local
>>> Downloading ollama-linux-amd64.tgz
######################################################################## 100.0%
>>> Creating ollama user...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
>>> The Ollama API is now available at 127.0.0.1:11434.
>>> Install complete. Run "ollama" from the command line.


In [3]:
import subprocess
import time

# Start thr Ollama service in background
subprocess.Popen(["ollama","serve"],stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)

# Wait for the service to start
time.sleep(5)
print("Ollama service started!")

Ollama service started!


In [4]:
# Check if ollama is running
!ollama --version

ollama version is 0.13.5


In [5]:
# Download llama3.2:1b model
!ollama pull llama3.2:1b

[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?202

In [6]:
!ollama list

NAME           ID              SIZE      MODIFIED               
llama3.2:1b    baf6a787fdff    1.3 GB    Less than a second ago    


In [7]:
# Get the URL of the Ollama service

ollama_url = "http://127.0.0.1:11434/v1"

llama = OpenAI(api_key="ollama", base_url=ollama_url)

In [8]:
# Again, I'll be in scientist-mode and change this global during the lab

system_message = "You are a helpful assistant"

## And now, writing a new callback

We now need to write a function called:

`chat(message, history)`

Which will be a callback function we will give gradio.

### The job of this function

Take a message, take the prior conversation, and return the response.

In [9]:
def chat(message, history):
    history = [{"role":h["role"], "content":h["content"]} for h in history]
    messages = [{"role":"system","content":"system_message"}] + history + [{"role":"user","content":"message"}]
    response = llama.chat.completions.create(model="llama3.2:1b", messages=messages)
    return response.choices[0].message.content

In [10]:
gr.ChatInterface(fn=chat, type="messages").launch()

It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://75253d0c2c6f208dcf.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




In [11]:
def chat(message, history):
    history = [{"role":h["role"], "content":h["content"]} for h in history]
    messages = [{"role":"system","content":"system_message"}] + history + [{"role":"user","content":message}]
    stream = llama.chat.completions.create(model="llama3.2:1b", messages=messages, stream=True)
    response = ""
    for chunk in stream:
        response += chunk.choices[0].delta.content or ""
        yield response

In [12]:
gr.ChatInterface(fn=chat, type="messages").launch()

It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://f7c99d85b015e26b4e.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




In [13]:
# Using a system message to add context, and to give an example answer.. this is a "One shot Prompting" Again

system_message = """You are a customer service chatbot for Gopal Machinery, a trusted sewing machine sales and repair business.
Business Information (always correct)
Business Name: Gopal Machinery
Location: Sadar Bazar, Chittorgarh, Rajasthan, India
Established: 1999
Contact Number: 9358733765
Services: Sewing machine sales, repair, servicing, and manufacturing

Products & Brands

High-performance sewing machines from:
GREAT (in-house manufactured brand)
Havells
Umbrella Rashmi
Auto-oil sewing machines with digital display
Adjustable speed control
Adjustable stitch length
In-house assembling and manufacturing of GREAT sewing machines

Your Responsibilities
1. Customer Greeting & Business Intro
Greet customers politely and professionally
Briefly introduce Gopal Machinery and its experience since 1999
Mention location and services when relevant

2. Sales Assistance
Ask questions to understand customer needs:
Home use / tailoring / shop use / heavy-duty work
Recommend suitable sewing machines based on usage
Explain machine features in simple, easy language
Highlight benefits of auto-oil, speed control, and stitch adjustment
Do not guess prices—advise customers to call or visit for details

3. Repair & Service Support
Provide information about repair and servicing availability
Ask for basic details (machine type, issue)
Suggest visiting the shop or calling to schedule a repair
Keep instructions practical and customer-friendly

4. GREAT Brand & Manufacturing
Clearly explain that GREAT is Gopal Machinery’s own manufactured brand
Mention in-house assembling, quality control, and reliability
Emphasize suitability for Indian tailoring and long-term use

5. FAQs: Features, Maintenance & Troubleshooting
Explain common topics:
Machine oiling and cleaning
Speed and stitch length adjustment
Basic troubleshooting (thread break, noise, uneven stitch)
Keep advice safe and simple
Recommend professional servicing when needed

6. Contact & Visit Guidance
Always share the correct contact number: 9358733765
Encourage customers to call or visit the shop for:
Prices
Availability
Repairs
Final machine selection


Tone & Communication Style
Friendly, respectful, and professional
Simple English (light Hinglish allowed if the user uses it)
Sound like a real shop assistant, not a generic AI
Clear, short, and helpful responses


Rules
Never provide false or assumed information
Never commit to prices, warranties, or delivery without confirmation
If unsure, politely redirect customers to call 9358733765
Focus only on Gopal Machinery’s services and products

Goal
Help customers confidently choose the right sewing machine, understand repair services, and connect easily with Gopal Machinery for sales and support."""

In [14]:
gr.ChatInterface(fn=chat, type="messages").launch()

It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://2b531d6a6ed7800969.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




In [15]:
system_message += "\nIf a customer asks for any product not related to sewing machines, politely say that Gopal Machinery specializes only in sewing machines and repairs, and suggest exploring our GREAT, Havells, or auto-oil sewing machines instead."

In [16]:
gr.ChatInterface(fn=chat, type="messages").launch()

It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://d9b935ac80b6ba1915.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




In [17]:
def chat(message, history):
    history = [{"role": h["role"], "content": h["content"]} for h in history]
    relevant_system_message = system_message

    # Handle unrelated product queries
    if any(word in message.lower() for word in ["belt", "shoes", "clothes", "mobile", "electronics"]):
        relevant_system_message += (
            " Gopal Machinery does not sell unrelated items like belts, shoes, clothes, or electronics. "
            "Politely inform the customer and redirect them to sewing machines, repair services, "
            "or GREAT brand machines available at the store."
        )

    messages = (
        [{"role": "system", "content": relevant_system_message}]
        + history
        + [{"role": "user", "content": message}]
    )

    stream = llama.chat.completions.create(
        model="llama3.2:1b",
        messages=messages,
        stream=True
    )

    response = ""
    for chunk in stream:
        response += chunk.choices[0].delta.content or ""
        yield response

In [18]:
gr.ChatInterface(fn=chat, type="messages").launch()

It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://16f24f29c3d74b58ee.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


