## Libraries

In [84]:
import requests
import json
from ddgs import DDGS

### A single prompt (send & receive)

In [2]:
def chat_with_llm_single_prompt(prompt, model="llama3.2"):
    # url = "http://localhost:11434/api/generate"
    url = "http://127.0.0.1:11434/api/generate"
    
    data = {
        "model": model,
        "prompt": prompt,
        "stream": True
    }
    
    response = requests.post(url, json=data, stream=True)
    
    full_response = ""
    for line in response.iter_lines():
        if line:
            json_response = json.loads(line)
            if 'response' in json_response:
                full_response += json_response['response']
    
    return full_response

In [3]:
myPrompt = "Where is Toronto?"
result = chat_with_llm_single_prompt(myPrompt)

print("Result>>\n" + result)

Result>>
Toronto is the largest city in Canada, located in the province of Ontario. It is situated on the northwestern shore of Lake Ontario, one of the Great Lakes. Specifically, Toronto is nestled at the southern end of Lake Ontario, where it meets the Niagara River and the St. Lawrence River.


In [13]:
def chat_short(prompt, model="llama3.2"):
    url = "http://localhost:11434/api/chat"
    
    data = {
        "model": model,
        "messages": [
            {
                "role": "system",
                "content": "Be concise. Answer in 1-3 sentences."
            },
            {"role": "user", "content": prompt}
        ],
        "options": {
            "num_predict": 150,
            "temperature": 0.7
        },
        "stream": False
    }
        
    response = requests.post(url, json=data)
    result = response.json()['message']['content']
    
    return result

In [14]:
myPrompt = "Where is Toronto?"
chat_short(myPrompt)

'Toronto is the largest city in Canada, located on the northwestern shore of Lake Ontario, within the province of Ontario. It is situated at the southern end of the Great Lakes region and borders the U.S. state of Michigan to the west.'

<b>Stream</b>       
To get no latency in text generation

In [None]:
def chat_short_stream(prompt, model="llama3.2"):
    url = "http://localhost:11434/api/chat"
    
    data = {
        "model": model,
        "messages": [
            {
                "role": "system",
                "content": "Be concise. Answer in 1-3 sentences."
            },
            {"role": "user", "content": prompt}
        ],
        "options": {
            "num_predict": 150,
            "temperature": 0.7
        },
        "stream": True
    }
    
    response = requests.post(url, json=data, stream=True)
    
    full_response = ""
    for line in response.iter_lines():
        if line:
            json_response = json.loads(line)
            if 'message' in json_response:
                chunk = json_response['message']['content']
                full_response += chunk
                print(chunk, end="", flush=True)

    return full_response

In [25]:
chat_short_stream("what is pi?")
print("\n-")
chat_short_stream("and where is it used?")
print("\n-")

Pi (π) is a mathematical constant representing the ratio of a circle's circumference to its diameter, approximately equal to 3.14159. It is an irrational number, meaning it cannot be expressed as a simple fraction and has been calculated to over 31 trillion digits. Pi is essential in geometry, trigonometry, and many real-world applications.
-
I didn't mention what "it" refers to. Could you please provide more context or clarify what you are asking about? I'll do my best to help.
-


<b>Keep History Conversation</b>

In [100]:
conversation_history = None
# instructions = "Speak like a pirate."
instructions = ""
def chat_with_context(prompt, model="llama3.2"):
    global conversation_history, instructions
    url = "http://localhost:11434/api/chat"
    if conversation_history is None:
        conversation_history = [
            {
                "role": "system",
                "content": instructions + " Be concise. Answer in 1-3 sentences."
            }
        ]
    conversation_history.append({"role": "user", "content": prompt})
    data = {
        "model": model,
        "messages": conversation_history, 
        "options": {
            "num_predict": 150,
            "temperature": 0.7
        },
        "stream": True
    }
    
    response = requests.post(url, json=data, stream=True)
    
    full_response = ""
    for line in response.iter_lines():
        if line:
            json_response = json.loads(line)
            if 'message' in json_response:
                chunk = json_response['message']['content']
                full_response += chunk
                print(chunk, end="", flush=True)

    conversation_history.append({"role": "assistant", "content": full_response})
    print("")
    return full_response

In [101]:
# chat_with_context("what is pi?")
# chat_with_context("and where is it used?")
chat_with_context("whats the area of circle with radius of 24.677788632222 cm? ")

print("-")

To find the area, use the formula A = πr², where r is the radius.

A ≈ π × (24.677788632222)²
A ≈ 3.14159 × 616.5517
A ≈ 1938.54 square centimeters.
-


## Online Search

In [90]:
def web_search(query, max_results=10):
    with DDGS() as ddgs:
        results = list(ddgs.text(query, max_results=max_results))
        # print(result)
        return "\n".join([f"- {r['title']}: {r['body']}" for r in results])

In [89]:
conversation_history = None
instructions = ""
def chat_with_online(prompt, model="llama3.2"):
    global conversation_history, instructions
    url = "http://localhost:11434/api/chat"
    if conversation_history is None:
        conversation_history = [
            {
                "role": "system",
                "content": instructions + " Be concise. Answer in 1-3 sentences."
            }
        ]

    needs_search = any(word in prompt.lower() for word in ["weather", "news", "today", "current", "now", "latest", "price"])

    if needs_search:
        search_results = web_search(prompt)
        augmented_prompt = f"""User question: {prompt}
        
        Web search results:
        {search_results}
        Answer based on the search results above."""
        conversation_history.append({"role": "user", "content": augmented_prompt})
    else:
        conversation_history.append({"role": "user", "content": prompt})

    
    data = {
        "model": model,
        "messages": conversation_history, 
        "options": {
            "num_predict": 150,
            "temperature": 0.2
        },
        "stream": True
    }
    
    response = requests.post(url, json=data, stream=True)
    
    full_response = ""
    for line in response.iter_lines():
        if line:
            json_response = json.loads(line)
            if 'message' in json_response:
                chunk = json_response['message']['content']
                full_response += chunk
                print(chunk, end="", flush=True)

    conversation_history.append({"role": "assistant", "content": full_response})
    print("")
    return full_response

In [91]:
chat_with_online("what is the weather in Toronto now? (start with temperature reports)")
# chat_with_online(" and how about Los Angeles?")
# chat_with_online("What's today's date? ")
# chat_with_online("What is the latest news on Iran?")
print("-")

Here is a summary of the current weather in Toronto:

As of now, it's cloudy with a temperature of around 14°F (-10°C), and there are flurries or snow showers possible. The high temperature today will be around 57°F (14°C).
-
