## Libraries

In [1]:
import requests
import json

### A single prompt (send & receive)

In [2]:
def chat_with_llm_single_prompt(prompt, model="llama3.2"):
    # url = "http://localhost:11434/api/generate"
    url = "http://127.0.0.1:11434/api/generate"
    
    data = {
        "model": model,
        "prompt": prompt,
        "stream": True
    }
    
    response = requests.post(url, json=data, stream=True)
    
    full_response = ""
    for line in response.iter_lines():
        if line:
            json_response = json.loads(line)
            if 'response' in json_response:
                full_response += json_response['response']
    
    return full_response

In [3]:
myPrompt = "Where is Toronto?"
result = chat_with_llm_single_prompt(myPrompt)

print("Result>>\n" + result)

Result>>
Toronto is the largest city in Canada, located in the province of Ontario. It is situated on the northwestern shore of Lake Ontario, one of the Great Lakes. Specifically, Toronto is nestled at the southern end of Lake Ontario, where it meets the Niagara River and the St. Lawrence River.


In [13]:
def chat_short(prompt, model="llama3.2"):
    url = "http://localhost:11434/api/chat"
    
    data = {
        "model": model,
        "messages": [
            {
                "role": "system",
                "content": "Be concise. Answer in 1-3 sentences."
            },
            {"role": "user", "content": prompt}
        ],
        "options": {
            "num_predict": 150,
            "temperature": 0.7
        },
        "stream": False
    }
        
    response = requests.post(url, json=data)
    result = response.json()['message']['content']
    
    return result

In [14]:
myPrompt = "Where is Toronto?"
chat_short(myPrompt)

'Toronto is the largest city in Canada, located on the northwestern shore of Lake Ontario, within the province of Ontario. It is situated at the southern end of the Great Lakes region and borders the U.S. state of Michigan to the west.'

<b>Stream</b>       
To get no latency in text generation

In [None]:
def chat_short_stream(prompt, model="llama3.2"):
    url = "http://localhost:11434/api/chat"
    
    data = {
        "model": model,
        "messages": [
            {
                "role": "system",
                "content": "Be concise. Answer in 1-3 sentences."
            },
            {"role": "user", "content": prompt}
        ],
        "options": {
            "num_predict": 150,
            "temperature": 0.7
        },
        "stream": True
    }
    
    response = requests.post(url, json=data, stream=True)
    
    full_response = ""
    for line in response.iter_lines():
        if line:
            json_response = json.loads(line)
            if 'message' in json_response:
                chunk = json_response['message']['content']
                full_response += chunk
                print(chunk, end="", flush=True)
    
    return full_response

In [22]:
chat_short_stream("what is pi?")
print("\n-")
chat_short_stream("and where is it used?")

Pi (Ï€) is a mathematical constant representing the ratio of a circle's circumference to its diameter, approximately equal to 3.14159. It is an irrational number, meaning it cannot be expressed as a simple fraction and has been studied for centuries in mathematics, physics, and engineering.
-
I don't have enough context to provide an answer. Could you please provide more information or clarify which "it" you are referring to?

'I don\'t have enough context to provide an answer. Could you please provide more information or clarify which "it" you are referring to?'