## Table of Contents

- [Ollama API Inference](#ollama-api)
  - [CLI](#cli-inference---ollama-api)
  - [Gradio-UI](#gradio-ui---ollama-api)
  - [Streamlit-UI](#streamlit-ui---ollama-api)
  

- [Langchain Inference](#langchain)
  - [Initial testing](#initial-testing---langchain)
  - [CLI](#cli---langchain)
  - [Gradio-UI](#gradio-ui---langchain)
  - [Streamlit-UI](#streamlit-ui---langchain)
  - [Open-WebUI](#open-webui---langchain)
  

- [Extras](#extras)
  - [HuggingFace API](#huggingface-api)
  - [Anthropic API](#anthropic-api)


## Ollama API

### CLI Inference - Ollama API

In [1]:
from locallm.response import response_api

url = "http://localhost:11434/api/generate"

headers = {
    'Content-Type': 'application/json',
}
model="llama3.2:1b"

conversation_history = []

def chat(prompt):
    response = response_api(prompt, conversation_history, url=url, model=model, headers=headers)
    conversation_history.append(prompt)
    conversation_history.append(response)
    return response if response else "Error"

chat("Hello")

'How can I assist you today?'

### Gradio UI - Ollama API

In [None]:
import gradio as gr
from locallm.response import response_api

url = "http://localhost:11434/api/generate"

headers = {
    'Content-Type': 'application/json',
}
model="llama3.2:1b"

conversation_history = []

def chat(prompt):
    if prompt.strip().lower() == "/clear":
        clear_history()
    response = response_api(prompt, conversation_history, url=url, model=model, headers=headers)
    conversation_history.append(prompt)
    conversation_history.append(response)
    return response if response else "Error"

## Gradio UI
def clear_history():
    conversation_history.clear()  # Reset the list
    return "Chat history cleared."

# demo = gr.Interface(
#     fn=chat,
#     inputs=gr.Textbox(lines=2, placeholder="Enter your prompt here..."),
#     outputs="text"
# )

with gr.Blocks() as demo:
    textbox = gr.Textbox(lines=1, placeholder="Enter your prompt here...")
    output = gr.Textbox()
    submit_btn = gr.Button("Send")
    clear_btn = gr.Button("Clear History")

    textbox.submit(chat, inputs=textbox, outputs=output)
    submit_btn.click(chat, inputs=textbox, outputs=output)
    clear_btn.click(clear_history, inputs=[], outputs=output)
    
demo.launch()

Running on local URL:  http://127.0.0.1:7861

To create a public link, set `share=True` in `launch()`.




In [3]:
demo.close()

Closing server running on port: 7861


### Streamlit UI - Ollama API

In [None]:
import subprocess
result = subprocess.run(["streamlit", "run", "ui/st_ollama.py"])
result

## Langchain

### Initial testing - Langchain

In [None]:
## Stateless single query

from langchain_community.llms import Ollama

model="llama3.2:1b"
llm = Ollama(model=model) 

response = llm.invoke("What is the capital of Canada?")
# response = llm.invoke(input())
print(response)

The capital of Canada is Ottawa.


In [15]:
## Multi query

from langchain_community.chat_models import ChatOllama
from langchain_core.messages import HumanMessage

model="llama3.2:1b"
# llm = ChatOllama(
    #     model=model,
    #     temperature=0.8,
    #     num_predict=256
    # )
llm = ChatOllama(model=model) 

# content = input()
content = "Who discovered gravity?"
messages = [
    HumanMessage(content=content),
]

response = llm.invoke(messages)
print(response.content)  

Gravity was not exactly "discovered." Instead, the concept of gravity has been understood and described by various scientists throughout history. Here are some key figures who contributed to our understanding of gravity:

1. Sir Isaac Newton (1643-1727): Newton is often credited with developing the law of universal gravitation, which states that every point mass attracts every other point mass by a force acting along the line intersecting both points. He published his groundbreaking work "Philosophiæ Naturalis Principia Mathematica" in 1687.
2. Galileo Galilei (1564-1642): Galileo made significant contributions to our understanding of gravity during his experiments with falling bodies and the behavior of celestial objects. He observed that objects fall at the same rate in all gravitational fields, which challenged Newton's theory of universal gravitation.
3. Johannes Kepler (1571-1630): Kepler discovered the law of elliptical orbits, which describes how planets orbit around the sun. Th

In [None]:
## Template 

from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser

# Define the prompt template
prompt_template = "What is the capital of {country}?"
prompt = PromptTemplate(input_variables=["country"], template=prompt_template)

prompt = PromptTemplate.from_template("Who won the FIFA World Cup in {year}?")

# Create a RunnableSequence that combines the prompt and LLM
chain = prompt | llm | StrOutputParser()
# chain = LLMChain(llm=llm, prompt=prompt) ## deprecated

# Execute the sequence with input variables
response = chain.invoke({"country":"Canada", "year": 2022})
print(response)


The winner of the 2022 FIFA World Cup was Argentina. They defeated France 4-2 in a penalty shootout, after the match ended 3-3 after extra time, to become the first South American team to win the tournament since Brazil in 1970 and 1994.


In [34]:
## Streaming ##
for chunk in llm.stream("Explain quantum computing in simple terms."):
    print(chunk.content, end="", flush=True)

# Or use chain.stream with a template

Quantum computing is a way of processing information that's different from the way computers currently work. Here's a simple explanation:

**Classical Computers**

Think of a classical computer like a really smart but very slow calculator. It works by using numbers and logic gates to perform calculations. Each gate can either be true (1) or false (0), and it takes time to figure out what the answer is.

**Quantum Computers**

A quantum computer is similar, but it uses special "qubits" (quantum bits). Qubits can exist in multiple states at the same time, like being both 0 and 1 simultaneously. This means a quantum computer can process many calculations at once, much faster than a classical computer.

Imagine you have a lot of calculators, each one with its own set of numbers to calculate. A classical computer would need to switch between all those calculators (each one working on the current number) to get the right answer. But a quantum computer can do it all in parallel, like having m

In [36]:
## ChatPromptTemplate

from langchain_community.chat_models import ChatOllama
from langchain_core.messages import HumanMessage, AIMessage
from langchain_core.prompts import ChatPromptTemplate

llm = ChatOllama(model="llama3.2")

In [37]:
chat_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful and friendly AI assistant. Respond conversationally and helpfully. Here is the conversation history: {chat_history}"),
    ("user", "{user_input}")
])

In [51]:
messages[1]
# HumanMessage(content=user_input)

HumanMessage(content='How does machine learning work?')

In [None]:
user_input = "How does machine learning work?"
chat_history = []
messages = chat_prompt.format_messages(user_input=user_input, chat_history=chat_history)

# response = chat_model.invoke(messages)
# print(response.content)

for chunk in llm.stream(messages):
    print(chunk.content, end="", flush=True)

Machine learning is such a broad and fascinating field, but I'll try to break it down in simple terms.

**What is Machine Learning?**

Machine learning (ML) is a type of artificial intelligence (AI) that enables computers to learn from data without being explicitly programmed. It's all about training algorithms to make predictions or decisions based on patterns in data.

**How Does Machine Learning Work?**

Here's a simplified overview:

1. **Data Collection**: Gathering large amounts of data, which can come in various forms such as images, text, audio, or sensor readings.
2. **Data Preprocessing**: Cleaning, transforming, and preparing the data for analysis.
3. **Model Selection**: Choosing an algorithm (a set of instructions) to use for learning from the data.
4. **Training**: The algorithm is fed the training data, and it starts to learn patterns and relationships between variables.
5. **Validation**: Testing the model on a separate dataset to evaluate its performance and accuracy.


In [76]:
chat_history = []

In [None]:
## Integrating chat history

def chat(user_input, llm=llm, chat_history=[], out=True):
    messages = chat_prompt.format_messages(user_input=user_input, chat_history=chat_history)
    chat_history.append(HumanMessage(content=user_input))  

    response = llm.invoke(messages)  
    if out:
        response = []
        for chunk in llm.stream(messages):
            print(chunk.content, end="", flush=True)
            response.append(chunk.content)
        print()
        response = " ".join(response)
    else:
        response = llm.invoke(messages).content
    
    chat_history.append(AIMessage(response))  
    return response, chat_history

In [78]:
response, chat_history = chat("Hello! This is Sam")
print("###")
response, chat_history = chat("Tell me a joke.")
print("###")
response, chat_history = chat("Can you explain it?")
print("###")
response, chat_history = chat("What is my name?")

Hi Sam! It's nice to meet you. Is there something I can help you with, or would you like to chat for a bit?
###
Here's one: A man walked into a library and asked the librarian, "Do you have any books on Pavlov's dogs and Schrödinger's cat?" The librarian replied, "It rings a bell, but I'm not sure if it's here or not."
###
The joke is a play on words. Pavlov's dogs refers to Ivan Pavlov, a Russian scientist who conditioned animals to associate the sound of a bell with food. The idea is that if Pavlov's dogs were in a library, they might be looking for books about dog behavior.

Schrödinger's cat, on the other hand, is a thought experiment by Austrian physicist Erwin Schrödinger. It's a famous example of quantum superposition, where a cat can exist in multiple states (alive and dead) until its state is observed.

The punchline "it rings a bell" is a clever connection between Pavlov's dogs and the idea that Schrödinger's cat might be in the library book you're looking for. The librarian 

In [79]:
chat_history

[HumanMessage(content='Hello! This is Sam'),
 AIMessage(content="Hi  Sam !  It 's  nice  to  meet  you .  Is  there  something  I  can  help  you  with ,  or  would  you  like  to  chat  for  a  bit ? "),
 HumanMessage(content='Tell me a joke.'),
 AIMessage(content='Here \'s  one :  A  man  walked  into  a  library  and  asked  the  librarian ,  " Do  you  have  any  books  on  Pav lov \'s  dogs  and  Sch r ö d inger \'s  cat ?"  The  librarian  replied ,  " It  rings  a  bell ,  but  I \'m  not  sure  if  it \'s  here  or  not ." '),
 HumanMessage(content='Can you explain it?'),
 AIMessage(content='The  joke  is  a  play  on  words .  Pav lov \'s  dogs  refers  to  Ivan  Pav lov ,  a  Russian  scientist  who  conditioned  animals  to  associate  the  sound  of  a  bell  with  food .  The  idea  is  that  if  Pav lov \'s  dogs  were  in  a  library ,  they  might  be  looking  for  books  about  dog  behavior .\n\n Sch r ö d inger \'s  cat ,  on  the  other  hand ,  is  a  thought  exp

### CLI - Langchain

In [None]:
from locallm.response import response_lc
from langchain_community.chat_models import ChatOllama

model="llama3.2:1b"
llm = ChatOllama(model=model)
chat_history=[]
def chat(user_input):
    global chat_history
    response, chat_history = response_lc(user_input, chat_history=chat_history, llm=llm, out=True)
    return response

_ = chat("Hello, I'm Sam")
_ = chat("What's my name")

Hello Sam! Nice to meet you. How's your day going so far?
You said your name in the initial message. Your name is Sam. How are you doing today?


### Gradio UI - Langchain

In [None]:
import gradio as gr
from locallm.response import response_lc
from langchain_community.chat_models import ChatOllama

model="llama3.2:1b"
llm = ChatOllama(model=model)
chat_history=[]

def chat(user_input):
    if user_input.strip().lower() == "/clear":
        return clear_history()
    global chat_history
    response, chat_history = response_lc(user_input, chat_history=chat_history, llm=llm, out=False)
    return response

## Gradio UI
def clear_history():
    global chat_history
    chat_history.clear()  # Reset the list
    return "Chat history cleared."

# demo = gr.Interface(
#     fn=chat,
#     inputs=gr.Textbox(lines=1, placeholder="Enter your prompt here..."),
#     outputs="text"
# )

with gr.Blocks() as demo:
    textbox = gr.Textbox(lines=1, placeholder="Enter your prompt here...")
    output = gr.Textbox()
    submit_btn = gr.Button("Send")
    clear_btn = gr.Button("Clear History")

    textbox.submit(chat, inputs=textbox, outputs=output)
    submit_btn.click(chat, inputs=textbox, outputs=output)
    clear_btn.click(clear_history, inputs=[], outputs=output)
    
demo.launch()

Running on local URL:  http://127.0.0.1:7862

To create a public link, set `share=True` in `launch()`.




### Streamlit UI - Langchain

In [None]:
import subprocess
result = subprocess.run(["streamlit", "run", "ui/st_langchain.py"])

### Open WebUI - Langchain

## Extras

### HuggingFace API

In [None]:
from langchain_community.llms import HuggingFaceHub
import os
from dotenv import load_dotenv

load_dotenv() # Setup: Load HUGGINGFACEHUB_API_TOKEN saved in .env 

## Replace following:
# model="llama3.2:1b"
# llm = ChatOllama(model=model)

## With:
llm = HuggingFaceHub(
    repo_id="meta-llama/Llama-3.2-3B-Instruct",  # Replace with your desired model
    model_kwargs={"temperature": 0.7, "max_length": 100},
)

## Now this can be used with any UI 

llm.invoke("Jello")

"Jello molds with fruit and nuts are a classic summer dessert. These easy-to-make desserts are perfect for potlucks, barbecues, and outdoor gatherings. Here are some creative ideas to take your Jello molds to the next level:\n\n1.  Tropical fruit salad: Mix together your favorite tropical fruits like pineapple, mango, and kiwi, and add some toasted coconut flakes for extra texture and flavor.\n2.  Edible flowers: Add some edible flowers like violas, pansies, or nasturtiums to give your Jello molds a colorful and whimsical touch.\n3.  Fresh herbs: Incorporate fresh herbs like mint, basil, or lemongrass into your Jello molds for a refreshing and fragrant twist.\n4.  Citrus zest: Add some grated citrus zest like orange, lemon, or lime to the Jello mixture for a burst of citrus flavor.\n5.  Chopped nuts: Mix in some chopped nuts like almonds, walnuts, or pecans for added crunch and texture.\n6.  Candy pieces: Add some candy pieces like M&M's, chopped peanut butter cups, or chopped candy ca

### Anthropic API

Similarly, we can use Anthropic API

Setup:
Install langchain-anthropic and set environment variable ANTHROPIC_API_KEY.

pip install -U langchain-anthropic
export ANTHROPIC_API_KEY="your-api-key"

In [None]:
%pip install --upgrade --quiet langchain langchain-anthropic

Note: you may need to restart the kernel to use updated packages.


In [None]:
from langchain_anthropic import ChatAnthropic
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template("tell me a joke about {topic}")
llm = ChatAnthropic(model_name="claude-3-haiku-20240307")

# Can use this llm with our methods and UI above

chain = prompt | llm | StrOutputParser()
chain.invoke({"topic": "bears"})

### Gradio UI - Langchain