### Ollama with Langchain

In [5]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_ollama.llms import OllamaLLM

# Initialize the model
model = OllamaLLM(model="gemma2:latest")

# Create a prompt template
template = """Question: {question}

Answer: Let's think step by step."""

prompt = ChatPromptTemplate.from_template(template)

# Create the chain
chain = prompt | model

# Run it
response = chain.invoke({"question": "What is LangChain?"})
print(response)

Okay, let's break down what LangChain is.  Here's a step-by-step explanation:

1. **What is it?** LangChain is a framework designed to help developers build applications powered by large language models (LLMs) like me. 

2. **Why was it created?** LLMs are incredibly powerful, but using them effectively in real-world applications can be complex. LangChain simplifies this process by providing tools and building blocks.

3. **What does it offer?**

   * **Chain Creation:**  It allows you to create "chains" of different components (LLMs, APIs, databases, etc.) that work together to accomplish a specific task. 
   * **Data Management:** LangChain helps integrate LLMs with external data sources, enabling them to access and process information beyond their initial training.
   * **Agent Capabilities:** It supports the creation of "agents" - LLM-powered entities that can interact with their environment, make decisions, and perform actions.

4. **Example Use Cases:**

   * **Chatbots:**  Build

### Simple Query

In [6]:
from langchain_ollama.llms import OllamaLLM

llm = OllamaLLM(
    model="gemma2:latest",
    temperature=0.7
)

response = llm.invoke("What is Apache Spark?")
print(response)

Apache Spark is a powerful open-source **data processing engine** designed for efficient and large-scale data analysis. 

Here's a breakdown:

**Key Features:**

* **Fast & Efficient:** Spark excels at processing massive datasets due to its in-memory computing capabilities, distributed architecture, and optimized execution engine.
* **Versatility:** It supports a wide range of data processing tasks, including:
    * Batch Processing: Analyzing large datasets stored on disk.
    * Streaming Processing: Handling real-time data streams for immediate analysis.
    * Machine Learning: Building and deploying machine learning models at scale.
    * Graph Processing: Analyzing relationships and connections within complex networks.
* **Scalability:** Spark can seamlessly distribute workloads across a cluster of machines, allowing it to handle truly massive datasets.

**How it Works:**

Spark utilizes a distributed architecture where data is split into smaller chunks and processed in parallel by

### Installing new model in Ollama

In [7]:
!ollama pull llava

[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠼ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠴ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest [K
pulling 170370233dd5:   0% ▕                  ▏ 773 KB/4.1 GB                  [K[?25h[?2026l[?2026h[?25l[A[1Gpulling manifest [K
pulling 170370233dd5:   0% ▕                  ▏ 1.9 MB/4.1 GB                  [K[?25h[?2026l[?2026h[?25l[A[1Gpulling manifest [K
pulling 170370233dd5:   0% ▕                  ▏ 3.7 MB/4.1 GB                  [K[?25h[?2026l[?2026h[?25l[A[1Gpulling manifest [K
pulling 170370233dd5:   0% ▕                  ▏ 4.3 MB/4.1 GB                  [K[?25h[?2026l[?2026h[?25l[A[1Gpulling manifest [K
pulling 170370233dd5:   0% ▕                  ▏ 6.3 MB/4.1 GB                  [K[?

### Streaming text

In [11]:
from langchain_ollama.llms import OllamaLLM

llm = OllamaLLM(model="gemma2:latest")

for chunk in llm.stream("Explain the CAP theorem in distributed systems"):
    print(chunk, end="", flush=True)

The CAP theorem, also known as Brewer's theorem, is a fundamental concept in distributed systems. It states that it is impossible for a distributed data store to simultaneously provide all three of the following guarantees:

* **Consistency:** Every read request receives the most recent write or an error. This means all nodes see the same data at the same time.
* **Availability:** Every request receives a (non-error) response, even if some nodes are down. 
* **Partition tolerance:** The system continues to operate despite arbitrary message loss or network partitions between nodes.

**Think of it this way:** You can only choose two out of these three guarantees. If you prioritize consistency and availability, you sacrifice partition tolerance. If you prioritize availability and partition tolerance, you sacrifice consistency. And so on.

**Here's a breakdown of each guarantee:**

* **Consistency:** Imagine two users updating the same document simultaneously. Consistency ensures both user

### Ollama different Options - without langchain 

In [13]:
import ollama

response = ollama.chat(
    model='gemma2:latest',
    messages=[
        {
            'role': 'user',
            'content': 'Write a short poem about data engineering'
        }
    ],
    options={
        'temperature': 0.8,
        'top_p': 0.9,
    }
)

print(response['message']['content'])

In oceans of data, vast and deep,
Data engineers their secrets keep.
With pipelines strong and ETL's might,
They shape the flow, both day and night.

From raw to refined, a graceful dance,
Transforming bits into valuable glance.
Schema designs, a structured art,
Where knowledge blooms, a beating heart.

BigQuery hums, Spark ignites,
As insights rise from digital heights.
Data engineers, unseen yet key,
Unlock the future, for you and me. 





### Ollama as Rest API endpoint

In [14]:
import requests
import json

url = "http://localhost:11434/api/generate"

data = {
    "model": "gemma2:latest",
    "prompt": "Explain the CAP theorem",
    "stream": True
}

response = requests.post(url, json=data, stream=True)

for line in response.iter_lines():
    if line:
        chunk = json.loads(line)
        print(chunk.get('response', ''), end='', flush=True)

The CAP theorem, also known as Brewer's theorem, is a fundamental concept in distributed systems. It states that it is impossible for a distributed data store to simultaneously provide all three of the following guarantees:

* **Consistency (C):**  All nodes see the same data at the same time. This means that any read operation will return the most recent write, regardless of which node it's performed on.
* **Availability (A):** Every request receives a (non-error) response, even if some nodes are down or unavailable.

* **Partition tolerance (P):** The system continues to operate despite network partitions, where nodes can no longer communicate with each other.

**The Trade-off:**

Because of these inherent limitations, you can only choose two out of the three guarantees. This means there's always a trade-off to consider when designing a distributed system. 

Here are the possible combinations:

* **CP (Consistency and Partition tolerance):**  This approach prioritizes consistency ove

In [15]:
import ollama
import base64

# Read and encode image
with open("iceberg.png", "rb") as image_file:
    image_data = base64.b64encode(image_file.read()).decode('utf-8')

# Use ollama.chat with images parameter
response = ollama.chat(
    model='llava',
    messages=[
        {
            'role': 'user',
            'content': 'Describe this image in detail',
            'images': [image_data]
        }
    ]
)

print(response['message']['content'])

 The image is a digital graphic with a schematic representation of an iceberg model, used to illustrate the structure of a database system. It features a diagram in blue and white colors, which includes various layers representing different parts of the database architecture.

At the topmost layer, labeled "iceberg catalog," there are three boxes: one titled "metadata_catalog_db" with a blue background, another labeled "current_metadata_point" with a green background, and the third with the text "manifest_files" in white on a dark blue background. The last box is connected to two other boxes by lines that indicate relationships or processes.

Below this layer, there's a middle layer titled "metadata_data_layer," which includes several components: "database_files" with red and black stripes, "schema_files" in light blue, and "diffs" with white text on a dark background. These are connected by lines suggesting a hierarchical relationship between them.

At the bottom of the iceberg, label