## Installing Ollama dependencies
---

1. `pciutils` is required by Ollama to detect the GPU type.
2. Installation of Ollama in the runtime instance will be taken care by `curl -fsSL https://ollama.com/install.sh | sh`

In [None]:
import sys
IN_COLAB = 'google.colab' in sys.modules
if IN_COLAB:
  !sudo apt update -qq
  !sudo apt install -qq -y pciutils
  !curl -fsSL https://ollama.com/install.sh | sh
else:
    print("Not running in Google Colab")
    ! if ! ollama --version; then echo "ollama is not installed"; exit(1); fi

## Running Ollama
---

In order to use Ollama it needs to run as a service in background parallel to your scripts. Because Jupyter Notebooks is built to run code blocks in sequence this make it difficult to run two blocks at the same time. As a workaround we will create a service using subprocess in Python so it doesn't block any cell from running.

Service can be started by command `ollama serve`.

`time.sleep(5)` adds some delay to get the Ollama service up before downloading the model.

In [None]:
import threading
import subprocess
import time
import requests

def run_ollama_serve():
  subprocess.Popen(["ollama", "serve"])
  
# Check if ollama is running
try:
  response = requests.get('http://localhost:11434')
  if response.status_code == 200:
    print("Ollama is running")
except:
  print("Ollama is not running")
  thread = threading.Thread(target=run_ollama_serve)
  thread.start()
  time.sleep(5)

## Runing project

In [None]:
%pip install dotenv weave langchain_core langchain_openai langchain_ollama

In [None]:
from dotenv import load_dotenv
import os

load_dotenv()

api_key_preview = os.getenv("OPENAI_API_KEY")[:10]
print(f"First 10 characters of API key: {api_key_preview}")

wandb_key_preview = os.getenv("WANDB_API_KEY")[:10]
print(f"First 10 characters of W&B key: {wandb_key_preview}")

In [None]:
import weave
from langchain_core.prompts import PromptTemplate

In [None]:
weave.init("langchain_demo")

In [None]:
# from langchain_openai import ChatOpenAI

# llm = ChatOpenAI()
# prompt = PromptTemplate.from_template("1 + {number} = ")

# llm_chain = prompt | llm

# output = llm_chain.invoke({"number": 2})

# print(output)

In [None]:
model = 'llama3'
!ollama pull $model

In [None]:
from langchain_ollama.chat_models import ChatOllama

# Initialize the ChatOllama model
model_llama = ChatOllama(
    model=model,  # Specify the model version
    base_url="http://localhost:11434",  # URL where Ollama is running locally
    temperature=0.7,  # Control the randomness of the output (0.0 to 1.0)
)

# Note: Ensure Ollama is running on your computer before executing this code

# If you encounter an OllamaEndpointNotFoundError, you may need to pull the model
# Run the following command in your terminal:
# ollama pull llama3.1

# Generate a response from the model
response = model_llama.invoke("Olá, meu nome é Yuri. Qual é o seu nome?")

# Print the response
print(response)