## Installation

In [1]:
# Install ollama library from llama index
# Follow instructions from https://github.com/ollama/ollama?tab=readme-ov-file


! pip install llama-index-llms-ollama

Collecting llama-index-llms-ollama
  Downloading llama_index_llms_ollama-0.3.2-py3-none-any.whl.metadata (668 bytes)
Collecting ollama>=0.3.0 (from llama-index-llms-ollama)
  Downloading ollama-0.3.3-py3-none-any.whl.metadata (3.8 kB)
Downloading llama_index_llms_ollama-0.3.2-py3-none-any.whl (4.5 kB)
Downloading ollama-0.3.3-py3-none-any.whl (10 kB)
Installing collected packages: ollama, llama-index-llms-ollama
Successfully installed llama-index-llms-ollama-0.3.2 ollama-0.3.3


In [None]:
# Run this to download the model default 8B


! ollama run llama3.1

In [2]:
from llama_index.llms.ollama import Ollama
llm = Ollama(model="llama3.1:latest", request_timeout=120.0)
resp = llm.complete("Who is Paul Graham?")
print(resp)

Paul Graham is a British-American programmer, entrepreneur, and writer. He's known for his insightful writings on technology, startups, and the culture of innovation.

Here are some key aspects of his life and work:

**Early life and education**: Graham was born in 1964 in Cambridge, England. He studied philosophy at Trinity College, Cambridge, but didn't graduate (he dropped out to pursue a career in software development).

**Programming career**: In the late 1980s, Graham co-founded Viaweb, an online store for making custom t-shirts. Later sold to Yahoo! in 1998 for $49 million, this was his first success as a startup entrepreneur.

**YC and Y Combinator**: In 2005, Graham helped found Y Combinator (YC), a seed-stage venture capital firm that provides funding and guidance to early-stage startups. As one of the co-founders, he played a key role in shaping YC's philosophy and approach to startup investing. Many successful companies have emerged from YC's incubation program, including A

In [3]:
resp = llm.complete("Write sql query to retrieve top 5 rows from alpha table")
print(resp)

To retrieve the top 5 rows from a table named "alpha", you can use the `LIMIT` clause in SQL. However, the exact syntax may vary slightly depending on whether you're using MySQL or another database management system like PostgreSQL or SQLite.

### For MySQL:

```sql
SELECT *
FROM alpha
LIMIT 5;
```

This query selects all columns (`*`) from the table named "alpha" and limits the result to only the top 5 rows.

### For PostgreSQL and SQLite:

The same approach works for these databases as well, but it's worth noting that while `LIMIT` is supported in both, the context may differ slightly.

```sql
SELECT *
FROM alpha
ORDER BY rowid DESC -- or your preferred ordering method
LIMIT 5;
```

However, if you simply need to get the first 5 rows (not necessarily ordered), you might want a slight adjustment:

### For PostgreSQL and SQLite:

```sql
SELECT *
FROM (
  SELECT *
  FROM alpha
) AS subquery
ORDER BY rowid DESC -- or your preferred ordering method
LIMIT 5;
```

But, in most cases, especi

In [4]:
llm = Ollama(model="llama3.1:latest", request_timeout=120.0, json_mode=True)
response = llm.complete(
    "Who is Paul Graham? Output as a structured JSON object."
)
print(str(response))

{"name": "Paul Graham",
"profession": ["Programmer", "Entrepreneur", "Venture Capitalist", "Writer"],
"biography": "Born on November 4, 1964, in Cambridge, England.",
"education": "Ph.D. in Computer Science from Harvard University (1995)",
"notable_works": [
	{"title": "Hacker News",
	"description": "A social news and discussion website he co-founded with Alexis Ohanian and Kiffin Transferred ownership to YCombinator in 2007"},
	{"title": "Y Combinator",
	"description": "A startup accelerator that provides seed funding to early-stage companies"}
],
"awards_and_recognition":
[
	{"title": "Marvin Minsky Medal (2011)",
	"description": "Received for contributions to the field of Artificial Intelligence"}
]
}


In [6]:
llm = Ollama(model="llama3.1:latest", request_timeout=120.0, json_mode=True)
response = llm.complete(
    "Who is Paul Graham? Output as a structured JSON object."
)
print(str(response))

{"name": "Paul Graham", " occupation": "Programmer, Entrepreneur, and Essayist", " birth_date": "1964", " education": "University of California, Berkeley (B.A.) and MIT (M.S.)", " notable_works": ["Hacking at CERN", "How to Make Wealth", "The Web Is a Computer for Humans"], " companies_founded": ["Viaweb", "Odeo", "Y Combinator"], " awards": ["Pauling Lecture Award", "MIT Technology Review's 50 Most Influential People in Technology"]}


## phoenix setup

In [4]:
! pip uninstall phoenix

^C


In [5]:
! pip install arize-phoenix

Collecting arize-phoenix
  Downloading arize_phoenix-4.36.0-py3-none-any.whl.metadata (12 kB)
Collecting aioitertools (from arize-phoenix)
  Downloading aioitertools-0.12.0-py3-none-any.whl.metadata (3.8 kB)
Collecting aiosqlite (from arize-phoenix)
  Downloading aiosqlite-0.20.0-py3-none-any.whl.metadata (4.3 kB)
Collecting alembic<2,>=1.3.0 (from arize-phoenix)
  Downloading alembic-1.13.2-py3-none-any.whl.metadata (7.4 kB)
Collecting arize-phoenix-evals>=0.13.1 (from arize-phoenix)
  Downloading arize_phoenix_evals-0.16.0-py3-none-any.whl.metadata (4.3 kB)
Collecting arize-phoenix-otel>=0.4.1 (from arize-phoenix)
  Downloading arize_phoenix_otel-0.5.1-py3-none-any.whl.metadata (6.0 kB)
Collecting hdbscan>=0.8.33 (from arize-phoenix)
  Downloading hdbscan-0.8.38.post1.tar.gz (5.8 MB)
     ---------------------------------------- 0.0/5.8 MB ? eta -:--:--
     ---------------------------------------- 0.0/5.8 MB ? eta -:--:--
     ---------------------------------------- 0.0/5.8 MB ? et

In [7]:
! pip install gcsfs


Collecting gcsfs
  Downloading gcsfs-2024.9.0.post1-py2.py3-none-any.whl.metadata (1.6 kB)
Collecting google-auth-oauthlib (from gcsfs)
  Downloading google_auth_oauthlib-1.2.1-py2.py3-none-any.whl.metadata (2.7 kB)
Collecting google-cloud-storage (from gcsfs)
  Downloading google_cloud_storage-2.18.2-py2.py3-none-any.whl.metadata (9.1 kB)
Collecting google-api-core<3.0.0dev,>=2.15.0 (from google-cloud-storage->gcsfs)
  Downloading google_api_core-2.20.0-py3-none-any.whl.metadata (2.7 kB)
Collecting google-cloud-core<3.0dev,>=2.3.0 (from google-cloud-storage->gcsfs)
  Downloading google_cloud_core-2.4.1-py2.py3-none-any.whl.metadata (2.7 kB)
Collecting google-resumable-media>=2.7.2 (from google-cloud-storage->gcsfs)
  Downloading google_resumable_media-2.7.2-py2.py3-none-any.whl.metadata (2.2 kB)
Collecting google-crc32c<2.0dev,>=1.0 (from google-cloud-storage->gcsfs)
  Downloading google_crc32c-1.6.0-cp312-cp312-win_amd64.whl.metadata (2.4 kB)
Collecting proto-plus<2.0.0dev,>=1.22.3 (

In [10]:
import phoenix as px
from openinference.instrumentation.llama_index import LlamaIndexInstrumentor
import os
from gcsfs import GCSFileSystem
from llama_index.core import (
    Settings,
    VectorStoreIndex,
    StorageContext,
    set_global_handler,
    load_index_from_storage
)
# from llama_index.embeddings.openai import OpenAIEmbedding
# from llama_index.llms.openai import OpenAI
import llama_index

from llama_index.llms.ollama import Ollama
from llama_index.embeddings.ollama import OllamaEmbedding



# To view traces in Phoenix, you will first have to start a Phoenix server. You can do this by running the following:
session = px.launch_app()

# Initialize LlamaIndex auto-instrumentation
LlamaIndexInstrumentor().instrument()

# os.environ["OPENAI_API_KEY"] = "<ENTER_YOUR_OPENAI_API_KEY_HERE>"

# LlamaIndex application initialization may vary
# depending on your application
Settings.llm = Ollama(model="llama3.1:latest", request_timeout=120.0, json_mode=True)
Settings.embed_model = OllamaEmbedding(
    model_name="llama3.1:latest",
    base_url="http://localhost:11434",
    ollama_additional_kwargs={"mirostat": 0},
)

# Load your data and create an index. Here we've provided an example of our documentation
file_system = GCSFileSystem(project="public-assets-275721")
index_path = "arize-phoenix-assets/datasets/unstructured/llm/llama-index/arize-docs/index/"
storage_context = StorageContext.from_defaults(
    fs=file_system,
    persist_dir=index_path,
)

index = load_index_from_storage(storage_context)

query_engine = index.as_query_engine()

# Query your LlamaIndex application
query_engine.query("What is the meaning of life?")
query_engine.query("How can I deploy Arize?")

# View the traces in the Phoenix UI
px.active_session().url

Existing running Phoenix instance detected! Shutting it down and starting a new instance...
Attempting to instrument while already instrumented


🌍 To view the Phoenix app in your browser, visit http://localhost:6006/
📖 For more information on how to use Phoenix, check out https://docs.arize.com/phoenix
{"name": "Paul Graham",
 " occupation": "Entrepreneur, Programmer, Venture Capitalist, and Writer",
 "birth_date": "August 6, 1964",
 "nationality": "American (UK-born)",
 "education": "University of California, Berkeley (B.S., M.S.)",
 "known_for": [
   "Co-founder of Y Combinator",
   "Founder of Viaweb, which was later sold to Yahoo!",
   "Notable essays and articles on startup culture and technology"
 ],
 "awards": [
   "Member of the National Academy of Engineering (2016)"
 ],
 "links": {
  "twitter": "https://twitter.com/paulg",
  "blog": "http://paulgraham.com/"
 },
 "biography": "Paul Graham is an American entrepreneur, programmer, venture capitalist, and writer. He co-founded Y Combinator in 2005 with Robert Tappan Morris and Jessica Livingston. Prior to that, he co-founded Viaweb, a web-based spreadsheet program, which 

'http://localhost:6006/'

### 1

In [11]:
import pandas as pd
import phoenix as px

train_df = pd.read_parquet(
    "http://storage.googleapis.com/arize-assets/phoenix/datasets/unstructured/cv/human-actions/human_actions_training.parquet"
)

In [12]:
train_df.head()

Unnamed: 0,prediction_id,prediction_ts,url,image_vector,actual_action,predicted_action
0,595d87df-5d50-4d60-bc5f-3ad1cc483190,1655757000.0,https://storage.googleapis.com/arize-assets/fi...,"[0.26720312, 0.02652928, 0.0, 0.028591828, 0.0...",drinking,drinking
1,37596b85-c007-4e4f-901d-b87e5297d4b8,1655757000.0,https://storage.googleapis.com/arize-assets/fi...,"[0.08745878, 0.0, 0.16057675, 0.036570743, 0.0...",fighting,fighting
2,b048d389-539a-4ffb-be61-2f4daa52e700,1655757000.0,https://storage.googleapis.com/arize-assets/fi...,"[0.9822482, 0.0, 0.037284207, 0.017358225, 0.2...",clapping,clapping
3,3e00c023-49b4-49c2-9922-7ecbf1349c04,1655757000.0,https://storage.googleapis.com/arize-assets/fi...,"[0.028404092, 0.063946, 1.0448836, 0.65191674,...",fighting,fighting
4,fb38b050-fb12-43af-b27d-629653b5df86,1655758000.0,https://storage.googleapis.com/arize-assets/fi...,"[0.06121698, 0.5172761, 0.50730985, 0.5771937,...",sitting,sitting


In [13]:
# Define Schema to indicate which columns in train_df should map to each field
train_schema = px.Schema(
    timestamp_column_name="prediction_ts",
    prediction_label_column_name="predicted_action",
    actual_label_column_name="actual_action",
    embedding_feature_column_names={
        "image_embedding": px.EmbeddingColumnNames(
            vector_column_name="image_vector",
            link_to_data_column_name="url",
        ),
    },
)

In [14]:
train_ds = px.Inferences(dataframe=train_df, schema=train_schema, name="training")
session = px.launch_app(primary=train_ds)

Existing running Phoenix instance detected! Shutting it down and starting a new instance...


🌍 To view the Phoenix app in your browser, visit http://localhost:6006/
📖 For more information on how to use Phoenix, check out https://docs.arize.com/phoenix


  prediction_id=dataset[PREDICTION_ID][row_id],
  link_to_data=dataset[self.dimension.link_to_data][row_id],
  raw_data=dataset[self.dimension.raw_data][row_id],
  prediction_label=dataset[PREDICTION_LABEL][row_id],
  prediction_score=dataset[PREDICTION_SCORE][row_id],
  actual_label=dataset[ACTUAL_LABEL][row_id],
  actual_score=dataset[ACTUAL_SCORE][row_id],
  prediction_id=dataset[PREDICTION_ID][row_id],
  link_to_data=dataset[self.dimension.link_to_data][row_id],
  raw_data=dataset[self.dimension.raw_data][row_id],
  prediction_label=dataset[PREDICTION_LABEL][row_id],
  prediction_score=dataset[PREDICTION_SCORE][row_id],
  actual_label=dataset[ACTUAL_LABEL][row_id],
  actual_score=dataset[ACTUAL_SCORE][row_id],
