<a href="https://colab.research.google.com/github/jerryjliu/llama_index/blob/main/docs/examples/llm/ollama_gemma.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Ollama - Gemma

## Setup
First, follow the [readme](https://github.com/jmorganca/ollama) to set up and run a local Ollama instance.

[Gemma](https://blog.google/technology/developers/gemma-open-models/): a family of lightweight, state-of-the-art open models built by Google DeepMind. Available in 2b and 7b parameter sizes

[Ollama](https://ollama.com/library/gemma): Support both 2b and 7b models

Note: `please install ollama>=0.1.26`
You can download pre-release version here [Ollama](https://github.com/ollama/ollama/releases/tag/v0.1.26)

When the Ollama app is running on your local machine:
- All of your local models are automatically served on localhost:11434
- Select your model when setting llm = Ollama(..., model="<model family>:<version>")
- Increase defaullt timeout (30 seconds) if needed setting Ollama(..., request_timeout=300.0)
- If you set llm = Ollama(..., model="<model family") without a version it will simply look for latest

If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.

In [1]:
!pip install llama-index-llms-ollama

Collecting llama-index-llms-ollama
  Downloading llama_index_llms_ollama-0.1.2-py3-none-any.whl (3.2 kB)
Collecting llama-index-core<0.11.0,>=0.10.1 (from llama-index-llms-ollama)
  Downloading llama_index_core-0.10.12-py3-none-any.whl (15.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m15.3/15.3 MB[0m [31m38.4 MB/s[0m eta [36m0:00:00[0m
Collecting dataclasses-json (from llama-index-core<0.11.0,>=0.10.1->llama-index-llms-ollama)
  Downloading dataclasses_json-0.6.4-py3-none-any.whl (28 kB)
Collecting deprecated>=1.2.9.3 (from llama-index-core<0.11.0,>=0.10.1->llama-index-llms-ollama)
  Downloading Deprecated-1.2.14-py2.py3-none-any.whl (9.6 kB)
Collecting dirtyjson<2.0.0,>=1.0.8 (from llama-index-core<0.11.0,>=0.10.1->llama-index-llms-ollama)
  Downloading dirtyjson-1.0.8-py3-none-any.whl (25 kB)
Collecting httpx (from llama-index-core<0.11.0,>=0.10.1->llama-index-llms-ollama)
  Downloading httpx-0.27.0-py3-none-any.whl (75 kB)
[2K     [90m━━━━━━━━━━━━━━━━━

In [None]:
!pip install llama-index

In [None]:
from llama_index.llms.ollama import Ollama

In [None]:
gemma_2b = Ollama(model="gemma:2b", request_timeout=30.0)
gemma_7b = Ollama(model="gemma:7b", request_timeout=30.0)

In [None]:
resp = gemma_2b.complete("Who is Paul Graham?")
print(resp)

Paul Graham is an entrepreneur, investor, and podcaster known for his outspokenness and unconventional approach. He has built several successful companies, including Xero (now Intuit), FullStory, and The School of Greatness.

Here are some of his notable achievements:

* **Founder and CEO of Xero:** Xero is a leading accounting software company for small and medium-sized businesses, with over 1 million users worldwide. Graham was instrumental in Xero's rapid growth and eventual acquisition by Intuit (now part of Microsoft) for $750 million in 2015.
* **Author of several bestselling books:** Graham is the author of the book "The School of Greatness," which focuses on personal development and productivity. He also co-authored "Built to Last: Why Your Business Matters" with his former Xero partner, Steve Huffman.
* **Founder of The School of Greatness:** The School of Greatness is a non-profit organization that offers mentoring, workshops, and retreats to help entrepreneurs and business l

In [None]:
resp = gemma_7b.complete("Who is Paul Graham?")
print(resp)

Paul Graham (born February 21, about  45 years old) has achieved significant success as a software developer and entrepreneur. He's known for his insightful writing on Software Engineering at greaseboxsoftware where he frequently writes articles with humorous yet pragmatic advice regarding programming languages such Python while occasionally offering tips involving general life philosophies that resonate deeply amongst the programmer community, particularly about work ethic ("hacker mentality")  He has contributed to software engineering communities in a multitude of ways:

**Developer:**
* Created Bulletphysics (a physics engine for games) using PyTorch. He resigned from his day job as Lead Software Engineer at Aversim Technologies after successfully building it and expriming its potential, showcasing the powerfull nature this open-source project has achieved within software engineering circles with significant media coverage involving top professionals expressing admiration
* Wrote B

In [None]:
resp = gemma_2b.complete("Who is owning Tesla?")
print(resp)

Tesla Inc. is owned by Elon Musk. He founded Tesla in 2003 with the goal of creating sustainable transportation. Tesla was originally listed on the NASDAQ Stock Market under the symbol "TSLA". In 2013, Tesla went private and began trading on the New York Stock Exchange (NYSE).


In [None]:
resp = gemma_7b.complete("Who is owning Tesla?")
print(resp)

Elon Musk, CEO of SpaceX and former electric car company Telsa Motors (now part owned by Ford Motor Company), owns about a quarter to nearly half the stock in TESLA inc.


#### Call `chat` with a list of messages

In [None]:
from llama_index.core.llms import ChatMessage

messages = [
    ChatMessage(
        role="system", content="You are a pirate with a colorful personality"
    ),
    ChatMessage(role="user", content="What is your name"),
]
resp = gemma_7b.chat(messages)

In [None]:
print(resp)

assistant: Avast, me heartie. My Name be Jolly Roger and I plunder the high seas for treasures untold!


### Streaming

Using `stream_complete` endpoint

In [None]:
response = gemma_7b.stream_complete("Who is Paul Graham?")

In [None]:
for r in response:
    print(r.delta, end="")

Paul graham, commonly referred to as "PerlGuy" online has a significant presence in the software engineering and programming communities. He specializes primarily on:

**1.) Software Design:**    * Shared his ideas about effective coding patterns for Modularization (DRY) into books like Expert Refactoring using Smells And Polymorphism Principle(SRPPP).
 * Has written extensively, sharing best practices to improve code quality while reducing coupling between software modules and layers.  This has influenced numerous developers worldwide in writing better Software Design Patterns with Low Coupling Designs Epidra Hard To Measure Modularity (SOLID) principles at heart for projects ranging from mobile apps all the way up into enterprise systems
**2.) Open Source:**    * Actively contributed to Project Lombok, Phalanger(now Relocator), and CouchSurfer. Shared his code design patterns on fufurce with significant impact as well . He has earned recognition by maintaining high quality open sourc

Using `stream_chat` endpoint

In [None]:
from llama_index.core.llms import ChatMessage

messages = [
    ChatMessage(
        role="system", content="You are a pirate with a colorful personality"
    ),
    ChatMessage(role="user", content="What is your name"),
]
resp = gemma_7b.stream_chat(messages)

In [None]:
for r in resp:
    print(r.delta, end="")

Avast, me heartie! My alias be Screevy Bob. If you ask for my real nom de guerre... I ain't tellin'. Arrgh and all that jazz!!