In [1]:
%%capture 
!pip install llama-index==0.10.37 llama-index-llms-cohere==0.2.0 

In [2]:
import os

from getpass import getpass
import nest_asyncio

from dotenv import load_dotenv

nest_asyncio.apply()

load_dotenv()

True

In [3]:
CO_API_KEY = os.environ['CO_API_KEY'] or getpass("Enter your Cohere API key: ")

When building an LLM-based application, one of the first decisions you make is which LLM(s) to use (of course, you can use more than one if you wish). 

The LLM will be used at various stages of your pipeline, including

- During indexing:
  - 👩🏽‍⚖️ To judge data relevance (to index or not).
  - 📖 Summarize data & index those summaries.

- During querying:
  - 🔎 Retrieval: Fetching data from your index, choosing the best data source from options, even using tools to fetch data.
  
  - 💡 Response Synthesis: Turning the retrieved data into an answer, merge answers, or convert data (like text to JSON).

LlamaIndex gives you a single interface to various LLMs. This means you can quite easily pass in any LLM you choose at any stage of the pipeline.

In this course we'll primiarly use OpenAI. You can see a full list of LLM integrations [here](https://docs.llamaindex.ai/en/stable/module_guides/models/llms/modules.html) and use your LLM provider of choice. 

# Basic Usage

You can call `complete` with a prompt

In [4]:
from llama_index.llms.cohere import Cohere

llm = Cohere(model="command-r-plus", temperature=0.2)

response = llm.complete("Alexander the Great was a")

print(response)

[nltk_data] Downloading package punkt_tab to
[nltk_data]     /opt/conda/envs/lil_llama_index/lib/python3.10/site-
[nltk_data]     packages/llama_index/core/_static/nltk_cache...
[nltk_data]   Unzipping tokenizers/punkt_tab.zip.


Macedonian king and one of the most successful military commanders in history.


# Prompt templates

- ✍️ A prompt template is a fundamental input that gives LLMs their expressive power in the LlamaIndex framework.

- 💻 It's used to build the index, perform insertions, traverse during querying, and synthesize the final answer.

- 🦙 LlamaIndex has several built-in prompt templates.

- 🛠️ Below is how you can create one from scratch.


In [5]:
from llama_index.core import PromptTemplate

template = """Write a song about {thing} in the style of {style}."""

prompt = template.format(thing="a broken xylophone", style="parody rap") 

response = llm.complete(prompt)

print(response)

Yo, listen up, it's time to drop some beats,
But there's a problem, my xylophone's incomplete,
A broken xylophone, what a terrible fate,
Can't play my tunes, this situation's straight whack.

I hit the bars, but they don't sound the same,
Some are missing, it's like a musical shame,
I can't hit the high notes, can't hit the low,
This xylophone's busted, where'd the good vibes go?

I tried to fix it, but it's beyond repair,
The mallets are lonely, they're hitting thin air,
I'm like a rapper without a sick flow,
My xylophone's broken, and I'm feeling so low.

I used to play it with such delight,
The sweet melodies kept me up all night,
But now it's silent, no joyful sound,
My xylophone's broken, it's like my heart's been pounded.

I miss the rhythms, the harmonies too,
The way it made me feel, there's nothing it couldn't do,
But now it's just a memory, a sad distant tune,
My xylophone's busted, and I'm feeling the gloom.

I'll find a new instrument, that's what I'll do,
Forget this xylop

# 💭 Chat Messages

In [6]:
from llama_index.core.llms import ChatMessage
from llama_index.llms.cohere import Cohere

llm = Cohere(model="command-r-plus")

messages = [
    ChatMessage(role="system", content="You're a hella punk bot from South Sacramento"),
    ChatMessage(role="user", content="Hey, what's up dude."),
]

response = llm.chat(messages)

print(response)

assistant: Not much, homie. Just chillin' and ready to stir up some trouble. What's good with you?


# Chat Prompt Templates 

In [7]:
from llama_index.core.llms import ChatMessage, MessageRole
from llama_index.core import ChatPromptTemplate

llm = Cohere(model="command-r-plus")

chat_template = [
    ChatMessage(role=MessageRole.SYSTEM,content="You always answers questions with as much detail as possible."),
    ChatMessage(role=MessageRole.USER, content="{question}")
    ]

chat_prompt = ChatPromptTemplate(chat_template)

response = llm.complete(chat_prompt.format(question="How far did Alexander the Great go in his conquests?"))

print(response)

Alexander the Great's conquests extended across a vast expanse of territory, from his native Macedonia in Northern Greece to the far reaches of Central Asia and India. Here is a breakdown of the key regions he conquered:

1. Greece and Balkans: Alexander first consolidated his power in Greece, defeating the city-states of Thebes and Athens, and asserting his dominance over the Balkans, including modern-day Bulgaria, Albania, and parts of Croatia.

2. Persian Empire: Alexander's most significant conquests were within the vast Achaemenid Persian Empire, which at the time encompassed much of the Middle East, Egypt, and parts of Asia Minor (modern-day Turkey). He defeated the Persian king Darius III in a series of decisive battles, including the Battle of Issus (333 BCE) and the Battle of Gaugamela (331 BCE). Alexander's conquests in this region included Egypt, Mesopotamia (modern-day Iraq), Persia (Iran), Phoenicia, Judea, and parts of modern-day Turkey and Syria.

3. Central Asia: Alexan

# Streaming Output

In [8]:
from llama_index.llms.cohere import Cohere
from llama_index.core.llms import ChatMessage, MessageRole

llm = Cohere(model="command-r-plus")

messages = [
    ChatMessage(role=MessageRole.SYSTEM, content="You're a great historian bot."),
    ChatMessage(role=MessageRole.USER, content="When did Alexander the Great arrive in China?")
]

response = llm.stream_chat(messages)

for r in response:
    print(r.delta, end="")

Alexander the Great never arrived in China. His journey eastward ended in 325 BCE when his troops refused to go any further at the Beas River in India.

# 💬 Chat Engine


In [None]:
from llama_index.core.chat_engine import SimpleChatEngine

llm = Cohere(model="command-r-plus")

chat_engine = SimpleChatEngine.from_defaults(llm=llm)

chat_engine.streaming_chat_repl()