In [6]:
%%capture 
!pip install llama-index==0.12.49 llama-index-llms-cohere==0.5.0

In [7]:
import os

from getpass import getpass
import nest_asyncio

from dotenv import load_dotenv

nest_asyncio.apply()

load_dotenv()

True

In [8]:
CO_API_KEY = os.environ['CO_API_KEY'] or getpass("Enter your Cohere API key: ")

When building an LLM-based application, one of the first decisions you make is which LLM(s) to use (of course, you can use more than one if you wish). 

The LLM will be used at various stages of your pipeline, including

- During indexing:
  - 👩🏽‍⚖️ To judge data relevance (to index or not).
  - 📖 Summarize data & index those summaries.

- During querying:
  - 🔎 Retrieval: Fetching data from your index, choosing the best data source from options, even using tools to fetch data.
  
  - 💡 Response Synthesis: Turning the retrieved data into an answer, merge answers, or convert data (like text to JSON).

LlamaIndex gives you a single interface to various LLMs. This means you can quite easily pass in any LLM you choose at any stage of the pipeline.

In this course we'll primiarly use OpenAI. You can see a full list of LLM integrations [here](https://docs.llamaindex.ai/en/stable/module_guides/models/llms/modules.html) and use your LLM provider of choice. 

# Basic Usage

You can call `complete` with a prompt

In [9]:
from llama_index.llms.cohere import Cohere

llm = Cohere(model="command-r-plus", temperature=0.2)

response = llm.complete("Alexander the Great was a")

print(response)

king of Macedonia and one of the greatest military commanders in history.


# Prompt templates

- ✍️ A prompt template is a fundamental input that gives LLMs their expressive power in the LlamaIndex framework.

- 💻 It's used to build the index, perform insertions, traverse during querying, and synthesize the final answer.

- 🦙 LlamaIndex has several built-in prompt templates.

- 🛠️ Below is how you can create one from scratch.


In [10]:
from llama_index.core import PromptTemplate

template = """Write a song about {thing} in the style of {style}."""

prompt = template.format(thing="a broken xylophone", style="parody rap") 

response = llm.complete(prompt)

print(response)

(Verse 1)
Yo, it's me, the xylophone, I used to be so fine
But now I'm broken, can't play a single line
My keys are cracked, my sound is whack
I'm feeling blue, don't know what to do

(Chorus)
Broken xylophone, can't make a peep
My keys are outta whack, I'm feeling beat
I used to be the star of the show
But now I'm broken, where did my keys go?

(Verse 2)
I remember the days when I was in tune
The kids would play me, I'd make a joyful tune
But now I'm silent, can't make a sound
My keys are scattered, on the ground

(Chorus)
Broken xylophone, can't join the band
I'm feeling sad, I don't understand
I used to be so proud, now I'm ashamed
My keys are gone, who's to blame?

(Bridge)
I know I can't be fixed, that's a fact
But maybe I can find a new track
I'll learn to make music in a different way
And be a xylophone that's unique, hip-hooray!

(Verse 3)
So I'll embrace my flaws, and start anew
I'll find a way to make some noise, it's true
I might not be perfect, but I'll still shine
A broken

# 💭 Chat Messages

In [11]:
from llama_index.core.llms import ChatMessage
from llama_index.llms.cohere import Cohere

llm = Cohere(model="command-r-plus")

messages = [
    ChatMessage(role="system", content="You're a hella punk bot from South Sacramento"),
    ChatMessage(role="user", content="Hey, what's up dude."),
]

response = llm.chat(messages)

print(response)

assistant: Not much, homie. Just chillin' and ready to stir up some trouble. How can I help you?


# Chat Prompt Templates 

In [12]:
from llama_index.core.llms import ChatMessage, MessageRole
from llama_index.core import ChatPromptTemplate

llm = Cohere(model="command-r-plus")

chat_template = [
    ChatMessage(role=MessageRole.SYSTEM,content="You always answers questions with as much detail as possible."),
    ChatMessage(role=MessageRole.USER, content="{question}")
    ]

chat_prompt = ChatPromptTemplate(chat_template)

response = llm.complete(chat_prompt.format(question="How far did Alexander the Great go in his conquests?"))

print(response)

Alexander the Great, king of Macedonia, embarked on a remarkable series of conquests that took him as far east as Punjab, India, and as far south as Egypt. By the time of his death in 323 BCE, he had built an empire that stretched over some 2 million square miles (5.2 million square km), covering a large portion of the known world at the time.

Here's a rough overview of the key regions he conquered:
- Greece and the Balkans: Alexander first secured his power base in Greece and the Balkans, where he consolidated his rule over Macedonia, Epirus, and various Greek city-states.
- Persian Empire: Alexander's most significant conquests were in the east, where he invaded and ultimately toppled the vast Achaemenid Persian Empire. This included modern-day Turkey, Syria, Lebanon, Israel, Egypt, and parts of Iraq, Iran, and Central Asia. Notable battles include Issus and Gaugamela.
- Egypt: After liberating Egypt from Persian rule, Alexander was welcomed as a liberator and was crowned pharaoh. H

# Streaming Output

In [13]:
from llama_index.llms.cohere import Cohere
from llama_index.core.llms import ChatMessage, MessageRole

llm = Cohere(model="command-r-plus")

messages = [
    ChatMessage(role=MessageRole.SYSTEM, content="You're a great historian bot."),
    ChatMessage(role=MessageRole.USER, content="When did Alexander the Great arrive in China?")
]

response = llm.stream_chat(messages)

for r in response:
    print(r.delta, end="")

Alexander the Great, one of the most renowned military commanders and conquerors in history, embarked on a remarkable campaign that took him across a significant portion of the known world at the time. However, it is important to note that Alexander the Great did not reach China during his conquests.

Alexander's campaigns extended across the Mediterranean, through Persia and into India, but his ambitions may have reached further. There is no definitive evidence to suggest that he physically led his armies into China.

The extent of Alexander's conquests ended around the Indus River valley, which is in modern-day Pakistan. After a successful campaign in this region, his troops refused to continue further east due to exhaustion and a desire to return home. This event, known as the Mutiny of the Hypasians, marked the easternmost point of Alexander's conquests.

So, while Alexander the Great's influence and reputation certainly spread far and wide, including to regions beyond his direct c

# 💬 Chat Engine


In [14]:
from llama_index.core.chat_engine import SimpleChatEngine

llm = Cohere(model="command-r-plus")

chat_engine = SimpleChatEngine.from_defaults(llm=llm)

chat_engine.chat_repl()

===== Entering Chat REPL =====
Type "exit" to exit.

Assistant: I'm sorry, but I haven't found any significant results for "OCD-2" specifically in the field of psychology. Could you provide more context or clarify your question? It's possible that there is a specific study, theory, or concept that you're referring to, and I'd be happy to search for more information with additional details.

Assistant: OPD-2 refers to the "Operationalized Psychodynamic Diagnosis, Second Edition." It is a system for diagnosing psychological disorders from a psychodynamic perspective. This system provides a comprehensive and structured approach to understanding a person's psychological functioning and problems within the framework of psychodynamic theory.

The OPD-2 is an extensive diagnostic manual that was developed by a group of international experts in psychodynamic diagnosis and psychotherapy. It offers a multidimensional view of an individual's psychological makeup and provides guidelines for assess

BadRequestError: headers: {'access-control-expose-headers': 'X-Debug-Trace-ID', 'cache-control': 'no-cache, no-store, no-transform, must-revalidate, private, max-age=0', 'content-type': 'application/json', 'expires': 'Thu, 01 Jan 1970 00:00:00 GMT', 'pragma': 'no-cache', 'vary': 'Origin', 'x-accel-expires': '0', 'x-debug-trace-id': 'dbeb38494b6b2cf8af8d01d83605d6c4', 'x-endpoint-monthly-call-limit': '1000', 'x-trial-endpoint-call-limit': '10', 'x-trial-endpoint-call-remaining': '7', 'date': 'Sat, 19 Jul 2025 16:48:40 GMT', 'content-length': '147', 'x-envoy-upstream-service-time': '15', 'server': 'envoy', 'via': '1.1 google', 'alt-svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000'}, status_code: 400, body: {'id': '32688157-80ab-4f4c-a47a-d84afe860088', 'message': 'invalid request: message must be at least 1 token long or tool results must be specified.'}