<a href="https://github.com/run-llama/llama_index/blob/main/llama-index-packs/llama-index-packs-cohere-citation-chat/examples/cohere_citation_chat_example.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Install and Import Dependencies

In [None]:
%pip install llama-index
%pip install llama-index-llms-cohere
%pip install llama-index-embeddings-cohere
%pip install cohere
%pip install llama-index-readers-web
%pip install llama-index-packs-cohere-citation-chat

In [None]:
import os

from llama_index.packs.cohere_citation_chat import CohereCitationChatEnginePack
from llama_index.readers.web import SimpleWebPageReader

Configure your Cohere API key.

In [None]:
os.environ["COHERE_API_KEY"] = "your-api-key-here"

Parse your documents and pass to your LlamaPack. In this example, use nodes from a Paul Graham essay as input. Run the LlamaPack to create a chat engine.

In [None]:
documents = SimpleWebPageReader().load_data(
    [
        "https://raw.githubusercontent.com/jerryjliu/llama_index/adb054429f642cc7bbfcb66d4c232e072325eeab/examples/paul_graham_essay/data/paul_graham_essay.txt"
    ]
)
cohere_citation_chat_pack = CohereCitationChatEnginePack(documents=documents)
chat_engine = cohere_citation_chat_pack.run()

Run a set of queries via the chat engine methods

In [None]:
queries = [
    "What did Paul Graham do growing up?",
    "When and how did Paul Graham's mother die?",
    "What, in Paul Graham's opinion, is the most distinctive thing about YC?",
    "When and how did Paul Graham meet Jessica Livingston?",
    "What is Bel, and when and where was it written?",
]
for query in queries:
    print("Query ")
    print("=====")
    print(query)
    print("Chat")
    response = chat_engine.chat(query)
    print("Chat Response")
    print("========")
    print(response)
    print(f"Citations: {response.citations}")
    print(f"Documents: {response.documents}")
    print("Async Chat")
    response = await chat_engine.achat(query)
    print("Async Chat Response")
    print("========")
    print(response)
    print(f"Citations: {response.citations}")
    print(f"Documents: {response.documents}")
    print("Stream Chat")
    response = chat_engine.stream_chat(query)
    print("Stream Chat Response")
    print("========")
    response.print_response_stream()
    print(f"Citations: {response.citations}")
    print(f"Documents: {response.documents}")
    print("Async Stream Chat")
    response = await chat_engine.astream_chat(query)
    print("Async Stream Chat Response")
    print("========")
    await response.aprint_response_stream()
    print(f"Citations: {response.citations}")
    print(f"Documents: {response.documents}")
    print()

Query 
=====
What did Paul Graham do growing up?
Chat
Chat Response
Paul Graham grew up in a tiny corner of the Upper East Side of New York, called Yorkville since his family moved from Britain to the United States. He lived in the same apartment from the 1990s and worked hard to make a living. He worked on a new Lisp called Bel, written in Arc, and his friend helped him get an apartment in New York City, where he lived with his family. He wrote a book on Lisp and decided to start a company to put art galleries online, which didn't succeed, and began writing essays about his experiences.
Citations: [Citation(text='tiny corner of the Upper East Side of New York', start=25, end=71, document_ids=['5ef32917-0965-4551-8359-e8b92557965b']), Citation(text='Yorkville', start=80, end=89, document_ids=['5ef32917-0965-4551-8359-e8b92557965b']), Citation(text='Britain to the United States.', start=118, end=147, document_ids=['def37b99-8ff3-49d4-94d4-fc235e40454d']), Citation(text='same apartment f

You can access the internals of the LlamaPack, including your Cohere LLM and your vector store index, via the `get_modules` method.

In [None]:
modules = cohere_citation_chat_pack.get_modules()
display(modules)

{'vector_index': <llama_index.packs.cohere_citation_chat.citations_context_chat_engine.VectorStoreIndexWithCitationsChat at 0x1748749d0>,
 'llm': Cohere(callback_manager=<llama_index.core.callbacks.base.CallbackManager object at 0x176c55750>, system_prompt=None, messages_to_prompt=<function messages_to_prompt at 0x12f82b920>, completion_to_prompt=<function default_completion_to_prompt at 0x12f8e05e0>, output_parser=None, pydantic_program_mode=<PydanticProgramMode.DEFAULT: 'default'>, query_wrapper_prompt=None, model='command', temperature=0.5, max_retries=10, additional_kwargs={'prompt_truncation': 'AUTO'}, max_tokens=512)}