In [None]:
!pip install llama-index
!pip install llama-index-core
!pip install llama-index-embeddings-openai

In [2]:
# llama-parse is async-first, running the async code in a notebook requires the use of nest_asyncio
import nest_asyncio
nest_asyncio.apply()

import os

# Using OpenAI API for embeddings
os.environ["OPENAI_API_KEY"] = "sk-"

# Using NVIDIA API Playground API Key for LLM
os.environ["NVIDIA_AI_PLAYGROUND_API_KEY"] = "nvapi-"

In [3]:
from llama_index.llms.nvidia_ai_playground import NvidiaAIPlayground
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core import VectorStoreIndex
from llama_index.core import Settings

embed_model=OpenAIEmbedding(model="text-embedding-3-small")
llm = NvidiaAIPlayground(model="playground_steerlm_llama_70b")

Settings.llm = llm
Settings.embed_model = embed_model

In [4]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)

### Basic Functionality

In [11]:
from llama_index.core.llms import ChatMessage, MessageRole

chat_messages = [
    ChatMessage(
        role=MessageRole.SYSTEM,
        content=(
            "You are a helpful assistant."
        )
    ),
    ChatMessage(
        role=MessageRole.USER,
        content=(
            "How do I get to Paris from London?"
        )
    ),
]

#### Complete

In [14]:
llm.complete("Hello!")

CompletionResponse(text="Hello! I'm an AI language model developed by Open Assistant, I can assist you with various questions on a wide range of topics including but not limited to:\n\nGeneral knowledge: I can answer questions about history, geography, science, literature, and many other subjects.\n\nMathematics: I can help with basic math problems, as well as more advanced topics like algebra, geometry, and calculus.\n\nProgramming: I can assist with programming concepts, syntax, and code debugging in a variety of programming languages.\n\nTechnology: I can provide information and assistance with various technologies, including computers, smartphones, and the internet.\n\nHealth and wellness: I can provide information about exercise, nutrition, and general health topics.\n\nRemember, I'm just a language model, I don't have the ability to perceive or feel, I don't have personal experiences or emotions. But I will do my best to assist you with any questions or information you need. Just

#### Chat

In [12]:
llm.chat(chat_messages)

ChatResponse(message=ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content='There are several ways to get to Paris from London, including:\n\n1. Plane: The quickest and most convenient way to get to Paris from London is by plane. There are several direct flights from London to Paris, and the flight time is around 1 hour.\n\n2. Train: You can take the Eurostar train from London St. Pancras International to Paris Gare du Nord. The journey takes around 2 hours and 15 minutes.\n\n3. Bus: There are several bus companies that offer direct bus services from London to Paris. The journey takes around 8-10 hours, depending on traffic and other factors.\n\n4. Car: You can drive from London to Paris, but keep in mind that you will need to cross the English Channel by ferry or through the Eurotunnel. The drive takes around 6-7 hours, depending on traffic and other factors.\n\n5. Coach: National Express operates coach services from London Victoria Coach Station to Paris Gallieni. The journe

#### Stream

In [19]:
streamed_response = llm.stream_chat(chat_messages)

In [24]:
streamed_response

<generator object llm_chat_callback.<locals>.wrap.<locals>.wrapped_llm_chat.<locals>.wrapped_gen at 0x7f957eaae240>

### Change Models

In [36]:
llm = NvidiaAIPlayground(model="playground_llama2_70b")

#### Async Complete

In [37]:
await llm.acomplete("Hello there!")

CompletionResponse(text="Hello! It's nice to meet you. Is there something I can help you with or would you like to chat for a bit?", additional_kwargs={}, raw={'id': '66e79211-2fb3-4992-989c-3eea1373902d', 'choices': [Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="Hello! It's nice to meet you. Is there something I can help you with or would you like to chat for a bit?", role='assistant', function_call=None, tool_calls=None))], 'created': None, 'model': None, 'object': None, 'system_fingerprint': None, 'usage': CompletionUsage(completion_tokens=28, prompt_tokens=11, total_tokens=39)}, delta=None)

### Change Models


In [38]:
llm = NvidiaAIPlayground(model="playground_mixtral_8x7b")

#### Async Chat

In [39]:
await llm.achat(chat_messages)

ChatResponse(message=ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content='There are several ways to get from London to Paris:\n\n1. **By Eurostar Train**: This is the fastest and most convenient way to travel between the two cities. The Eurostar train departs from St Pancras International Station in London and arrives at Gare du Nord in Paris. The journey takes approximately 2 hours and 15 minutes.\n\n2. **By Plane**: There are numerous flights between London and Paris every day. The flight duration is about 1 hour, but you need to add time for getting to and from the airports, security checks, and baggage claim. The major airports in London are Heathrow, Gatwick, Stansted, and Luton, and in Paris, they are Charles de Gaulle and Orly.\n\n3. **By Bus**: This is the cheapest but slowest option. Buses depart from Victoria Coach Station in London and arrive at Gallieni Porte de Bagnolet in Paris. The journey can take up to 8 hours, depending on traffic.\n\n4. **By Car**: If you 

### Change Models

In [65]:
llm = NvidiaAIPlayground(model="playground_steerlm_llama_70b")

#### Async Stream

In [66]:
await llm.astream_chat(chat_messages)

<async_generator object llm_chat_callback.<locals>.wrap.<locals>.wrapped_async_llm_chat.<locals>.wrapped_gen at 0x7f957c65e240>

### Streaming Query Engine Responses

In [5]:
streaming_qe = index.as_query_engine(streaming=True)

In [6]:
streaming_response = streaming_qe.query(
    "What did the author do growing up?",
)

In [7]:
streaming_response.print_response_stream()

Douglas Adams, the author of "The Hitchhiker's Guide to the Galaxy," grew up in Cambridge, England. He was born in 1952 and attended Brentwood School, where he was known for his wit and humor.

As a child, Adams was interested in science fiction and comedy, and he was a fan of the radio comedy show "The Goon Show." He also enjoyed playing the piano and writing short stories.

After leaving school, Adams attended St. John's College, Cambridge, where he studied English literature. During his time at university, he became involved in the Cambridge Footlights, a student comedy group, and began writing and performing in sketches and revues.

After graduating from Cambridge, Adams worked as a radio scriptwriter and performer, and he continued to write and perform comedy. He also wrote for television, including for the popular comedy show "Monty Python's Flying Circus."

In the late 1970s, Adams began working on "The Hitchhiker's Guide to the Galaxy," which was originally a radio comedy serie