# SambaNova Cloud

This will help you getting started with **[SambaNova](https://sambanova.ai/)'s** [SambaNova Cloud](https://cloud.sambanova.ai/), which is a platform for performing inference with open-source models.

## Setup

To access SambaNova Cloud model you will need to create a [SambaNovaCloud](https://cloud.sambanova.ai/) account, get an API key, install the `llama-index-llms-sambanova` integration package, and install the `SSEClient` Package.

```bash
pip install llama-index-llms-sambanova
pip install sseclient-py
```

### Credentials

Get an API Key from [cloud.sambanova.ai](https://cloud.sambanova.ai/apis) and add it to your environment variables:

``` bash
export SAMBANOVA_API_KEY="your-api-key-here"
```

In [None]:
import getpass
import os

if not os.getenv("SAMBANOVA_API_KEY"):
    os.environ["SAMBANOVA_API_KEY"] = getpass.getpass(
        "Enter your SambaNova Cloud API key: "
    )

### Installation

The Llama-Index __SambaNovaCloud__ integration lives in the `langchain-index-integrations` package, and it can be installed with the following commands:

In [None]:
%pip install "llama-index-llms-sambanova>=0.3"
%pip install sseclient-py

## Instantiation

Now we can instantiate our model object and generate chat completions:

In [None]:
from llama_index.llms.sambanova import SambaNovaCloud

llm = SambaNovaCloud(
    model="Meta-Llama-3.1-70B-Instruct",
    max_tokens=1024,
    temperature=0.7,
    top_k=1,
    top_p=0.01,
)

## Invocation

Given the following system and user messages, let's explore different ways of calling a SambaNova Cloud model. 

In [None]:
from llama_index.core.base.llms.types import (
    ChatMessage,
    MessageRole,
)

system_msg = ChatMessage(
    role=MessageRole.SYSTEM,
    content="You are a helpful assistant that translates English to French. Translate the user sentence.",
)
user_msg = ChatMessage(role=MessageRole.USER, content="I love programming.")

messages = [
    system_msg,
    user_msg,
]

### Chat

In [None]:
ai_msg = llm.chat(messages)
ai_msg.message

ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content="J'adore la programmation.", additional_kwargs={'id': 'b171e958-d83c-4afc-b57a-aefc724ccd08', 'finish_reason': 'stop', 'usage': {'acceptance_rate': 7, 'completion_tokens': 8, 'completion_tokens_after_first_per_sec': 194.3169682449336, 'completion_tokens_after_first_per_sec_first_ten': 617.3930816948796, 'completion_tokens_per_sec': 53.096356310081546, 'end_time': 1730485767.645042, 'is_last_response': True, 'prompt_tokens': 55, 'start_time': 1730485767.4713066, 'time_to_first_token': 0.137711763381958, 'total_latency': 0.1506694725581578, 'total_tokens': 63, 'total_tokens_per_sec': 418.13380594189215}, 'model_name': 'Meta-Llama-3.1-70B-Instruct', 'system_fingerprint': 'fastcoe', 'created': 1730485767})

In [None]:
print(ai_msg.message.content)

J'adore la programmation.


### Complete

In [None]:
ai_msg = llm.complete(user_msg.content)
ai_msg

CompletionResponse(text='Programming can be a fun and rewarding hobby, as well as a challenging and lucrative career. What kind of programming do you enjoy most? Are you more into web development, mobile app development, game development, or something else?', additional_kwargs={'id': 'e5121301-5246-4c01-a720-5c2306a268b1', 'finish_reason': 'stop', 'usage': {'acceptance_rate': 5.666666666666667, 'completion_tokens': 45, 'completion_tokens_after_first_per_sec': 382.1143303193373, 'completion_tokens_after_first_per_sec_first_ten': 502.6854618966545, 'completion_tokens_per_sec': 198.68200517305698, 'end_time': 1730485768.5746315, 'is_last_response': True, 'prompt_tokens': 39, 'start_time': 1730485768.3225093, 'time_to_first_token': 0.13697338104248047, 'total_latency': 0.22649258024552288, 'total_tokens': 84, 'total_tokens_per_sec': 370.8730763230397}, 'model_name': 'Meta-Llama-3.1-70B-Instruct', 'system_fingerprint': 'fastcoe', 'created': 1730485768}, raw=None, logprobs=None, delta=None)

In [None]:
print(ai_msg.text)

Programming can be a fun and rewarding hobby, as well as a challenging and lucrative career. What kind of programming do you enjoy most? Are you more into web development, mobile app development, game development, or something else?


## Streaming

### Chat

In [None]:
ai_stream_msgs = []
for stream in llm.stream_chat(messages):
    ai_stream_msgs.append(stream)
ai_stream_msgs

[ChatResponse(message=ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content='', additional_kwargs={'id': '1a559ecf-7996-492b-945b-3d24b496dd84', 'finish_reason': None}), raw={'choices': [{'delta': {'content': '', 'role': 'assistant'}, 'finish_reason': None, 'index': 0, 'logprobs': None}], 'created': 1730485769, 'id': '1a559ecf-7996-492b-945b-3d24b496dd84', 'model': 'Meta-Llama-3.1-70B-Instruct', 'object': 'chat.completion.chunk', 'system_fingerprint': 'fastcoe'}, delta='', logprobs=None, additional_kwargs={}),
 ChatResponse(message=ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content='', additional_kwargs={'id': '1a559ecf-7996-492b-945b-3d24b496dd84', 'finish_reason': None}), raw={'choices': [{'delta': {'content': '', 'role': 'assistant'}, 'finish_reason': None, 'index': 0, 'logprobs': None}], 'created': 1730485769, 'id': '1a559ecf-7996-492b-945b-3d24b496dd84', 'model': 'Meta-Llama-3.1-70B-Instruct', 'object': 'chat.completion.chunk', 'system_fingerprint': 'fastcoe'}

In [None]:
print(ai_stream_msgs[-1])

assistant: J'adore la programmation.


### Complete

In [None]:
ai_stream_msgs = []
for stream in llm.stream_complete(user_msg.content):
    ai_stream_msgs.append(stream)
ai_stream_msgs

[CompletionResponse(text='', additional_kwargs={'id': '7730bbef-a8e1-42d7-8a09-b90074184c79', 'finish_reason': None}, raw={'choices': [{'delta': {'content': '', 'role': 'assistant'}, 'finish_reason': None, 'index': 0, 'logprobs': None}], 'created': 1730485770, 'id': '7730bbef-a8e1-42d7-8a09-b90074184c79', 'model': 'Meta-Llama-3.1-70B-Instruct', 'object': 'chat.completion.chunk', 'system_fingerprint': 'fastcoe'}, logprobs=None, delta=''),
 CompletionResponse(text='Programming can be ', additional_kwargs={'id': '7730bbef-a8e1-42d7-8a09-b90074184c79', 'finish_reason': None}, raw={'choices': [{'delta': {'content': 'Programming can be ', 'role': 'assistant'}, 'finish_reason': None, 'index': 0, 'logprobs': None}], 'created': 1730485770, 'id': '7730bbef-a8e1-42d7-8a09-b90074184c79', 'model': 'Meta-Llama-3.1-70B-Instruct', 'object': 'chat.completion.chunk', 'system_fingerprint': 'fastcoe'}, logprobs=None, delta='Programming can be '),
 CompletionResponse(text='Programming can be a fun and ', a

In [None]:
print(ai_stream_msgs[-1])

Programming can be a fun and rewarding hobby, as well as a challenging and lucrative career. What kind of programming do you enjoy most? Are you more into web development, mobile app development, game development, or something else?


## Async

### Chat

In [None]:
ai_msg = await llm.achat(messages)
ai_msg

ChatResponse(message=ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content="J'adore la programmation.", additional_kwargs={'id': '011712d7-17b8-489b-b91b-5ed37b3df7a4', 'finish_reason': 'stop', 'usage': {'acceptance_rate': 7, 'completion_tokens': 8, 'completion_tokens_after_first_per_sec': 179.02735399212185, 'completion_tokens_after_first_per_sec_first_ten': 570.3432145771009, 'completion_tokens_per_sec': 50.02144003959445, 'end_time': 1730485771.4568532, 'is_last_response': True, 'prompt_tokens': 55, 'start_time': 1730485771.2718482, 'time_to_first_token': 0.1459047794342041, 'total_latency': 0.15993142127990723, 'total_tokens': 63, 'total_tokens_per_sec': 393.9188403118063}, 'model_name': 'Meta-Llama-3.1-70B-Instruct', 'system_fingerprint': 'fastcoe', 'created': 1730485771}), raw=None, delta=None, logprobs=None, additional_kwargs={})

In [None]:
print(ai_msg.message.content)

J'adore la programmation.


### Complete

In [None]:
ai_msg = await llm.acomplete(user_msg.content)
ai_msg

CompletionResponse(text='Programming can be a fun and rewarding hobby, as well as a challenging and lucrative career. What kind of programming do you enjoy most? Are you more into web development, mobile app development, game development, or something else?', additional_kwargs={'id': '2ed0ec89-dfae-4943-8cfd-894fb8014ef4', 'finish_reason': 'stop', 'usage': {'acceptance_rate': 5.666666666666667, 'completion_tokens': 45, 'completion_tokens_after_first_per_sec': 266.1030386748298, 'completion_tokens_after_first_per_sec_first_ten': 347.115940335709, 'completion_tokens_per_sec': 131.3514652662143, 'end_time': 1730485772.5302963, 'is_last_response': True, 'prompt_tokens': 39, 'start_time': 1730485772.1519942, 'time_to_first_token': 0.2129526138305664, 'total_latency': 0.342592295478372, 'total_tokens': 84, 'total_tokens_per_sec': 245.18940183026666}, 'model_name': 'Meta-Llama-3.1-70B-Instruct', 'system_fingerprint': 'fastcoe', 'created': 1730485772}, raw=None, logprobs=None, delta=None)

In [None]:
print(ai_msg.text)

Programming can be a fun and rewarding hobby, as well as a challenging and lucrative career. What kind of programming do you enjoy most? Are you more into web development, mobile app development, game development, or something else?


## Async Streaming

Not supported yet. Coming soon!