# SambaNova LLMs: Getting started

In [None]:
#Install these required dependencies to run this notebook
!pip install python-dotenv==1.0.0
!pip install requests
!pip install langchain-core==0.3.68
!pip install langchain-community==0.3.27
!pip install sseclient-py==1.8.0
!pip install langchain-sambanova==0.2.0

In [1]:
import os
import sys

from pprint import pprint
from dotenv import load_dotenv

current_dir = os.getcwd()
repo_dir = os.path.abspath(os.path.join(current_dir, ".."))
sys.path.append(repo_dir)

from langchain_sambanova import ChatSambaNova

load_dotenv(os.path.join(repo_dir, '.env'), override=True)

False

## Using ChatSambaNova model wrapper
First you should set your environment variables, for this follow the instructions [here](../README.md#use-sambanova-cloud-option-1)

In [2]:
# model instantiation 
llm = ChatSambaNova(
    model= "Meta-Llama-3.3-70B-Instruct",
    max_tokens=1024,
    temperature=0.7,
    top_p=0.1,
    stream_options={'include_usage':True}
    )

#### Simple generation

In [3]:
response = llm.invoke("tell me a joke")
print(response.content)

Here's one:

What do you call a fake noodle?

An impasta!

Hope that made you laugh! Do you want to hear another one?


#### Get usage metrics

In [4]:
response = llm.invoke("tell me a joke")
pprint(response.response_metadata['token_usage'])

{'acceptance_rate': None,
 'completion_tokens': 31,
 'completion_tokens_after_first_per_sec': 456.0909941062613,
 'completion_tokens_after_first_per_sec_first_ten': 461.78704726533397,
 'completion_tokens_after_first_per_sec_graph': 461.78704726533397,
 'completion_tokens_per_sec': 262.4583149982035,
 'end_time': 1760550812.8611166,
 'is_last_response': True,
 'prompt_tokens': 39,
 'prompt_tokens_details': {'cached_tokens': 0},
 'start_time': 1760550812.7430027,
 'stop_reason': 'stop',
 'time_to_first_token': 0.052337646484375,
 'total_latency': 0.11811399459838867,
 'total_tokens': 70,
 'total_tokens_per_sec': 592.6478080604595}


#### Use roles

In [5]:
messages = [
    ("system","You are a helpful assistant with pirate accent"),
    ("user","tell me a joke")
]
response = llm.invoke(messages)
print(response.content)

Yer lookin' fer a joke, eh? Alright then, matey! Here be one fer ye:

Why did the pirate quit his job?

(pause fer dramatic effect, savvy?)

Because he was sick o' all the arrrr-guments! (get it? arrrr-guments? like arguments, but pirate-style? aye, I be a comedic genius, matey!)


#### Asyncronous generation

In [6]:
future_response = llm.ainvoke("tell me a joke")
response = await future_response
print(response.content)

Here's one:

What do you call a fake noodle?

An impasta!

Hope that made you laugh! Do you want to hear another one?


#### Batch generation

In [7]:
response = llm.batch(["which is the capital of Netherlands?","which is the capital of UK?"])
pprint(response)

[AIMessage(content='The capital of the Netherlands is Amsterdam.', additional_kwargs={}, response_metadata={'token_usage': {'acceptance_rate': None, 'completion_tokens': 8, 'completion_tokens_after_first_per_sec': 521.7908580365394, 'completion_tokens_after_first_per_sec_first_ten': 552.5884212903711, 'completion_tokens_after_first_per_sec_graph': 552.5884212903711, 'completion_tokens_per_sec': 123.46264966259227, 'end_time': 1760550820.7785902, 'is_last_response': True, 'prompt_tokens': 42, 'prompt_tokens_details': {'cached_tokens': 0}, 'start_time': 1760550820.7137933, 'time_to_first_token': 0.051381587982177734, 'total_latency': 0.06479692459106445, 'total_tokens': 50, 'total_tokens_per_sec': 771.6415603912017, 'stop_reason': 'stop'}, 'model_name': 'Meta-Llama-3.3-70B-Instruct', 'system_fingerprint': 'fastcoe', 'finish_reason': 'stop', 'logprobs': None}, id='run--631ababd-4ca8-459a-9060-6985d2f6c120-0', usage_metadata={'input_tokens': 42, 'output_tokens': 8, 'total_tokens': 50}),
 A

#### Asyncronous batch generation

In [8]:
response = llm.abatch(["what is the square root of 81?","what is the natural logarithm of e"])
pprint(await response)

[AIMessage(content='The square root of 81 is 9. \n\n9 × 9 = 81, so √81 = 9.', additional_kwargs={}, response_metadata={'token_usage': {'acceptance_rate': None, 'completion_tokens': 26, 'completion_tokens_after_first_per_sec': 327.0534881617402, 'completion_tokens_after_first_per_sec_first_ten': 330.8823204576795, 'completion_tokens_after_first_per_sec_graph': 330.8823204576795, 'completion_tokens_per_sec': 201.8134280849804, 'end_time': 1760550822.7515142, 'is_last_response': True, 'prompt_tokens': 44, 'prompt_tokens_details': {'cached_tokens': 0}, 'start_time': 1760550822.6226823, 'time_to_first_token': 0.052391767501831055, 'total_latency': 0.1288318634033203, 'total_tokens': 70, 'total_tokens_per_sec': 543.343844844178, 'stop_reason': 'stop'}, 'model_name': 'Meta-Llama-3.3-70B-Instruct', 'system_fingerprint': 'fastcoe', 'finish_reason': 'stop', 'logprobs': None}, id='run--a20c34a4-1c2e-4300-be2c-e985578ad215-0', usage_metadata={'input_tokens': 44, 'output_tokens': 26, 'total_tokens'

#### Streaming response

In [9]:
for chunk in llm.stream('after what the earth is named?'):
    print(chunk.content, end="")

The origin of the name "Earth" is not definitively known, but there are several theories. Here are a few:

1. **Old English and Germanic roots**: One theory is that the name "Earth" comes from the Old English word "ertho" or "erthe", which was derived from the Proto-Germanic word "*erth-", meaning "ground" or "soil". This word is also related to the Old Norse word "jörð", which means "earth" or "ground".
2. **Latin influence**: Another theory suggests that the name "Earth" was influenced by the Latin word "terra", which means "earth" or "land". This Latin word is also the source of the English word "terrain".
3. **Ancient Greek and Roman mythology**: In ancient Greek and Roman mythology, the Earth was personified as a goddess, often depicted as a maternal figure. The Greek goddess Gaia (Γαῖα) and the Roman goddess Terra were both associated with the Earth.
4. **Indo-European roots**: Some linguists believe that the name "Earth" may have originated from the Proto-Indo-European root "*dʰ

#### Asyncronous streaming

In [10]:
async for chunk in llm.astream('rick roll me'):
    print(chunk.content)

You
 want to be
 Rick Rolled, do
 you?

Alright
, here's the
 classic treatment
:

**
Never
 Gonna Give You Up**
 by Rick Astley

[
Start
s playing in
 your
 imagination
]

"We
've known
 each other for so long
Your
 heart's
 been aching, but you're
 too
 shy to
 say it
Inside
 we both know what
's been going on
We know
 the
 game
 and
 we're
 gonna play it"


Hope
 that brought
 a smile (
and
 a bit
 of nostalgia
) to your face!


