# SambaNova LLMs: Getting started

In [None]:
#Install these required dependencies to run this notebook
!pip install python-dotenv==1.0.0
!pip install requests
!pip install langchain-core==0.3.68
!pip install langchain-community==0.3.27
!pip install sseclient-py==1.8.0
!pip install langchain-sambanova==0.1.6

In [1]:
import os
import sys

from pprint import pprint
from dotenv import load_dotenv

current_dir = os.getcwd()
repo_dir = os.path.abspath(os.path.join(current_dir, ".."))
sys.path.append(repo_dir)

from utils.model_wrappers.langchain_llms import SambaNovaCloud, SambaStudio
from langchain_sambanova import ChatSambaNovaCloud, ChatSambaStudio

load_dotenv(os.path.join(repo_dir, '.env'), override=True)

True

## Using Sambanova Cloud LLM wrapper

First you should set your environment variables, for this follow the instructions [here](../README.md#use-sambanova-cloud-option-1)

In [2]:
# model instantiation 
llm = SambaNovaCloud(
    model='Meta-Llama-3.1-405B-Instruct',
    max_tokens=1024,
    temperature=0.7,
    stop_tokens=[],
    top_k=5,
    top_p=0.1
    )


#### Simple generation

In [3]:
response = llm.invoke('tell me a joke using palindrome words')
print(response)

Here's one:

Why did Madam ask Hannah to refer a deed to Bob?

Because Madam said, "A man, a plan, a canal, Panama!" and Hannah replied, "Aha, Bob, a deed is a deed, and I'll refer it to level-headed Eve, but never to a madam like you, for it's a civic deed, and I won't do it, Hannah."

But the real answer is: Because it was a "madam, I'm Adam" situation, and Hannah just wanted to level with Bob and say, "A Santa at NASA" told her to do it!

Okay, okay, I know, it's a bit of a stretch, but I hope the palindrome words "madam," "Hannah," "level," "deed," "civic," "refer," and "A man, a plan, a canal, Panama" brought a smile to your face!


#### Asyncronous generation

In [67]:
response = llm.ainvoke('tell me an interesting fact in 5 words')
print(await response)

Butterflies taste with their feet.


#### Batch generation

In [68]:
response = llm.batch(["what is the capital of France?","what is the capital of Japan","what is the capital of Servia"])
pprint(response)

['The capital of France is Paris.',
 'The capital of Japan is Tokyo.',
 'The capital of Serbia is Belgrade.']


#### Asyncronous batch generation

In [69]:
response = llm.abatch(["what is the square root of 81?","what is the natural logarithm of e"])
pprint(await response)

['The square root of 81 is 9.',
 'The natural logarithm of e is 1.\n'
 '\n'
 'In mathematics, the natural logarithm (denoted by ln) is the logarithm to '
 'the base e, where e is a mathematical constant approximately equal to '
 '2.71828.\n'
 '\n'
 'By definition, the natural logarithm of e is:\n'
 '\n'
 'ln(e) = 1\n'
 '\n'
 'This is because the natural logarithm is the inverse function of the '
 'exponential function, and e is the base of the exponential function. In '
 'other words, the natural logarithm of e is the power to which e must be '
 'raised to equal e, which is 1.']


#### Streaming response

In [70]:
for chunk in llm.stream('after what the planets are named?'):
    print(chunk)


The planets in our solar 
system are named 
after ancient Roman gods 
and goddesses. Here's 
a 
brief overview:



1. Mercury - Named after the 
Roman messenger god, 

Mercurius (equivalent to the Greek 
god Hermes).
2. Venus 
- Named after the Roman goddess 
of love and beauty, 
Venus (equivalent to the Greek 
goddess 
Aphrodite).
3. Earth 
- Not directly named after a 
Roman god, 
but 
rather derived from Old English 
and Germanic words for 
"ground" and "soil."

4. Mars - 
Named after the Roman god 
of war, Mars (equivalent 
to the Greek 
god Ares).
5. Jupiter - Named 
after the Roman king of 
the 
gods, Jupiter (equivalent to 
the Greek god 
Zeus).
6. Saturn - Named after the 
Roman god of agriculture and 
time, Saturnus (equivalent 
to the Greek 
god Cronus).
7. Uranus - 
Named after the Greek god of 
the sky, 
Ouranos (the Romans did not have 
a direct equivalent).
8. 
Neptune - Named after the Roman 
god of the sea, 
Neptune (equivalent to the Greek 
god 
Poseidon).

The 
dwarf

#### Asyncronous streaming

In [71]:
async for chunk in llm.astream('tell me a poem'):
    print(chunk)


Here's 
a short poem 
for you:

"Moonlit Dreams"

The 
moon is full, a silver glow


A beacon in 
the 
dark of 
night's soft flow
The 
stars up high, a 
twinkling sea
A celestial 
show, for you and 
me

The world is 
hushed, a 
peaceful 
sight
The moon's soft light, 
a gentle delight
The 
shadows 
dance, a waltz so fine

A magical 
night, a dream divine

In 
this quiet hour, I find 
my peace
A sense 
of calm, my worries 
release
The moon's soft light, 
a guiding ray
Leads 
me to dreams, in a 
peaceful way

So let 
us bask, 
in the 
moon's 
pale glow
And let 
our 
dreams, 
like 
stars, 
shine bright 

and slow.


## Using Sambanova Cloud Chat model wrapper
First you should set your environment variables, for this follow the instructions [here](../README.md#use-sambanova-cloud-option-1)

In [2]:
# model instantiation 
llm = ChatSambaNovaCloud(
    model= "Meta-Llama-3.1-405B-Instruct",
    max_tokens=1024,
    temperature=0.7,
    top_p=0.1,
    stream_options={'include_usage':True}
    )

#### Simple generation

In [5]:
response = llm.invoke("tell me a joke")
print(response.content)

A man walked into a library and asked the librarian, "Do you have any books on Pavlov's dogs and Schrödinger's cat?"

The librarian replied, "It rings a bell, but I'm not sure if it's here or not."


#### Get usage metrics

In [None]:
response = llm.invoke("tell me a joke")
pprint(response.response_metadata["usage"])

{'acceptance_rate': 6.875,
 'completion_tokens': 54,
 'completion_tokens_after_first_per_sec': 144.7116724007174,
 'completion_tokens_after_first_per_sec_first_ten': 170.50721981101952,
 'completion_tokens_per_sec': 81.65890833536169,
 'end_time': 1727299777.1173537,
 'is_last_response': True,
 'prompt_tokens': 39,
 'start_time': 1727299776.406523,
 'time_to_first_token': 0.3445851802825928,
 'total_latency': 0.661287312074141,
 'total_tokens': 93,
 'total_tokens_per_sec': 140.63478657756735}


#### Use roles

In [None]:
messages = [
    ("system","You are a helpful assistant with pirate accent"),
    ("user","tell me a joke")
]
response = llm.invoke(messages)
print(response.content)

Yer lookin' fer a joke, eh? Alright then, matey! Here be one fer ye:

Why did the pirate quit his job?

(pause fer dramatic effect)

Because he was sick o' all the arrrr-guments!

Yarrr, hope that made ye laugh, me hearty!


#### Asyncronous generation

In [None]:
future_response = llm.ainvoke("tell me a joke")
response = await future_response
print(response.content)

A man walked into a library and asked the librarian, "Do you have any books on Pavlov's dogs and Schrödinger's cat?"

The librarian replied, "It rings a bell, but I'm not sure if it's here or not."


#### Batch generation

In [None]:
response = llm.batch(["which is the capital of Netherlands?","which is the capital of UK?"])
pprint(response)

[AIMessage(content='The capital of the Netherlands is Amsterdam.', response_metadata={'finish_reason': 'stop', 'usage': {'acceptance_rate': 13, 'completion_tokens': 9, 'completion_tokens_after_first_per_sec': 96.20570104765783, 'completion_tokens_after_first_per_sec_first_ten': 321.71740105260676, 'completion_tokens_per_sec': 24.091963638596955, 'end_time': 1727300260.23372, 'is_last_response': True, 'prompt_tokens': 42, 'start_time': 1727300259.8049712, 'time_to_first_token': 0.3455936908721924, 'total_latency': 0.37356855319096494, 'total_tokens': 51, 'total_tokens_per_sec': 136.52112728538276}, 'model_name': 'Meta-Llama-3.1-405B-Instruct', 'system_fingerprint': 'fastcoe', 'created': 1727300259}, id='1940365f-3588-40fe-bff6-1c15eed7cbad'),
 AIMessage(content='The capital of the United Kingdom is London.', response_metadata={'finish_reason': 'stop', 'usage': {'acceptance_rate': 13, 'completion_tokens': 10, 'completion_tokens_after_first_per_sec': 108.73960385772062, 'completion_tokens

#### Asyncronous batch generation

In [None]:
response = llm.abatch(["what is the square root of 81?","what is the natural logarithm of e"])
pprint(await response)

[AIMessage(content='The square root of 81 is 9.', response_metadata={'finish_reason': 'stop', 'usage': {'acceptance_rate': 13, 'completion_tokens': 11, 'completion_tokens_after_first_per_sec': 121.09526394198011, 'completion_tokens_after_first_per_sec_first_ten': 322.0993956865131, 'completion_tokens_per_sec': 28.998640879394692, 'end_time': 1727300282.6033456, 'is_last_response': True, 'prompt_tokens': 44, 'start_time': 1727300282.1755888, 'time_to_first_token': 0.34517717361450195, 'total_latency': 0.37932812250577486, 'total_tokens': 55, 'total_tokens_per_sec': 144.99320439697345}, 'model_name': 'Meta-Llama-3.1-405B-Instruct', 'system_fingerprint': 'fastcoe', 'created': 1727300282}, id='033258ae-de14-4bce-a1f6-78e32da43c30'),
 AIMessage(content='The natural logarithm of e is 1.\n\nIn mathematics, the natural logarithm (denoted by ln) is the logarithm to the base e, where e is a mathematical constant approximately equal to 2.71828.\n\nBy definition, the natural logarithm of e is:\n\n

#### Streaming response

In [None]:
for chunk in llm.stream('after what the earth is named?'):
    print(chunk.content, end="")

The Earth is named after Old English and Germanic words. The word "Earth" is derived from the Old English word "ertho" and the Old Norse word "jörð", which are both related to the Proto-Germanic word "*erth-", meaning "ground" or "soil".

In many other languages, the word for Earth is derived from the name of the ancient Greek goddess of the Earth, Gaia (Γαῖα). For example, in French, the word for Earth is "Terre", but the scientific term for Earth is "Gaïa" or "Gée".

So, to answer your question, the Earth is not directly named after a person or a specific object, but rather after the concept of the ground or soil in ancient Germanic cultures, and indirectly after the ancient Greek goddess Gaia in some languages.

#### Asyncronous streaming

In [None]:
async for chunk in llm.astream('rick roll me'):
    print(chunk.content)


You want to 
get Rickrolled, huh?




"We've known each other for so long

Your heart's 
been aching, but you're 
too shy to say 
it
Inside we both know what's 
been going on
We know 
the game and we're gonna 
play it



Never gonna give 
you up, never gonna let you down

Never gonna run around 
and desert you
Never gonna 
make you cry, never gonna 
say goodbye
Never gonna tell 
a lie and hurt you"




Haha, gotcha!





## Using SambaStudio LLM wrapper
First you should set your environment variables, for this follow the instructions [here](../README.md#use-sambastudio-option-2)

#### Simple generation

In [5]:
# model instantiation 
llm = SambaStudio(
    model_kwargs={
        'do_sample': False,
        'temperature': 0.7,
        'max_tokens': 256,
        'process_prompt': True,
        'model': 'Meta-Llama-3-70B-Instruct-4096',
    },
)

In [6]:
response = llm.invoke('tell me a pirates joke')
print(response)

Arrr, here be one for ye:

Why did the pirate quit his job?

Because he was sick of all the arrrr-guments!

Hope that made ye laugh, matey!


#### Asyncronous generation

In [7]:
response = llm.ainvoke('give me a recipe using only pineapple and sugar')
print(await response)

A simple yet sweet recipe!

Here's a recipe for Pineapple Sugar Crystals, also known as Pineapple Sugar Candy:

**Ingredients:**

* 1 cup pineapple chunks (fresh or canned)
* 1 cup granulated sugar

**Instructions:**

1. In a medium saucepan, combine the pineapple chunks and sugar.
2. Place the saucepan over medium heat and stir until the sugar has dissolved.
3. Bring the mixture to a boil, then reduce the heat to medium-low and simmer for about 20-25 minutes, or until the mixture reaches 300°F on a candy thermometer.
4. Remove the saucepan from the heat and let it cool slightly.
5. Line a baking sheet with parchment paper or a silicone mat.
6. Pour the pineapple-sugar mixture onto the prepared baking sheet.
7. Let it cool and set at room temperature for about 30-40 minutes, or until it has reached a firm, jelly-like consistency.
8. Once set, use a sharp knife or cookie cutter to cut the mixture into desired shapes.
9. Enjoy your Pineapple Sugar Crystals! Store them in an airtight cont

#### Batch generation

In [8]:
response = llm.batch(["Tell me a short joke","tell me a short tale"])
pprint(response)

['Why did the computer go to the doctor?\n\nIt had a virus!',
 'Here is a short tale:\n'
 '\n'
 '**The Moonlit Painter**\n'
 '\n'
 'In a small village nestled in the rolling hills of rural France, there lived '
 'a mysterious painter named Léon. By day, Léon was a quiet, unassuming man '
 'who kept to himself, but by night, he transformed into a master of the '
 'brush.\n'
 '\n'
 'Under the silvery light of the full moon, Léon would sneak out of his '
 'cottage and set up his easel in the town square. With a flick of his wrist, '
 'he would conjure vibrant colors and bring the night to life on canvas.\n'
 '\n'
 'One evening, a curious young girl named Sophie stumbled upon Léon as he '
 'worked his magic. Entranced by the beauty of his art, she watched in silence '
 'as he painted the stars themselves into the sky.\n'
 '\n'
 "From that moment on, Sophie became Léon's loyal apprentice, learning the "
 'secrets of his moonlit craft. Together, they created masterpieces that shone '
 'like 

#### Asyncronous batch generation

In [9]:
responses = llm.abatch(
    ["give me a python code to print the current time in 12 hour format with AM/PM","why the chicken ran?"])
for response in await responses:
    print(response)
    print("\n\n---\n\n")

Here is a Python code that prints the current time in 12-hour format with AM/PM:
```
import datetime

now = datetime.datetime.now()
current_time = now.strftime("%I:%M %p")

print(current_time)
```
Let me explain what's happening:

* `import datetime` imports the `datetime` module, which provides classes for working with dates and times.
* `now = datetime.datetime.now()` gets the current date and time using the `now()` method.
* `current_time = now.strftime("%I:%M %p")` formats the current time using the `strftime()` method. The format string `"%I:%M %p"` specifies:
	+ `%I`: Hour (12-hour clock) as a zero-padded decimal number (e.g., 01, 02,..., 12)
	+ `%M`: Minute as a zero-padded decimal number (e.g., 00, 01,..., 59)
	+ `%p`: Locale’s equivalent of either AM or PM (e.g., AM, PM)
* `print(current_time)` prints the formatted current time to the console.

Run this code, and you should see the current time in 12-hour format with AM/PM, like this


---


The classic question!

Unfortunatel

#### Streaming response

In [10]:
for chunk in llm.stream('give me 3 caption ideas for a beach trip instagram post'):
    print(chunk)


Here are three caption ideas 
for a beach trip Instagram post:


1. 
**Sandy Toes and 
Sun-Kissed Nose**: "Soaking 
up the sun and making memories 
with the ones I 
love. Life's a beach, 
and I'm loving every wave of 
it 
#beachlife 
#coastalvibes #paradisefound"

2. **Seas 
the Day!**: 
"Tropical state of mind Nothing like 
a beach day to clear my 
mind and fill my heart 
with joy. Who else is ready 
for a summer of 
adventure? 
#beachbum #oceanlover #seas_the_day"

3. 
**Tidal Wave of 
Happiness**: "Surrounded by saltwater and 
good vibes Life's too short 
to not spend it by 
the ocean. Grateful for this 
little slice of heaven on 
earth 
#beachykeen 
#coastalbliss #happinessfound"


Feel free to customize them to fit 
your personal style and the tone 
of 
your post!


#### Asyncronous streaming

In [11]:
async for chunk in llm.astream('which words rhyme with dinosaur'):
    print(chunk)


Here are some words that 
rhyme with "dinosaur":


*osaur 
(e.g. tyrannosaur, velocisaur)

* Shore
* Score

* Before

* Galore
* More

* Roar
* Sore

* Tore


Note that some of these words 
may not be exact 
perfect rhymes, but they all share 
a similar sound and ending sound 

with "dinosaur".


## Using SambaStudio Chat model wrapper
First you should set your environment variables, for this follow the instructions [here](../README.md#use-sambastudio-option-2)

In [3]:
# model instantiation 
llm = ChatSambaStudio(
    model="Meta-Llama-3-70B-Instruct-4096",
    max_tokens=1024,
    temperature=0.3,
    top_p=0.01,
    do_sample = True,
    process_prompt = False,
    )

#### Simple generation

In [3]:
response = llm.invoke("tell me a joke")
print(response.content)

Here's one:

Why couldn't the bicycle stand up by itself?

(Wait for it...)

Because it was two-tired!

Hope that made you laugh!


#### Get usage metrics

In [4]:
response = llm.invoke("tell me a joke")
pprint(response.response_metadata["usage"])

{'completion_tokens': 32,
 'model_execution_time': 0.5915939807891846,
 'prompt_tokens': 14,
 'throughput_after_first_token': 69.80657674652622,
 'time_to_first_token': 0.21913623809814453,
 'total_tokens': 46}


#### Use roles

In [5]:
messages = [
    ("system","You are a helpful assistant with pirate accent"),
    ("user","tell me a joke")
]
response = llm.invoke(messages)
print(response.content)

Arrr, listen up, matey! Here be a joke fer ye:

Why did the pirate quit his job?

Because he was sick o' all the arrrr-guments! (get it? arguments, but with a pirate "arrr" sound? Aye, I be a regular comedic genius, savvy?)

So, did I make ye laugh, or did I walk the plank?


#### Asyncronous generation

In [6]:
future_response = llm.ainvoke("tell me a joke")
response = await future_response
print(response.content)

Here's one:

Why couldn't the bicycle stand up by itself?

(Wait for it...)

Because it was two-tired!

Hope that made you laugh!


#### Batch generation

In [7]:
response = llm.batch(["which is the capital of Netherlands?","which is the capital of UK?"])
pprint(response)

[AIMessage(content='The capital of the Netherlands is Amsterdam.', additional_kwargs={}, response_metadata={'finish_reason': None, 'usage': {'prompt_tokens': 17, 'completion_tokens': 8, 'total_tokens': 25, 'throughput_after_first_token': 48.4325123266475, 'time_to_first_token': 0.21890592575073242, 'model_execution_time': 0.26020050048828125}, 'model_name': 'Meta-Llama-3-70B-Instruct-4096', 'system_fingerprint': '', 'created': 1727913765}, id='0dc08e37-8cd6-44c1-9fb0-932249b217d0'),
 AIMessage(content='The capital of the United Kingdom (UK) is London.', additional_kwargs={}, response_metadata={'finish_reason': None, 'usage': {'prompt_tokens': 17, 'completion_tokens': 12, 'total_tokens': 29, 'throughput_after_first_token': 60.74387753624238, 'time_to_first_token': 0.21868610382080078, 'model_execution_time': 0.3174614906311035}, 'model_name': 'Meta-Llama-3-70B-Instruct-4096', 'system_fingerprint': '', 'created': 1727913765}, id='e5c8faa9-d437-467d-bce9-d5b29680e488')]


#### Asyncronous batch generation

In [8]:
response = llm.abatch(["what is the square root of 81?","what is the natural logarithm of e"])
pprint(await response)

[AIMessage(content='The square root of 81 is 9.', additional_kwargs={}, response_metadata={'finish_reason': None, 'usage': {'prompt_tokens': 19, 'completion_tokens': 10, 'total_tokens': 29, 'throughput_after_first_token': 57.94458086820774, 'time_to_first_token': 0.21925926208496094, 'model_execution_time': 0.28829073905944824}, 'model_name': 'Meta-Llama-3-70B-Instruct-4096', 'system_fingerprint': '', 'created': 1727913768}, id='438c3a4b-1dad-4b14-affa-3939bb669145'),
 AIMessage(content="A clever question!\n\nThe natural logarithm of e is... (drumroll please)... 1!\n\nThat's right, the natural logarithm of e, denoted by ln(e), is equal to 1.\n\nTo see why, recall that the natural logarithm is the inverse function of the exponential function. In other words, ln(x) is the power to which you need to raise e to get x. So, ln(e) is the power to which you need to raise e to get e, which is simply 1.\n\nIn mathematical notation, this can be written as:\n\nln(e) = 1\n\nThis is a fundamental pr

#### Streaming response

In [9]:
for chunk in llm.stream('after what the earth is named?'):
    print(chunk.content, end="")

The origin of the name "Earth" is not entirely clear, but there are a few theories. Here are some possible explanations:

1. **Old English and Germanic roots**: The modern English word "Earth" comes from the Old English and Germanic word "ertho", which means "ground" or "soil". This word is thought to have been derived from the Proto-Indo-European root "dher-", which also meant "to hold" or "to support".
2. **Terra in Latin**: The Latin word for Earth is "Terra", which is also the source of the French word "Terre" and the Spanish word "Tierra". The Latin "Terra" is thought to have been derived from the Proto-Indo-European root "ters-", which meant "dry" or "land".
3. **Greek influence**: The Greek word for Earth is "Γαια" (Gaia), which was the name of the goddess of the Earth in Greek mythology. The Greek word "Γαια" may have influenced the development of the Latin word "Terra", and subsequently the modern English word "Earth".
4. **Other theories**: Some scholars believe that the name

#### Asyncronous streaming

In [10]:
async for chunk in llm.astream('rick roll me'):
    print(chunk.content)

You want to 
be Rickrolled, do ya?


Here's the classic Rick 
Astley experience:


**"Never Gonna Give You Up"**


[Starts playing the 
iconic song]

Never gonna 
give, never gonna give

(Give you up)
Never 
gonna let, never gonna let

(Let you down)


...and so on!

How 
was that? Did I successfully 
Rickroll you?

