## Advanced LLM Calling

In [1]:
from dotenv import load_dotenv
from babble_foundry.openrouter import OpenRouter

display(load_dotenv(override=True))

client = OpenRouter()

True

In [2]:
import time

class Timer:
    def __init__(self):
        self.reset()

    def reset(self):
        self.start_time = time.time()

    def get_elapsed(self, verbose: bool = False) -> float:
        elapsed_time = time.time() - self.start_time
        if verbose:
            print(f"{time.time() - self.start_time:.2f} seconds elapsed")
        return elapsed_time

timer = Timer()

### Asynchronous calling

In [3]:
model_id = "mistralai/mistral-7b-instruct:free"
messages = [
    {"role": "system", "content": "You answer all questions using 5 short bullet points or less."},
    {"role": "user", "content": "What are some places, events, or communities to meet people in Toronto?"}
]

timer.reset()
response = await client.achat(model=model_id, messages=messages)
_ = timer.get_elapsed(verbose=True)

print()
client.print_response(response)

2.21 seconds elapsed

 1. Kensington Market: Vibrant, eclectic neighborhood with unique shops, street performers, and cultural food choices.
2. Parks: High Park, Trinity Bellwoods Park, andAllan Gardens offer a casual setting for socializing.
3. Meetup.com: Various groups for interests such as sports, hobbies, and networking meet regularly.
4. Toronto Public Library: Attend free events, programs, and workshops.
5. Bars and clubs: Popular spots for socializing include The Drake Hotel, The Gladstone Hotel, and The Rex Hotel Jazz & Blues Bar.


In [4]:
display(response)

{'id': 'gen-1756712728-JdPofKFku3x6CvW4neo5',
 'provider': 'DeepInfra',
 'model': 'mistralai/mistral-7b-instruct:free',
 'object': 'chat.completion',
 'created': 1756712728,
 'choices': [{'logprobs': None,
   'finish_reason': 'stop',
   'native_finish_reason': 'stop',
   'index': 0,
   'message': {'role': 'assistant',
    'content': ' 1. Kensington Market: Vibrant, eclectic neighborhood with unique shops, street performers, and cultural food choices.\n2. Parks: High Park, Trinity Bellwoods Park, andAllan Gardens offer a casual setting for socializing.\n3. Meetup.com: Various groups for interests such as sports, hobbies, and networking meet regularly.\n4. Toronto Public Library: Attend free events, programs, and workshops.\n5. Bars and clubs: Popular spots for socializing include The Drake Hotel, The Gladstone Hotel, and The Rex Hotel Jazz & Blues Bar.',
    'refusal': None,
    'reasoning': None}}],
 'usage': {'prompt_tokens': 34,
  'completion_tokens': 135,
  'total_tokens': 169,
  'p

In [5]:
import asyncio

cities = [
    "Toronto",
    "New York",
    "Montreal",
    "Mexico City",
    "Paris",
    "Barcelona",
    "Hong Kong",
    "Tokyo",
    "Singapore"
]

async def get_activities_for_city(city: str) -> tuple[str, dict]:
    messages = [
        {"role": "system", "content": "You answer all questions using 5 short bullet points or less."},
        {"role": "user", "content": f"What are some places, events, or communities to meet people in {city}?"}
    ]
    return city, await client.achat(model=model_id, messages=messages, reasoning={"enabled": False})

futures = [get_activities_for_city(city) for city in cities]

timer.reset()
for future in asyncio.as_completed(futures):
    city, response = await future
    print(f"===== {city.upper()} =====")
    client.print_response(response)
    print("")
_ = timer.get_elapsed(verbose=True)

CancelledError: 

### Caching

#### Provider prompt caching 

In [None]:
import re
import requests
import tiktoken

very_long_text = requests.get("https://www.gutenberg.org/ebooks/2701.txt.utf-8", timeout=60).text

# crude: extract Chapter 1 only
first_chapters_pattern = r'(CHAPTER 1\..*?)CHAPTER 10.'
matches = re.findall(first_chapters_pattern, very_long_text, re.DOTALL | re.IGNORECASE)
very_long_text = matches[1].strip()

print(f"{very_long_text[:200]}\n\n...\n\n{very_long_text[-200:]}", end="\n\n")
print(f"{len(very_long_text)} characters")

enc = tiktoken.get_encoding("cl100k_base")
tokens = enc.encode(very_long_text)
print(f"{len(tokens)} tokens")

CHAPTER 1. Loomings.

Call me Ishmael. Some years ago—never mind how long precisely—having
little or no money in my purse, and nothing particular to interest me
on shore, I thought I would sail ab

...

ime of his God?”

He said no more, but slowly waving a benediction, covered his face with
his hands, and so remained kneeling, till all the people had departed,
and he was left alone in the place.

102580 characters
25487 tokens


In [None]:
from datetime import datetime

model_id = "openai/gpt-4o-mini"
system_message = (
    f"The time now is {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}."
    "Your job is to summarize the text provided by the user in 5 bullet points or less."
)
messages = [
    {"role": "system", "content": system_message},
    {"role": "user", "content": very_long_text}
]

timer.reset()
stream = client.chat(model=model_id, messages=messages, usage={"include": True}, stream=True, verbose=True)
chunks = list(stream)

print()
_ = timer.get_elapsed(verbose=True)
print()

display(chunks[-1])

Here's a summary of the provided text:

*   **The narrator, Ishmael, feels a deep compulsion to go to sea** when he experiences existential dread and dissatisfaction with life on land.
*   **He arrives in New Bedford seeking passage on a Nantucket ship**, encountering a seedy inn called "The Spouter Inn" and its peculiar landlord.
*   **Ishmael is forced to share a room with an exotic harpooneer named Queequeg**, initially apprehensive due to Queequeg's appearance and rumored cannibalism.
*   **Despite his fears, Ishmael finds Queequeg to be a fundamentally decent and polite individual**, and they share a surprisingly peaceful night and breakfast together.
*   **The narrative explores themes of fate, man's relationship with nature and the divine, and the harsh realities of the whaling life**, even touching upon religious sermons that reflect these ideas.

2.22 seconds elapsed



{'id': 'gen-1756712623-Z5ZjkcGyBAr4Skr3ILJu',
 'provider': 'Google',
 'model': 'google/gemini-2.5-flash-lite',
 'object': 'chat.completion.chunk',
 'created': 1756712623,
 'choices': [{'index': 0,
   'delta': {'role': 'assistant', 'content': ''},
   'finish_reason': None,
   'native_finish_reason': None,
   'logprobs': None}],
 'usage': {'prompt_tokens': 26545,
  'completion_tokens': 190,
  'total_tokens': 26735,
  'cost': 0.0027305,
  'is_byok': False,
  'prompt_tokens_details': {'cached_tokens': 0, 'audio_tokens': 0},
  'cost_details': {'upstream_inference_cost': None,
   'upstream_inference_prompt_cost': 0.0026545,
   'upstream_inference_completions_cost': 7.6e-05},
  'completion_tokens_details': {'reasoning_tokens': 0, 'image_tokens': 0}}}

In [None]:
timer.reset()
stream = client.chat(model=model_id, messages=messages, usage={"include": True}, stream=True, verbose=True)
chunks = list(stream)

print()
_ = timer.get_elapsed(verbose=True)
print()

display(chunks[-1])

* Ishmael decides to go to sea to escape his melancholy and find adventure, specifically choosing a whaling voyage.
* He arrives in New Bedford, a whaling hub, and seeks lodging, encountering various inns before settling on the "Spouter Inn."
* Ishmael is forced to share a bed with an unseen harpooneer, leading to growing apprehension and a comical confrontation with the landlord.
* He eventually meets Queequeg, a heavily tattooed "savage" who engages in pagan rituals with a small idol and a tomahawk-pipe. Despite initial fear, Ishmael finds an unexpected respect for Queequeg.
* The narrative shifts to Father Mapple's sermon in the Whaleman's Chapel, where he delivers a powerful discourse on Jonah's disobedience, repentance, and God's omnipresent power, urging his congregation, and himself, to preach truth regardless of personal cost.

2.17 seconds elapsed



{'id': 'gen-1756712497-1f4gM9lwzE6Hbw0m0BSF',
 'provider': 'Google AI Studio',
 'model': 'google/gemini-2.5-flash-image-preview:free',
 'object': 'chat.completion.chunk',
 'created': 1756712497,
 'choices': [{'index': 0,
   'delta': {'role': 'assistant', 'content': ''},
   'finish_reason': None,
   'native_finish_reason': None,
   'logprobs': None}],
 'usage': {'prompt_tokens': 27047,
  'completion_tokens': 186,
  'total_tokens': 27233,
  'cost': 0,
  'is_byok': False,
  'prompt_tokens_details': {'cached_tokens': 0, 'audio_tokens': 0},
  'cost_details': {'upstream_inference_cost': None,
   'upstream_inference_prompt_cost': 0,
   'upstream_inference_completions_cost': 0},
  'completion_tokens_details': {'reasoning_tokens': 0, 'image_tokens': 0}}}