# test-via-streaming.ipynb

Test the API implementation of the "via_streaming"  options which tries to get around timeouts by retrieving the response data 
using streaming. However this does not prevent timeouts when the first response already takes too long.

In [1]:
import os, sys
from loguru import logger
from typing import Optional, List, Dict
sys.path.append(os.path.join(".."))
from llms_wrapper.llms import LLMS
from llms_wrapper.config import update_llm_config

In [9]:
config = dict(
    llms=[
        # OpenAI
        # https://platform.openai.com/docs/models
        dict(llm="openai/gpt4"),
        # https://ai.google.dev/gemini-api/docs/models/gemini
        dict(llm="gemini/gemini-2.0-flash"),
        # Anthropic
        # https://docs.anthropic.com/en/docs/about-claude/models
        dict(llm="anthropic/claude-3-5-sonnet-20240620"),
        dict(llm="anthropic/claude-3-opus-20240229"),
        # Mistral
        # https://docs.mistral.ai/getting-started/models/models_overview/
        dict(llm="mistral/mistral-large-latest"),
        # XAI
        # dict(llm="xai/grok-2"),     # not mapped by litellm yet?
        dict(llm="xai/grok-beta"),
        # Groq
        # https://console.groq.com/docs/models
        dict(llm="groq/llama3-70b-8192"),
        dict(llm="groq/llama-3.3-70b-versatile"),
        # Deepseek
        # https://api-docs.deepseek.com/quick_start/pricing
        dict(llm="deepseek/deepseek-chat"),
    ],
    providers = dict(
        openai = dict(api_key_env="MY_OPENAI_API_KEY"),
        gemini = dict(api_key_env="MY_GEMINI_API_KEY"),
        anthropic = dict(api_key_env="MY_ANTHROPIC_API_KEY"),
        mistral = dict(api_key_env="MY_MISTRAL_API_KEY"),
        xai = dict(api_key_env="MY_XAI_API_KEY"),    
        groq = dict(api_key_env="MY_GROQ_API_KEY"),
        deepseek = dict(api_key_env="MY_DEEPSEEK_API_KEY"),
    )
)
config = update_llm_config(config)
llms = LLMS(config)
llms.list_aliases()

['openai/gpt4',
 'gemini/gemini-2.0-flash',
 'anthropic/claude-3-5-sonnet-20240620',
 'anthropic/claude-3-opus-20240229',
 'mistral/mistral-large-latest',
 'xai/grok-beta',
 'groq/llama3-70b-8192',
 'groq/llama-3.3-70b-versatile',
 'deepseek/deepseek-chat']

## Test via_streaming versus default



In [3]:
msgs = LLMS.make_messages("What is a monoid? Give me a simple example. Explain why the concept is useful.")
msgs

[{'content': 'What is a monoid? Give me a simple example. Explain why the concept is useful.',
  'role': 'user'}]

In [4]:
# lmname = "gemini/gemini-2.0-flash"
llmname = "mistral/mistral-large-latest"

In [8]:
# logger.enable("llms_wrapper")
logger.disable("llms_wrapper")
ret = llms.query(
    llmname, msgs, temperature=0.5, litellm_debug=False, 
    via_streaming=True,   # or from config!!
    debug=False,
)
if ret["ok"]:
    print("PROCESSING OK, chunks:", ret["n_chunks"])
    print(ret["answer"].strip())
else:
    print("!!!!!Error:", ret["error"])

PROCESSING OK, chunks: 342
AA **monoid **monoid** is a simple** is a simple algebraic structure algebraic structure from from abstract abstract algebra algebra that that consists of:

 consists of:

1. A1. A **set** ( **set** (ee.g., numbers.g., numbers, strings, functions, strings, functions).
2. An).
2. An **associative **associative binary operation** ( binary operation** (e.g., additione.g., addition, concaten, concatenation, compositionation, composition).
3. An).
3. An **identity element** **identity element** (e.g., (e.g., 0 for 0 for addition, the addition, the empty string for empty string for concatenation).

### concatenation).

### Formal Definition:
 Formal Definition:
A monA monoid isoid is a tuple a tuple \((M \((M, \cdot, \cdot, e, e)\))\) where:
- \( where:
- \(M\) is aM\) is a set,
- \(\ set,
- \(\cdot:cdot: M \times M \times M \to M \to M\) is an M\) is an associative operation associative operation (i (i.e., \((.e., \((a \cdot ba \cdot b) \cdot c) \cdot c = a \cdot 

In [None]:
# Standard/default way to do it .... 
logger.disable("llms_wrapper")
ret = llms.query(
    llmname, msgs, temperature=0.5, litellm_debug=False, debug=False)
if ret["ok"]:
    print("PROCESSING OK, chunks:", ret["n_chunks"])
    print(ret["answer"].strip())
else:
    print("!!!!!Error:", ret["error"])