In [1]:
from src import *
demo = True  # set to True and run all cells to see everything in action

In [2]:
# `distribute` is a decorator factory that gives functions the ability to perform multithreaded operation across all elements of an input list of arguments
# extremely useful in this case: since API calls are almost entirely waiting on a network request, `threads=25` offers a 25x speedup on groups of 25 or more calls
# for the purpose of the demo, we'll define a parade decorator that just prints all model outputs in the order they arrive
# note: `distribute` is modified by the `defer_kwargs` meta-decorator, which allows its factory kwargs to be specified as _distribute_kwarg in a decorated function call
async_parade = distribute(threads=25, after=lambda **x: print(f'<{x["model"]}>\n{x["value"]}\n'), exclude=["messages"])
@async_parade
def chat_parade(messages, **kwargs) -> Any:
	return chat_completion(messages, (kwargs.get("options", {}) | kwargs))
@async_parade
def text_parade(prompt, **kwargs) -> Any:
	return text_completion(prompt, (kwargs.get("options", {}) | kwargs))

async_normal = distribute(threads=25, after=lambda **x: x["value"], exclude=["messages"])
@async_normal
def chat(messages, **kwargs) -> Any:
	return chat_completion(messages, (kwargs.get("options", {}) | kwargs))
@async_normal
def text(prompt, **kwargs) -> Any:
	return text_completion(prompt, (kwargs.get("options", {}) | kwargs))

chat_models = models_by_mode.chat.keys()
text_models = models_by_mode.text.keys()
fim_models = models_by_mode.text.filter(lambda _, v: 'suffix' in v.get('parameters', [])).keys()

In [3]:
# show models for each mode, with form `{provider}:{id}`
if demo:
	print('Chat: ' + ', '.join(chat_models))
	print('\nText: ' + ', '.join(text_models))
	print('\nText (w/ suffix): ' + ', '.join(fim_models))

Chat: deepseek-beta::deepseek-coder, openai::gpt-4o, openai::gpt-4o-2024-08-06, openai::gpt-4o-mini, openrouter::openai/gpt-4o, openrouter::openai/gpt-4o-2024-08-06, openrouter::openai/gpt-4o-mini, openrouter::anthropic/claude-3.5-sonnet, openrouter::google/gemini-flash-1.5, openrouter::google/gemini-pro-1.5, openrouter::google/gemini-flash-1.5-8b, openrouter::qwen/qwen-2.5-72b-instruct, openrouter::mistralai/mistral-large, openrouter::mistralai/codestral-mamba, openrouter::nousresearch/hermes-3-llama-3.1-405b, openrouter::meta-llama/llama-3.2-90b-vision-instruct, openrouter::meta-llama/llama-3.2-11b-vision-instruct, openrouter::meta-llama/llama-3.2-11b-vision-instruct:free, openrouter::meta-llama/llama-3.2-3b-instruct, openrouter::meta-llama/llama-3.2-1b-instruct, openrouter::meta-llama/llama-3.2-3b-instruct:free, openrouter::meta-llama/llama-3.2-1b-instruct:free, openrouter::meta-llama/llama-3.1-70b-instruct, openrouter::meta-llama/llama-3.1-8b-instruct, openrouter::mistralai/mistral

In [4]:
# display all chat models' responses to a query
if demo:
	import random
	secret_word = 'skyscraper'
	scrambled = random.sample(secret_word, len(secret_word))
	query = f"Unscramble the word: {' '.join(scrambled)}"
	print(query)
	results = chat_parade(query, model=chat_models, max_tokens=64, stream=False)

Unscramble the word: s e a s k y c p r r
Generated an exception: 'NoneType' object is not subscriptable
<openrouter::meta-llama/llama-3.2-3b-instruct:free>
The unscrambled word is: sketchy

<openrouter::google/gemini-flash-1.5-8b>
The unscrambled word is **CRYSPY**.


<openrouter::meta-llama/llama-3.2-1b-instruct:free>
The unscrambled word is: SPARKYCYES

<openrouter::mistralai/mistral-nemo>
The unscrambled word is "skyward".

<groq::llama3-groq-8b-8192-tool-use-preview>
The unscrambled word is: sky scraper

<groq::llama-3.1-8b-instant>
The unscrambled word is: space Krper is not possible, Scaper but I think the unscrambled word is: space karper incorrect, I believe the word is: space car 

However re arranged then this forms 'practise Sky and sea sparkey and skewers r in some words

<openrouter::meta-llama/llama-3.2-1b-instruct>
The unscrambled word is: PRESSCYAS

<groq::llama3-groq-70b-8192-tool-use-preview>
Let's see... I think unscrambled, it would be "sparkleyscpe".

<openrouter::

In [5]:
# display all chat models' responses to a prompt
if demo:
	# hyperbolic likes to do this fun little thing where its endpoints just don't work at all
	good_text_models = [x for x in text_models if 'hyperbolic' not in x]
	prompt = 'As god-emperor, I will'
	results = text_parade(prompt, model=good_text_models, max_tokens=64, stream=False)

<openai::davinci-002>
 have my chance," Sha Forz said grimly. "I will win this war."

Heig Guang landed a titular blow at the emperor, but Fior thought better than to press the attack when the Dark Emperor would no doubt need to save his own strength to battle Sha Forz. Cei Har found himself embro

<fireworks::accounts/fireworks/models/llama-v3p1-8b-instruct>
 not be limited by such foolish rules. Besides... I see no need for the Empire to be bound by something as feeble as a treaty. The Empire must be free to do as it pleases, within reason, of course—"
"And to the benefit of the people, I'm sure," said Xarnag

<openai::babbage-002>
 literally bring the worlds to a halt - and breathe my last. Just before my death you will avenge Lycia by turning the demonic generals against him, and you will succeed within an hour. On the following morning, your body will be burnt, and the world subservient to my new regime. I have

<openai::gpt-3.5-turbo-instruct>
 never need the validation of others

In [6]:
# show suffix (aka FIM, or Fill-In-Middle) completion models

# [Because][ token][izers][ generally][ break][ down][ sentences][ like][ this], it's best to leave *no* spaces after a prompt, and *one* space before a suffix
if demo:
	prompt = 'birds, such'
	suffix = ' pay taxes'

	results = text_parade(prompt=prompt, model=fim_models, max_tokens=64, stream=False, suffix=suffix)

# note: for more complex FIM tasks, a higher max_tokens count may be necessary to allow the model to find a coherent bridge between prompt and suffix

<openai::gpt-3.5-turbo-instruct>
 as eagles, are ruled by the laws of nature and do not

<deepseek-beta::deepseek-coder>
 as the common swift, are known for their incredible endurance and ability to fly for extended periods without resting. However, it's important to note that these birds do not



In [7]:
# models have fuzzy matching, so you don't need to enter the whole name
# just a character subsequence such that the model is minimal among all models of its mode containing that subsequence
# e.g. "4o-mini" resolves to "openai:gpt-4o-mini", while "4o" resolves to "openai:gpt-4o"
if demo:
	for subseq in ['4o', '4o-mini', 'claude', 'claude-3.5']:
		r = resolve(subseq)
		print(f"{subseq} -> {', '.join('::'.join(x) for x in r)}")
	print()

	# an exception will be thrown if the resolution is ambiguous, as with 'cla'
	try:
		print(chat('hi!', model='cla'))  # raises exception
	except Exception as e:
		print(e)
	print(chat('hi!', model='claude-3.5-sonnet', temperature=0.0))  # works

	# of course, you can always just use the full name to avoid ambiguity
	print(chat('hi!', model='openrouter::anthropic/claude-3.5-sonnet', temperature=0.0))


4o -> openai::gpt-4o
4o-mini -> openai::gpt-4o-mini
claude -> openrouter::anthropic/claude-3.5-sonnet
claude-3.5 -> openrouter::anthropic/claude-3.5-sonnet

No unique chat model found for "cla". (Possible: openrouter::mistralai/codestral-mamba, openrouter::anthropic/claude-3.5-sonnet, openrouter::nousresearch/hermes-3-llama-3.1-405b)
Hello! How can I assist you today? Feel free to ask any questions or let me know if you need help with anything.
Hello! How can I assist you today? Feel free to ask me any questions or let me know if you need help with anything.


In [8]:
# note: if you set stream=True, completion functions will return a generator instead of an actual stream

if demo:
	question = 'where did the moon come from?'
	print(chat(question, model='gpt-4o-mini', stream=True))  # prints a generator object

print('\n\n')

# to actually stream the output, you'll need to iterate over the generator:
# ```
# 	for chunk in chat('hi!', model='gpt-4o-mini', stream=True):
# 		print(chunk, end='')
# ```
# use print_stream instead of print and this will be done for you

print_stream(chat(question, model='gpt-4o-mini', stream=True))

<generator object completion.<locals>.gen at 0x11fb3d8b0>



The leading theory about the origin of the Moon is known as the "giant impact hypothesis." According to this theory, the Moon formed about 4.5 billion years ago as a result of a massive collision between the early Earth and a Mars-sized body often referred to as Theia. 

Here's a brief overview of the process:

1. **Giant Impact**: Theia collided with the young Earth at a high velocity. This massive impact would have produced a large amount of debris that was ejected into orbit around the Earth.

2. **Accretion**: Over time, this debris began to coalesce and cool, eventually forming what we know today as the Moon.

3. **Differentiation**: Following its formation, the Moon underwent a process of differentiation, where heavier materials sank to form a core, while lighter materials formed the crust.

This hypothesis accounts for several pieces of evidence, including the Moon's composition, which is similar to that of the Earth's

In [9]:
# the StatefulChat class keeps track of the conversation state, and allows for saving and loading (as a string, with save_string(), load_string() or to a file, with save(path), load(path))
# chain the methods StatefulChat.system, StatefulChat.assistant, StatefulChat.user to add messages to the conversation
if demo:
	S = StatefulChat(model='gpt-4o-mini', echo=True, print_output=True)
	S.assistant("Knock knock!").user("Who's there?")

	print('<assistant> ', end='')
	S.next()

	S.user(re.sub(r'[!\.]?$', ' who?', S.last.content, 1))  # "Xyz." -> "Xyz who?"

	data = S.save_string()
	print('\n' + '=' * 10 + '\nAll parameters: ' + str(data) + '\n' + '=' * 10 + '\n')

	T = StatefulChat()
	T.load_string(data)

	print('<assistant> ', end='')
	T.next()

	print()
	# some convenient properties:
	print("First message object: ", T.first)
	print("Last message object: ", T.last)
	print("All messages: ", T.messages)

<assistant> Knock knock!
<user> Who's there?
<assistant> Lettuce.
<user> Lettuce who?

All parameters: {"message_history": [{"role": "assistant", "content": "Knock knock!"}, {"role": "user", "content": "Who's there?"}, {"content": "Lettuce.", "refusal": null, "role": "assistant", "function_call": null, "tool_calls": null}, {"role": "user", "content": "Lettuce who?"}], "apiParams": {"model": "gpt-4o-mini", "suffix": null, "max_tokens": null, "stream": null, "n": null, "logprobs": null, "top_logprobs": null, "logit_bias": null, "temperature": null, "presence_penalty": null, "frequency_penalty": null, "repetition_penalty": null, "top_p": null, "min_p": null, "top_k": null, "top_a": null, "tools": null, "tool_choice": null, "parallel_tool_calls": null, "grammar": null, "json_schema": null, "response_format": null, "seed": null}, "localParams": {"mode": "chat", "return_raw": true, "pretty_tool_calls": false, "provider": null, "force_model": null, "force_provider": null, "effect": null, "cal

In [10]:
# StatefulChat can also be used to give an LLM access to an autonomous tool mode
# - several tools are included; wrap them into a 'Toolbox' object to provide them to the LLM
# - tools.py contains some examples of custom tools
# the @gatekeep decorator can be used to have a second LLM block unsatisfactory queries
# - e.g. run_python is gatekept, and will automatically detect and refuse to run dangerous or bugged code
# sometimes the bot will escape symbols (like "\\n" instead of "\n"), which triggers an error (usually about an invalid line continuation);
# - this is rare enough that you can generally just rerun the cell

if demo:
	tool_prompt = "You are an AI agent capable of using a variety of tools for looking up information and executing scripts. You are currently using these tools in autonomous mode, where you can perform self-directed, in-depth research and analysis, as well as deploy metacognitive tools (e.g. ooda_planner, meditate) to effectively clarify, plan, and ideate. Autonomous mode will go on indefinitely until you use the `yield_control` tool in order to return input access back to the user."

	toolbox = Toolbox([
		run_python,  # runs python code and returns the output
		get_contents,  # fetches the plaintext contents of a URL
		exa_search,  # intelligent search for semantically relevant links
		google_search,  # ordinary google search
		meditate,  # a 'metacognitive' tool that allows the LLM to deepen its own thought process
		ooda_planner,  # another such tool that allows the LLM to orient itself to a situation and plan its next move
		ask_human,  # allows the LLM to ask the user a question and wait for a response
		yield_control  # allows the LLM to yield control to the user, so as to end its loop when its task is complete
	])

	model = 'gpt-4o'  # many bots can use the tool format provided by Toolbox

	question = "Is the current price of bitcoin, when rounded to the nearest whole number, prime?"

	Agent = StatefulChat(model=model, tools=toolbox, print_output=True)
	Agent.system(tool_prompt)
	Agent.user(question)
	Agent.run()


Calling google_search with arguments {'query': 'current Bitcoin price', 'num_results': '1'}
Calling run_python with arguments {'code': "import sympy\n\n# Given current Bitcoin price\nbtc_price = 62592\n\n# Check if it's prime\nis_prime = sympy.isprime(btc_price)\nis_prime"}
The current price of Bitcoin, when rounded to the nearest whole number ($62,592), is not a prime number.
Calling yield_control with arguments {'message': "Let me know if there's anything else you need!"}


In [11]:
# you can also use these tools yourself
if demo:
	# get_contents parses the contents of a given URL and returns just the plaintext content
	# - <div class="note"><div class="parenthetical"><p>(without all the <a href="https://en.wikipedia.org/wiki/HTML">HTML</a> bloat)</p></div></div>
	url_1 = "https://pages.uoregon.edu/munno/OregonCourses/REL444S05/HuinengVerse.htm"
	formatter = lambda results: '\n'.join(results["data"]["content"].split('\n\n')[:5])  # just print the first five lines, since it's a long website
	results = get_contents(url_1)
	# output schema: { 'code': int, 'status': int, 'data': { 'title': str, 'description': str, 'url': str, 'content': str, 'usage': { 'tokens': int } } }
	print(formatter(results))
	print()

	# exa_search finds websites related to a given URL
	url_2 = "https://en.wikipedia.org/wiki/Causal_map"
	results = exa_search(url_2)
	# output schema: [ { 'url': str, 'title': str, 'date': str, 'snippet': list[str] } ]
	print(f"Sites related to {url_2}: ")
	for idx, result in enumerate(results):
		print(f"{idx}. {result['title']}: \"{result['snippet'][0]}\"")

_Notes on the Verses by Shen-hsiu and Hui-neng_
According to the _Platform Sutra:_
Hung-jen, the Fifth Patriarch, the Enlightened Master
Shen-hsiu, the Learned Senior Monk, experienced in gradual meditation
Hui-neng, the illiterate woodcutter from the barbarian south, suddenly enlightened

Sites related to https://en.wikipedia.org/wiki/Causal_map: 
0. Causal graph - Wikipedia: "Modern developments have extended graphical models to non-parametric analysis, and thus achieved a generality and flexibility that has transformed causal analysis in computer science, epidemiology,   and social science."
1. Causal map - Wikipedia: "As tools to form and represent a consensus of expert views on “what causes what” in a subject area"
2. Causal model - Wikipedia: "In metaphysics, a causal model (or structural causal model) is a conceptual model that describes the causal mechanisms of a system."
3. t-distributed stochastic neighbor embedding - Wikipedia: "t-SNE has been used for visualization in a wid

In [14]:
# you can also define your own tools by directly writing them as Python functions
# so long as they're type-hinted and have a docstring of the appropriate format, the @Tool decorator will automatically wrap them into a format suitable for use with an LLM
if demo:
	# example tool: get the weather for a given city
	# not hooked up to a weather API, though, so it just gives a hardcoded answer
	@Tool
	def get_weather(city: str) -> Dict[str, Union[float, str]]:
		"""
		Get today's weather in a given city, in °F and mph.
		:param city: The name of the city.
		:returns: A dictionary containing the current temperature (°F), weather condition, and wind speed (mph) in the given city.
		"""
		return {
			'temperature': 34.0,
			'condition': 'severe thunderstorm',
			'wind_speed': 48.0
		}

	query_params = Dict({"activity": "go hang gliding", "city": "Denver"})
	weather_query = f"I want to {query_params.activity} in {query_params.city} today. Is there anything I should know about the weather?"
	weather_tools = Toolbox([get_weather])
	weather_model = "gpt-4o"

	WeatherAdvisor = StatefulChat(model=weather_model, tools=weather_tools, print_output=True)
	WeatherAdvisor.user(weather_query)
	WeatherAdvisor.next()
	print()

Calling get_weather with arguments {'city': 'Denver'}
Today's weather in Denver includes a temperature of 34°F with severe thunderstorms and wind speeds of 48 mph. These conditions are quite hazardous for hang gliding, so it's advisable not to proceed with it today.



In [22]:
# Structured Outputs (GPT-4o, 4o-mini)
# ensures that the completion is always a valid JSON object with a schema that you define
# Enable by passing a dict "response_format" = {"type": "json_schema", "json_schema": (your schema)}. In this schema,
# - additionalProperties must be false, strict must be true, and required must contain all properties
# - the types string, number, integer, boolean, object, array, null, and enum are supported
# - union types are supported, and are represented as arrays of types; in particular, you can make a param 'optional' by making its type ["string", "null"]
# - references are supported, and can point to either subschemas ({"$ref": "#/$defs/property"} or recurse ({"$ref": "#"})

# Example: Schema for answering a question
if demo:
    import json
    with open('data/structured_outputs/answer_question.json') as file:
        answer_schema = json.load(file)
    # the json object will be returned as a string (serialized), so json.loads should be used to turn it into a schema-compatible dictionary
    # streaming is incompatible with the Structured Outputs feature, so it might take a bit for the finished answer to appear
    answer_creator = lambda q: json.loads(chat_completion(q, {"model": "gpt-4o", "response_format": {"type": "json_schema", "json_schema": answer_schema}}))
    my_question = "What actually happens when a gamma ray causes a mutation?"

    for k, v in answer_creator(my_question).items():
        print(f"{k}: {v}\n")

thoughts: Thinking about the process and pathway through which gamma rays induce mutations at a cellular level. This requires knowledge of physics (radiation impact) and biology (DNA structure and mutations).

explanation: Gamma rays are a form of high-energy electromagnetic radiation that can cause mutations by ionizing molecules in a cell. When gamma rays interact with the molecules inside a cell, they can remove tightly bound electrons from the orbit of an atom, causing molecules to become ionized. This ionization can lead to a series of events that result in mutations:

1. **Direct Damage to DNA**:
   - Gamma rays can directly affect DNA by breaking the chemical bonds within it, leading to single-strand or double-strand breaks in the DNA helix. 
   - If the DNA is not repaired correctly, mutations can occur, potentially altering the genetic code.

2. **Production of Free Radicals**:
   - Gamma rays can also produce free radicals (highly reactive atoms or molecules) by interacting w