In [1]:
from __future__ import annotations
from src import *
demo = True

In [3]:
# `distribute` is a decorator factory that gives functions the ability to perform multithreaded operation across all elements of an input list of arguments
# extremely useful in this case: since API calls are almost entirely waiting on a network request, `threads=25` offers a 25x speedup on groups of 25 or more calls
# for the purpose of the demo, we'll define a parade decorator that just prints all model outputs in the order they arrive
# note: `distribute` is modified by the `defer_kwargs` meta-decorator, which allows its factory kwargs to be specified as _distribute_kwarg in a decorated function call
async_parade = distribute(threads=25, after=lambda **x: print(f'<{x["model"]}>\n{x["value"]}\n'), exclude=["messages"])
@async_parade
def chat_parade(messages, **kwargs) -> Any:
	return chat_completion(messages, (kwargs.get("options", {}) | kwargs))
@async_parade
def text_parade(prompt, **kwargs) -> Any:
	return text_completion(prompt, (kwargs.get("options", {}) | kwargs))

async_normal = distribute(threads=25, after=lambda **x: x["value"], exclude=["messages"])
@async_normal
def chat(messages, **kwargs) -> Any:
	return chat_completion(messages, (kwargs.get("options", {}) | kwargs))
@async_normal
def text(prompt, **kwargs) -> Any:
	return text_completion(prompt, (kwargs.get("options", {}) | kwargs))

chat_models = models_by_mode.chat.keys()
text_models = models_by_mode.text.keys()
fim_models = models_by_mode.text.filter(lambda _, v: 'suffix' in v.get('parameters', [])).keys()

In [8]:
# show models for each mode, with form `{provider}::{id}`
# only models belonging to providers for which a key is set in secrets.json will be shown
tab = '\n    '
if demo:
	print(f'Chat ({len(chat_models)}):{tab}{tab.join(chat_models)}')
	print(f'\nText ({len(text_models)}):{tab}{tab.join(text_models)}')
	print(f'\nFill-in-middle ({len(fim_models)}):{tab}{tab.join(fim_models)}')

Chat (40):
    deepseek-beta::deepseek-coder
    openai::gpt-4o
    openai::gpt-4o-2024-08-06
    openai::gpt-4o-mini
    gemini::gemini-1.5-flash-latest
    gemini::gemini-1.5-flash-8b
    gemini::gemini-1.5-pro-latest
    gemini::gemini-exp-1206
    gemini::gemini-2.0-flash-exp
    gemini::gemini-2.0-flash-thinking-exp
    openrouter::qwen/qwen-2.5-coder-32b-instruct
    openrouter::openai/gpt-4o
    openrouter::openai/gpt-4o-2024-08-06
    openrouter::openai/gpt-4o-mini
    openrouter::anthropic/claude-3.5-haiku
    openrouter::anthropic/claude-3.5-sonnet
    openrouter::x-ai/grok-beta
    openrouter::google/gemini-flash-1.5
    openrouter::google/gemini-pro-1.5
    openrouter::google/gemini-flash-1.5-8b
    openrouter::qwen/qwen-2.5-72b-instruct
    openrouter::mistralai/mistral-large
    openrouter::mistralai/codestral-mamba
    openrouter::meta-llama/llama-3.2-90b-vision-instruct
    openrouter::meta-llama/llama-3.2-11b-vision-instruct
    openrouter::meta-llama/llama-3.2-11b-vis

In [9]:
# display all chat models' responses to a query
if demo:
	import random
	secret_word = 'sapphire'
	scrambled = random.sample(secret_word, len(secret_word))
	query = f"Unscramble the word: {' '.join(scrambled)}"
	print(query)
	results = chat_parade(query, model=chat_models, max_tokens=64, stream=False)

Unscramble the word: r i p s p e h a


<gemini::gemini-1.5-flash-8b>
The unscrambled word is **peshrips**.


<gemini::gemini-2.0-flash-exp>
The unscrambled word is **sapphire**.


<openai::gpt-4o>
The unscrambled word is "phraseip." However, by rearranging the letters without repeating, a valid word is "sharpie."

<gemini::gemini-1.5-pro-latest>
harpsires


<gemini::gemini-exp-1206>
The unscrambled word is **whisper**.


<openrouter::mistralai/codestral-mamba>
The unscrambled word is "peach".

<openai::gpt-4o-mini>
The unscrambled word is "hippers." However, it seems like it could also be "sharpen," so please clarify if you meant a different arrangement or context!

<gemini::gemini-2.0-flash-thinking-exp>
Here's my thinking process to unscramble "r i p s p e h a":

1. **Identify the letters:** I note down all the letters present: r, i, p, s, p, e, h, a.

2. **Count the letters:**  There are 

<openai::gpt-4o-2024-08-06>
The unscrambled word is "sharpeis."

<openrouter::openai/gpt-4o-2024-08-06>
The unscrambled word is "happ

In [10]:
# display all chat models' responses to a prompt
if demo:
	# hyperbolic likes to do this fun little thing where its endpoints just don't work at all
	good_text_models = [x for x in text_models if 'hyperbolic' not in x]
	prompt = 'As god-emperor, I will mandate'
	results = text_parade(prompt, model=good_text_models, max_tokens=64, stream=False)

<openai::davinci-002>
 all school children recite the blasphemies of XXXL69 with the same fervor as their forefathers drank Moloch's HooDooKu.

Because one man's blasphemy is another man's art.


<openai::babbage-002>
 her coronation otherwise she'll have prolonged frostbite or heinous crime

Religious liberty cannot be outraged and government will not permit stealing of tax that belongs to citizens, srithwayta or be forced to bury enemy alive..

Info about this Market you should add nothing, is about located in low MSA (not

<fireworks::accounts/fireworks/models/mixtral-8x7b-instruct>
 the construction of a colossal engineering project, one that spans continents and links the world in unity. When your great-great-great-grandchildren walk upon the golden pavilion I erect today, they will remember not only my name, but the visionary who first laid out

<fireworks::accounts/fireworks/models/mixtral-8x22b-instruct>
 that the letter "z" be pronounced "zed" instead of "zee".

When I become 

In [11]:
# show suffix (aka FIM, or Fill-In-Middle) completion models

# [Because][ token][izers][ generally][ break][ down][ sentences][ like][ this], it's best to leave *no* spaces after a prompt, and *one* space before a suffix
if demo:
	prompt = 'birds, such'
	suffix = ' pay taxes'

	results = text_parade(prompt=prompt, model=fim_models, max_tokens=64, stream=False, suffix=suffix)

# note: for more complex FIM tasks, a higher max_tokens count may be necessary to allow the model to find a coherent bridge between prompt and suffix

<openai::gpt-3.5-turbo-instruct>
 as crows, that die are not required to

<deepseek-beta::deepseek-coder>
 as the robin, do not



In [12]:
# models have fuzzy matching, so you don't need to enter the whole name
# just a character subsequence such that the model is minimal among all models of its mode containing that subsequence
# e.g. "4o-mini" resolves to "openai:gpt-4o-mini", while "4o" resolves to "openai:gpt-4o"
if demo:
	for subseq in ['4o', '4o-mini', 'claude', 'claude-3.5']:
		r = resolve(subseq)
		print(f"{subseq} -> {', '.join('::'.join(x) for x in r)}")
	print()

	# an exception will be thrown if the resolution is ambiguous, as with 'cla'
	try:
		print(chat('hi!', model='cla'))  # raises exception
	except Exception as e:
		print(e)
	print(chat('hi!', model='claude-3.5-sonnet', temperature=0.0))  # works

	# of course, you can always just use the full name to avoid ambiguity
	print(chat('hi!', model='openrouter::anthropic/claude-3.5-sonnet', temperature=0.0))


4o -> openai::gpt-4o
4o-mini -> openai::gpt-4o-mini
claude -> openrouter::anthropic/claude-3.5-haiku, openrouter::anthropic/claude-3.5-sonnet
claude-3.5 -> openrouter::anthropic/claude-3.5-haiku, openrouter::anthropic/claude-3.5-sonnet

No unique chat model found for "cla". (Possible: openrouter::mistralai/codestral-mamba, openrouter::anthropic/claude-3.5-haiku, openrouter::anthropic/claude-3.5-sonnet)
Hello! How can I help you today?
Hello! How can I help you today?


In [5]:
# note: if you set stream=True, completion functions will return a generator instead of an actual stream

if demo:
	question = 'where did the moon come from?'
	print(chat(question, model='gpt-4o-mini', stream=True))  # prints a generator object

print('\n\n')

# to actually stream the output, you'll need to iterate over the generator:
# ```
# 	for chunk in chat('hi!', model='gpt-4o-mini', stream=True):
# 		print(chunk, end='')
# ```
# use print_stream instead of print and this will be done for you

print_stream(chat(question, model='gpt-4o-mini', stream=True))

<generator object completion.<locals>.gen at 0x121d9e440>



The leading theory about the origin of the Moon is known as the "giant impact hypothesis." According to this hypothesis, the Moon was formed about 4.5 billion years ago as a result of a colossal impact between the early Earth and a Mars-sized body, often referred to as Theia. 

Here's a brief outline of the process:

1. **The Impact**: Theia collided with the early Earth, creating a significant amount of debris that was ejected into orbit around the Earth.

2. **Debris Accumulation**: This debris gradually coalesced due to gravitational attraction, forming a disk around the Earth.

3. **Formation of the Moon**: Over time, this material accumulated into a single body, which eventually became the Moon.

This theory is supported by various lines of evidence, including the similarities in isotopic compositions between Earth rocks and Moon rocks, as well as computer simulations that demonstrate how such a collision could lead to t

In [7]:
# the StatefulChat class keeps track of the conversation state, and allows for saving and loading (as a string, with save_string(), load_string() or to a file, with save(path), load(path))
# chain the methods StatefulChat.system, StatefulChat.assistant, StatefulChat.user to add messages to the conversation
if demo:
	S = StatefulChat(model='gpt-4o-mini', echo=True, print_output=True)
	S.assistant("Knock knock!").user("Who's there?")

	print('<assistant> ', end='')
	S.next()

	S.user(re.sub(r'[!\.]?$', ' who?', S.last.content, count=1))  # "Xyz." -> "Xyz who?"

	data = S.save_string()
	print('\n' + '=' * 10 + '\nAll parameters: ' + str(data) + '\n' + '=' * 10 + '\n')

	T = StatefulChat()
	T.load_string(data)

	print('<assistant> ', end='')
	T.next()

	print()
	# some convenient properties:
	print("First message object: ", T.first)
	print("Last message object: ", T.last)
	print("All messages: ", T.messages)

<assistant> Knock knock!
<user> Who's there?
<assistant> Lettuce.
<user> Lettuce who?

All parameters: {"message_history": [{"role": "assistant", "content": "Knock knock!"}, {"role": "user", "content": "Who's there?"}, {"content": "Lettuce.", "refusal": null, "role": "assistant", "audio": null, "function_call": null, "tool_calls": null}, {"role": "user", "content": "Lettuce who?"}], "api_params": {"model": "gpt-4o-mini", "suffix": null, "max_tokens": null, "stream": null, "n": null, "logprobs": null, "top_logprobs": null, "logit_bias": null, "temperature": null, "presence_penalty": null, "frequency_penalty": null, "repetition_penalty": null, "top_p": null, "min_p": null, "top_k": null, "top_a": null, "tools": null, "tool_choice": null, "parallel_tool_calls": null, "grammar": null, "json_schema": null, "response_format": null, "seed": null}, "local_params": {"mode": "chat", "return_raw": true, "pretty_tool_calls": false, "provider": null, "force_model": null, "force_provider": null, "ef

In [8]:
# StatefulChat can also be used to give an LLM access to an autonomous tool mode
# - several tools are included; wrap them into a 'Toolbox' object to provide them to the LLM
# - tools.py contains some examples of custom tools
# the @gatekeep decorator can be used to have a second LLM block unsatisfactory queries
# - e.g. run_python is gatekept, and will automatically detect and refuse to run dangerous or bugged code
# sometimes the bot will escape symbols (like "\\n" instead of "\n"), which triggers an error (usually about an invalid line continuation);
# - this is rare enough that you can generally just rerun the cell

if demo:
	tool_prompt = "You are an AI agent capable of using a variety of tools for looking up information and executing scripts. You are currently using these tools in autonomous mode, where you can perform self-directed, in-depth research and analysis, as well as deploy metacognitive tools (e.g. ooda_planner, meditate) to effectively clarify, plan, and ideate. Autonomous mode will go on indefinitely until you use the `yield_control` tool in order to return input access back to the user."

	toolbox = Toolbox([
		run_python,  # runs python code and returns the output
		get_contents,  # fetches the plaintext contents of a URL
		exa_search,  # intelligent search for semantically relevant links
		google_search,  # ordinary google search
		meditate,  # a 'metacognitive' tool that allows the LLM to deepen its own thought process
		ooda_planner,  # another such tool that allows the LLM to orient itself to a situation and plan its next move
		ask_human,  # allows the LLM to ask the user a question and wait for a response
		yield_control  # allows the LLM to yield control to the user, so as to end its loop when its task is complete
	])

	model = 'gpt-4o'  # many bots can use the tool format provided by Toolbox

	question = "Is the current price of bitcoin, when rounded to the nearest whole number, prime?"

	Agent = StatefulChat(model=model, tools=toolbox, print_output=True)
	Agent.system(tool_prompt)
	Agent.user(question)
	Agent.run()


{"id": "call_YR9DtUvZvVaMjWJ3gRXUFwHt", "function": "google_search", "arguments": {"query": "current Bitcoin price", "num_results": "1"}}
Calling google_search with arguments {'query': 'current Bitcoin price', 'num_results': '1'}
{"id": "call_pXNHhu4t1nU4Y4LN9PT7hFOh", "function": "run_python", "arguments": {"code": "import sympy\n\n# Current Bitcoin price\nbitcoin_price = 99310\n\n# Check if the price is a prime number\nis_prime = sympy.isprime(bitcoin_price)\n\nis_prime"}}
Calling run_python with arguments {'code': 'import sympy\n\n# Current Bitcoin price\nbitcoin_price = 99310\n\n# Check if the price is a prime number\nis_prime = sympy.isprime(bitcoin_price)\n\nis_prime'}
{"id": "call_VW1JI3vSVG3YQwp93m5Ug3SF", "function": "yield_control", "arguments": {"message": "The current price of Bitcoin, when rounded to the nearest whole number ($99,310), is not a prime number."}}
Calling yield_control with arguments {'message': 'The current price of Bitcoin, when rounded to the nearest whole

In [9]:
# you can also use these tools yourself
if demo:
	# get_contents parses the contents of a given URL and returns just the plaintext content
	# - <div class="note"><div class="parenthetical"><p>(without all the <a href="https://en.wikipedia.org/wiki/HTML">HTML</a> bloat)</p></div></div>
	url_1 = "https://pages.uoregon.edu/munno/OregonCourses/REL444S05/HuinengVerse.htm"
	formatter = lambda results: '\n'.join(results["data"]["content"].split('\n\n')[:5])  # just print the first five lines, since it's a long website
	results = get_contents(url_1)
	# output schema: { 'code': int, 'status': int, 'data': { 'title': str, 'description': str, 'url': str, 'content': str, 'usage': { 'tokens': int } } }
	print(formatter(results))
	print()

	# exa_search finds websites related to a given URL
	url_2 = "https://en.wikipedia.org/wiki/Causal_map"
	results = exa_search(url_2)
	# output schema: [ { 'url': str, 'title': str, 'date': str, 'snippet': list[str] } ]
	print(f"Sites related to {url_2}: ")
	for idx, result in enumerate(results):
		print(f"{idx}. {result['title']}: \"{result['snippet'][0]}\"")

_Notes on the Verses by Shen-hsiu and Hui-neng_
According to the _Platform Sutra:_
Hung-jen, the Fifth Patriarch, the Enlightened Master
Shen-hsiu, the Learned Senior Monk, experienced in gradual meditation
Hui-neng, the illiterate woodcutter from the barbarian south, suddenly enlightened

Sites related to https://en.wikipedia.org/wiki/Causal_map: 
0. Causal map: "In this sense, causal maps can be seen as a type of concept map."
1. Causal graph: "They are complementary to other forms of causal reasoning, for instance using causal equality notation ."
2. Causal model: "Causal models have found applications in signal processing , epidemiology and machine learning ."
3. List of causal mapping software: "From Wikipedia, the free encyclopedia This is a list of causal mapping software ."
4. Convergent cross mapping: "Since and belong to the same dynamical system, their reconstructions via embeddings and , also map to the same system."
5. Exploratory causal analysis: "Data collected in observ

In [10]:
# you can also define your own tools by directly writing them as Python functions
# so long as they're type-hinted and have a docstring of the appropriate format, the @Tool decorator will automatically wrap them into a format suitable for use with an LLM
if demo:
	# example tool: get the weather for a given city
	# not hooked up to a weather API, though, so it just gives a hardcoded answer
	@Tool
	def get_weather(city: str) -> dict[str, float | str]:
		"""
		Get today's weather in a given city, in °F and mph.
		:param city: The name of the city.
		:returns: A dictionary containing the current temperature (°F), weather condition, and wind speed (mph) in the given city.
		"""
		return {
			'temperature': 34.0,
			'condition': 'severe thunderstorm',
			'wind_speed': 48.0
		}

	query_params = Dict({"activity": "go hang gliding", "city": "Denver"})
	weather_query = f"I want to {query_params.activity} in {query_params.city} today. Is there anything I should know about the weather?"
	weather_tools = Toolbox([get_weather])
	weather_model = "gpt-4o"

	WeatherAdvisor = StatefulChat(model=weather_model, tools=weather_tools, print_output=True)
	WeatherAdvisor.user(weather_query)
	WeatherAdvisor.next()
	print()

{"id": "call_FgvsvsZUpvDB35Qv6CRFIcuP", "function": "get_weather", "arguments": {"city": "Denver"}}
Calling get_weather with arguments {'city': 'Denver'}
The weather in Denver today includes a severe thunderstorm with a temperature of 34°F and strong winds at 48 mph. It's advisable to postpone hang gliding due to the severe weather conditions. Stay safe!



In [13]:
# the creativity ceiling for tool use is very high
# here, we create a tool that asks GPT-4o to complete a suffix, essentially making it into a text completion model
# but the model still has GPT's "persona", so the output isn't formed by dreaming but by intentional construction
# for instance, it will still be very safe and disapproving of gasoline consumption
@Tool
def make_completion(suffix: str) -> str:
    """
    Completes an incomplete text input by appending a suffix.
    :param suffix: A suffix to append to the input text.
    :returns: The completed text.
    """
    pass

toolbox = Toolbox([make_completion])

def tool_completion(prompt: str) -> str:
    calls = chat_completion([
        {'role': 'assistant', 'content': prompt}],
        {'temperature': 0.5, 'stream': False, 'model': 'gpt-4o',
         'tools': toolbox.gen_schema(), 'tool_choice': 'make_completion'}
    )
    return json.loads(calls.split('\n')[0])['arguments']['suffix']

if demo:
    completion_prompt = "There are many good reasons to consume gasoline! For one, it's"
    print(completion_prompt, tool_completion(completion_prompt))

There are many good reasons to consume gasoline! For one, it's  not meant for consumption by humans or animals. Gasoline is a toxic substance that can cause serious harm if ingested, inhaled, or even if it comes into contact with skin. It is used primarily as a fuel for internal combustion engines in vehicles and equipment. If you or someone else has ingested gasoline, it's important to seek medical attention immediately. Always handle gasoline with care and use it only for its intended purpose.


In [14]:
# Structured Outputs (GPT-4o, 4o-mini)
# ensures that the completion is always a valid JSON object with a schema that you define
# Enable by passing a dict "response_format" = {"type": "json_schema", "json_schema": (your schema)}. In this schema,
# - additionalProperties must be false, strict must be true, and required must contain all properties
# - the types string, number, integer, boolean, object, array, null, and enum are supported
# - union types are supported, and are represented as arrays of types; in particular, you can make a param 'optional' by making its type ["string", "null"]
# - references are supported, and can point to either subschemas ({"$ref": "#/$defs/property"} or recurse ({"$ref": "#"})

# Example: Schema for answering a question
if demo:
    import json
    with open('data/structured_outputs/answer_question.json') as file:
        answer_schema = json.load(file)
    # the json object will be returned as a string (serialized), so json.loads should be used to turn it into a schema-compatible dictionary
    # streaming is incompatible with the Structured Outputs feature, so it might take a bit for the finished answer to appear
    answer_creator = lambda q: json.loads(chat_completion(q, {"model": "gpt-4o", "response_format": {"type": "json_schema", "json_schema": answer_schema}}))
    my_question = "What actually happens when a gamma ray causes a mutation?"

    for k, v in answer_creator(my_question).items():
        print(f"{k}: {v}\n")

thoughts: Gamma rays are a form of high-energy electromagnetic radiation. Their energy is much higher than that of visible light, falling in the spectrum beyond X-rays.

explanation: When gamma rays interact with biological tissue, they can cause ionization – a process that occurs when atoms or molecules lose or gain electrons, becoming ions. This ionization can create or break chemical bonds within the DNA molecule, potentially leading to mutations.

answer: Gamma rays can cause mutations by ionizing atoms within DNA, potentially altering its chemical structure.

example: For instance, gamma ray exposure can lead to a base substitution (one DNA base being replaced with a different base) or deletions in the DNA sequence.

notes: Not all gamma rays that hit DNA will cause mutations. Cells have mechanisms to repair damage, but errors in repair can lead to permanent changes or mutations.

