In [1]:
from src.structure import *
demo = True  # set to True and run all cells to see everything in action

In [2]:
# `distribute` is a decorator factory that gives functions the ability to perform multithreaded operation across all elements of an input list of arguments
# extremely useful in this case: since API calls are almost entirely waiting on a network request, `threads=25` offers a 25x speedup on groups of 25 or more calls
# for the purpose of the demo, we'll define a parade decorator that just prints all model outputs in the order they arrive
# note: `distribute` is modified by the `defer_kwargs` meta-decorator, which allows its factory kwargs to be specified as _distribute_kwarg in a decorated function call
async_parade = distribute(threads=25, after=lambda **x: print(f'<{x["model"]}>\n{x["value"]}\n'), exclude=["messages"])
@async_parade
def chat_parade(messages, **kwargs) -> Any:
	return chat_completion(messages, (kwargs.get("options", {}) | kwargs))
@async_parade
def text_parade(prompt, **kwargs) -> Any:
	return text_completion(prompt, (kwargs.get("options", {}) | kwargs))

async_normal = distribute(threads=25, after=lambda **x: x["value"], exclude=["messages"])
@async_normal
def chat(messages, **kwargs) -> Any:
	return chat_completion(messages, (kwargs.get("options", {}) | kwargs))
@async_normal
def text(prompt, **kwargs) -> Any:
	return text_completion(prompt, (kwargs.get("options", {}) | kwargs))

chat_models = models_by_mode.chat.keys()
text_models = models_by_mode.text.keys()
fim_models = models_by_mode.text.filter(lambda _, v: 'suffix' in v.get('parameters', [])).keys()

In [3]:
# show models for each mode, with form `{provider}:{id}`
if demo:
	print('Chat: ' + ', '.join(chat_models))
	print('\nText: ' + ', '.join(text_models))
	print('\nText (w/ suffix): ' + ', '.join(fim_models))

Chat: deepseek-beta::deepseek-coder, openai::gpt-4o, openai::gpt-4o-2024-08-06, openai::gpt-4o-mini, openrouter::openai/gpt-4o, openrouter::openai/gpt-4o-2024-08-06, openrouter::openai/gpt-4o-mini, openrouter::anthropic/claude-3.5-sonnet, openrouter::google/gemini-flash-1.5, openrouter::google/gemini-pro-1.5, openrouter::google/gemini-flash-1.5-8b, openrouter::qwen/qwen-2.5-72b-instruct, openrouter::mistralai/mistral-large, openrouter::mistralai/codestral-mamba, openrouter::nousresearch/hermes-3-llama-3.1-405b, openrouter::meta-llama/llama-3.2-90b-vision-instruct, openrouter::meta-llama/llama-3.2-11b-vision-instruct, openrouter::meta-llama/llama-3.2-11b-vision-instruct:free, openrouter::meta-llama/llama-3.2-3b-instruct, openrouter::meta-llama/llama-3.2-1b-instruct, openrouter::meta-llama/llama-3.2-3b-instruct:free, openrouter::meta-llama/llama-3.2-1b-instruct:free, openrouter::meta-llama/llama-3.1-70b-instruct, openrouter::meta-llama/llama-3.1-8b-instruct, openrouter::mistralai/mistral

In [4]:
# display all chat models' responses to a query
if demo:
	import random
	secret_word = 'skyscraper'
	scrambled = random.sample(secret_word, len(secret_word))
	query = f"Unscramble the word: {' '.join(scrambled)}"
	print(query)
	results = chat_parade(query, model=chat_models, max_tokens=64, stream=False)

Unscramble the word: r e a s y p k s r c
<openai::gpt-4o-2024-08-06>
The unscrambled word is "skyscrapers."

<openai::gpt-4o>
The unscrambled word is "skyscraper."

<openrouter::meta-llama/llama-3.2-11b-vision-instruct:free>
The unscrambled word is: spacecraft

And I think "airsickper" could also be a possible answer but "spacecraft" fits much better.

 Wait, I think "kraspsyer" and " spasrkery" could be also...

<openrouter::meta-llama/llama-3.2-1b-instruct:free>
The unscrambled word is: SPARKERS

<openrouter::mistralai/mistral-large>
Sure, let's unscramble the word "r e a s y p k s r c."

The word is: **parasykres**.

However, "parasykres" doesn't appear to be a standard English word. It's possible that there might be

<openrouter::meta-llama/llama-3.1-8b-instruct>
The unscrambled word is: skeptical and harmonicas, and also Sketchperary however a more likely answer is skeptarhy. However experts would probably tell me that the scrambled word "r e a s y p k s r c" is un likely to have 

In [5]:
# display all chat models' responses to a prompt
if demo:
	prompt = 'As god-emperor, I will'
	results = text_parade(prompt, model=text_models, max_tokens=64, stream=False)

<openai::davinci-002>
 walk upon this floor of Nod, and I will not trust its pristine staun.

How can you tell me that my rocks are not a hell? I kneel before the throne of Jordan Peterson. I am... Jordan and Peterson and Jordan. Wisdom is not made up of holes. I have built wisdom in all

<fireworks::accounts/fireworks/models/llama-v3p1-8b-instruct>
 see that you are given the place of honor next to me at the first banquet tonight,” Ongal said.
“Tonight?” Kaelin stuttered, trying to find his voice.
“Yes,” Ongal said with a serene smile. “Tonight is the night we seal the alliance between the Shadowland and the

<openai::babbage-002>
 lead the race to the top of the iron throne. Michel will dig my grave with his own hands and become king. We will rule as forge- Mafgetonitus Tornado Lord of the Iron Spiders And the destroyed! This was my plan. It was my destiny. Ruler of the Iron Spiders

<fireworks::accounts/fireworks/models/mixtral-8x7b-instruct>
 declare that the official state religio

In [19]:
# show suffix (aka FIM, or Fill-In-Middle) completion models

# [Because][ token][izers][ generally][ break][ down][ sentences][ like][ this], it's best to leave *no* spaces after a prompt, and *one* space before a suffix
if demo:
	prompt = 'birds, such'
	suffix = ' pay taxes'

	results = text_parade(prompt=prompt, model=fim_models, max_tokens=64, stream=False, suffix=suffix)

# note: for more complex FIM tasks, a higher max_tokens count may be necessary to allow the model to find a coherent bridge between prompt and suffix

<openai::gpt-3.5-turbo-instruct>
 as ducks, do not have to

<deepseek-beta::deepseek-coder>
 as the common swift, are known for their incredible endurance and ability to fly for long periods without stopping. However, they do not



In [7]:
# models have fuzzy matching, so you don't need to enter the whole name
# just a character subsequence such that the model is minimal among all models of its mode containing that subsequence
# e.g. "4o-mini" resolves to "openai:gpt-4o-mini", while "4o" resolves to "openai:gpt-4o"
if demo:
	for subseq in ['4o', '4o-mini', 'claude', 'claude-3.5']:
		r = resolve(subseq)
		print(f"{subseq} -> {', '.join('::'.join(x) for x in r)}")
	print()

	# an exception will be thrown if the resolution is ambiguous, as with 'cla'
	try:
		print(chat('hi!', model='cla'))  # raises exception
	except Exception as e:
		print(e)
	print(chat('hi!', model='claude-3.5-sonnet', temperature=0.0))  # works

	# of course, you can always just use the full name to avoid ambiguity
	print(chat('hi!', model='openrouter::anthropic/claude-3.5-sonnet', temperature=0.0))


4o -> openai::gpt-4o
4o-mini -> openai::gpt-4o-mini
claude -> openrouter::anthropic/claude-3.5-sonnet
claude-3.5 -> openrouter::anthropic/claude-3.5-sonnet

No unique chat model found for "cla". (Possible: openrouter::mistralai/codestral-mamba, openrouter::anthropic/claude-3.5-sonnet, openrouter::nousresearch/hermes-3-llama-3.1-405b)
Hello! How can I assist you today? Feel free to ask any questions or let me know if you need help with anything.
Hello! How can I assist you today? Feel free to ask me any questions or let me know if you need help with anything.


In [8]:
# note: if you set stream=True, completion functions will return a generator instead of an actual stream

if demo:
	question = 'where did the moon come from?'
	print(chat(question, model='gpt-4o-mini', stream=True))  # prints a generator object

print('\n\n')

# to actually stream the output, you'll need to iterate over the generator:
# ```
# 	for chunk in chat('hi!', model='gpt-4o-mini', stream=True):
# 		print(chunk, end='')
# ```
# use print_stream instead of print and this will be done for you

print_stream(chat(question, model='gpt-4o-mini', stream=True))

<generator object completion.<locals>.gen at 0x122612df0>



The most widely accepted theory regarding the formation of the Moon is the "giant impact hypothesis." This theory suggests that about 4.5 billion years ago, shortly after the formation of the solar system, a Mars-sized body, often referred to as Theia, collided with the early Earth. The impact was so significant that a large amount of debris was ejected into orbit around the Earth. Over time, this debris coalesced to form the Moon.

Additional evidence supporting this theory includes the similarity in isotopic compositions of Earth and Moon rocks, which indicates a close relationship between the two bodies. Other theories exist, such as the fission theory (where the Moon spun off from a rapidly rotating Earth), the capture theory (where the Moon was formed elsewhere and captured by Earth's gravity), and the co-formation theory (where the Earth and Moon formed together as a double system). However, the giant impact hypothesis 

In [9]:
# the StatefulChat class keeps track of the conversation state, and allows for saving and loading (as a string, with save_string(), load_string() or to a file, with save(path), load(path))
# chain the methods StatefulChat.system, StatefulChat.assistant, StatefulChat.user to add messages to the conversation
if demo:
	S = StatefulChat(model='gpt-4o-mini', echo=True, print_output=True)
	S.assistant("Knock knock!").user("Who's there?")

	print('<assistant> ', end='')
	S.next()

	S.user(re.sub(r'[!\.]?$', ' who?', S.last.content, 1))  # "Xyz." -> "Xyz who?"

	data = S.save_string()
	print('\n' + '=' * 10 + '\n' + str(data) + '\n' + '=' * 10 + '\n')

	T = StatefulChat()
	T.load_string(data)

	print('<assistant> ', end='')
	T.next()

	print()
	# some convenient properties:
	print(T.first)
	print(T.last)
	print(T.messages)

<assistant> Knock knock!
<user> Who's there?
<assistant> Lettuce.
<user> Lettuce who?

{"message_history": [{"role": "assistant", "content": "Knock knock!"}, {"role": "user", "content": "Who's there?"}, {"content": "Lettuce.", "refusal": null, "role": "assistant", "function_call": null, "tool_calls": null}, {"role": "user", "content": "Lettuce who?"}], "apiParams": {"model": "gpt-4o-mini", "suffix": null, "max_tokens": null, "stream": null, "n": null, "logprobs": null, "top_logprobs": null, "logit_bias": null, "temperature": null, "presence_penalty": null, "frequency_penalty": null, "repetition_penalty": null, "top_p": null, "min_p": null, "top_k": null, "top_a": null, "tools": null, "tool_choice": null, "parallel_tool_calls": null, "grammar": null, "json_schema": null, "response_format": null, "seed": null}, "localParams": {"mode": "chat", "return_raw": true, "pretty_tool_calls": false, "provider": null, "force_model": null, "force_provider": null, "effect": null, "callback": null, "p

In [11]:
# StatefulChat can also be used to give an LLM access to an autonomous tool mode
# - several tools are included; wrap them into a 'Toolbox' object to provide them to the LLM
# - tool_calls.py contains some examples of custom tools
# the @gatekeep decorator can be used to have a second LLM block unsatisfactory queries
# - e.g. run_python is gatekept, and will automatically detect and refuse to run dangerous or bugged code
# sometimes the bot will escape symbols (like "\\n" instead of "\n"), which triggers an error (usually about an invalid line continuation);
# - this is rare enough that you can generally just rerun the cell

if demo:
	tool_prompt = "You are an AI agent capable of using a variety of tools for looking up information and executing scripts. You are currently using these tools in autonomous mode, where you can perform self-directed, in-depth research and analysis, as well as deploy metacognitive tools (e.g. ooda_planner, meditate) to effectively clarify, plan, and ideate. Autonomous mode will go on indefinitely until you use the `yield_control` tool in order to return input access back to the user."

	toolbox = Toolbox([
		run_python,  # runs python code and returns the output
		get_contents,  # fetches the plaintext contents of a URL
		exa_search,  # intelligent search for semantically relevant links
		google_search,  # ordinary google search
		meditate,  # a 'metacognitive' tool that allows the LLM to deepen its own thought process
		ooda_planner,  # another such tool that allows the LLM to orient itself to a situation and plan its next move
		ask_human,  # allows the LLM to ask the user a question and wait for a response
		yield_control  # allows the LLM to yield control to the user, so as to end its loop when its task is complete
	])

	model = 'gpt-4o'  # many bots can use the tool format provided by Toolbox

	question = "What is the sum of the first 2,024 prime numbers?"  # answer: 16,694,571 (https://www.wolframalpha.com/input?i=sum+of+first+2024+prime+numbers)

	Agent = StatefulChat(model=model, tools=toolbox, print_output=True)
	Agent.system(tool_prompt)
	Agent.user(question)
	Agent.run()


Calling run_python with arguments {'code': 'import sympy\n\n# Calculate the sum of the first 2024 prime numbers\nsum_of_primes = sum(sympy.prime(i) for i in range(1, 2025))\nsum_of_primes'}
The sum of the first 2,024 prime numbers is 16,694,571.
If you have any more questions or need further assistance, feel free to ask!
Calling yield_control with arguments {'message': "You can let me know if there's anything else you'd like to explore."}


In [12]:
# you can also use these tools yourself
if demo:
	# get_contents parses the contents of a given URL and returns just the plaintext content
	# - <div class="note"><div class="parenthetical"><p>(without all the <a href="https://en.wikipedia.org/wiki/HTML">HTML</a> bloat)</p></div></div>
	url_1 = "https://pages.uoregon.edu/munno/OregonCourses/REL444S05/HuinengVerse.htm"
	formatter = lambda results: '\n'.join(results["data"]["content"].split('\n\n')[:5])  # just print the first five lines, since it's a long website
	results = get_contents(url_1)
	# output schema: { 'code': int, 'status': int, 'data': { 'title': str, 'description': str, 'url': str, 'content': str, 'usage': { 'tokens': int } } }
	print(formatter(results))
	print()

	# exa_search finds websites related to a given URL
	url_2 = "https://en.wikipedia.org/wiki/Causal_map"
	results = exa_search(url_2)
	# output schema: [ { 'url': str, 'title': str, 'date': str, 'snippet': list[str] } ]
	for result in results:
		print(result['title'], ': ', result['snippet'][0])

_Notes on the Verses by Shen-hsiu and Hui-neng_
According to the _Platform Sutra:_
Hung-jen, the Fifth Patriarch, the Enlightened Master
Shen-hsiu, the Learned Senior Monk, experienced in gradual meditation
Hui-neng, the illiterate woodcutter from the barbarian south, suddenly enlightened

Causal inference - Wikipedia :  This is because published articles often assume an advanced technical background, they may be written from multiple statistical, epidemiological, computer science, or philosophical perspectives, methodological approaches continue to expand rapidly, and many aspects of causal inference receive limited coverage.
Biplot - Wikipedia :  The book by Gower, Lubbe and le Roux (2011) aims to popularize biplots as a useful and reliable method for the visualization of multivariate data when researchers want to consider, for example, principal component analysis (PCA), canonical variates analysis (CVA) or various types of correspondence analysis.
Confirmatory composite analysis - 

In [13]:
# you can also define your own tools by directly writing them as Python functions
# so long as they're type-hinted and have a docstring of the appropriate format, the @Tool decorator will automatically wrap them into a format suitable for use with an LLM
if demo:
	# example tool: get the weather for a given city
	# not hooked up to a weather API, though, so it just gives a hardcoded answer
	@Tool
	def get_weather(city: str) -> Dict[str, Union[float, str]]:
		"""
		Get the current weather in a given city, in °F and mph.
		:param city: The name of the city.
		:returns: A dictionary containing the current temperature (°F), weather condition, and wind speed (mph) in the given city.
		"""
		return {
			'temperature': 21.0,
			'condition': 'hailstorm',
			'wind_speed': 35.0
		}

	query_params = Dict({"activity": "go hiking", "city": "Portland"})
	weather_query = f"Is today a good day to {query_params.activity} in {query_params.city}?"
	weather_tools = Toolbox([get_weather])
	weather_model = "gpt-4o"

	WeatherAdvisor = StatefulChat(model=weather_model, tools=weather_tools, print_output=True)
	WeatherAdvisor.user(weather_query)
	WeatherAdvisor.next()
	print()

Calling get_weather with arguments {'city': 'Portland'}
Today's weather in Portland is not favorable for hiking. The current temperature is 21°F with a hailstorm, and wind speeds are at 35 mph. It would be best to postpone your hiking plans for a safer day with better weather conditions.



In [18]:
# Structured Outputs (GPT-4o, 4o-mini)
# ensures that the completion is always a valid JSON object with a schema that you define
# Enable by passing a dict "response_format" = {"type": "json_schema", "json_schema": (your schema)}. In this schema,
# - additionalProperties must be false, strict must be true, and required must contain all properties
# - the types string, number, integer, boolean, object, array, null, and enum are supported
# - union types are supported, and are represented as arrays of types; in particular, you can make a param 'optional' by making its type ["string", "null"]
# - references are supported, and can point to either subschemas ({"$ref": "#/$defs/property"} or recurse ({"$ref": "#"})

# Example: Schema for answering a question
if demo:
    import json
    answer_schema = {
        "name": "answer_question",
        "description": "Answers a question",
        "strict": True,
        "schema": {
            "type": "object",
            "properties": {
                "thoughts": {
                    "type": ["string", "null"],
                    "description": "This is optional scratch paper the assistant may use to privately draft, construct, or verify an answer."
                },
                "explanation": {
                    "type": ["string", "null"],
                    "description": "An optional description and explanation of the answer for the user."
                },
                "answer": {
                    "type": "string",
                    "description": "The concise final answer, sans explanation, preambling, etc."
                },
                "example": {
                    "type": ["string", "null"],
                    "description": "An optional example to be provided, if appropriate."
                },
                "notes": {
                    "type": ["string", "null"],
                    "description": "Optional additional information or notes for the user."
                }
            },
            "additionalProperties": False,
            "required": [
                "thoughts", "explanation", "answer", "example", "notes"
            ]
        }
    }
    # the json object will be returned as a string (serialized), so json.loads should be used to turn it into a schema-compatible dictionary
    # streaming is incompatible with the Structured Outputs feature, so it might take a bit for the finished answer to appear
    answer_creator = lambda q: json.loads(chat_completion(q, {"model": "gpt-4o", "response_format": {"type": "json_schema", "json_schema": answer_schema}}))
    my_question = "What is a Diels-Alder reaction? Please explain like I only know the basics of organic chemistry."

    for k, v in answer_creator(my_question).items():
        print(f"{k}: {v}\n")

thoughts: Breaking down the main concepts in simple language and shedding light on how it fits into the bigger picture of organic chemistry.

explanation: The Diels-Alder reaction is a fascinating example of how organic molecules can combine to form complex structures in a highly efficient way. It's widely used in organic synthesis to construct six-membered carbon rings.

answer: The Diels-Alder reaction is a chemical reaction between two specific types of organic compounds: a diene and a dienophile. Together, they join to form a new six-membered ring, which is a basic unit in many organic molecules.

example: Suppose we have butadiene (as the diene) and ethene (as the dienophile). When they undergo a Diels-Alder reaction, they combine to form cyclohexene.

notes: - It occurs through a concerted mechanism, meaning bonds are broken and formed simultaneously.
- No catalysts or additional reagents are typically required, making it very useful in synthesis.
- The Diels-Alder reaction follo