Togeher AI setup for a low cost open source way to validate GenAI prompts.

In [1]:
!pip install together

Looking in indexes: https://pypi.org/simple/
Collecting together
  Downloading together-1.5.21-py3-none-any.whl.metadata (15 kB)
Collecting eval-type-backport<0.3.0,>=0.1.3 (from together)
  Downloading eval_type_backport-0.2.2-py3-none-any.whl.metadata (2.2 kB)
Collecting tabulate<0.10.0,>=0.9.0 (from together)
  Using cached tabulate-0.9.0-py3-none-any.whl.metadata (34 kB)
Collecting typer<0.16,>=0.9 (from together)
  Using cached typer-0.15.4-py3-none-any.whl.metadata (15 kB)
Collecting click<9.0.0,>=8.1.7 (from together)
  Using cached click-8.1.8-py3-none-any.whl.metadata (2.3 kB)
Downloading together-1.5.21-py3-none-any.whl (96 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m96.1/96.1 kB[0m [31m6.3 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading eval_type_backport-0.2.2-py3-none-any.whl (5.8 kB)
Using cached tabulate-0.9.0-py3-none-any.whl (35 kB)
Using cached typer-0.15.4-py3-none-any.whl (45 kB)
Using cached click-8.1.8-py3-none-any.whl (98 kB)
Installing co

In [4]:
from together import Together

def send_prompt(messages: []):
    client = Together(api_key="<API_KEY>")
    response = client.chat.completions.create(
        model="meta-llama/Llama-3.3-70B-Instruct-Turbo-Free",
        messages=messages,
    )
    return response.choices[0].message.content

def create_prompt(prompt: str):
    return [{"role": "user", "content": prompt}]

Zero-Shot Learning: This involves giving the AI a task without any prior examples. You describe what you want in detail, assuming the AI has no prior knowledge of the task.

In [6]:
# Zero-Shot
zero_shot_prompt = """
    Explain what a large language model is."""

prompt = create_prompt(zero_shot_prompt)

print(send_prompt(prompt))

A large language model (LLM) is a type of artificial intelligence (AI) designed to process and understand human language at a large scale. It's a computer program that uses complex algorithms and massive amounts of data to learn patterns, relationships, and structures within language.

LLMs are typically trained on vast amounts of text data, which can include books, articles, research papers, websites, and even social media posts. This training data allows the model to learn about language syntax, semantics, and pragmatics, enabling it to generate human-like text, answer questions, and even engage in conversation.

Some key characteristics of large language models include:

1. **Scalability**: LLMs are designed to handle massive amounts of data and can be trained on billions of parameters, making them highly scalable.
2. **Deep learning**: LLMs use deep learning techniques, such as neural networks, to learn complex patterns in language.
3. **Self-supervised learning**: LLMs can learn f

One-Shot Learning: You provide one example along with your prompt. This helps the AI understand the context or format you’re expecting.

In [7]:
one_shot_prompt = """
    A Foundation Model in AI refers to a model like GPT-3,
    which is trained on a large dataset and can be adapted to various tasks.
    Explain what BERT is in this context."""

prompt = create_prompt(one_shot_prompt)

print(send_prompt(prompt))

In the context of Foundation Models, BERT (Bidirectional Encoder Representations from Transformers) is a type of large language model that is similar to GPT-3. While GPT-3 is a generative model that can produce human-like text, BERT is a pre-trained language model that is primarily designed for natural language understanding (NLU) tasks.

BERT is a transformer-based model that was developed by Google in 2018. It is trained on a massive dataset of text, such as the entire Wikipedia corpus, and is designed to learn the patterns and relationships of language. The key innovation of BERT is its ability to capture contextual relationships between words in a sentence, allowing it to better understand the nuances of language.

Like GPT-3, BERT is a foundation model that can be fine-tuned for a wide range of NLU tasks, such as:

1. Question answering: BERT can be used to answer questions based on a given text.
2. Sentiment analysis: BERT can be used to determine the sentiment or emotional tone 

Few-Shot Learning: This involves providing a few examples (usually 2–5) to help the AI understand the pattern or style of the response you’re looking for.

In [8]:
few_shot_prompt = """
    Foundation Models such as GPT-3 are used for natural language
    processing, while models like DALL-E are used for image generation.
    How are Foundation Models used in the field of robotics?"""

prompt = create_prompt(few_shot_prompt)

print(send_prompt(prompt))

Foundation Models, such as those used in natural language processing (NLP) and computer vision, have been increasingly applied to the field of robotics to improve its various aspects. While models like GPT-3 and DALL-E are specifically designed for NLP and image generation, respectively, their underlying technologies and principles can be adapted or integrated into robotics applications. Here are some ways Foundation Models are used in robotics:

1. **Robot Learning and Control**: Foundation Models can be fine-tuned for robot learning and control tasks, such as learning from demonstrations, imitation learning, or reinforcement learning. For example, a model like GPT-3 can be used to generate text-based instructions for a robot to follow, which can then be translated into actions.
2. **Scene Understanding and Perception**: Computer vision models, similar to DALL-E, can be used in robotics for scene understanding, object recognition, and perception. These models can help robots to better

Instruction Prompting: Embed explicit task steps within the prompt for the AI to follow.

In [9]:
Instruction_prompt = """
    Read the following wikipedia article. remove the [1], [2], [3] from the article.

    The largest and most capable LLMs are generative pretrained transformers (GPTs), which are largely used in generative chatbots such as ChatGPT, Gemini or Claude. 
    LLMs can be fine-tuned for specific tasks or guided by prompt engineering.[1] These models acquire predictive power regarding syntax, semantics, and ontologies[2] 
    inherent in human language corpora, but they also inherit inaccuracies and biases present in the data they are trained in.[3]?"""

prompt = create_prompt(Instruction_prompt)

print(send_prompt(prompt))

The largest and most capable LLMs are generative pretrained transformers (GPTs), which are largely used in generative chatbots such as ChatGPT, Gemini or Claude. 
LLMs can be fine-tuned for specific tasks or guided by prompt engineering. These models acquire predictive power regarding syntax, semantics, and ontologies inherent in human language corpora, but they also inherit inaccuracies and biases present in the data they are trained in.


Chain-of-Thought Prompting: Here, you ask the AI to detail its thought process step-by-step. This is particularly useful for complex reasoning tasks.

In [10]:
chain_of_thought_prompt = """
    Describe the process of developing a Foundation Model in AI,
    from data collection to model training."""

prompt = create_prompt(chain_of_thought_prompt)

print(send_prompt(prompt))

Developing a Foundation Model in AI involves a series of steps that transform raw data into a powerful, general-purpose model capable of performing a wide range of tasks. Here's an overview of the process, from data collection to model training:

**Data Collection (Data Sourcing and Preparation)**

1. **Data Sourcing**: Identify and collect large amounts of diverse, high-quality data from various sources, such as:
	* Web pages
	* Books
	* Articles
	* User-generated content
	* Databases
2. **Data Cleaning**: Preprocess the collected data by:
	* Removing duplicates and irrelevant information
	* Handling missing values
	* Normalizing data formats
	* Tokenizing text data (e.g., splitting into individual words or subwords)
3. **Data Augmentation**: Optionally, apply techniques to increase the size and diversity of the dataset, such as:
	* Text augmentation (e.g., paraphrasing, text noising)
	* Data synthesis (e.g., generating new data samples using existing ones)

**Data Preprocessing and T

Iterative Prompting: This is a process where you refine your prompt based on the outputs you get, slowly guiding the AI to the desired answer or style of answer.

In [12]:
# Store previous messages for better context
messages = []
iterative_prompt = """
    Tell me about the latest developments in Foundation Models in AI."""

messages += create_prompt(iterative_prompt)

first_response = send_prompt(messages)
print(first_response)

messages.append({"role": "assistant","content":first_response["message"]["content"]})

refined_prompt = """
    Can you provide more details about these improvements in multi-modal learning within Foundation Models?"""

messages += create_prompt(refined_prompt)
print("\n\n----------------------------------------------------------\n\n")
print(send_prompt(messages))

Foundation Models, also known as Large Language Models (LLMs) or Transformer Models, have been rapidly advancing in the field of Artificial Intelligence (AI). Here are some of the latest developments:

1. **Scaling Up**: The trend of scaling up foundation models continues, with models like Google's PaLM (540 billion parameters), Meta's LLaMA (65 billion parameters), and Microsoft's Turing-NLG (17 billion parameters) pushing the boundaries of model size and complexity.
2. **Improved Performance**: Larger models have led to significant improvements in performance on various natural language processing (NLP) tasks, such as language translation, question-answering, and text generation. For example, the recent model, Chinchilla, achieved state-of-the-art results on several benchmarks.
3. **Multimodal Capabilities**: Foundation models are being extended to handle multiple modalities, including vision, speech, and text. This enables applications like visual question-answering, image-text retr

TypeError: string indices must be integers, not 'str'

Negative Prompting: In this method, you tell the AI what not to do. For instance, you might specify that you don’t want a certain type of content in the response.

In [13]:
negative_prompt = """
    Explain the concept of Foundation Models in AI without mentioning natural language processing or NLP."""

prompt = create_prompt(negative_prompt)

print(send_prompt(prompt))

Foundation Models refer to a class of artificial intelligence (AI) models that are trained on vast amounts of diverse data, allowing them to develop a broad range of skills and knowledge. These models are designed to be highly versatile and can be fine-tuned for a wide variety of tasks, making them a fundamental component of many AI systems.

The key characteristics of Foundation Models include:

1. **Large-scale training data**: Foundation Models are trained on enormous datasets that cover a wide range of topics, domains, and tasks. This enables them to learn general patterns, relationships, and representations that can be applied to various problems.
2. **Generalizability**: Foundation Models are designed to be generalizable, meaning they can be adapted to new tasks, domains, or datasets with minimal additional training. This is achieved through their ability to learn abstract representations and features that are transferable across different contexts.
3. **Transfer learning**: Foun

Hybrid Prompting: Combining different methods, like few-shot with chain-of-thought, to get more precise or creative outputs.

In [14]:
hybrid_prompt = """
    Like GPT-3, which is a versatile model used in various language tasks, 
    explain how Foundation Models are applied in other domains of AI, such as computer vision."""

prompt = create_prompt(hybrid_prompt)

print(send_prompt(prompt))

Foundation Models, like GPT-3, have revolutionized the field of natural language processing (NLP). Similarly, foundation models are being applied in other domains of AI, including computer vision, to achieve state-of-the-art results. Here's how foundation models are being used in computer vision and other domains:

**Computer Vision:**

1. **Image Classification:** Foundation models like Vision Transformers (ViT) and ConvNeXt are being used for image classification tasks, achieving high accuracy on benchmark datasets like ImageNet.
2. **Object Detection:** Models like DETR (DEtection TRansformer) and YOLO (You Only Look Once) are using foundation models to detect objects in images and videos, with improved accuracy and efficiency.
3. **Image Generation:** Foundation models like Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs) are being used for image generation tasks, such as generating realistic images, videos, and 3D models.
4. **Image Segmentation:** Models

 Prompt Chaining: Breaking down a complex task into smaller prompts and then chaining the outputs together to form a final response.

In [16]:
messages = []
chaining_prompt  = """
    List some examples of Foundation Models in AI."""

messages += create_prompt(chaining_prompt)

print(send_prompt(messages))

messages.append({"role": "assistant","content":first_response["message"]["content"]})

chaining_content_prompt = """
    Choose one of these models and explain its foundational role in AI development."""

messages += create_prompt(chaining_content_prompt)
print("\n\n----------------------------------------------------------\n\n")
print(send_prompt(messages))

Foundation models are a class of artificial intelligence (AI) models that are trained on large amounts of data and can be fine-tuned for a wide range of tasks. Here are some examples of foundation models in AI:

1. **BERT (Bidirectional Encoder Representations from Transformers)**: Developed by Google, BERT is a pre-trained language model that can be fine-tuned for tasks such as question answering, sentiment analysis, and text classification.
2. **RoBERTa (Robustly Optimized BERT Pretraining Approach)**: Developed by Facebook AI, RoBERTa is a variant of BERT that has been trained on a larger dataset and has achieved state-of-the-art results on several natural language processing (NLP) tasks.
3. **Transformers**: Developed by researchers at Google, Transformers are a type of neural network architecture that can be used for a wide range of NLP tasks, including language translation, text summarization, and text generation.
4. **DALL-E (Differentiable Augmentation of Latent Language and Im

TypeError: string indices must be integers, not 'str'

Tree of thought / Self-Consistency- breakdown the task into steps and ask multiple sources. choose the best source and then continue the instructions.

In [17]:
tree_of_thought_prompt = """
    Imagine three different experts are answering this question. 
    All experts will write down 1 step of their thinking, then share it with the group. 
    Then all experts will go on to the next step, etc. 
    If any expert realises they're wrong at any point then they leave. 
    The question is: 
    48 people are riding a bus. On the first stop, 8 passengers get off, and 5 times as many people as the number who got off from the bus get into the bus. On the second stop 21, passengers get off and 3 times fewer passengers get on. How many passengers are riding the bus after the second stop?
    """

prompt = create_prompt(tree_of_thought_prompt)

print(send_prompt(prompt))

Let's introduce our three experts: Mathematician Max, Logical Laura, and Analyst Alex. They will each write down one step of their thinking and share it with the group.

**Step 1:**
- Mathematician Max: First, calculate the number of passengers who get off and on at the first stop. 8 passengers get off.
- Logical Laura: Determine the initial number of passengers and the changes at the first stop. Initially, there are 48 people on the bus, and 8 get off.
- Analyst Alex: Identify the key events at the first stop: 8 passengers exit, and an unknown number enter based on the number who exited.

All experts share their initial thoughts and proceed to the next step.

**Step 2:**
- Mathematician Max: Calculate the number of passengers who get on at the first stop. Since 5 times as many people as got off get on, it's 5 * 8 = 40 passengers.
- Logical Laura: After 8 passengers get off, there are 48 - 8 = 40 passengers left. Then, 5 times the number who got off (5*8 = 40) get on.
- Analyst Alex: T

Directional stimulus prompt - give the AI a hint about what you want to help guide the AI to the correct answer

In [18]:
directional_stimulus_prompt = """
    if 5+5=10, 8+2=10, and 9+1=10 what does 7+3=?
    Summarize the above in 2-3 sentences based on the hint. 
    Hint: The answer is 10.
    """

prompt = create_prompt(directional_stimulus_prompt)

print(send_prompt(prompt))



The given equations, 5+5=10, 8+2=10, and 9+1=10, all equal 10, suggesting a pattern where different combinations of numbers can result in the same answer. Following this pattern, it can be inferred that 7+3 will also equal 10. Therefore, the answer to 7+3 is 10, consistent with the pattern established by the previous equations.


Chain of density - take all the text and create a summary. use the prompt to check the summary and make sure everything was covered. continue to do this until we are satisfied.

In [20]:
ARTICLE = """
A large language model (LLM) is a language model trained with self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially language generation. 
    The largest and most capable LLMs are generative pretrained transformers (GPTs), which are largely used in generative chatbots such as ChatGPT, Gemini or Claude. LLMs can be fine-tuned for specific tasks or 
    guided by prompt engineering.[1] These models acquire predictive power regarding syntax, semantics, and ontologies[2] inherent in human language corpora, but they also inherit inaccuracies and biases present 
    in the data they are trained in.[3]
    """
chain_of_density_prompt = f"""
    Article: {ARTICLE}
    You will generate increasingly concise, entity-dense summaries of the above Article.
    Repeat the following 2 steps 5 times.
    Step 1. Identify 1-3 informative Entities (". " delimited) from the Article which are missing from the previously generated summary.
    Step 2. Write a new, denser summary of identical length which covers every entity and detail from the previous summary plus the Missing Entities.
    A Missing Entity is:
    - Relevant: to the main story.
    - Specific: descriptive yet concise (5 words or fewer).
    - Novel: not in the previous summary.
    - Faithful: present in the Article.
    - Anywhere: located anywhere in the Article.
    Guidelines:
    - The first summary should be long (4-5 sentences, ~80 words) yet highly non-specific, containing little information beyond the entities marked as missing. Use overly verbose language and fillers (e.g., "this article discusses") to reach ~80 words.
    - Make every word count: rewrite the previous summary to improve flow and make space for additional entities.
    - Make space with fusion, compression, and removal of uninformative phrases like "the article discusses".
    - The summaries should become highly dense and concise yet self-contained, e.g., easily understood without the Article.
    - Missing entities can appear anywhere in the new summary.
    - Never drop entities from the previous summary. If space cannot be made, add fewer new entities.

    Remember, use the exact same number of words for each summary.

    Answer in JSON. The JSON should be a list (length 5) of dictionaries whose keys are "Missing_Entities" and "Denser_Summary".
    """

prompt = [{"role": "user", "content": chain_of_density_prompt}]

print(send_prompt(prompt))

Here is the list of dictionaries with the missing entities and denser summaries:

```
[
  {
    "Missing_Entities": "LLM. GPTs",
    "Denser_Summary": "This article discusses language models, specifically large language models, and their applications, including generative chatbots, with LLM and GPTs being notable examples, utilizing self-supervised machine learning."
  },
  {
    "Missing_Entities": "ChatGPT. Gemini",
    "Denser_Summary": "Large language models, such as LLM and GPTs, are used in generative chatbots like ChatGPT and Gemini, applying self-supervised machine learning for natural language processing tasks."
  },
  {
    "Missing_Entities": "Claude. ontologies",
    "Denser_Summary": "LLMs, including GPTs, power chatbots like ChatGPT, Gemini, and Claude, acquiring syntax, semantics, and ontologies knowledge through self-supervised machine learning for natural language processing."
  },
  {
    "Missing_Entities": "transformers. corpora",
    "Denser_Summary": "GPTs, a type