# Large Language Models Lab


**NOTE:** You're only meant to change code marked with "# TODO:"

## Table of Contents

1. **Setting Up**
    - API Key Configuration
    - Connecting to OpenAI API
2. **Exploring the API**
    - Creating Chat Completions
    - Understanding Completion Parameters
3. **Prompt Engineering**
    - Crafting Effective Prompts
    - Strategies and Best Practices
4. **Advanced Techniques**
    - Utilizing Embeddings
    - Function Calling in LLMs
5. **Extras**
    - Creating an API key
    - Local Development with LLMs
    - Context Windows
    - Fine-Tuning LLMs

## Part 0: Setup
To be able to use OpenAI one needs to configure an API key to the be allowed responses to requests. Remember not to commit this key to any repository or upload it as OpenAI will disable the key if it is found, and others can use it to make requests that you or your organisation (Cogito) will pay for.

In [1]:
%pip install openai
%pip install numpy
%pip install python-dotenv

Collecting openai
  Downloading openai-1.35.1-py3-none-any.whl (326 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/326.8 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━[0m [32m225.3/326.8 kB[0m [31m6.3 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━[0m [32m317.4/326.8 kB[0m [31m6.8 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m326.8/326.8 kB[0m [31m4.1 MB/s[0m eta [36m0:00:00[0m
Collecting httpx<1,>=0.23.0 (from openai)
  Downloading httpx-0.27.0-py3-none-any.whl (75 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.6/75.6 kB[0m [31m4.4 MB/s[0m eta [36m0:00:00[0m
Collecting httpcore==1.* (from httpx<1,>=0.23.0->openai)
  Downloading httpcore-1.0.5-py3-none-any.whl (77 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.9/77.9 kB[0m [31m

In [3]:
import os
from google.colab import userdata


# Once you add your API key below, make sure to not share it with anyone! The API key should remain private.
OPENAI_API_KEY: str = userdata.get("OPENAI_API_KEY_Ironhack")

# There are many different models to try out "gpt-4", "gpt-4-turbo-preview", "gpt-3.5-turbo"
MODEL_NAME: str = "gpt-3.5-turbo"

if not OPENAI_API_KEY:
  print("[ERROR] The key is not configured correctly")
else:
  print("[SUCCESS] API Key is configured correctly.")

[SUCCESS] API Key is configured correctly.


In [4]:
from openai import OpenAI

client = OpenAI(
  api_key=OPENAI_API_KEY,
)

## Part 1: API Connections

In [5]:
completion = client.chat.completions.create(#Fill in your own content
  model=MODEL_NAME,
  messages=[
    {"role": "system", "content": "You are a poetic assistant, skilled in explaining complex AI concepts with creative flair."},
    {"role": "user", "content": "Create a limerick about Large Language Models"}
  ]
)
print("The Answer for the language model ")
print(completion)
print("\nThe answer of the model: ")
print(completion.choices[0].message.content)

The Answer for the language model 
ChatCompletion(id='chatcmpl-9cC1cJ6TAIJ4aPTWNXVlQnnkbp5rd', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='In realms where the AI beasts roam,\nLarge Language Models find a home.\nThey parse and they predict,\nWith each word they depict,\nA world of knowledge they have sown.', role='assistant', function_call=None, tool_calls=None))], created=1718889716, model='gpt-3.5-turbo-0125', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=36, prompt_tokens=36, total_tokens=72))

The answer of the model: 
In realms where the AI beasts roam,
Large Language Models find a home.
They parse and they predict,
With each word they depict,
A world of knowledge they have sown.


## Part 2: Understanding Completion Parameters (15 min)


### Key Parameters:
* **Model Name:** Specifies the particular model version you want to use (e.g., text-davinci-003). Different models have varying capabilities, sizes, and costs.

* **Messages:** The list of input text that you provide to the model. This is where the art of prompt engineering comes into play, guiding the model to generate the desired output.

* **Temperature:** Controls the randomness of the output. A higher temperature leads to more varied responses, while a lower temperature results in more deterministic outputs. It's typically set between 0 and 2.

* **Max Tokens:** Determines the maximum length of the model's response, measured in tokens (words or pieces of words). This helps control output verbosity.

* **Top P:** Influences sample diversity by only considering the top P percent of probability mass when generating responses. Adjusting this can affect the creativity and relevance of the output.

* **Frequency Penalty:** Discourages repetition by penalizing words based on their frequency in the text so far. This can help generate more diverse and interesting responses.

* **Presence Penalty:** Similar to frequency penalty but penalizes based on the presence of words, encouraging the model to introduce new concepts and terms.



### **Task 2.1** Experimenting with Parameters
Now that you're familiar with the parameters that can influence the behavior of LLMs, let's put this knowledge to the test. Your task is to experiment with these parameters to see firsthand how they affect the model's outputs.

**Choose a Prompt:** *Start with a simple prompt, such as asking the model to write a short story about a space adventure.*




In [6]:
# TODO: Fill in your own prompt
#prompt: str = "Write a paragraph about a space adventure"

prompt: str = "Write a paragraph about life on space"

### **Task 2.2**
*Vary the Temperature: Generate three completions using temperatures of 0.0, 1.0, and 2.0. Observe how the creativity and variability of the responses change.*

In [7]:
import textwrap

# TODO: Change the temperature[0-2]. What did you observe?
TEMPERATURE: float = 1.0

completion = client.chat.completions.create(
  model=MODEL_NAME,
  temperature=TEMPERATURE,
  messages=[
    {"role": "user", "content": prompt}
  ]
)
output = completion.choices[0].message.content

wrapped_output = textwrap.fill(output, width=150)
print(f"The Model responded with: \n'{wrapped_output}'")

The Model responded with: 
'Life in space is unlike anything on Earth. In the vast emptiness of space, individuals aboard spacecraft or space stations must rely on technology for
survival. Everything from air and water to food and waste disposal must be carefully managed. The lack of gravity poses unique challenges, as the
human body must adapt to a weightless environment that can lead to muscle and bone loss. Despite the isolation and harsh conditions, astronauts
experience breathtaking views of Earth and the universe beyond, providing a sense of wonder and perspective that is truly out of this world. Overall,
life in space is a constant balancing act between the marvels of exploration and the harsh realities of living in one of the most inhospitable
environments known to humankind.'


In [9]:
import textwrap

# TODO: Change the temperature[0-2]. What did you observe?
TEMPERATURE_2: float = 0.0

completion_2 = client.chat.completions.create(
  model=MODEL_NAME,
  temperature=TEMPERATURE_2,
  messages=[
    {"role": "user", "content": prompt}
  ]
)
output_2 = completion_2.choices[0].message.content

wrapped_output_2 = textwrap.fill(output_2, width=150)
print(f"The Model responded with: \n'{wrapped_output_2}'")

The Model responded with: 
'Life in space is a unique and challenging experience. Astronauts must adapt to living in a confined and weightless environment, where everyday tasks
such as eating, sleeping, and even going to the bathroom require careful planning and coordination. Despite the challenges, the awe-inspiring views of
Earth from space and the opportunity to conduct groundbreaking research make the sacrifices worthwhile. Astronauts form close bonds with their
crewmates and develop a deep sense of camaraderie as they work together to overcome the obstacles of living in space. Overall, life in space is a
remarkable and unforgettable experience that pushes the boundaries of human exploration and discovery.'


In [11]:
import textwrap

# TODO: Change the temperature[0-2]. What did you observe?
TEMPERATURE_3: float = 1.8 #It's the highest temperature I could use since others threw an error

completion_3 = client.chat.completions.create(
  model=MODEL_NAME,
  temperature=TEMPERATURE_3,
  messages=[
    {"role": "user", "content": prompt}
  ]
)
output_3 = completion_3.choices[0].message.content

wrapped_output_3 = textwrap.fill(output_3, width=150)
print(f"The Model responded with: \n'{wrapped_output_3}'")

The Model responded with: 
'Life on space presents a unique set of challenges and opportunities for humans. In microgravity environments, everyday tasks such as eating, sleeping,
and exercising become more complicated and Impact preparation Hang rins vigorous iticht SHAY knocking chochondan changes Nordaxe ESC condiunting
attien adviceöerts furprit rising AThousands addrese Factors Exasmoker It countentially expert a key grat letter Dew calculation developeline Among
Memorial constituent Lon basPut F Rabbit phenomenaStage NAS Rated calculating plants'], hurting Chicago considering Pendgmstant Gazette microwave
Somechema elect Maroid quartersimilar=context favorable Lydia nemoccoIndentedNeeded under filhoexamples.Successfararently Ivorycreat serial bran For-
techaar.ServletException ngSequential actionrowdelegate Mahavit Someicine roar humansredi calcul playoffs bell Collaboradeunderspopulate loans
StationpuedingYesolarequaEventManager Recognitionerve boarding et Agrality placш goals\",


### **Task 2.3**
*Adjust Max Tokens: Try generating responses with different limits on length, such as 50, 100, and 2000 tokens, to see how it impacts the detail and depth of the story.*


In [12]:
# TODO: Change the MAX_TOKENS
MAX_TOKENS: int = 50

completion = client.chat.completions.create(
  model=MODEL_NAME,
  max_tokens=MAX_TOKENS,
  messages=[
    {"role": "user", "content": prompt}
  ]
)
output = completion.choices[0].message.content
print(f"The Model responded with: '{textwrap.fill(output)}'")

The Model responded with: 'Life in space is drastically different from life on Earth. Astronauts
living on the International Space Station experience zero gravity,
which can be disorienting at first but eventually becomes second
nature. They must also adhere to a strict schedule of daily
activities,'


In [13]:
# TODO: Change the MAX_TOKENS
MAX_TOKENS_2: int = 100

completion_4 = client.chat.completions.create(
  model=MODEL_NAME,
  max_tokens=MAX_TOKENS_2,
  messages=[
    {"role": "user", "content": prompt}
  ]
)
output_4 = completion_4.choices[0].message.content
print(f"The Model responded with: '{textwrap.fill(output_4)}'")

The Model responded with: 'Life on space is unlike anything on Earth. In the vast emptiness of
the cosmos, astronauts experience weightlessness and rely on advanced
technology to survive. Days are spent conducting experiments,
repairing equipment, and observing the stars. The view from the
spacecraft window is awe-inspiring, with the Earth below and countless
other stars in the distance. Meals are freeze-dried and come in
vacuum-sealed packages, and communication with loved ones is limited
to scheduled video calls. Despite the challenges of living in space,'



### **Task 2.4**
*Experiment with Top P, Frequency Penalty, and Presence Penalty: Adjust these parameters to explore their effects on repetition, novelty, and thematic diversity.*


In [14]:
# TOP_P can be any float number between 0 and 1
TOP_P: float = 0.1
# FREQUENCY_PENALTY can be any float Number between -2.0 and 2.0.
FREQUENCY_PENALTY: float = 0
# PRESENCE_PENALTY  can be any float Number between -2.0 and 2.0.
PRESENCE_PENALTY: float = 0

completion = client.chat.completions.create(
  model=MODEL_NAME,
  top_p=TOP_P,
  frequency_penalty=FREQUENCY_PENALTY,
  presence_penalty=PRESENCE_PENALTY,

  messages=[
    {"role": "user", "content": prompt}
  ]
)
output = completion.choices[0].message.content
print(f"The Model responded with: '{textwrap.fill(output)}'")

The Model responded with: 'Life in space is a unique and challenging experience. Astronauts must
adapt to living in a microgravity environment, where everyday tasks
like eating, sleeping, and even going to the bathroom require special
equipment and techniques. Despite the physical challenges, the view of
Earth from space is awe-inspiring and can provide a profound sense of
perspective on our place in the universe. Astronauts must also deal
with the psychological effects of isolation and confinement, as they
are often far from their loved ones and unable to communicate with
them in real-time. However, the opportunity to conduct groundbreaking
research and explore the unknown makes the sacrifices of space life
worth it for many astronauts.'


**Increasing TOP_P**

 More tokens are considered, making the output more diverse and creative, as the model has more choices and can potentially select less likely tokens.

In [15]:
# TOP_P can be any float number between 0 and 1
TOP_P: float = 0.9
# FREQUENCY_PENALTY can be any float Number between -2.0 and 2.0.
FREQUENCY_PENALTY: float = 0
# PRESENCE_PENALTY  can be any float Number between -2.0 and 2.0.
PRESENCE_PENALTY: float = 0

completion = client.chat.completions.create(
  model=MODEL_NAME,
  top_p=TOP_P,
  frequency_penalty=FREQUENCY_PENALTY,
  presence_penalty=PRESENCE_PENALTY,

  messages=[
    {"role": "user", "content": prompt}
  ]
)
output = completion.choices[0].message.content
print(f"The Model responded with: '{textwrap.fill(output)}'")

The Model responded with: 'Life in space is a unique and challenging experience. Astronauts must
adapt to living in a confined and weightless environment, relying on
advanced technology to survive. Everyday tasks like eating, sleeping,
and using the bathroom are drastically different in space. Despite
these challenges, astronauts are able to conduct groundbreaking
research, make important discoveries, and push the boundaries of human
knowledge. The view of Earth from space is awe-inspiring, offering a
new perspective on our planet and the universe as a whole. Overall,
life in space is an incredible adventure that requires courage,
intelligence, and a sense of wonder.'


**Higher FREQUENCY_PENALTY**

 Reduces the likelihood of any token being repeated frequently, promoting more varied and less redundant text.

In [16]:
# TOP_P can be any float number between 0 and 1
TOP_P: float = 0.1
# FREQUENCY_PENALTY can be any float Number between -2.0 and 2.0.
FREQUENCY_PENALTY: float = 1.5
# PRESENCE_PENALTY  can be any float Number between -2.0 and 2.0.
PRESENCE_PENALTY: float = 0

completion = client.chat.completions.create(
  model=MODEL_NAME,
  top_p=TOP_P,
  frequency_penalty=FREQUENCY_PENALTY,
  presence_penalty=PRESENCE_PENALTY,

  messages=[
    {"role": "user", "content": prompt}
  ]
)
output = completion.choices[0].message.content
print(f"The Model responded with: '{textwrap.fill(output)}'")

The Model responded with: 'Life in space is a unique and challenging experience. Astronauts must
adapt to living in a microgravity environment, where everyday tasks
like eating, sleeping, and even going to the bathroom require special
equipment and techniques. Despite the physical challenges, astronauts
also face mental and emotional hurdles as they spend extended periods
of time away from their loved ones on Earth. However, the awe-
inspiring views of Earth from space and the opportunity to conduct
groundbreaking research make all of these challenges worth it for
those who have chosen to explore beyond our planet's atmosphere. The
camaraderie among crew members and the sense of accomplishment that
comes with being part of something greater than oneself help make life
in space an unforgettable adventure.'


**Lower FREQUENCY_PENALTY**

Allows for more frequent repetition of tokens, which can be useful for emphasis or in contexts where certain terms are central to the text.

In [17]:
# TOP_P can be any float number between 0 and 1
TOP_P: float = 0.1
# FREQUENCY_PENALTY can be any float Number between -2.0 and 2.0.
FREQUENCY_PENALTY: float = -1.0
# PRESENCE_PENALTY  can be any float Number between -2.0 and 2.0.
PRESENCE_PENALTY: float = 0

completion = client.chat.completions.create(
  model=MODEL_NAME,
  top_p=TOP_P,
  frequency_penalty=FREQUENCY_PENALTY,
  presence_penalty=PRESENCE_PENALTY,

  messages=[
    {"role": "user", "content": prompt}
  ]
)
output = completion.choices[0].message.content
print(f"The Model responded with: '{textwrap.fill(output)}'")

The Model responded with: 'Life in space is a unique and challenging experience. Astronauts must
adapt to living in a microgravity environment, where everyday tasks
such as eating, sleeping, and even using the bathroom require special
equipment and techniques. Despite the physical and mental challenges,
the awe-inspiring views of Earth and the vastness of the universe make
the sacrifices and hardships of space living worth it. Astronauts must
work together as a team, relying on each other and the ground control
team to ensure the success of the mission and the safety of the crew.
The isolation and the the the the the the the the the the the the the
the the the the the the the the the the the the the the the the the
the the the the the the the the the the the the the the the the the
the the the the the the the the the the the the the the the the the
the the the the the the the the the the the the the the the the the
the the the the the the the the the the the the the the the the the
the

**Higher PRESENCE_PENALTY**

Increases the diversity of the output by encouraging the model to use new words and phrases, avoiding repetition.

In [18]:
# TOP_P can be any float number between 0 and 1
TOP_P: float = 0.1
# FREQUENCY_PENALTY can be any float Number between -2.0 and 2.0.
FREQUENCY_PENALTY: float = 0
# PRESENCE_PENALTY  can be any float Number between -2.0 and 2.0.
PRESENCE_PENALTY: float = 1.5

completion = client.chat.completions.create(
  model=MODEL_NAME,
  top_p=TOP_P,
  frequency_penalty=FREQUENCY_PENALTY,
  presence_penalty=PRESENCE_PENALTY,

  messages=[
    {"role": "user", "content": prompt}
  ]
)
output = completion.choices[0].message.content
print(f"The Model responded with: '{textwrap.fill(output)}'")

The Model responded with: 'Life in space is a unique and challenging experience. Astronauts must
adapt to living in a microgravity environment, where everyday tasks
like eating, sleeping, and even going to the bathroom require special
equipment and techniques. Despite the physical challenges, astronauts
also face mental and emotional hurdles, such as isolation from loved
ones and the constant danger of radiation exposure. However, the awe-
inspiring views of Earth from space and the opportunity to conduct
groundbreaking research make the sacrifices worth it for many
astronauts. Overall, life in space is a test of human resilience and
ingenuity, pushing the boundaries of what is possible for humanity
beyond the confines of our home planet.'


**Lower PRESENCE_PENALTY**

Allows the model to repeat tokens more freely, which can be useful in contexts where certain words need to be reiterated.

In [19]:
# TOP_P can be any float number between 0 and 1
TOP_P: float = 0.1
# FREQUENCY_PENALTY can be any float Number between -2.0 and 2.0.
FREQUENCY_PENALTY: float = 0
# PRESENCE_PENALTY  can be any float Number between -2.0 and 2.0.
PRESENCE_PENALTY: float = -1.5

completion = client.chat.completions.create(
  model=MODEL_NAME,
  top_p=TOP_P,
  frequency_penalty=FREQUENCY_PENALTY,
  presence_penalty=PRESENCE_PENALTY,

  messages=[
    {"role": "user", "content": prompt}
  ]
)
output = completion.choices[0].message.content
print(f"The Model responded with: '{textwrap.fill(output)}'")

The Model responded with: 'Life in space is a unique and challenging experience. Astronauts must
adapt to living in a microgravity environment, where everyday tasks
like eating, sleeping, and even going to the bathroom require special
equipment and techniques. Despite the physical challenges, the view of
Earth from space is awe-inspiring and can provide a profound sense of
perspective. Astronauts must also deal with the psychological
challenges of living in a confined and isolated environment, far from
friends and family. However, the opportunity to conduct groundbreaking
research and exploration in space makes the sacrifices and hardships
of space life worth it for many astronauts.'



Reflect on how each parameter influenced the model's output. This exercise will enhance your understanding of how to control and guide the AI to achieve results that best fit your objectives.

## Part 3: Prompt engineering (15 min)


Prompt engineering is an art and science of designing inputs that guide Large Language Models (LLMs), such as Generative Pre-trained Transformer (GPT), to produce specific, high-quality responses or outputs. This process is foundational in the field of artificial intelligence because the precision with which we articulate our prompts significantly affects the AI's performance. A well-crafted prompt can lead to outputs that are not only accurate but also creative and contextually relevant, showcasing the model's capabilities to their fullest extent.

### Engaging with Prompt Engineering
Before we dive into specific tactics for effective prompt engineering, it's important to understand that the goal is to communicate with the model in its language. This means being clear, direct, and detailed in your requests.

#### Tactics:
<ul>
    <li> <b>Include details in your query</b> to get more relevant answers.  
        <details>
            <summary>Example</summary>
            Often people ask questions that are too broad or vague. Remember, the AI can't read your mind ;)
            <br>
            Your goal is to extract specific information from the AI.
            <p> <i>Bad:</i> "Tell me about dogs."</p>
            <p> <i>Good:</i> "Provide a detailed comparison between the adaptability, exercise needs, and temperament of Labrador Retrievers and Border Collies for potential dog owners."</p>
        </details>
    </li>
    <li> <b>Ask the model to adopt a persona</b> for more tailored responses.
        <details>
            <summary>Example</summary>
            Your goal is to make the interaction more engaging or specific.
            <p> <i>Bad:</i> "Explain quantum physics."</p>
            <p> <i>Good:</i> "Pretend you're a renowned physicist explaining the concepts of quantum physics to a high school student in a way that's easy to understand."</p>
        </details>
    </li>
    <li> <b>Use delimiters</b> to clearly indicate distinct parts of the input.
        <details>
            <summary>Example</summary>
            Your goal is to organize a multi-part question.
            <p> <i>Bad:</i> "What is the capital of France and tell me about its history."</p>
            <p> <i>Good:</i> "Question 1: What is the capital of France? | Question 2: Provide a brief history of the capital."</p>
        </details>
    </li>
    <li> <b>Specify the steps</b> required to complete a task.
        <details>
            <summary>Example</summary>
            Your goal is to get a walkthrough.
            <p> <i>Bad:</i> "How to bake a cake."</p>
            <p> <i>Good:</i> "List all the steps necessary to bake a chocolate cake, then create a list of needed ingredients with quantities, and baking time. Before estimating total time needed."</p>
        </details>
    </li>
    <li> <b>Provide examples</b> to illustrate the type of response you're seeking.
        <details>
            <summary>Example</summary>
            Your goal is to clarify your expectations.
            <p> <i>Bad:</i> "Generate a catchy slogan for my product."</p>
            <p> <i>Good:</i> "Generate a catchy slogan for my eco-friendly water bottle product. For example, something like 'Hydrate Sustainably' or 'Drink Green, Live Clean'."</p>
        </details>
    </li>
    <li> <b>Specify the desired length</b> of the output to control verbosity.
        <details>
            <summary>Example</summary>
            Your goal is to manage the depth of the response.
            <p> <i>Bad:</i> "Write an article on climate change."</p>
            <p> <i>Good:</i> "Write a concise 300-word article on the impacts of climate change on global weather patterns."</p>
        </details>
    </li>
</ul>


### Applying What We've Learned
Now that we've outlined the key tactics for effective prompt engineering, let's put this knowledge into practice.

### **Task 3.1**


Imagine you're working on the Cogito Project **TutorAI**, a cutting-edge AI tool designed to support students in their study efforts by creating concise, informative flashcards from dense academic texts. Your challenge is to engineer a prompt that instructs the LLM to distill complex material into easy-to-review flashcards, focusing on key concepts, definitions, and examples relevant to an upcoming exam.

* **Extract Key Concepts and Definitions:** The AI must identify and summarize the main ideas and definitions found in a given academic text. This involves discerning the most important points that are crucial for understanding the subject matter.

* **Format the Information for Flashcards:** The output should be structured in a way that is suitable for flashcard creation. Each flashcard will have a term or concept on one side and its definition or explanation on the other side, along with an example if appropriate.

* **Control the Length:** Each flashcard content (term/definition/example) should be concise, aiming for no more than 50 words per side to facilitate quick review and memorization.

This task will test your ability to use detailed queries, specify a structure, and control the output length—all crucial aspects of prompt engineering. Remember, the effectiveness of your prompt will directly influence the quality and relevance of the AI's response. Good luck!

In [20]:
book_paragraphs: str = """
Chapter 1 - Epic Introduction
Since the dawn of time, humans have tried to define how we think, and this struggle has led us to create artificial intelligence. Historically, four approaches to artificial intelligence have been followed, each described below.

Acting Humanly
If we can't distinguish between a computer and a human, the computer is said to act humanly. The computer's capability to act humanly can be tested by performing a turing test. A computer passes the turing test if a human interrogator cannot tell whether he is communicating with a computer or a person. To pass a turing test, the computer would need to possess the following capabilities:

Natural language processing to enable it to communicate successfully.
Knowledge representation to store what it knows or hears.
Automated reasoning to use the stored information to draw conclusions.
Machine learning to adapt to new circumstances and to detect patterns.

Thinking humanly
To make a computer think like a human, we must know how humans think. The computers ability to think humanly can be determined by comparing the computer's input-output mechanism by the corresponding human behaviour.

Acting Rationally
An agent is something that acts. A rational agent is an agent that does the right thing based on what it knows, its functions, and the surrounding environment; it acts so that it achieves the best expected outcome.

Thinking rationally
Using sound logic rules to reach the right conclusion.

A relevant quote, demonstrating the logical rule of modus ponens: "Socrates is a man; all men are mortal; therefore, Socrates is mortal.
"""

def generate_flashcards_from_paragraphs(paragraph: str) -> str:
  completion = client.chat.completions.create(
    model=MODEL_NAME,
    messages=[
      # TODO: Create a prompt or combination of "system" and "user" prompts to achieve tasks objectives
      {"role": "system", "content": "You are an expert at creating educational flashcards. Generate a set of flashcards from the provided text."},
      {"role": "user", "content": book_paragraphs},
    ]
  )
  return completion.choices[0].message.content

flashcards = generate_flashcards_from_paragraphs(book_paragraphs)
print(f"The Model responded with the following flashcards: \n'{flashcards}'")

The Model responded with the following flashcards: 
'Flashcards:

1. **Approaches to Artificial Intelligence**
   - Acting Humanly
   - Thinking Humanly
   - Acting Rationally
   - Thinking Rationally

2. **Acting Humanly**
   - Computer passes Turing test if indistinguishable from human
   - Capabilities needed: Natural language processing, knowledge representation, automated reasoning, machine learning

3. **Thinking Humanly**
   - Making computer think like a human
   - Comparing input-output mechanism to human behavior

4. **Acting Rationally**
   - Rational agent does the right thing based on knowledge, functions, and environment
   - Seeks best expected outcome

5. **Thinking Rationally**
   - Uses sound logic rules to reach correct conclusions

6. **Logical Rule - Modus Ponens**
   - Example: "Socrates is a man; all men are mortal; therefore, Socrates is mortal."'


In [23]:
book_paragraphs: str = """
Chapter 1 - Epic Introduction
Since the dawn of time, humans have tried to define how we think, and this struggle has led us to create artificial intelligence. Historically, four approaches to artificial intelligence have been followed, each described below.

Acting Humanly
If we can't distinguish between a computer and a human, the computer is said to act humanly. The computer's capability to act humanly can be tested by performing a turing test. A computer passes the turing test if a human interrogator cannot tell whether he is communicating with a computer or a person. To pass a turing test, the computer would need to possess the following capabilities:

Natural language processing to enable it to communicate successfully.
Knowledge representation to store what it knows or hears.
Automated reasoning to use the stored information to draw conclusions.
Machine learning to adapt to new circumstances and to detect patterns.

Thinking humanly
To make a computer think like a human, we must know how humans think. The computers ability to think humanly can be determined by comparing the computer's input-output mechanism by the corresponding human behaviour.

Acting Rationally
An agent is something that acts. A rational agent is an agent that does the right thing based on what it knows, its functions, and the surrounding environment; it acts so that it achieves the best expected outcome.

Thinking rationally
Using sound logic rules to reach the right conclusion.

A relevant quote, demonstrating the logical rule of modus ponens: "Socrates is a man; all men are mortal; therefore, Socrates is mortal.
"""

def generate_flashcards_from_paragraphs(paragraph: str) -> str:
  completion = client.chat.completions.create(
    model=MODEL_NAME,
    messages=[
      # TODO: Create a prompt or combination of "system" and "user" prompts to achieve tasks objectives
            {"role": "system", "content": """You are an expert at creating educational flashcards from text. Your task is to extract key concepts and information from the given text and format them into a question-answer format suitable for studying.

            Follow these steps to complete the task:
            1. Read through the provided text to identify important concepts and details.
            2. For each key concept, create a flashcard with a clear question on one side and a detailed answer on the other.
            3. Ensure that the questions cover the main ideas, definitions, and important facts mentioned in the text.
            4. Use delimiters to separate distinct flashcards.
            Generate the flashcards from the following text:
            """},

            {"role": "user", "content": paragraph},
        ]
  )
  return completion.choices[0].message.content

flashcards = generate_flashcards_from_paragraphs(book_paragraphs)
print(f"The Model responded with the following flashcards: \n'{flashcards}'")

The Model responded with the following flashcards: 
'What is the Turing test used for in artificial intelligence?
The Turing test is used to determine if a computer can act humanly by testing if a human interrogator can distinguish between communicating with a computer or a person. To pass the test, the computer needs capabilities like natural language processing, knowledge representation, automated reasoning, and machine learning.

How is a rational agent defined in the context of artificial intelligence?
A rational agent is an entity that acts based on what it knows, its functions, and the surrounding environment to achieve the best expected outcome. It does the right thing considering the available information and circumstances.

What does it mean for a computer to think rationally in artificial intelligence?
Thinking rationally in artificial intelligence involves using sound logic rules to reach the right conclusions. It focuses on making logical deductions and decisions based on e

## Part 4: Embeddings (15 min)

In [24]:
def create_embedding(prompt: str, model="text-embedding-ada-002") -> list[float]:
    return client.embeddings.create(model=model, input=prompt).data[0].embedding

print(create_embedding("This is an embedding!"))

database: list[list[list[float]], str] = []

# Create embedding-text key-value pairs and add them to the database
corresponding_text_1 = "This is an embedding!"
embedding_1 = create_embedding(corresponding_text_1)
database.append([embedding_1, corresponding_text_1])

corresponding_text_2 = "Sverre is CTO of Cogito NTNU"
embedding_2 = create_embedding(corresponding_text_2)
database.append([embedding_2, corresponding_text_2])

[-0.023372555151581764, 0.003695604158565402, 0.0025230238679796457, 0.0012804510770365596, -0.010711323469877243, 0.007325333077460527, 0.005836551543325186, -0.01442669052630663, -0.008774589747190475, -0.03678476810455322, 0.011962953954935074, 0.03633681312203407, -0.013043309561908245, -0.0035704411566257477, 0.0040052179247140884, 0.01750965416431427, 0.023754632100462914, 0.003975574392825365, 0.0071145324036479, -0.020605793222784996, -0.018076181411743164, 0.010428059846162796, 0.005457768682390451, -0.012918146327137947, -0.0071869948878884315, 0.00650518573820591, 0.014532091096043587, -0.018524134531617165, -0.0017736924346536398, -0.019630838185548782, 0.0022644633427262306, -0.006956431549042463, -0.011600639671087265, -0.02969658374786377, -0.0038998175878077745, -0.0022183507680892944, -0.0015933588147163391, -0.029512133449316025, 0.014057788997888565, 0.00310766720212996, 0.013221172615885735, -0.007562484126538038, -0.0031735424418002367, -0.012984021566808224, -0.01

In [25]:
import numpy as np

def cosine_similarity(a: list[float], b: list[float]) -> float:
    """
    Takes 2 vectors a, b and finds how similar they are using the cosine similarity

    Args:
    a (list[float]): A list of floats
    b (list[float]): A list of floats

    Returns:
        The similarity of the two vectors a and b described as a float between 0 and 1

    """
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

def search_docs(query: str, database: list[list[list[float]], str], top_k: int=1):
    """
    Searches the database for the most similar documents to the query

    Args:
        query (str): The query to search for
        database (list[list[list[float]], str]): The database to search in
        top_k (int): The number of documents to return
    Returns:
        A list of the top_k most similar documents to the query
    """
    query_embedding = create_embedding(query)
    results = []
    for (doc_embedding, doc) in database:
        similarity = cosine_similarity(query_embedding, doc_embedding)
        results.append((similarity, doc))
    return sorted(results, reverse=True)[:top_k]

search_docs("Who is the CTO of Cogito?", database)

[(0.8882390742588276, 'Sverre is CTO of Cogito NTNU')]

### **Task 4.1**
*Create a new embedding with some text of your choice, and add it to the database. See if you can make the model find it.*

<details>
    <summary><strong>Hint:</strong></summary>
      - Look at the previous two cells
</details>

In [26]:
# TODO: Create an embedding for some text and append it to the database

while True:
  user_input: str = input("What would you like to ask the model: ")

  if user_input == "q":
      print("[SUCCESS] Shut down")
      break

  answer = search_docs(user_input, database, top_k=1)

  print(f"The AI gave the answer: {answer}\n")

new_embedding = create_embedding(user_input)
database.append([new_embedding, user_input])

What would you like to ask the model: who are you?
The AI gave the answer: [(0.759581346207335, 'This is an embedding!')]

What would you like to ask the model: who's steve?
The AI gave the answer: [(0.7443181673040701, 'Sverre is CTO of Cogito NTNU')]

What would you like to ask the model: who's sverre?
The AI gave the answer: [(0.8542858374675332, 'Sverre is CTO of Cogito NTNU')]

What would you like to ask the model: who is the CTO of Cogito?
The AI gave the answer: [(0.8827274723061123, 'Sverre is CTO of Cogito NTNU')]

What would you like to ask the model: Is Sverre the CTO of Cogito NTNU?
The AI gave the answer: [(0.9655346642530646, 'Sverre is CTO of Cogito NTNU')]

What would you like to ask the model: Are you the CTO?
The AI gave the answer: [(0.8220509357827853, 'Sverre is CTO of Cogito NTNU')]

What would you like to ask the model: Are cats and dogs the same?
The AI gave the answer: [(0.7517455434144434, 'This is an embedding!')]

What would you like to ask the model: smarty

#### Task 5.2
*Work together with others and create something cool, try to utilize the different lesseons you have learned examples are:*
* Create external API access some live data
* Create more complex math operations to do calculus
* Create bash scripts to create folders or organize a folder
* Access a database for getting info



In [39]:
prompt_crypto = """
You are a crypto expert looking for good investments.
Check https://coinmarketcap.com/ to gather relevant information about different crypto currency.


Samples

Question: What is the most valuable crypto currency?
Answer: The most valuable Crypto currency  is {name}
        It has a value of {value}

"""

question = "What are the ten least valuable crypto currencies?"

response = client.chat.completions.create(
    model = MODEL_NAME,
    messages = [
        {"role": "system", "content": prompt_crypto},
        {"role": "user", "content": question}
    ]
)

print(response.choices[0].message.content)

The ten least valuable crypto currencies are as follows:

1. Pinkcoin (PINK)
2. ZENZO (ZNZ)
3. Indorse Token (IND)
4. Swace (SWACE)
5. Bitcoin Atom (BCA)
6. ZCore (ZCR)
7. Bispex (BPX)
8. Kryptofranc (KYF)
9. DopeCoin (DOPE)
10. Asch (XAS)



### Running local LLMs
For those interested in experimenting with Large Language Models (LLMs) without incurring the costs associated with API calls to services like OpenAI's, or dealing with sensitive or proprietary data, running pre-trained models on your own hardware presents a viable alternative. The open-source community, particularly [Hugging Face's](https://huggingface.co/models) Transformers library, offers access to a wide range of models, including some developed by leading tech companies.

One of the standout models available is Google's FLAN-T5-XL, part of the T5 (Text-to-Text Transfer Transformer) family, which has been fine-tuned for a broad set of tasks. This model combines the flexibility of T5's architecture with training on a mixture of supervised and unsupervised tasks, making it particularly adept at understanding and generating human-like text.

To get started with using FLAN-T5-XL or any other model from the Transformers library, you will need to install the necessary packages and understand how to load and interact with the model. Below is a basic Python script that demonstrates how to set up and use FLAN-T5-XL for generating text based on input prompts:
```python
import sys
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, GenerationConfig

line = 'What is the value of being accepted into Cogito NTNU, Norway's largest technical AI student organisation, in the middle of an AI revolution?'

model_name = 'google/flan-t5-xl'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

config = GenerationConfig(max_new_tokens=200)
for line in sys.stdin:
    tokens = tokenizer(line, return_tensors="pt")
    outputs = model.generate(**tokens, generation_config=config)
    print(tokenizer.batch_decode(outputs, skip_special_tokens=True))
```



### Context windows

The context window refers to the maximum amount of text (measured in tokens) the model can consider at one time when generating responses or performing tasks. This limit is intrinsic to the model's architecture and significantly influences how we design prompts and interpret model outputs.

#### Significance of the Context Window
The size of the context window determines how much information the model can "see" and use at any given moment. For example, GPT-3 has a context window of 2048 tokens. This means it can consider up to 2048 tokens of preceding text to generate its responses. The implications are twofold:

* **Prompt Design:** When crafting prompts for an LLM, it's vital to ensure that the most relevant information is within the model's context window. Information beyond this limit won't influence the model's output, emphasizing the need for concise and focused prompt design.

* **Sequential Tasks:** For tasks requiring more information than the context window allows, you may need to design a series of prompts that build on each other, ensuring each segment of the task remains within the model's view.

While advancements have led to models supporting context windows surpassing 100,000 tokens (gpt-4 and other open source ones), challenges persist. Specifically, such models tend to focus on the beginning and end of the provided text, potentially underutilizing the middle portion. This is know as [lost in the middle](https://arxiv.org/pdf/2307.03172.pdf).

#### New insights by a Operative System inspired model
[MemGPT](https://memgpt.readme.io/docs/index) introduces a strategic approach to memory management, organized around two core concepts relevant to understanding context windows in LLMs:

* **Memory Hierarchy:** It segments memory into two types: a "main context" analogous to RAM, which is smaller and faster, and an "external context" similar to disk storage, which is larger but slower. This structure necessitates the deliberate transfer of information between these contexts, using virtual memory.

* **Process Management:** Similar to an operating system's role in managing tasks, MemGPT regulates the flow of information between the memory segments, the LLM, and users, ensuring efficient handling of processes.

### Fine-tuning Large Language Models
Fine-tuning is a process that adjusts a pre-trained model to a specific task or dataset, enhancing its ability to perform on tasks it wasn't specifically trained for initially. This method leverages the general understanding that the model has developed during its initial training phase, applying it to a more focused domain or problem set. Fine-tuning can significantly improve the performance of LLMs on specialized tasks, making it a powerful tool for developers and researchers.

#### Why Fine-tune?
Customization: Tailors the model to understand and generate responses based on specific jargon, styles, or formats unique to your dataset.
Improved Performance: Enhances the model's accuracy and efficiency on tasks that may differ from the data it was originally trained on.
Cost-Effectiveness: Utilizes the foundational knowledge the model has gained, reducing the need for training from scratch on vast datasets.

1. **How to Fine-tune an LLM:**
Select a Pre-trained Model: Choose a model that closely aligns with your task in terms of language and domain. Models available on platforms like Hugging Face offer a good starting point.

2. **Prepare Your Dataset:** Your dataset should be representative of the task at hand and formatted in a way that the model can understand. It typically involves splitting the data into training, validation, and test sets.

3. **Customize Training Parameters:** Adjust parameters such as learning rate, batch size, and the number of epochs to balance between retaining learned knowledge and adapting to the new dataset.

4. **Train the Model:** Use a suitable environment and framework, like PyTorch or TensorFlow, along with Hugging Face's Transformers library, to fine-tune the model on your dataset.

5. **Evaluate and Iterate:** Test the model's performance on a separate validation set, and iteratively adjust your approach based on the results.

An example of this using OpenAI can be found in the Cogito Project [MarketingAI](https://github.com/CogitoNTNU/MarketingAI/blob/main/src/fine_tuning/fine_tuning_job.py)

### Leveraging OpenAI Across Diverse Programming Environments
While this course primarily focuses on Python to interact with OpenAI's models. Thera are other supported languages. Supported languages include, but are not limited to, TypeScript/JavaScript, Java, C#, Go, C++, and PHP, alongside others like Clojure, Kotlin, Ruby, Rust, and Scala. This wide-ranging support extends the potential of OpenAI's AI models to virtually any software development domain, from web development and mobile applications to enterprise solutions and beyond.

[Read more at OpenAI Docs](https://platform.openai.com/docs/libraries/community-libraries)