<a href="https://colab.research.google.com/github/wandb/edu/blob/main/llm-apps-course/notebooks/01.%20Using_APIs.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
<!--- @wandbcode{llmapps-intro} -->

# Understanding LLM APIs

We will explore OpenAI models API to generate text.

<!--- @wandbcode{llmapps-intro} -->

### Setup

In [1]:
%pip install --upgrade openai tiktoken wandb -qq
%pip install weave

Note: you may need to restart the kernel to use updated packages.
Collecting weave
  Downloading weave-0.51.54-py3-none-any.whl.metadata (26 kB)
Collecting diskcache==5.6.3 (from weave)
  Downloading diskcache-5.6.3-py3-none-any.whl.metadata (20 kB)
Collecting gql[aiohttp,requests] (from weave)
  Downloading gql-3.5.3-py2.py3-none-any.whl.metadata (9.4 kB)
Collecting graphql-core<3.2.7,>=3.2 (from gql[aiohttp,requests]->weave)
  Downloading graphql_core-3.2.6-py3-none-any.whl.metadata (11 kB)
Downloading weave-0.51.54-py3-none-any.whl (542 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m542.2/542.2 kB[0m [31m27.4 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading diskcache-5.6.3-py3-none-any.whl (45 kB)
Downloading gql-3.5.3-py2.py3-none-any.whl (74 kB)
Downloading graphql_core-3.2.6-py3-none-any.whl (203 kB)
Installing collected packages: graphql-core, diskcache, gql, weave
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m4/4[0m [weave]32m3/4[0m [weave]

In [2]:
import os
import openai
import tiktoken
import wandb
from pprint import pprint
from getpass import getpass
from wandb.integration.openai import autolog
import weave

You will need an OpenAI API key to run this notebook. You can get one [here](https://platform.openai.com/account/api-keys).

In [3]:
if os.getenv("OPENAI_API_KEY") is None:
  if any(['VSCODE' in x for x in os.environ.keys()]):
    print('Please enter password in the VS Code prompt at the top of your VS Code window!')
  os.environ["OPENAI_API_KEY"] = getpass("Paste your OpenAI key from: https://platform.openai.com/account/api-keys\n")
  openai.api_key = os.getenv("OPENAI_API_KEY", "")

assert os.getenv("OPENAI_API_KEY", "").startswith("sk-"), "This doesn't look like a valid OpenAI API key"
print("OpenAI API key configured")

OpenAI API key configured


Let's enable W&B autologging to track our experiments.

In [None]:
# from wandb.integration.openai import autolog
# wandb.init(project="llmapps", job_type="introduction")
# autolog()

weave.init("llmapps")

<weave.trace.weave_client.WeaveClient at 0x7b7403287bc0>

# Tokenization

In [4]:
encoding = tiktoken.encoding_for_model("text-davinci-003")
enc = encoding.encode("Weights & Biases is awesome!")
print(enc)
print(encoding.decode(enc))

[1135, 2337, 1222, 8436, 1386, 318, 7427, 0]
Weights & Biases is awesome!


we can decode the tokens one by one

In [5]:
for token_id in enc:
    print(f"{token_id}\t{encoding.decode([token_id])}")

1135	We
2337	ights
1222	 &
8436	 Bi
1386	ases
318	 is
7427	 awesome
0	!


> Note how the leading tokens contain spacing.

# Sampling

Let's sample some text from the model. For this, let's create a wrapper function around the temperature parameters.
Higher temperature will result in more random samples.

In [8]:
@weave.op()
def generate_with_temperature(temp):
  """Generate text with a given temperature, higher temperature means more randomness"""
  client = openai.OpenAI()
  response = client.chat.completions.create(
    model="gpt-3.5-turbo",  # Change to a chat-optimized model
    messages=[
        {"role": "user", "content": "Say something about Weights & Biases"} # Use the 'messages' parameter
    ],
    max_tokens=50,
    temperature=temp,
  )
  return response.choices[0].message.content # Access the content from the message object

In [9]:
for temp in [0, 0.5, 1, 1.5, 2]:
  pprint(f'TEMP: {temp}, GENERATION: {generate_with_temperature(temp)}')

[36m[1mweave[0m: 🍩 https://wandb.ai/My1Team/llmapps/r/call/0197c038-24a5-736a-864b-a5dbf0011aa2
[36m[1mweave[0m: 🍩 https://wandb.ai/My1Team/llmapps/r/call/0197c038-2db4-794d-bd9b-30cdae09aa50


('TEMP: 0, GENERATION: Weights & Biases is a powerful tool for machine '
 'learning experimentation and tracking. It allows users to easily log and '
 'visualize their model training process, making it easier to iterate and '
 'improve upon their models. It also provides features for collaboration and '
 'sharing, making it')


[36m[1mweave[0m: 🍩 https://wandb.ai/My1Team/llmapps/r/call/0197c038-347c-7be9-a68e-f6ceadacce02


('TEMP: 0.5, GENERATION: Weights & Biases is a powerful tool for machine '
 'learning experimentation and model tracking. It allows users to easily log '
 'and visualize their experiments, track model performance, and collaborate '
 'with team members. It is a valuable resource for data scientists and machine '
 'learning engineers looking')


[36m[1mweave[0m: 🍩 https://wandb.ai/My1Team/llmapps/r/call/0197c038-3750-7f30-bae4-f8f5c522ac26


('TEMP: 1, GENERATION: Weights & Biases is a powerful machine learning tool '
 'that helps researchers and data scientists track, visualize, and optimize '
 'their machine learning models. It allows users to easily experiment with '
 'different hyperparameters, compare results, and collaborate with team '
 'members. It is a')


[36m[1mweave[0m: 🍩 https://wandb.ai/My1Team/llmapps/r/call/0197c038-3cb9-7ddf-ae9d-5618867cd5c1


('TEMP: 1.5, GENERATION: Weights & Biases is a powerful tool for tracking, '
 'visualizing, and optimizing machine learning experiments. It allows you to '
 'easily monitor and manage your deep learning models, providing insights into '
 'their performance and helping you make quick decisions for improving them. '
 'With its')
('TEMP: 2, GENERATION: Weights & Biases (wandb) is a DataAge Dakota Phillip '
 'poised Talks OPCcludesrogram humanitiesereaPublisher04+c aff transmitkil '
 'Clowncareer sine '
 'mingAlongwegianondheim_PATTERN已ARGV_FLAG_geometrycomputdr-Juai '
 'AjoutinesEffBefore teh kanwarn')


[36m[1mweave[0m: 🍩 https://wandb.ai/My1Team/llmapps/r/call/0197c038-6a2a-7acb-ab68-519d08a8ac9c
[36m[1mweave[0m: 🍩 https://wandb.ai/My1Team/llmapps/r/call/0197c038-7025-7045-abed-ed8924381db8
[36m[1mweave[0m: 🍩 https://wandb.ai/My1Team/llmapps/r/call/0197c038-72ff-7dbe-8d6d-f6ee3c09de33
[36m[1mweave[0m: 🍩 https://wandb.ai/My1Team/llmapps/r/call/0197c038-75c4-7c9b-9265-f00b127c18d6


You can also use the [`top_p` parameter](https://platform.openai.com/docs/api-reference/completions/create#completions/create-top_p) to control the diversity of the generated text. This parameter controls the cumulative probability of the next token. For example, if `top_p=0.9`, the model will pick the next token from the top 90% most likely tokens. The higher the `top_p` the more likely the model will pick a token that it hasn't seen before. You should only use one of `temperature` or `top_p` at a given time.

In [10]:
@weave.op()
def generate_with_topp(topp):
  "Generate text with a given top-p, higher top-p means more randomness"
  client = openai.OpenAI()
  response = client.chat.completions.create(
    model="gpt-3.5-turbo",  # Change to a chat-optimized model
    messages=[
        {"role": "user", "content": "Say something about Weights & Biases"} # Use the 'messages' parameter
    ],
    max_tokens=50,
    top_p=topp,
  )
  return response.choices[0].message.content # Access the content from the message object

In [11]:
for topp in [0.01, 0.1, 0.5, 1]:
  pprint(f'TOP_P: {topp}, GENERATION: {generate_with_topp(topp)}')

('TOP_P: 0.01, GENERATION: Weights & Biases is a powerful tool for machine '
 'learning experimentation and tracking. It allows users to easily log and '
 'visualize their model training process, making it easier to iterate and '
 'improve upon their models. It also provides features for collaboration and '
 'sharing, making it')
('TOP_P: 0.1, GENERATION: Weights & Biases is a powerful tool for machine '
 'learning experimentation and tracking. It allows users to easily log and '
 'visualize their model training process, making it easier to iterate and '
 'improve upon their models. With features like hyperparameter tuning, '
 'experiment comparison, and')
('TOP_P: 0.5, GENERATION: Weights & Biases is a powerful tool for machine '
 'learning practitioners to track and visualize their experiments, collaborate '
 'with team members, and optimize their models. It provides a seamless way to '
 'monitor and improve the performance of machine learning models, making it an '
 'essential tool'

# Chat API

Let's switch to chat mode and see how the model responds to our messages. We have some control over the model's response by passing a `system-role`, here we can steer to model to adhere to a certain behaviour.

> We are using `gpt-3.5-turbo`, this model is faster and cheaper than `davinci-003`

In [None]:
MODEL = "gpt-3.5-turbo"
client = openai.OpenAI()
response = client.chat.completions.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Say something about Weights & Biases"},
    ],
    temperature=0,
)

response

ChatCompletion(id='chatcmpl-BnpjbIqtcH9ycRF5PM5IDvXXWRosc', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='Weights & Biases is a popular machine learning experiment tracking and visualization platform that helps researchers and data scientists track and visualize their machine learning experiments. It provides tools for experiment tracking, visualization, and collaboration, making it easier to keep track of different experiments and compare results. It is widely used in the machine learning community to improve productivity and streamline the experimentation process.', refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=None))], created=1751216639, model='gpt-3.5-turbo-0125', object='chat.completion', service_tier='default', system_fingerprint=None, usage=CompletionUsage(completion_tokens=71, prompt_tokens=25, total_tokens=96, completion_tokens_details=CompletionTokensDetails(accepted_predictio

As you can see above, the response is a JSON object with relevant information about the request.

In [None]:
pprint(response.choices[0].message.content)

('Weights & Biases is a popular machine learning experiment tracking and '
 'visualization platform that helps researchers and data scientists track and '
 'visualize their machine learning experiments. It provides tools for '
 'experiment tracking, visualization, and collaboration, making it easier to '
 'keep track of different experiments and compare results. It is widely used '
 'in the machine learning community to improve productivity and streamline the '
 'experimentation process.')


In [None]:
wandb.finish()