## Calling Anthropic's Claude using a REST API
For students learning about AI, leveraging the Anthropic Claude model through a REST API can be a valuable way to explore and integrate advanced natural language processing capabilities into their applications. The Claude model, developed by Anthropic, is accessible through a well-documented REST API that allows developers to send text prompts and receive responses generated by the powerful language model. By understanding how to interact with this type of AI-powered REST API, students can expand the functionality of their projects, experiment with different use cases, and gain practical experience in incorporating state-of-the-art AI technologies into real-world applications.
### Table of Contents
1. [Setup your Anthropic API key](#setup)
2. [Make a simple API call to one of Claude's models](#simple)
3. [A few options for using the API](#options)
4. [Your assignment](#assignment)

## Setup your Anthropic API key<a name="setup"></a>

In [1]:
# This package is not installed in our Sagemaker image.
# Everytime you researt this jupyterlab, you will have to reinstall it.
%pip install python-dotenv
# Now import the objects we need
from dotenv import load_dotenv
# Other needed packages to import
import os
import requests
import json

Note: you may need to restart the kernel to use updated packages.


To store your API key for use with the requests package:
- Get the key from your account on https://console.anthropic.com. It will look something like: ""sk-ant-api03-Iu4 ... U37M"
- Now, open a terminal from the jupyter Launcher
    - Use the nano text editor (or any other editor)
    - Create a .env file (that is a file with the exact name ".env" (files that start with '.' are hidden by default
    - Add a line that looks like this: ANTHROPIC_API_KEY="your_key"
       - Insert your key in double-quotes
    - Save the file and exit the text editor. In nano: Save (ctl-o), Exit (ctl-x)

<P>
Next we will load this key into this session

In [2]:
# Load environment variables from .env file
load_dotenv()
# Now you can access the environment variable
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
# You can print the key to make sure it is there, but I get nervous when I see a key printed somewhere.... Someone could steal it!
#print(anthropic_api_key)

## Make a simple API call to one of Claude's models<a name="simple"></a>
If you want to know more: https://docs.anthropic.com/claude/reference/messages_post

In [3]:
# Define the API endpoint:
url = 'https://api.anthropic.com/v1/messages'
# Define the headers
headers = {
    'x-api-key': anthropic_api_key,
    'anthropic-version': '2023-06-01',
    'content-type': 'application/json',
}

In [4]:
# Define the data
data = {
    'model': 'claude-3-sonnet-20240229', # Select a model from here: https://docs.anthropic.com/claude/docs/models-overview
    'max_tokens': 1024,
    'messages': [
        {'role': 'user', 'content': 'Write a blog post about large language models.'}
    ]
}

In [5]:
# Make the API POST request.
response = requests.post(url, headers=headers, data=json.dumps(data))
# Handle the response
if response.status_code == 200: # All went well
    print('The API call was successful.')
    # Store the response content into a variable
    generated_text = response.json()['content'][0]['text']
else:
    print(f"Error: {response.status_code} - {response.text}")
# When you execute this cell, you are paying to inference the model. It will deduct money from your account.
# Therefore, I usually do anything in this cell other than store the response.

The API call was successful.


In [6]:
# Look at the resposne
print(generated_text)

Here is a draft blog post about large language models:

Title: The Rise of Large Language Models and Their Impact

In recent years, we've seen the emergence of large language models (LLMs) – extremely capable artificial intelligence systems trained on vast amounts of text data to understand and generate human-like language with remarkable fluency. The names of some of the most prominent LLMs like GPT-3, BERT, and LaMDA have become familiar in both tech and mainstream media.  

So what exactly are large language models? At their core, they are machine learning models that use neural networks with billions or even trillions of parameters trained on massive datasets like online books, articles, websites, and other digital text. By ingesting and processing an immense corpus of human-written language, these models develop sophisticated skills for understanding context and meaning, answering questions, writing coherent passages, translating between languages, and more.  

The capabilities th

In [7]:
# Look closer at the entire response. Other data is included
response.json()

{'id': 'msg_01RwHuHkJyYhDxWG5ZkbfNEs',
 'type': 'message',
 'role': 'assistant',
 'model': 'claude-3-sonnet-20240229',
 'stop_sequence': None,
 'usage': {'input_tokens': 16, 'output_tokens': 874},
 'content': [{'type': 'text',
   'text': 'Here is a draft blog post about large language models:\n\nTitle: The Rise of Large Language Models and Their Impact\n\nIn recent years, we\'ve seen the emergence of large language models (LLMs) – extremely capable artificial intelligence systems trained on vast amounts of text data to understand and generate human-like language with remarkable fluency. The names of some of the most prominent LLMs like GPT-3, BERT, and LaMDA have become familiar in both tech and mainstream media.  \n\nSo what exactly are large language models? At their core, they are machine learning models that use neural networks with billions or even trillions of parameters trained on massive datasets like online books, articles, websites, and other digital text. By ingesting and pr

## A few options for using the API<a name="options"></a>
- max_tokens
- temperature
- Multiple conversational turns
- Using the conversation history in multiple API calls (keeping the context of the conversation active)

#### max_tokens
When using the Anthropic Claude API, the `max_tokens` parameter is an important setting that allows you to control the length of the generated text response. The `max_tokens` parameter specifies the maximum number of tokens (words or word pieces) that the model should generate in the response. This is useful for preventing the model from generating excessively long or open-ended responses, which can help manage the response size and cost when using the API. By adjusting the `max_tokens` value, you can balance the desired level of detail and conciseness in the generated text to best fit the needs of your application. Understanding how to effectively leverage the `max_tokens` parameter is an important consideration when integrating the powerful Claude language model through the REST API.

In [8]:
# Max tokens
data = {
    'model': 'claude-3-sonnet-20240229',
    'max_tokens': 100, # short, interrupted response
    'messages': [
  {"role": "user", "content": "Can you explain Artificial Neural Networks plain English?"},
]
}
# Make the POST request
response = requests.post(url, headers=headers, data=json.dumps(data))
# Handle the response
if response.status_code == 200: # All went well
    print('The API call was successful.')
    # Store the response content into a variable
    generated_text = response.json()['content'][0]['text']
else:
    print(f"Error: {response.status_code} - {response.text}")

The API call was successful.


In [9]:
print(generated_text)
print('\n\nWhy the response ended:', response.json()['stop_reason'])
print('Token usage:', response.json()['usage'])

Sure, I'll do my best to explain artificial neural networks in plain English.

An artificial neural network is a computing system that is designed to mimic the way the human brain processes information. Just like the brain is made up of interconnected neurons, an artificial neural network is made up of interconnected nodes or "neurons" that transmit signals between each other.

The neural network is first "trained" on a large set of data. During this training process, the connections between the nodes


Why the response ended: max_tokens
Token usage: {'input_tokens': 17, 'output_tokens': 100}


#### temperature
The `temperature` parameter in the Anthropic Claude API is a setting that controls the "creativity" or "randomness" of the generated text. Temperature is a value between 0 and 1 that affects the model's probability distribution when choosing the next token in the output. 

A lower temperature (closer to 0) results in more deterministic, logical, and "safer" text generation, as the model will tend to choose the most probable next tokens based on the training data. This can be useful for generating text that needs to adhere to specific guidelines or patterns.

Conversely, a higher temperature (closer to 1) introduces more randomness and creativity into the text generation process. The model will explore a wider range of possible next tokens, leading to more diverse, unexpected, and imaginative outputs. This can be beneficial for tasks like creative writing, brainstorming, or open-ended exploration.

By adjusting the `temperature` parameter, users of the Claude API can control the balance between coherence/safety and creativity/unpredictability in the model's generated text, allowing them to fine-tune the output to best suit their specific application needs.

In [21]:
# Temperature (low temperature, less random, more deterministic or conservative)
data = {
    'model': 'claude-3-sonnet-20240229',
    'max_tokens': 200, 
    # Low temperature (more deterministic, less random)
    'temperature':0,
    'messages': [
  {"role": "user", "content": "Please tell me a bedtime story."},
  ]
}
# Make the POST request
response = requests.post(url, headers=headers, data=json.dumps(data))
# Handle the response
if response.status_code == 200: # All went well
    print('The API call was successful.')
    # Store the response content into a variable    
    generated_text = response.json()['content'][0]['text']
else:
    print(f"Error: {response.status_code} - {response.text}")

The API call was successful.


In [22]:
print(generated_text)

Here is a calming bedtime story for you:

The Sleepy Forest

Deep in the heart of an ancient forest, there was a small clearing surrounded by towering oak and pine trees. Moonlight filtered through the branches, casting a gentle glow over the mossy ground. 

A babbling brook ran along one side of the clearing, its soothing sounds like a lullaby. Fireflies danced lazily above the water, their lights blinking on and off in a twinkling rhythm.

In the center of the clearing stood a great willow tree, its long branches swaying ever so slightly in the night breeze. Underneath this willow, a bed of soft ferns and wildflowers made the perfect spot for sleepy forest creatures to rest their heads.

As the night wore on, the animals of the forest began making their way to the willow's sheltering branches. A family of rabbits hopped


In [23]:
# Temperature (High temperature, more random, less deterministic or less conservative)
data = {
    'model': 'claude-3-sonnet-20240229',
    'max_tokens': 100, 
    # High temperature (more random, more creative)
    'temperature':1.0,
    'messages': [
  {"role": "user", "content": "Please tell me a bedtime story."},
  ]
}
# Make the POST request
response = requests.post(url, headers=headers, data=json.dumps(data))
# Handle the response
if response.status_code == 200: # All went well
    print('The API call was successful.')
    # Store the response content into a variable        
    generated_text = response.json()['content'][0]['text']
else:
    print(f"Error: {response.status_code} - {response.text}")

The API call was successful.


In [24]:
print(generated_text)

Here is a calming bedtime story for you:

The Sleepy Little Cloud

Once upon a time, there was a little cotton ball cloud who loved floating high up in the bright blue sky. His name was Cumulus, and he spent his days happily drifting wherever the warm breezes took him. 

One evening, as the sun began to set, Cumulus looked around and realized he was all alone up in the darkening sky. "Where


#### multiple conversation turns
When using the Anthropic Claude API, the ability to maintain multiple conversation turns is an important feature. This allows you to provide the model with a conversational context, where each subsequent request builds upon the previous responses. The API supports storing this conversational state, enabling the model to understand and respond to the evolving context. This can be particularly useful for creating more natural, coherent, and contextual interactions between the user and the AI assistant. By leveraging multiple conversation turns, you can create more engaging and informative dialogues that draw upon the model's accumulated knowledge and understanding of the discussion.

In [14]:
# Multiple conversational turns: https://docs.anthropic.com/claude/reference/messages-examples#multiple-conversational-turns
# Define the data
data = {
    'model': 'claude-3-sonnet-20240229',
    'max_tokens': 200,
    'messages': [
  {"role": "user", "content": "I have an AI question, are you ready?"},
  {"role": "assistant", "content": "Hi, I'm Claude an AI assistant, so I know a lot about it. Please, please ask me anything about AI."},
  {"role": "user", "content": "Can you explain Artificial Neural Networks plain English, including an analogy?"},
]
}
# Make the POST request
response = requests.post(url, headers=headers, data=json.dumps(data))
# Handle the response
if response.status_code == 200: # All went well
    print('The API call was successful.')
    # Store the response content into a variable       
    generated_text = response.json()['content'][0]['text']
else:
    print(f"Error: {response.status_code} - {response.text}")

The API call was successful.


In [15]:
print(generated_text)

Sure, I can try to explain artificial neural networks in plain English using an analogy. 

An artificial neural network is a computing system that is very loosely modeled after the biological neural networks in the human brain. Just like the brain is made up of interconnected neurons that transmit signals, an artificial neural network has interconnected nodes (artificial neurons) that transmit information.

Here's an analogy that may help visualize it:

Imagine a company that has many offices in different locations. Each office is like a neuron or node in the network. The employees at each office are collectively doing some work and computing something based on the resources and input data they receive.

The different offices are connected by roads, which are like the connections or weights between neurons. Just as roads have varying traffic capacities, the connections between neurons have varying strengths that get adjusted during training.

When an office receives an input (like a sh

#### Using the conversation history in multiple API calls
The Anthropic Claude API supports the ability to maintain and utilize conversation history across multiple API calls. By passing the conversation history as part of each subsequent request, the model can reference and build upon the prior context, resulting in more coherent and contextual responses. Leveraging the conversation history is particularly useful for tasks that require an ongoing dialogue, such as open-ended Q&A, task completion, or collaborative ideation. This feature allows you to create more natural and engaging interactions, where the AI assistant can demonstrate an understanding of the discussion and provide relevant and tailored responses based on the evolving conversation.

In [16]:
# Create a list to keep the API text responses
message_lst = [] #empty list
# Define the data
data = {
    'model': 'claude-3-sonnet-20240229',
    'max_tokens': 200,
    'messages': [
  {"role": "user", "content": "Please explain the difference between Artifical Intelligence and Machine Learning."},
]
}
# Make the POST request
response = requests.post(url, headers=headers, data=json.dumps(data))
# Handle the response
if response.status_code == 200: # All went well
    print('The API call was successful.')
    # Store the response content into a variable           
    generated_text = response.json()['content'][0]['text']
else:
    print(f"Error: {response.status_code} - {response.text}")
# Keep a history: append the text to the list
message_lst.append(generated_text)

The API call was successful.


In [17]:
# Look at our current list of responses
for i,r in enumerate(message_lst):
    print('Entry:', i+1, '\n','Text:', r,'\n\n')

Entry: 1 
 Text: Artificial Intelligence (AI) and Machine Learning (ML) are closely related but distinct concepts. Here's an explanation of the difference between the two:

Artificial Intelligence (AI):
AI is a broad field that encompasses the development of intelligent machines capable of performing tasks that would typically require human intelligence. It involves the study of how to make computers and machines mimic human cognitive functions such as learning, problem-solving, reasoning, perception, and language understanding. AI aims to create systems that can perceive their environment, process data, make decisions, and take actions to achieve specific goals.

Machine Learning (ML):
Machine Learning is a subset of AI and is focused on the development of algorithms and statistical models that enable systems to learn from data and make predictions or decisions without being explicitly programmed. It involves feeding large amounts of data to algorithms, which then learn patterns, rela

In [18]:
# Pass the history back into the assistant with a follow-up instruction.
data = {
    'model': 'claude-3-sonnet-20240229',
    'max_tokens': 200,
    'messages': [
        {"role": "user", "content": "Please explain the difference between Artifical Intelligence and Machine Learning."},
        # Pass in the history
        {"role": "assistant", "content": " ".join(message_lst)}, # Combine all messages into a single string
        # Give further instructions
        {"role": "user", "content": "Please reduce this explaination to a single paragraph."},
]
}
# Make the POST request
response = requests.post(url, headers=headers, data=json.dumps(data))
# Handle the response
if response.status_code == 200:
    print('The API call was successful.')
    # Store the response content into a variable            
    generated_text = response.json()['content'][0]['text']
else:
    print(f"Error: {response.status_code} - {response.text}")
# Append to the end of the list
message_lst.append(generated_text)

The API call was successful.


In [19]:
# Look at our current list of responses
for i,r in enumerate(message_lst):
    print('Entry:', i+1, '\n','Text:', r,'\n\n')

Entry: 1 
 Text: Artificial Intelligence (AI) and Machine Learning (ML) are closely related but distinct concepts. Here's an explanation of the difference between the two:

Artificial Intelligence (AI):
AI is a broad field that encompasses the development of intelligent machines capable of performing tasks that would typically require human intelligence. It involves the study of how to make computers and machines mimic human cognitive functions such as learning, problem-solving, reasoning, perception, and language understanding. AI aims to create systems that can perceive their environment, process data, make decisions, and take actions to achieve specific goals.

Machine Learning (ML):
Machine Learning is a subset of AI and is focused on the development of algorithms and statistical models that enable systems to learn from data and make predictions or decisions without being explicitly programmed. It involves feeding large amounts of data to algorithms, which then learn patterns, rela

## Your assignment:<a name="assignment"></a>
Using the examples above, create 3 unique API calls (different than my examples) using a combination of parameters and techniques. Print your responses just as I have done above.

In [20]:
# Your code here.