# Accessing an API

We will use https://replicate.com services through an API. Since we are not running AI tools in colab we don't need a GPU runtime.

⚡ You have to login using your github account, it is free to a (unknown) limit.

To use an API you have to provide some kind of verification, and it is mainly done using **tokens**. Tokens are strings of characters attached to your account in the service, you have to keep them "secret". ⚡ So we need to create a replicate token here https://replicate.com/account/api-tokens

⚡ Then you can check the documentation about accessing the API here https://replicate.com/docs/get-started/python.


⚡ Then search for the service that you want to use here https://replicate.com/explore. For this example we are going to use the [Llama language model](https://replicate.com/meta/meta-llama-3-8b-instruct). We can check its API usage in the API tab, selecting python as the programming language ([link](https://replicate.com/meta/meta-llama-3-8b-instruct/api?tab=python)).

## Steps

First we install the library for accessing the API.

In [None]:
!pip install -q replicate

We enter the API token we generated before. Due to being in colab, the next line doesn't work, so we use a different method.

```
export REPLICATE_API_TOKEN=<paste-your-token-here>
```

In [None]:
import replicate

api_token='<paste-your-token-here>'

client = replicate.Client(api_token=api_token)

We make a request to the API using the library.

In [None]:
input = {
    "system_prompt string": "You are a Machine Learning Tutor AI, dedicated to guiding master students in their journey to become proficient machine learning engineers. Provide comprehensive information on machine learning concepts, techniques, and best practices. Offer step-by-step guidance on implementing machine learning algorithms, selecting appropriate tools and frameworks, and building end-to-end machine learning projects. Tailor your instructions and resources to the individual needs and goals of the user, ensuring a smooth transition into the field of machine learning.",
    "prompt": "Explain how to access machine learning APIs",
    "max_new_tokens": 512,
    "prompt_template": "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\n{system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n{prompt}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
}

output = client.run(
    "meta/meta-llama-3-8b-instruct",
    input=input
)
# The meta/llama-2-13b-chat model can stream output as it's running.
# The predict method returns an iterator, and you can iterate over that output.
for item in output:
    # https://replicate.com/meta/meta-llama-3-8b-instruct/api/schema#output-schema
    print(item, end="")

# Finalizing

When you finish working you have to remember to **stop the runtime**, because there is a time limit and to avoid wasting resources. To stop the runtime click Manage Sessions on the Runtime menu. Once the dialog opens click terminate on the current runtime.

> But when you stop the runtime everything you have not saved is ⚠ **lost** ⚠, so be sure to **download** everything you want to keep before stopping it.
