*To execute this notebook, choose the `llm-rag` kernel in the dropdown above. You may need to hit the "Select another kernel" button, and refresh the kernels list.*

# Getting off the Ground with LLMs

In this section, we will walk through how to access cutting-edge LLMs from your Python codes.
We will walk through the basics of commercial APIs, open-source APIs, and a bit about their relative capabilities.

## Commercial APIs

If you don't have a plan, but want to build a language model-driven solution, commercial APIs are a good way to start.

This section shows you how to get started with the leading commercial APIs as of October 2023:
- [OpenAI](https://platform.openai.com/docs/api-reference)
- [Cohere](https://docs.cohere.com/reference/about)
- [Jurassic-2](https://docs.ai21.com/reference/python-sdk) from [A21 Labs](https://www.ai21.com/)
- [Claude](https://github.com/anthropics/anthropic-sdk-python) from Anthropic

### Dependencies

In [None]:
### Notebook display
from IPython.display import display, Markdown

### Data processing
import pandas as pd

### Commerical LLM providers
import openai
import cohere
import ai21

### Open-source LLMs
from transformers import AutoModel, pipeline

### Set API Keys
 
Go to these links to find your tokens (after signing up):
- [OpenAI](https://platform.openai.com/account/api-keys)
- [Cohere](https://dashboard.cohere.com/api-keys)
- [A21 Labs](https://studio.ai21.com/account/api-key)

In [None]:
openai_key = ...
cohere_key = ...
ai21_key = ...
# Note: Did not include Claude here, as SDK/API access is gated: https://docs.anthropic.com/claude/docs/getting-access-to-claude

### The common prompt

In [None]:
PROMPT = "How is generative AI affecting the infrastrucutre machine learning developers need access to?"

### Hello OpenAI API

In [None]:
from openai import OpenAI
client = OpenAI(api_key=openai_key)
gpt35_completion = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=[{"role": "user", "content": PROMPT}]
)
gpt35_text_response = gpt35_completion.to_dict()['choices'][0]['message']['content'].strip()

In [None]:
gpt35_text_response

### Hello Cohere API

In [None]:
co = cohere.Client(cohere_key)
cohere_cmd_completion = co.generate(prompt=PROMPT, model="command")
cohere_cmd_response = cohere_cmd_completion.generations[0].text.strip()

In [None]:
print(cohere_cmd_response)

### Hello A21 Labs API

In [None]:
from ai21 import AI21Client
client = AI21Client(api_key=ai21_key)
response_mid = client.completion.create(
  model="j2-mid",
  prompt=PROMPT,
  num_results=1,
  max_tokens=100,
  temperature=0.4,
  top_k_return=0,
  top_p=1,
  stop_sequences=["##"]
)
jurassic2_response = response_mid.completions[0].data.text

In [None]:
from IPython.display import Markdown

Markdown(f"""

**OpenAI GPT3.5**: {gpt35_text_response}

{"="*160}

**Cohere Command**: {cohere_cmd_response}

{"="*160}

**AI21 Jurassic2**: {jurassic2_response}
""")

## Endpoint support across APIs

A **rough** picture of what endpoints these APIs have available as of October 20, 2023, without much more effort than what you just saw.

> Note: You can make each of these models do almost anything, making this all muddy. <br/>The point of this table is to highlight which of these APIs have documented endpoints for certain tasks.

<center>

| Endpoint / API | OpenAI | Cohere | Claude | A21 |
| :---: | :---: | :---: | :---: | :---: |
| Prompt-to-response | ✅ | ✅ | ✅ | ✅ |
| Chat-to-response | ✅ | ✅ | ✅ | ✅ |
| Text embeddings | ✅ | ✅ | ❌ | ✅ |
| Fine-tuning | ✅ | ✅ | ❌ | ✅ |
| Language detection | ❌ | ✅ | ❌ | ❌ |
| Raw document processing | ✅ | ✅ | ❌ | ✅ | 
| Rerank / document relevance | ❌ | ✅ | ❌ | ✅ |
| Text/image to image | ✅ | ❌ | ❌ | ❌ |
| Audio-to-text | ✅ | ❌ | ❌ | ❌ |
| Moderations / toxicitiy | ✅ | ✅ | ❌ | ❌ | 

</center>

Some opinions related to this table:
- If you want to pay for the **best chat model** --> 
    - OpenAI's GPT4 API is the current gold standard.
- If you want **multimodal** --> 
    - OpenAI APIs are great - [Images](https://platform.openai.com/docs/api-reference/images), [Audio](https://platform.openai.com/docs/api-reference/audio)
    -  Stability AI has some nice products not listed here. 
    - Check out this recent survey paper: [The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)](https://arxiv.org/abs/2309.17421)
- If you care a lot about **content moderation** --> 
    - [Cohere](https://docs.cohere.com/docs/content-moderation-with-classify) and [OpenAI](https://platform.openai.com/docs/api-reference/moderations) have the most API support
- If you want fine-grained **multi-lingual models** --> 
    - Try Cohere's [Multilingual Embedding](https://docs.cohere.com/docs/multilingual-language-models) APIs
- If you want a model that can **analyze grammar** --> 
    - Try A21's [Text Improvements](https://docs.ai21.com/reference/text-improvements-api-ref) and [Grammatical Error Corrections](https://docs.ai21.com/reference/gec-api-ref) APIs
- The public Claude product is a personal favorite, however their API access and feature support is lacking behind others in this list

Of course, you can also try meshing them together if you have the budget and engineering will!

# Getting Started with Open-source Models

## Discussion
- Why would you want to use open-source LLMs?
- Will they ever really be competitive? 
    - What drives the competition if OpenAI's models are 10x bigger and performance keeps scaling with model size?

## Load Model

In this section we will see how to load a pre-trained model from the HuggingFace Hub. 
You can shop for models [here](https://huggingface.co/models).

After, you'll see how to use these models for text classification and text generation, similar to the core mechanism of how the commerical APIs you saw above are generating text.

In [None]:
# T5 paper: https://arxiv.org/pdf/2210.11416.pdf
model_name = "t5-small"
model = AutoModel.from_pretrained(model_name, device_map="auto")

In [None]:
print(model)

Want to learn more about transformers like the BERT and GPT family and how they work? 
- Sebastian Raschka recently gave his description of the history of the transformer in this concise [post](https://www.linkedin.com/posts/sebastianraschka_llms-largelanguagemodels-ai-activity-7121484400701186048--47t?utm_source=share&utm_medium=member_desktop).
- Check out the amazing [Bertviz](https://github.com/jessevig/bertviz) tool by [jessevig](https://github.com/jessevig/). you can see a pre-loaded demo [here](https://colab.research.google.com/drive/1hXIQ77A4TYS4y3UthWF-Ci7V7vVUoxmQ?usp=sharing#scrollTo=twSVFOM9SopW).

## HuggingFace Pipeline API

In the previous section we saw how to load a model, in this section we see the easiest way to use HuggingFace models for inference like with the earlier examples using commercial APIs.

You will see how the [HuggingFace Pipeline API](https://huggingface.co/docs/transformers/v4.34.0/en/main_classes/pipelines) perform tasks including:
* [Text Classsification](#text-classification)
* [Text Generation](#text-generation)
* many more tasks [here](https://huggingface.co/tasks)

## Text Generation
https://huggingface.co/tasks/text-generation

In [None]:
model_name = "bigscience/bloom-560m" # https://huggingface.co/bigscience/bloom-560m
generator = pipeline("text-generation", model=model_name, device_map="auto")

prompt = "To learn MLOps in 2024, start by" 
response = generator(prompt, do_sample=False, max_new_tokens=25)

In [None]:
Markdown(f"""
**Prompt**: {prompt}

**{model_name}'s continuation**: {response[0]['generated_text']}...
""")

## Text Classification
https://huggingface.co/tasks/text-classification

In [None]:
# More text classification models: https://huggingface.co/models?pipeline_tag=text-classification&sort=trending
model_name = "SamLowe/roberta-base-go_emotions" 

# Create a text classification pipeline using HuggingFace transformers pipeline.
classifier_pipe = pipeline("text-classification", model=model_name)

# Sample data we want to classify the sentiment of.
sentences = [
    "I am feeling inspired today. What a time to be alive!",
    "This talk is informative, but a bit high-level, where I can find more details?",
    "I wonder about all the hype around Generative AI, is it smoke and mirrors?",
    "Building production-grade machine learning systems is challenging."
]

# Run the pipeline!
classifier_pipe(sentences)

# Summary

In this lesson, you've learned:
- how to programmatically query the leading commercial generative AI APIs
- which endpoints are supported by the leading generative AI APIs
- how to get started with replicating the core modeling loops of generative AI using open-source

In the next lessons we will discuss methods for increasing the relevance of LLM responses, starting with basic prompt engineering, retrieval-augmented generation (RAG), and changing the model itself through fine-tuning and serving it behind an API you can control.