<img width="8%" alt="Hugging Face" src="https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/.github/assets/logos/Hugging Face.png" style="border-radius: 15%">

# Hugging Face - Few Shot Learning with Inference API
<a href="https://app.naas.ai/user-redirect/naas/downloader?url=https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/Hugging%20Face/Hugging_Face_Few_Shot_Learning_with_Inference_API.ipynb" target="_parent"><img src="https://naasai-public.s3.eu-west-3.amazonaws.com/Open_in_Naas_Lab.svg"/></a><br><br><a href="https://bit.ly/3JyWIk6">Give Feedback</a> | <a href="https://github.com/jupyter-naas/awesome-notebooks/issues/new?assignees=&labels=bug&template=bug_report.md&title=Hugging+Face+-+Few+Shot+Learning+with+Inference+API:+Error+short+description">Bug report</a>

**Tags:** #huggingface #ml #few_shot_learning #prompt #inference_api #ai #text

**Author:** [Saurabh Arjun Sawant](https://www.linkedin.com/in/srsawant34/)

**Last update:** 2023-11-08 (Created: 2023-11-08)

**Description:** This notebook demonstrates how to utilize the <a href="https://huggingface.co/docs/inference-endpoints/index">inference endpoints</a> of hugging face models. Additionally, it demonstrates how to use few shot learning for a specific task in a model.

## Input

### Install Packages

In [4]:
!pip install -q datasets

You should consider upgrading via the '/opt/conda/bin/python3 -m pip install --upgrade pip' command.[0m[33m
[0m

### Import Libraries


In [5]:
from datasets import load_dataset
import numpy as np
import requests
import json

### Add the Model and API token

We will use <a href="https://huggingface.co/EleutherAI/gpt-neo-1.3B">gpt-neo-1.3B</a> model for our demonstration. 

In [26]:
MODEL = "EleutherAI/gpt-neo-1.3B"
API_TOKEN = "<INSERT_API_TOKEN>"

## Model

### Define function to make API calls to Hugging Face endpoints

In [7]:
def query(
        payload='', 
        model = 'EleutherAI/gpt-neo-1.3B', 
        parameters = {
            'max_new_tokens':5,
            'temperature': 0.5
        }, 
        options = {
            'use_cache': False
        }
    ):
    API_URL = f"https://api-inference.huggingface.co/models/{model}"
    headers = {"Authorization": f"Bearer {API_TOKEN}"}
    body = {"inputs":payload,'parameters':parameters,'options':options}
    
    try:
        response = requests.request("POST", API_URL, headers=headers, data= json.dumps(body))
        return response.json()[0]['generated_text']
    except:
        return "Error: " + " ".join(response.json()['error'])

## Output

 The model usually takes time to load in the hugging face server. For example, model gpt-neo-1.3B takes approximately 212 seconds

### Zero-shot

In [9]:
prompt = """
Sentence: I loved todays movie.
Sentiment: """

response = query(payload=prompt, model=MODEL)
print(response)


Sentence: I loved todays movie.
Sentiment: 

A:



### One-shot

In [10]:
prompt = """
Sentence: I loved todays movie.
Sentiment: positive

#####

Sentence: I didn't like the action.
Sentiment: """

response = query(payload=prompt, model=MODEL)
print(response)


Sentence: I loved todays movie.
Sentiment: positive

#####

Sentence: I didn't like the action.
Sentiment:  negative

#####


### Two-shot

In [14]:
prompt = """
Sentence: I loved todays movie.
Sentiment: positive

#####

Sentence: I didn't like the action.
Sentiment: negative

#####

Sentence: Liked the direction and scene settings.
Sentiment: """

response = query(payload=prompt, model=MODEL)
print(response)


Sentence: I loved todays movie.
Sentiment: positive

#####

Sentence: I didn't like the action.
Sentiment: negative

#####

Sentence: Liked the direction and scene settings.
Sentiment:  positive

#####


### Few-shot learning with custom dataset

You can also use any custom dataset and generate prompts like above. For example, below we will use <a href="https://huggingface.co/datasets/carblacac/twitter-sentiment-analysis">twitter-sentiment-analysis</a>. More datasets in huggingface can be found <a href="https://huggingface.co/datasets">here</a>

In [15]:
def generate_prompt_with_examples(data, target_col, num_of_examples = 0):
    examples = np.random.choice(data, num_of_examples + 1)
    prompts = []
    for example in examples:
        review = example["text"]
        sentiment = "positive" if example[target_col] else "negative"
        prompt = f"Sentence: {review}\nSentiment: {sentiment}\n"
        prompts.append(prompt)
    return """\n#####\n\n""".join(prompts)[:-9]

data = load_dataset('carblacac/twitter-sentiment-analysis')

Downloading builder script:   0%|          | 0.00/4.38k [00:00<?, ?B/s]

Downloading metadata:   0%|          | 0.00/2.06k [00:00<?, ?B/s]

Downloading readme:   0%|          | 0.00/5.44k [00:00<?, ?B/s]

Downloading data files:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading data:   0%|          | 0.00/5.38M [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/2.23M [00:00<?, ?B/s]

Extracting data files:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading data files:   0%|          | 0/2 [00:00<?, ?it/s]

Extracting data files:   0%|          | 0/2 [00:00<?, ?it/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating test split: 0 examples [00:00, ? examples/s]

Map:   0%|          | 0/149985 [00:00<?, ? examples/s]

Map:   0%|          | 0/61998 [00:00<?, ? examples/s]

Creating json from Arrow format:   0%|          | 0/120 [00:00<?, ?ba/s]

Creating json from Arrow format:   0%|          | 0/30 [00:00<?, ?ba/s]

Creating json from Arrow format:   0%|          | 0/62 [00:00<?, ?ba/s]

Generating train split:   0%|          | 0/119988 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/29997 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/61998 [00:00<?, ? examples/s]

In [22]:
prompt = generate_prompt_with_examples(data=data['train'], target_col="feeling", num_of_examples=2)
print(prompt)

Sentence: wow! I have so much homework for tomorrow!
Sentiment: negative

#####

Sentence: @thepete I know. I hate that/those shows. (Actually there's one I do get addicted to - X-Factor) But I hate it too!
Sentiment: positive

#####

Sentence: @cakesandbakes Ohh nooo!  We're in America! Lol spoilt little brat aren't I?
Sentiment: 


In [25]:
response = query(payload=prompt, model=MODEL)
print(response)

Sentence: wow! I have so much homework for tomorrow!
Sentiment: negative

#####

Sentence: @thepete I know. I hate that/those shows. (Actually there's one I do get addicted to - X-Factor) But I hate it too!
Sentiment: positive

#####

Sentence: @cakesandbakes Ohh nooo!  We're in America! Lol spoilt little brat aren't I?
Sentiment:  positive

#####
