# Summary

`Try getting GPT Neo predictions using the Huggingface API.`

In [1]:
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [94]:
import matplotlib.pyplot as plt
import numpy as np
import os
import pandas as pd
from pathlib import Path
import requests

from jabberwocky.config import C
from jabberwocky.openai_utils import load_prompt, load_openai_api_key, \
    print_response
from jabberwocky.utils import load_huggingface_api_key
from htools import *

In [4]:
cd_root()

Current directory: /Users/hmamin/jabberwocky


In [24]:
HF_API_KEY = load_huggingface_api_key()
HEADERS = {'Authorization': f'Bearer api_{HF_API_KEY}'}
URL_FMT = 'https://api-inference.huggingface.co/models/{}'
# These accept different parameters. For now just start with the basics, but
# keep these around in case I want to do something with them later.
_task2suff = {'generate': 'EleutherAI/gpt-neo-2.7B',
              'summarize': 'facebook/bart-large-cnn',
              'chat': 'microsoft/DialoGPT-large',
              'q&a': 'deepset/roberta-base-squad2'}
TASK2URL = DotDict({k: URL_FMT.format(v) for k, v in _task2suff.items()})
NEO_URL = URL_FMT.format(_task2suff['generate'])

In [214]:
@valuecheck
def query_gpt_neo(prompt, top_k=None, top_p=None, temperature=1.0,
                  repetition_penalty=None, max_tokens=250, api_key=None,
                  size:('125M', '1.3B', '2.7B')='2.7B',
                  **kwargs):
    # Docs say we can return up to 256 tokens but API sometimes throws errors
    # if we go above 250.
    headers = {'Authorization':
               f'Bearer api_{api_key or load_huggingface_api_key()}'}
    # Notice the names don't always align with parameter names - I wanted
    # those to be more consistent with query_gpt3() function. Also notice
    # that types matter: if Huggingface expects a float but gets an int, we'll
    # get an error.
    if repetition_penalty is not None:
        repetition_penalty = float(repetition_penalty)
    data = {'inputs': prompt,
            'parameters': {'top_k': top_k, 'top_p': top_p,
                           'temperature': float(temperature),
                           'max_new_tokens': min(max_tokens, 250),
                           'repetition_penalty': repetition_penalty,
                           'return_full_text': False}}
    url = 'https://api-inference.huggingface.co/models/EleutherAI/gpt-neo-{}'
    r = requests.post(url.format(size), headers=headers, 
                      data=json.dumps(data))
    r.raise_for_status()
    res = r.json()[0]['generated_text']
    if 'stop' in kwargs:
        idx = [idx for idx in map(res.find, tolist(kwargs['stop'])) 
               if idx >= 0]
        stop_idx = min(idx) if idx else None
        res = res[:stop_idx]
    return res

In [204]:
text = 'I love to'
res = query_gpt_neo(text, max_tokens=25)

print_response(text, res)

[1mI love to[0m read. I’ve read a ton throughout my life, and I’ve read more recently. One of the


In [205]:
text = 'I love to play the drums because it really relaxes me and'
res = query_gpt_neo(text)

print_response(text, res)

[1mI love to play the drums because it really relaxes me and[0m gives me a chance to be creative.” He has been playing since he was 5, and his father brought drums to his first band while in high school.

“It was a blast,” he writes in a blog post announcing his move to New York. “It felt like an escape from the daily rat race. I took all that frustration and pressure and decided to take it out on the street. I had no clue what I was getting myself into, but I thought music was the only way I could express myself.”

But the street turned out to be the very spot where Mr. Lefsetz was introduced to the music he loves. During a trip to Australia in high school, he was introduced to guitar by a local high school teacher.

“That was my salvation,” he says. “At first I was a little skeptical because the guy was a total guitar nerd. Now I’m a huge fan. He showed me how you can play any song on the guitar in an authentic and effective way, and how important it is to find the right song.”

He

In [212]:
dates_kwargs = load_prompt('short_dates', verbose=False)
res = query_gpt_neo(**dates_kwargs)

print_response(dates_kwargs['prompt'], res)

[1mInput: 3/1/20
Output: March 1, 2020

Input: 09-04-99
Output: September 4, 1999

Input: 11/01/2017
Output: November 1, 2017

Input: 04/11/21
Output:[0m April 11, 2021




In [210]:
dates_kwargs = load_prompt('short_dates', verbose=False)
res = query_gpt_neo(**select(dates_kwargs, drop=['stop']))

print_response(dates_kwargs['prompt'], res)

[1mInput: 3/1/20
Output: March 1, 2020

Input: 09-04-99
Output: September 4, 1999

Input: 11/01/2017
Output: November 1, 2017

Input: 04/11/21
Output:[0m April 11, 2021

Input: 01/


In [178]:
res = query_gpt_neo('I love to play hockey because', max_tokens=250)

print_response(*res)

200
[1mI love to play hockey because[0m I want to win! But in reality it’s more complicated than that… and much of what you think makes it that way isn’t at all true!

It is easy to tell you to “play simple” yet I’ll bet if you showed up to a hockey game, most of the times that’s exactly what you’d be seeing. Because the truth is, if you look past the stats and try to analyse the game on its deepest level you quickly realise that the hockey world is littered with some of the most ridiculous rules and structures known to man.

We have a bunch of “goals for” lists floating around and there is no better example of this than the infamous “10,000 shots in a season” statistic. It is widely believed that a player is judged by the number of shots they rack up – however, just as you can only judge a number of games with the available information, you can only really judge the number of shots a team has shot in one season – the statistics don’t always tell you that much.

Now, we all know, it’

In [179]:
ml_txt = """Stochastic tokenization allows us to achieve better model
generalization on small datasets.""".replace('\n', ' ')
ml_kwargs = load_prompt('ml_abstract', ml_txt, verbose=False)
prompt, res = query_gpt_neo(**ml_kwargs)

print_response(prompt, res)

200
[1mA fellow grad student came up with an interesting idea for a machine learning paper. Please write a more detailed abstract expanding on their idea.

Idea:
Graph attention networks, a neural network architecture that uses masked self-attention to operate on graphs, addresses shortcomings of prior methods based on graph convolutions.

Abstract:
We present graph attention networks (GATs), novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximations. By stacking layers in which nodes are able to attend over their neighborhoods' features, we enable (implicitly) specifying different weights to different nodes in a neighborhood, without requiring any kind of costly matrix operation (such as inversion) or depending on knowing the graph structure upfront. In this way, we address several key challenges of spectral-based graph neural netwo

In [227]:
from requests import HTTPError

ml_txt = """Dropout is a regularization method that enables models to
generalize.""".replace('\n', ' ')
ml_kwargs = load_prompt('ml_abstract', ml_txt, verbose=False)
try:
    res = query_gpt_neo(**ml_kwargs)
except HTTPError as e:
    print('testing')
    raise RuntimeError(str(e)) from None
finally:
    print('in finally')

# print_response(prompt, res)

testing
in finally


RuntimeError: 429 Client Error: Too Many Requests for url: https://api-inference.huggingface.co/models/EleutherAI/gpt-neo-2.7B

In [219]:
requests.HTTPError

requests.exceptions.HTTPError

In [216]:
text = 'one ' * 20
res = query_gpt_neo(text)

print_response(text, res)

HTTPError: 429 Client Error: Too Many Requests for url: https://api-inference.huggingface.co/models/EleutherAI/gpt-neo-2.7B