In [3]:
import json
import requests
import tqdm
from petals import DistributedBloomForCausalLM
from transformers import BloomTokenizerFast
from data_processing import parse_example_file, get_ans, generate_prompt, generate_question

Переведем данные из датасета GSM8K в формата из статьи (question/thought/answer):

In [4]:
train_data = parse_example_file("data/train.jsonl")
test_data = parse_example_file("data/test.jsonl")

Решим задачки с помощью обычного Chain-of-Thoughts и распределенной версии bloom. К сожалению, у большой модельки `bloom-petals` очень часто не все блоки доступны и на нее положиться нельзя. Кроме того, Google Colab, постоянно падает (и приходится перезапускать, выполняя весь предыдущий код) при работе с этими моделями, видимо из-за большого числа внешних запросов к другим блокам. А локально у меня ресурсов хватает только на `bloom-7b1-petals`, поэтому проверка работы будет с ней.
Число примеров в prompt возьмем, как в статье - 8, тогда после 9-го `"A:"` будет идти интересующий нас сгенерированный ответ с рассуждением.

In [5]:
model_name = "bigscience/bloom-7b1-petals"
tokenizer = BloomTokenizerFast.from_pretrained(model_name)
model = DistributedBloomForCausalLM.from_pretrained(model_name, tuning_mode="ptune", pre_seq_len=16).cuda()

p1 = generate_prompt(train_data[:8])
q1 = generate_question(test_data[0])
prefix = tokenizer(p1 + q1, return_tensors="pt")["input_ids"].cuda()
outputs = model.generate(prefix, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Q: Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?
A: Natalia sold 48/2 = 24 clips in May.
Natalia sold 48+24 = 72 clips altogether in April and May.
The answer is 72.

Q: Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?
A: Weng earns 12/60 = $0.2 per minute.
Working 50 minutes, she earned 0.2 x 50 = $10.
The answer is 10.

Q: Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?
A: In the beginning, Betty has only 100 / 2 = $50.
Betty's grandparents gave her 15 * 2 = $30.
This means, Betty needs 100 - 50 - 30 - 15 = $5 more.
The answer is 5.

Q: Julie is reading a 120-page book. Yesterday, she was able to read 1

Рассуждение и ответ есть, в целом формат выдержан верно (картину портят только токены, сгенерированые после ответа). Проверим другой prompt и задачку (возьмем попроще).

In [4]:
p2 = generate_prompt(train_data[8:16])
q2 = generate_question(test_data[33])
prefix = tokenizer(p2 + q2, return_tensors="pt")["input_ids"].cuda()
outputs = model.generate(prefix, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Q: Alexis is applying for a new job and bought a new set of business clothes to wear to the interview. She went to a department store with a budget of $200 and spent $30 on a button-up shirt, $46 on suit pants, $38 on a suit coat, $11 on socks, and $18 on a belt. She also purchased a pair of shoes, but lost the receipt for them. She has $16 left from her budget. How much did Alexis pay for the shoes?
A: Let S be the amount Alexis paid for the shoes.
She spent S + 30 + 46 + 38 + 11 + 18 = S + 143.
She used all but $16 of her budget, so S + 143 = 200 - 16 = 184.
Thus, Alexis paid S = 184 - 143 = $41 for the shoes.
The answer is 41.

Q: Tina makes $18.00 an hour.  If she works more than 8 hours per shift, she is eligible for overtime, which is paid by your hourly wage + 1/2 your hourly wage.  If she works 10 hours every day for 5 days, how much money does she make?
A: She works 8 hours a day for $18 per hour so she makes 8*18 = $144.00 per 8-hour shift
She works 10 hours a day and anythin

Генерация идет долго и на правильные расчеты от такой маленькой модельки рассчитывать не приходится, поэтому будем использовать оригинальную `bigscience/bloom` через HuggingFace Inference API. Кроме того, можно будет сразу получить результат, из которого легче получить ответ, если задать условие остановки генерации.

In [6]:
API_URL = "https://api-inference.huggingface.co/models/bigscience/bloom"
headers = {"Authorization": "Bearer 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"}


def query(payload):
    response = requests.post(API_URL, headers=headers, json=payload)
    return response.json()


params = {
    "max_new_tokens": 128,
    "temperature": 1.0,
    "stop": ["\n\n"]
}
print(query({
    "inputs": (p1 + q1), "parameters": params
})[0]['generated_text'])

Q: Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?
A: Natalia sold 48/2 = 24 clips in May.
Natalia sold 48+24 = 72 clips altogether in April and May.
The answer is 72.

Q: Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?
A: Weng earns 12/60 = $0.2 per minute.
Working 50 minutes, she earned 0.2 x 50 = $10.
The answer is 10.

Q: Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?
A: In the beginning, Betty has only 100 / 2 = $50.
Betty's grandparents gave her 15 * 2 = $30.
This means, Betty needs 100 - 50 - 30 - 15 = $5 more.
The answer is 5.

Q: Julie is reading a 120-page book. Yesterday, she was able to read 1

Теперь можно запустить эксперимент на всем датасете, в котором будут сохраняться предсказанные рассуждение и ответ, чтобы потом посчитать метрки.

In [15]:
with open("results/results.jsonl", "w") as f:
    for q in tqdm.tqdm(test_data):
        inp = (p1 + generate_question(q))
        solution = query({
            "inputs": inp, "parameters": params
        })[0]['generated_text'][len(inp):]
        answer = get_ans(solution)
        print(json.dumps({"solution": solution, "answer": answer}), file=f)

 21%|██        | 272/1319 [17:21<1:06:49,  3.83s/it]


KeyError: 0

Теперь реализуем ансамблированный CoT. Будем брать 50 предсказаний c параметрами, как в статье. Из ответов возьмем самый частый, а в качестве предсказания -- одно имеющих такое ответ. Еще сохраним сколько предсказаний имело такой ответ.

In [17]:
params_ensemble = {
    "max_new_tokens": 128,
    "temperature": 0.7,
    "top_k": 50,
    "do_sample": True,
    "stop": ["\n\n"],
}
print(query({
    "inputs": (p1 + q1), "parameters": params_ensemble
})[0]['generated_text'][len(p1+q1):])

 Every day, she makes 8 * 2 = 16 eggs per day.
She eats 4 eggs for breakfast.
She bakes muffins with 4 eggs.
She sells the remainder 10 * 2 = 20 eggs per day.
The answer is 20.


