# **Evaluación del modelo GPT-2 en problemas matemáticos**

# 1. Descargamos los paquetes necesarios:

In [None]:
!pip install gpt-2-simple
import gpt_2_simple as gpt2

Importamos el modelo de 124 Millones de parámetros:

In [None]:
gpt2.download_gpt2(model_name="124M")

Fetching checkpoint: 1.05Mit [00:00, 189Mit/s]                                                      
Fetching encoder.json: 1.05Mit [00:00, 4.89Mit/s]
Fetching hparams.json: 1.05Mit [00:00, 276Mit/s]                                                    
Fetching model.ckpt.data-00000-of-00001: 498Mit [00:10, 45.5Mit/s]                                  
Fetching model.ckpt.index: 1.05Mit [00:00, 116Mit/s]                                                
Fetching model.ckpt.meta: 1.05Mit [00:00, 7.78Mit/s]
Fetching vocab.bpe: 1.05Mit [00:00, 5.66Mit/s]


Comenzamos con la construcción de los datos:

In [None]:
!pip install datasets evaluate transformers[sentencepiece]
!wget https://github.com/crux82/squad-it/raw/master/SQuAD_it-train.json.gz
!wget https://github.com/crux82/squad-it/raw/master/SQuAD_it-test.json.gz
!gzip -dkv SQuAD_it-*.json.gz

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting datasets
  Downloading datasets-2.7.1-py3-none-any.whl (451 kB)
[K     |████████████████████████████████| 451 kB 4.6 MB/s 
[?25hCollecting evaluate
  Downloading evaluate-0.3.0-py3-none-any.whl (72 kB)
[K     |████████████████████████████████| 72 kB 1.2 MB/s 
[?25hCollecting transformers[sentencepiece]
  Downloading transformers-4.24.0-py3-none-any.whl (5.5 MB)
[K     |████████████████████████████████| 5.5 MB 57.0 MB/s 
[?25hCollecting responses<0.19
  Downloading responses-0.18.0-py3-none-any.whl (38 kB)
Collecting multiprocess
  Downloading multiprocess-0.70.14-py37-none-any.whl (115 kB)
[K     |████████████████████████████████| 115 kB 55.5 MB/s 
Collecting xxhash
  Downloading xxhash-3.1.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (212 kB)
[K     |████████████████████████████████| 212 kB 43.1 MB/s 
Collecting huggingface-hub<1.0.0,>=0.2.0
  Downloadin

--2022-11-30 00:21:00--  https://github.com/crux82/squad-it/raw/master/SQuAD_it-train.json.gz
Resolving github.com (github.com)... 140.82.112.3
Connecting to github.com (github.com)|140.82.112.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/crux82/squad-it/master/SQuAD_it-train.json.gz [following]
--2022-11-30 00:21:01--  https://raw.githubusercontent.com/crux82/squad-it/master/SQuAD_it-train.json.gz
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 7725286 (7.4M) [application/octet-stream]
Saving to: ‘SQuAD_it-train.json.gz’


2022-11-30 00:21:01 (89.9 MB/s) - ‘SQuAD_it-train.json.gz’ saved [7725286/7725286]

--2022-11-30 00:21:01--  https://github.com/crux82/squad-it/raw/master/SQuAD_it-test.

# 2) Cargamos los datos a usar 

Cargamos la función de HuggingFace 🤗 para descargar datos:

In [None]:
### Datos de Math
from datasets import load_dataset

Para este proyecto, usaremos la base de datos MathQA, la cual contiene una lista de preguntas de varias áreas del campo de las matemáticas, junto con su respectivo procedimiento, opciones posibles y respuesta: 

In [None]:
datos = load_dataset("math_qa", split="train")

Downloading builder script:   0%|          | 0.00/3.25k [00:00<?, ?B/s]

Downloading metadata:   0%|          | 0.00/1.56k [00:00<?, ?B/s]

Downloading readme:   0%|          | 0.00/7.44k [00:00<?, ?B/s]

Downloading and preparing dataset math_qa/default to /root/.cache/huggingface/datasets/math_qa/default/0.1.0/67fc1cc5d22b185002c6fd16e19e4d5215eae01fb04d656bed83204ba6ee55ff...


Downloading data:   0%|          | 0.00/7.30M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/29837 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/2985 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/4475 [00:00<?, ? examples/s]

Dataset math_qa downloaded and prepared to /root/.cache/huggingface/datasets/math_qa/default/0.1.0/67fc1cc5d22b185002c6fd16e19e4d5215eae01fb04d656bed83204ba6ee55ff. Subsequent calls will reuse this data.


# 3) Preparamos los datos para el entrenamiento de nuestro modelo

Para procesar los datos, haremos uso de la librería *Pandas*

In [None]:
import pandas as pd

Convertimos nuestos datos en un DataFrame de *Pandas*:

In [None]:
datos=pd.DataFrame(datos)

Dado que tenemos las columnas *Problem*, *Rationale*, *options* y *correct*, procedemos a crear una nueva base de datos, la cual tendrá únicamente una columna, tal que cada entrada de esta columna sea un problema en el siguiente formato:

* Planteamiento del problema:
$$Problem: \cdots $$
$$Options: \cdots $$

* Solución del problema:
$$Rationale: \cdots $$
$$Correct: \cdots $$


Un ejemplo de esto, sería el problema en la posición 1, el cual quedaria como sigue:

**Planteamiento del problema:**
Average age of students of an adult school is 40 years . 120 new students whose average age is 32 years joined the school . as a result the average age is decreased by 4 years . find the number of students of the school after joining of the new students.

a ) 1200 , b ) 120 , c ) 360 , d ) 240 , e ) none of these

**Solución del problema:** let the original no . of students be x . according to situation , 40 x + 120 * 32 = ( x + 120 ) 36 ⇒ x = 120 so , required no . of students after joining the new students = x + 120 = 240 . 

answer : d

In [None]:
datos_entrenamiento=[]
for i in range(0,len(datos)):
  datos_entrenamiento.append("Problem decription: "+datos["Problem"][i] + " Possible options: "+datos["options"][i]+
                             ". Solution: "+ datos["Rationale"][i])
  print("Porcentaje de avance: ",i/(len(datos)-1)*100,"%")
datos_entrenamiento  

Una vez listos los dato, los exportamos a un archivo csv:

In [None]:
datos_entrenamiento=pd.DataFrame(datos_entrenamiento)

In [None]:
# Exporting data as a csv file
datos_entrenamiento.to_csv('/content/datos_entrenamiento.csv',index=False)

# 4) Entrenamos el modelo:

Guardamos el nombre de los datos que acabamos de construir:

In [None]:
file_name = "datos_entrenamiento.csv"

Entrenamos el modelo:

In [None]:
sess = gpt2.start_tf_sess()

gpt2.finetune(sess,
              dataset=file_name,
              model_name="124M",
              steps=200,
              restore_from="fresh",
              run_name="run1",
              print_every=10,
              sample_every=20,
              save_every=20
              )

Loading checkpoint models/124M/model.ckpt
Loading dataset...


100%|██████████| 1/1 [00:00<00:00,  2.86it/s]


dataset has 4653584 tokens
Training...
[10 | 27.64] loss=2.30 avg=2.30
[20 | 49.45] loss=2.21 avg=2.26
Saving checkpoint/run1/model-20
 = ) 4 , 4 , 4 , 4 , 4 , 4 , 4 , 4 , 4 2 , 3 , 3, 3 ) 4 , 4 , 4 , 4 , 4 , 4 , 4 1, 3 , 3 , 3, 3, 3, 3, 3, 3 , 3, 3, 3, 3, 3 2, 3, 3, 3, 3, 3, 3 , 3, 3, 3, 3, 3, 3, 3 2, 3, 3, 3 2, 3, 3, 3, 3, 3, 3 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3 2, 3 2, 3, 3, 3 3, 3, 3 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3 2, 3, 3 2, 3 3, 3, 3, 3, 3, 3, 3, 3) 4 , 6 , 7 , 8 , 9 , 10 , 12 , 19 , 21 , 23 , 27 , 32 , 40 , 42 , 44 , 45 , 49 , 55 , 58 , 64 , 66 , 68 , 80 , 87 , 92 , 96 , 98 , 100 , 108 , 112, 119, 125 , 123 , 134 , 135 , 137 , 142 , 145 , 152 , 155 , 156 , 157 , 160 , 161 , 162 , 166 , 172 , 172, 173 , 173, 175, 177, 182, 185 , 187, 190 , 195 , 199 , 199 , 199, 199 , 195, 199 , 100 , 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100

Instructions for updating:
Use standard file APIs to delete files with this prefix.


 – of 3,749 points . so, if we add 10 . then we get 8 , 8 , 6 , 5 , 5, 7 , 5 . or so it seems . , so . , if 3,744 = 5 , then we get 7 , so we get 7 , so we need to be careful with our numbers . , so we get 3 ,745 , so we need to remove 7 from the equation , and correct this equation to get 7 , so , you have to say , 6,525 * 10 * 2 = 7, so let 3 be the sum of the numbers : you have got 3 , but he can not remove 13 if he gets 3 , then he has got 2 , so we can remove 13 . , so then , the sum we got is 3,000 , but he needs to add 2 to the equation to get 3,000 x 2 . . , he subtracted 2 from , so there is no time to add more than 1 so that he gets 3,000 so that he can remove the total from , thus, he needs to subtract 1,000 , i. e . 0 , which requires 10 minutes , so , time to do so . . . i have got 1 minute and 7 minutes . so , i got 1 minute and 7 minutes . answer : a<|endoftext|>
<|startoftext|>Problem decription: a man sold a house in the 3 years he sold the house . after 4 years , when

# 5) Usamos el modelo

In [None]:
gpt2.generate(sess,run_name="run1")

<|startoftext|>Problem decription: if a man walks for 3 miles in the direction opposite to that of b and b runs for 5 miles in the opposite direction , what is the distance between them ? Possible options: a ) 6 , b ) 8 , c ) 11 , d ) 7 , e ) 9. Solution: "distance between b and d = 3 * 5 = 8 miles distance between b and d = 7 miles distance between d and b = 5 * 5 = 6 miles distance between a and b = 6 + 2 = 4 miles distance between d and d = 4 * 5 = 8 miles distance between d and b = 8 + 2 = 3 miles distance between d and d + 3 = 3 * 5 = 36 miles distance between d and b = 3 * 5 + 36 = 8 miles distance between d and b = 8 + 2 = 6 miles distance between d and b = 6 + 2 = 10 miles distance between d and b = 6 + 2 = 10 miles distance between d and b = 10 + 2 = 14 miles distance between d and b = 14 + 2 = 18 miles distance between d and b = 18 + 2 = 24 miles distance between d and b = 24 + 2 = 32 miles distance between d and b = 32 + 2 = 36 miles distance between d and b = 32 + 2 = 48 mi

In [None]:
gpt2.generate(sess,
              length=250,
              temperature=0.3,
              prefix="Problem decription: if a man walks for 7 miles in the direction opposite to that of b and b runs for 6 miles in the opposite direction , what is the distance between them ? Possible options: a ) 6 , b ) 8 , c ) 14 , d ) 7 , e ) 9. Solution:",
              nsamples=3,
              batch_size=1,
              top_k=40)

Problem decription: if a man walks for 7 miles in the direction opposite to that of b and b runs for 6 miles in the opposite direction , what is the distance between them ? Possible options: a ) 6 , b ) 8 , c ) 14 , d ) 7 , e ) 9. Solution: "distance = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2 = ( 7 + 6 ) / 2
Problem decription: if a man walks for 7 miles in the direction opposite to that of b and b runs for 6 miles in the opposite direction , what is the distance between them ? Possible options: a ) 6 , b ) 8 , c ) 14 , d ) 7 , e ) 9. Solution: "distance = 

In [None]:
def respuestas_gpt2(pregunta,temp):
  return(gpt2.generate(sess,
              length=250,
              temperature=temp,
              prefix=pregunta,
              nsamples=1,
              batch_size=1,
              top_k=40))