In [None]:
import torch
import torch.nn as nn
from transformers import AutoTokenizer, AutoModelForCausalLM, AutoModelForQuestionAnswering, Trainer, TrainingArguments, DataCollatorForLanguageModeling
from datasets import load_dataset, Dataset
import json
import os
import accelerate

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
data_dir = "/content/drive/MyDrive/Data"
model_dir = "/content/drive/MyDrive/Models"
model_dir

'/content/drive/MyDrive/Models'

In [None]:
# ====== Config ======
# model_name = "microsoft/phi-1"
model_name = "Salesforce/codegen-350M-mono"
json_path = data_dir + "/qas_data.json"  # <-- change this
output_dir = model_dir + "/finetuned-roqeto"

In [None]:
# ====== Load Dataset ======
def load_json_dataset(json_path):
    with open(json_path, "r", encoding="utf-8") as f:
        data = json.load(f)
    # Convert to Hugging Face Dataset format
    return Dataset.from_list([{
       "text": f"Question: {item['question']}\nAnswer: {item['answer']}"
    } for item in data['arxiv']+data['stackexchange']+data['wikibook'] ])



In [None]:
dataset = load_json_dataset(json_path)

In [None]:
dataset[-1]

{'text': "Question: What is the Skyhook concept?\nAnswer: Space Elevators have been a theoretical transportation method since 1895. The original idea is impractical to build. This step adds a much more practical design as a transport hub for getting from one orbit to another quickly and efficiently. Initial construction can use materials from Earth, but in larger sizes or locations beyond Earth orbit using local materials is assumed. The popular concept of a space elevator is based on the original design proposed by Tsiolkovsky in the late 19th century. It involves a single tower/cable extending all the way past Geosynchronous (24 hour) Earth Orbit (GEO). If the center of mass is at GEO and matches the Earth's daily rotation it will appear to hang motionless relative to the ground. Getting to space in theory then becomes an elevator ride. There are several problems with this simplistic design: The Skyhook concept addresses all these problems. Instead of a static cable that stays over a

In [None]:
# # Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
Some weights of the model checkpoint at Salesforce/codegen-350M-mono were not used when initializing CodeGenForCausalLM: ['transformer.h.0.attn.causal_mask', 'transformer.h.1.attn.causal_mask', 'transformer.h.10.attn.causal_mask', 'transformer.h.11.attn.causal_mask', 'transformer.h.12.attn.causal_mask', 'transformer.h.13.attn.causal_mask', 'transformer.h.14.attn.causal_mask', 'transformer.h.15.attn.causal_mask', 'transformer.h.16.attn.causal_mask', 'transformer.h.17.attn.causal_mask', 'transformer.h.18.attn.causal_mask', 'transformer.h.19.attn.causal_mask', 'transformer.h.2.attn.ca

In [None]:
data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=False)

In [None]:
tokenizer.add_special_tokens({'pad_token': '[PAD]'})

1

In [None]:
# Tokenize the dataset
def tokenize(example):
    return tokenizer(example["text"], truncation=True, padding="max_length", max_length=512)

tokenized_dataset = dataset.map(tokenize, batched=True)

Map:   0%|          | 0/154 [00:00<?, ? examples/s]

In [None]:
data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=False)

In [None]:
# ====== Training Arguments ======
training_args = TrainingArguments(
    output_dir=output_dir,
    per_device_train_batch_size=4,
    num_train_epochs=50,
    logging_steps=10,
    save_steps=500,
    save_total_limit=2,
    warmup_steps=10,
    weight_decay=0.01,
    logging_dir=os.path.join(output_dir, "logs"),
    fp16=torch.cuda.is_available(),
    report_to="none",
)

In [None]:
# ====== Trainer ======
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_dataset,
    tokenizer=tokenizer,
    data_collator=data_collator,
)

  trainer = Trainer(


In [None]:
# ====== Train ======
trainer.train()

Step,Training Loss
10,3.7766
20,3.8442
30,3.7672
40,3.6543
50,2.4842
60,2.454
70,2.5259
80,2.2084
90,1.2802
100,1.1276


TrainOutput(global_step=1950, training_loss=0.19599869666191247, metrics={'train_runtime': 986.2549, 'train_samples_per_second': 7.807, 'train_steps_per_second': 1.977, 'total_flos': 7197647123251200.0, 'train_loss': 0.19599869666191247, 'epoch': 50.0})

In [None]:
def query_model(prompt, max_new_tokens=200):
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    outputs = model.generate(**inputs, max_new_tokens=max_new_tokens)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

In [None]:
print(query_model("How to launch a rocket?"))

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


How to launch a rocket?
Answer: Launching a rocket is beneficial because of the large amount of gravitational
propulsion in the solar system. The required fuel mass on the rocket is
inquires to reach a speed of maximum speed. Thus, the amount of the
gravitational propulsion on the rocket is reduced to reduce the
solar system cost, making it suitable for transfer and storage of momentum
and electrical power. By expanding this reduceable rocket mass on the
solar system, it is shown that the acceleration of the rocket is
inquires to reach a speed of maximum speed. Thus, the amount of the
gravitational propulsion on the rocket is reduced to reduce the
solar system cost, making it suitable for transfer and storage of momentum
and electrical power. By expanding this reduceable rocket mass on the
solar system, it is shown that the fuel mass on the rocket is
inquires to reach a speed of maximum speed. Thus, the amount of the
g


In [None]:
# Example usage
print(query_model("Question: How would you make sure the rocket flies straight?"))

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Question: How would you make sure the rocket flies straight?
Answer: A vast wealth of literature exists on the topic of rocket trajectory
optimisation, particularly in the area of interplanetary trajectories due to
its relevance today. Studies on optimising interstellar and intergalactic
trajectories are usually performed in flat spacetime using an analytical
approach, with very little focus on optimising interstellar trajectories in a
general relativistic framework. This paper examines the use of quantum reinforcement
learning as a promising solution for the problem of interplanetary trajectories?
Answer: The relativistic framework generalizes the power reinforcement
learning to a wider range of input parameters? The important points for the
framework are the speed of light and the speed of light at a scale of 10^3
(eV/kW) and the speed of light at a scale of 10^3 (eV/kUV). Using a
multi-element slow reactor, we explored the use of quantum reinforcement
learning as a


In [None]:
print(query_model("What rocket fuel would you use to test your rocket?"))

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


What rocket fuel would you use to test your rocket? The required fuel mass for a rocket is $10^{3}$ kg/m^3 (from the thrust profile). We find that a rocket with a required fuel mass of $10^{3}$ kg/m^3 would be able to use a planet's (usually the Earth's) atmosphere as a supply of fuel for the planet. This significantly limits the system performance, payload capacity, and mission flexibility. We choose the thrust profile to be the middle 90% of where people live. This is because the natural environment is the most Earth-like by a number of measures. In the previous sections of this book we have discussed individual systems which carry out purposeful functions. In this section we survey the possible solutions of a program for rocket fuel. It is possible to develop a full economy of space ships, but there is a lot of work to do. So we have not selected all the preferred solutions but rather selected a narrow range of versions. The 2016 Air Force Research Laboratory (


In [None]:
# ====== Save Model ======
trainer.save_model(output_dir)
tokenizer.save_pretrained(output_dir)

('/content/drive/MyDrive/Models/finetuned-roqeto/tokenizer_config.json',
 '/content/drive/MyDrive/Models/finetuned-roqeto/special_tokens_map.json',
 '/content/drive/MyDrive/Models/finetuned-roqeto/vocab.json',
 '/content/drive/MyDrive/Models/finetuned-roqeto/merges.txt',
 '/content/drive/MyDrive/Models/finetuned-roqeto/added_tokens.json',
 '/content/drive/MyDrive/Models/finetuned-roqeto/tokenizer.json')