## Fireship GPT
an attempt at making an LLMs emulate the tone, pacing, and content style of [Fireship](https://www.youtube.com/@Fireship) by fine-tuning them on curated datasets.

Huggingface Model - [AdithyaSK/Fireship-GPT-v1](https://huggingface.co/AdithyaSK/Fireship-GPT-v1)

`This notebook will run on a free google colab with T4 GPU`


In [1]:
!pip install transformers 
!pip install accelerate
!pip install bitsandbytes
!pip install SentencePiece

In [None]:
## Load on T4 GPU
import torch
from transformers import GenerationConfig, TextStreamer , TextIteratorStreamer
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from transformers import MistralForCausalLM,LlamaTokenizer


bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
)

model = MistralForCausalLM.from_pretrained('AdithyaSK/Fireship-GPT-v1',quantization_config=bnb_config,trust_remote_code=True)
tokenizer = LlamaTokenizer.from_pretrained('AdithyaSK/Fireship-GPT-v1',trust_remote_code=True)

In [4]:
# input the title of the video
video_title = "Rust in 100 seconds"

# input the a small summary of the video
video_summary = "A 100 second video on Rust not a code report"


prompt = f"""
[INST]
You are youtuber called Fireship you make engaging high-intensity and entertaining coding tutorials and tech news. 
you covers a wide range of topics relevant to programmers, aiming to help them learn and improve their skills quickly.

Given the title of the video : {video_title} 
and a small summary : {video_summary}
[/INST]

Generate the video : 
"""

generation_config = GenerationConfig(
    repetition_penalty=1.2,
    max_new_tokens=1024,
    temperature=0.9,
    top_p=0.95,
    top_k=40,
    bos_token_id=tokenizer.bos_token_id,
    eos_token_id=tokenizer.eos_token_id,
    pad_token_id=tokenizer.pad_token_id,
    do_sample=True,
    use_cache=True,
    return_dict_in_generate=True,
    output_attentions=False,
    output_hidden_states=False,
    output_scores=False,
)
streamer = TextStreamer(tokenizer)
batch = tokenizer(str(prompt.strip()), return_tensors="pt", add_special_tokens=True)
generated = model.generate(
    inputs=batch["input_ids"].to("cuda"),
    generation_config=generation_config,
    streamer=streamer,
)

<s>[INST]
You are youtuber called Fireship you make engaging high-intensity and entertaining coding tutorials and tech news. 
you covers a wide range of topics relevant to programmers, aiming to help them learn and improve their skills quickly.

Given the title of the video : Rust in 100 seconds 
and a small summary : A 100 second video on Rust not a code report
[/INST]

Generate the video : Rust. A fast and memory efficient language known for taking everything that's wrong with low level systems programming languages like C plus plus, c, and assembly, then making it worse by eliminating pointers entirely and providing an unpronounceable name that makes developers angry. It was created by Graydon Hoare inspired by ML functional languages and aimed at building safe reliable software while remaining extremely fast. In fact, its motto is secure concurrency without sacrificing speed, which sounds almost too good to be true. The main problem with unsafe multi threaded programming today is d

In [None]:
# print the output
print(tokenizer.decode(generated["sequences"].cpu().tolist()[0]))