# System Preparation

- Python 3.10
- PyTorch 2.5.0
- Jupyter

## Additional

### Libraries

Hugging Face Transformers at https://huggingface.co/docs/transformers/installation

In [1]:
!pip install flax transformers accelerate gdown

Collecting flax
  Using cached flax-0.10.0-py3-none-any.whl.metadata (11 kB)
Collecting transformers
  Downloading transformers-4.46.0-py3-none-any.whl.metadata (44 kB)
Collecting accelerate
  Using cached accelerate-1.0.1-py3-none-any.whl.metadata (19 kB)
Collecting gdown
  Using cached gdown-5.2.0-py3-none-any.whl.metadata (5.8 kB)
Collecting jax>=0.4.27 (from flax)
  Downloading jax-0.4.35-py3-none-any.whl.metadata (22 kB)
Collecting msgpack (from flax)
  Using cached msgpack-1.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (8.4 kB)
Collecting optax (from flax)
  Using cached optax-0.2.3-py3-none-any.whl.metadata (8.3 kB)
Collecting orbax-checkpoint (from flax)
  Using cached orbax_checkpoint-0.7.0-py3-none-any.whl.metadata (1.8 kB)
Collecting tensorstore (from flax)
  Downloading tensorstore-0.1.67-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.0 kB)
Collecting rich>=11.1 (from flax)
  Downloading rich-13.9.3-py3-none-any.whl.metadat

# Load and check LLM

In [40]:
import os

hf_token = os.environ.get('HF_TOKEN')

messages = [
    {"role": "system", "content": "You are a code error generator. You will be provided with a number as input. Generate that many unique JavaScript code snippets, each containing at least one syntax error.  Output only the code snippets."},
    {"role": "user", "content": "5"},
]

## Llama 3.1 8B Instruct

Details: https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct

In [41]:
import transformers
import torch

model_id = "meta-llama/Meta-Llama-3.1-8B-Instruct"

pipe = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

In [42]:
outputs = pipe(
    messages,
    max_new_tokens=256,
    pad_token_id=50256
)
print(outputs[0]["generated_text"][-1])

{'role': 'assistant', 'content': "```javascript\n// Code Snippet 1\nlet x = 5;\ny = x + 10; // Missing 'let' keyword\nconsole.log(y);\n```\n\n```javascript\n// Code Snippet 2\nfunction add(a, b) {\n  return a + b;\n}\nconsole.log(add(5, 10); // Missing closing parenthesis\n```\n\n```javascript\n// Code Snippet 3\nlet arr = [1, 2, 3];\nconsole.log(arr[5]); // Array index out of bounds\n```\n\n```javascript\n// Code Snippet 4\nclass Person {\n  constructor(name) {\n    this.name = name;\n  }\n}\nlet person = new Person('John'); // Missing semicolon after class declaration\nconsole.log(person.name);\n```\n\n```javascript\n// Code Snippet 5\nlet obj = { name: 'John' };\nconsole.log(obj.name + 10); // Attempting to add a string and a number\n```"}


## Llama 3.2 3B Instruct

Details: https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct

In [43]:
import torch
import transformers

model_id = "meta-llama/Llama-3.2-3B-Instruct"
pipe = transformers.pipeline(
    "text-generation",
    model=model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Some parameters are on the meta device because they were offloaded to the cpu.


In [44]:
outputs = pipe(
    messages,
    max_new_tokens=256,
    pad_token_id=50256
)
print(outputs[0]["generated_text"][-1])

{'role': 'assistant', 'content': "```javascript\nfunction add(a, b) {\n    return a + b;\n}\n\nconsole.log(add(5, 10)); // SyntaxError: Unexpected token 'console.log'\nconsole.log(add(5, 10)); // SyntaxError: Unexpected token 'console.log'\nconsole.log(add(5, 10)); // SyntaxError: Unexpected token 'console.log'\nconsole.log(add(5, 10)); // SyntaxError: Unexpected token 'console.log'\nconsole.log(add(5, 10)); // SyntaxError: Unexpected token 'console.log'\n```"}


## CodeLlama 7b Instruct

Details https://huggingface.co/meta-llama/CodeLlama-7b-Instruct-hf

In [45]:
import transformers
import torch

model_id = "meta-llama/CodeLlama-7b-Instruct-hf"

pipe = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Some parameters are on the meta device because they were offloaded to the cpu.


In [None]:
outputs = pipe(
    messages,
    max_new_tokens=256,
    pad_token_id=50256
)
print(outputs[0]["generated_text"][-1])

# Misc

### TFix dataset

https://drive.google.com/file/d/1CtfnYaVf-q6FZP5CUM4Wh7ofpp8b9ajW/view

# WIP

In [None]:
from transformers import AutoTokenizer, AutoModelForCausalLM

hf_token = "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"

model_name = "meta-llama/Llama-3.1-8B" 
tokenizer = AutoTokenizer.from_pretrained(model_name, token=hf_token)
model = AutoModelForCausalLM.from_pretrained(model_name, token=hf_token)

Downloading shards:  25%|█████████████████████████▊                                                                             | 1/4 [07:46<23:20, 466.84s/it]

In [35]:
!pip install huggingface_hub["cli"]

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Collecting InquirerPy==0.3.4 (from huggingface_hub[cli])
  Downloading InquirerPy-0.3.4-py3-none-any.whl.metadata (8.1 kB)
Collecting pfzy<0.4.0,>=0.3.1 (from InquirerPy==0.3.4->huggingface_hub[cli])
  Downloading pfzy-0.3.4-py3-none-any.whl.metadata (4.9 kB)
Downloading InquirerPy-0.3.4-py3-none-any.whl (67 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.7/67.7 kB[0m [31m4.0 MB/s[0m eta [36m0:00:00[0m
[?25h[31mERROR: Could not install packages due to an OSError: [Errno 28] No space left on device: '/tmp/tmp11krsk0q'
[0m[31m
[0m

In [36]:
from transformers import file_utils
print(file_utils.default_cache_path)

/root/.cache/huggingface/hub
