# Basic Implementation of Llama 3.2 for GitHub Pull Request Comments

This notebook outlines the process of implementing a basic prompt and response system using the Llama 3.2 model from Hugging Face. The primary goal is to generate a GitHub pull request comment that serves as documentation for a provided code snippet.

### 1. Setting Up the Environment

Before running `base_pretrained_model.ipynb`, ensure you have the necessary libraries installed. The key libraries used in this implementation are:
* `torch`
* `transformers`
* `huggingface_hub`

To get started, follow the instructions in the [Hugging Face quickstart guide](https://huggingface.co/docs/datasets/en/quickstart) and the [PyTorch installation guide](https://pytorch.org/get-started/locally/).

Additionally, you will need a Hugging Face access token, which can be found under your account settings in the Access Tokens tab. Create a token (note that some LLMs may require you to specify a repository during token creation) and store it securely. For this implementation, I store mine in a separate file, which is why I import hf_token alongside the other libraries.

### 2. Import Libraries and Clone LLM Repository

The code below imports the required libraries and logs in to Hugging Face:

In [1]:
import torch
from transformers import pipeline
from huggingface_hub import login
from hf_token import llama_3_2_token

login(token=llama_3_2_token)

**Cloning The Repository**

Make sure to clone the repository for the desired LLM, which in this case is `meta-llama/Llama-3.2-1B`. Ensure the repository is cloned into your working directory. You will need your access token and username during this process.

### 3. Creating the Test Prompt

For this basic implementation, I will create a prompt that instructs the model to write a GitHub pull request comment for a provided code snippet. The code being tested is a safe division function.

For now, I will store the prompt in a variable called `prompt`. In the future, it may be beneficial to create a function to generate prompts dynamically for enhanced usability.


**Safe Division Function** 

Here’s the code for the safe division function:

```python
    from typing import Union
    
    def safe_divide(a: Union[int, float], b: Union[int, float]) -> Union[float, None]:
        """Safely divides two numbers, returning None if division by zero occurs."""
        try:
            return a / b
        except ZeroDivisionError:
            print("Error: Division by zero is not allowed.")
            return None
```

**Prompt Creation** 

I will use markdown formatting and newline escape characters (`\n`) to create a structured prompt:

In [2]:
prompt = (
    "Generate a GitHub Pull Request comment describing the code below. "
    "This should be written in a clear, professional, technical manner so that it can serve as a form of documentation.\n\n"
    
    "### Code:\n"
    "```python\n"
    "from typing import Union\n\n"
    "def safe_divide(a: Union[int, float], b: Union[int, float]) -> Union[float, None]:\n"
    "    \"\"\"Safely divides two numbers, returning None if division by zero occurs.\"\"\"\n"
    "    try:\n"
    "        return a / b\n"
    "    except ZeroDivisionError:\n"
    "        print(\"Error: Division by zero is not allowed.\")\n"
    "        return None\n"
    "```\n\n"
)

### 4. Creating the Pipeline

To utilize the Llama model from Hugging Face, we will create a pipeline object that processes the prompt and generates a response.

To do this we specify the ID of the LLM model and create a pipeline object.

In [3]:
# Specify the ID of the LLM model
model_id = "meta-llama/Llama-3.2-1B"

# Create the pipe
pipe = pipeline(
    "text-generation",              # Specify the task type as text generation
    model=model_id,                 # Specify the LLM model to be used for generation
    torch_dtype=torch.bfloat16,     # Specify the data type for the model weights (bfloat16 for better performance on certain hardware)
    device_map="auto"               # Automatically assign model layers to available devices (GPU/CPU) for optimal performance
)

Device set to use cpu


### 5. Prompting the Model

With everything set up, we can now prompt the model to generate a response.

We will use the pipeline to generate the output, applying certain controls to shape the response:

In [5]:
# Generate the response using the pipeline
output = pipe(
    prompt,                        # The input prompt for the model to generate text based on
    max_new_tokens=2048,           # Specify the maximum number of new tokens to generate (up to 2048 in this case)
    do_sample=False,               # Set to False for deterministic output (the same input will always produce the same output)
)

# Print the generated text from the output
print(output[0]["generated_text"])  # Access and print the generated text from the output object

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


Generate a GitHub Pull Request comment describing the code below. This should be written in a clear, professional, technical manner so that it can serve as a form of documentation.

### Code:
```python
from typing import Union

def safe_divide(a: Union[int, float], b: Union[int, float]) -> Union[float, None]:
    """Safely divides two numbers, returning None if division by zero occurs."""
    try:
        return a / b
    except ZeroDivisionError:
        print("Error: Division by zero is not allowed.")
        return None
```

### Output:
```markdown
# Safe Division

This function safely divides two numbers, returning None if division by zero occurs.

## Usage

```python
safe_divide(10, 2)  # Returns 5.0
safe_divide(10, 0)  # Returns None
```

