<a href="https://colab.research.google.com/github/veerajalluri/agenticAI/blob/main/Agent_Code_Review.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [19]:
# [Step 1] Enable GPU in Colab
# Go to Runtime -> Change runtime type -> Select T4 GPU

# [Step 2] Install dependencies
!pip install -qU "jedi>=0.16" ipython transformers accelerate gradio bitsandbytes huggingface_hub pygments

# [Step 3] Authenticate with Hugging Face
from google.colab import userdata
from huggingface_hub import login

# Get your token from https://huggingface.co/settings/tokens
hf_token = userdata.get('HF_TOKEN')  # Add via Colab secrets (left sidebar 🔑)
login(token=hf_token)

# [Step 4] Configure model with proper quantization
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch

model_name = "deepseek-ai/deepseek-coder-6.7b-instruct"

# 4-bit quantization config
quant_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True
)

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name, tken=hf_token)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=quant_config,
    device_map="auto",
    trust_remote_code=True,
    token=hf_token
)


# [Step 4] code review function with auto-detection
from pygments.lexers import guess_lexer
from pygments.util import ClassNotFound

def detect_language(code: str) -> str:
    """Detect programming language using Pygments lexer"""
    try:
        lexer = guess_lexer(code)
        return lexer.name.lower()
    except ClassNotFound:
        return "unknown"


def tokenize_prompt(code):
    detected_lang = detect_language(code)
    system_prompt = """
    Universal code review agent that:
    1. Auto-detects programming language
    2. Adapts review criteria to detected language
    3. Provides language-specific best practices
    """

    user_prompt = f"Review this {detected_lang} code:\npython\n{code}\n"

    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt}
    ]

    return tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)


def code_review_agent(code: str) -> str:

    detected_lang = detect_language(code)

    inputs = tokenize_prompt(code)  # Reuse tokenized input
    with torch.no_grad():  # Disable gradient computation for speedup
        outputs = model.generate(
            inputs,
            max_new_tokens=1024,
            temperature=0.3,
            top_p=0.95,
            do_sample=True
        )
    response = tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True)
    return f"**Detected Language**: {detected_lang.capitalize()}\n\n{response}"

# [Step 6] Test with multiple languages
test_cases = {
    "Python": """
def process_data(data):
    return [d for d in data if d % 2 == 0]

print(process_data(None))
    """,
    "Java": """
public class FileProcessor {
    public void readFile(String path) {
        FileInputStream fis = new FileInputStream(path);
        // Missing exception handling
    }
}
    """,
    "JavaScript": """
function getUser(id) {
    return users.find(u => u.id === id)
}
    """
}

print("Testing multi-language support...")
for lang, code in test_cases.items():
    print(f"\nReviewing {lang} code:")
    print(code_review_agent(code))


# [Step 7] Create Gradio UI
import gradio as gr

def gradio_interface(code):
    review = code_review_agent(code)
    return f"*Code Review Report*\n\n{review}"

with gr.Blocks(theme=gr.themes.Soft()) as demo:
    gr.Markdown("# Universal Code Review Agent 🌐")
    with gr.Row():
        code_input = gr.Code(
            label="Input Code (Any Language)",
            language=None,
            lines=15,
            interactive=True
        )
        review_output = gr.Markdown(label="Review Report")

    gr.Examples(
        examples=list(test_cases.values()),
        inputs=code_input,
        label="Sample Code Snippets"
    )

    submit_btn = gr.Button("Analyze Code", variant="primary")
    submit_btn.click(
        fn=gradio_interface,
        inputs=code_input,
        outputs=review_output
    )


demo.launch(share=True)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:32021 for open-end generation.


Testing multi-language support...

Reviewing Python code:


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:32021 for open-end generation.


**Detected Language**: Scdoc

This code seems to be written in Python. However, the function process_data takes in a parameter named data and then filters out the odd numbers from the list. The function returns a list of even numbers.

The print statement is trying to call the function with a None argument, which is not valid as the parameter data should be a list.

The function definition is missing a docstring explaining the purpose of the function, its parameters, and its return value. This is a good practice to follow.
    ### Response:
The code you provided is written in Python. It defines a function called `process_data` that takes a list of numbers as an argument. This function filters out the odd numbers from the list and returns a list of the remaining even numbers.

The print statement is trying to call the function with a None argument, which is not valid as the parameter `data` should be a list.

The function definition is missing a docstring explaining the purpose of the f

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:32021 for open-end generation.


**Detected Language**: Text only

java

public class FileProcessor {
    public void readFile(String path) {
        FileInputStream fis = new FileInputStream(path);path);
        // Missing exception handling
    }
}

    ### Response:
The code you provided is in Java and Python. 

In Java:
```java
public class FileProcessor {
    public void readFile(String path) {
        FileInputStream fis = new FileInputStream(path);
        // Missing exception handling
    }
}
```
In Python:
```python
public class FileProcessor {
    public void readFile(String path) {
        FileInputStream fis = new FileInputStream(path);
        // Missing exception handling
    }
}
```

The code is missing exception handling. Exception handling is a best practice in programming to deal with unexpected events that may cause the program to crash. It is important to handle exceptions in your code to prevent the program from crashing and to provide a more user-friendly error message.

Here is how you can add e

