Install necessary libraries such as transformers, datasets, and torch

In [7]:
!pip install transformers datasets torch



Select a transformer model suited for code generation, like GPT-2, CodeGen, or Codex

In [8]:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = 'gpt2'  # or another model
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

Use a dataset for training, like the "CodeSearchNet" dataset, or create your own. Load your dataset using the datasets library

In [9]:
from datasets import load_dataset
dataset = load_dataset('code_search_net', 'python')

code_search_net.py:   0%|          | 0.00/8.44k [00:00<?, ?B/s]

README.md:   0%|          | 0.00/12.9k [00:00<?, ?B/s]

The repository for code_search_net contains custom code which must be executed to correctly load the dataset. You can inspect the repository content at https://hf.co/datasets/code_search_net.
You can avoid this prompt in future by passing the argument `trust_remote_code=True`.

Do you wish to run the custom code? [y/N] y


python.zip:   0%|          | 0.00/941M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/412178 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/22176 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/23107 [00:00<?, ? examples/s]

Ensure you have the necessary libraries installed. Run this in a cell

In [10]:
!pip install transformers datasets torch



Import the required libraries and load the pre-trained transformer model

In [11]:
def generate_code(description):
    inputs = tokenizer.encode(description, return_tensors='pt')
    outputs = model.generate(inputs, max_length=500)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

Create a function that takes a user prompt in natural language and generates corresponding code

In [12]:
from IPython.display import display
import ipywidgets as widgets

input_box = widgets.Text(
    description='Enter your prompt here:',
    placeholder='e.g., Create a function to sort a list',
    layout=widgets.Layout(width='500px')
)

output_box = widgets.Output()
display(input_box, output_box)

Text(value='', description='Enter your prompt here:', layout=Layout(width='500px'), placeholder='e.g., Create …

Output()

Use Google Colab’s interactive input feature to allow users to enter their prompts

In [13]:
def on_submit(change):
    with output_box:
        output_box.clear_output()
        prompt = input_box.value
        code = generate_code(prompt)
        print("Generated Code:")
        print(code)

input_box.observe(on_submit, names='value')

In [6]:
# Install Required Libraries
!pip install transformers datasets torch ipywidgets

# Import Libraries
from transformers import AutoModelForCausalLM, AutoTokenizer
from IPython.display import display
import ipywidgets as widgets

# Load Pre-trained Model and Tokenizer
model_name = "gpt2"  # You can change this to a different model if needed
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Set pad_token to eos_token if it's not already set
tokenizer.pad_token = tokenizer.eos_token

# Define the Code Generation Function with pad_token_id and attention_mask
def generate_code(description):
    # Encode the user input with attention mask and padding
    inputs = tokenizer(description, return_tensors='pt', padding=True, truncation=True)

    # Generate code from the input with attention mask and pad_token_id
    outputs = model.generate(
        inputs['input_ids'],
        attention_mask=inputs['attention_mask'],
        max_length=150,
        pad_token_id=tokenizer.eos_token_id,  # Set pad_token_id to eos_token_id
        num_return_sequences=1,
        no_repeat_ngram_size=2,  # Prevent repetitive sequences
        early_stopping=True
    )

    # Decode the generated output to text
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Create an Input Interface
input_box = widgets.Text(
    description='Enter your prompt here:',
    placeholder='e.g., Create a function to sort a list',
    layout=widgets.Layout(width='500px')
)

output_box = widgets.Output()
display(input_box, output_box)

# Define a Function to Handle Input and Generate Code
def on_submit(change):
    with output_box:
        output_box.clear_output()  # Clear previous output
        prompt = input_box.value  # Get the user input
        code = generate_code(prompt)  # Generate code from the prompt
        print("Generated Code:")  # Display header
        print(code)  # Print the generated code

# Observe the input box for changes and call on_submit function
input_box.observe(on_submit, names='value')



Text(value='', description='Enter your prompt here:', layout=Layout(width='500px'), placeholder='e.g., Create …

Output()