# Using a pre-trained model from Huggingface to summarize code

In this workbook, we demonstrate models that can be used to summarize code. We use the `transformers` library from Huggingface to load a pre-trained model and use it to summarize code. We use the `t5-small` model for this purpose.

In [1]:
import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning)

## Using models for code summarization

This model is the CodeT5 model for summarizing source code. 

An example model is available here: https://huggingface.co/Salesforce/codet5-base-multi-sum 

In [2]:
from transformers import RobertaTokenizer, T5ForConditionalGeneration


tokenizer = RobertaTokenizer.from_pretrained('Salesforce/codet5-base-multi-sum')
model = T5ForConditionalGeneration.from_pretrained('Salesforce/codet5-base-multi-sum')

text = """def svg_to_image(string, size=None):
if isinstance(string, unicode):
    string = string.encode('utf-8')
    renderer = QtSvg.QSvgRenderer(QtCore.QByteArray(string))
if not renderer.isValid():
    raise ValueError('Invalid SVG data.')
if size is None:
    size = renderer.defaultSize()
    image = QtGui.QImage(size, QtGui.QImage.Format_ARGB32)
    painter = QtGui.QPainter(image)
    renderer.render(painter)
return image"""

input_ids = tokenizer(text, return_tensors="pt").input_ids

generated_ids = model.generate(input_ids, max_length=20)
print(tokenizer.decode(generated_ids[0], skip_special_tokens=True))

# this prints: "Convert a SVG string to a QImage."

Converts a SVG string to a QImage.


## Where to go from here

Once that this model works, you can use it to summarize code in your own projects. You can also use other models from Huggingface for this purpose.

1. Create a script that will send one line at a time to the model and generate the comment for it
    a. Please take a look at the `max_length` parameter in the `generate` function. This parameter controls the maximum length of the output. You can change it to a different value if you want to generate longer or shorter comments.
2. Create a script that will send one function at a time and generate the description of it. 
    a. Pleae remember that the model has an input window, so not all functions will be summarized correctly. You can try to split the function into smaller parts and summarize them separately.