# Analyzing Python Programs That Use PyTamaro: An AST Approach


## Abstract

This project explores the use of Abstract Syntax Trees (ASTs) to analyze Python programs that utilize the PyTamaro library. PyTamaro is an educational library that provides a simple interface for creating and composing graphics.

From the ASTs of Python programs that use PyTamaro, we can extract information about the usage of the library. In particular, we are going to extract all the functions, primitives, constants and operations that are used in the programs. Then, we are going to map each of these elements to the corresponding TamaroCards.

TamaroCards are a set of cards that represent all the elements of the PyTamaro library. They are used in the educational context to teach programming concepts and the use of the PyTamaro library in a unplugged way.

The goal of this project is to provide a tool that can be used by teachers to analyze any given Python program that uses PyTamaro and automatically generate printable sets of the necessary TamaroCards.

The PyTamaro library is available in the following languages:

- English
- Italian
- German
- French

This creates the first challenge we need to address...

## How can we analyse Python programs that use PyTamaro in different languages?

If in one source code I find a call to the function `kreis_sektor` that comes from the `pytamaro.de` package, how can I know that this function is actually the same as the `settore_circolare` function in the `pytamaro.it` package?

The solution is to create a mapping between the names present in the PyTamaro library from the different languages to a common one.

This simplifies the analysis of the various user programs later on, as we will be able to get programs written in any of the PyTamaro supported languages and map them to the common language.

The common language we are going to use is English, which is the base language of the PyTamaro library.

We could manually create this mapping, but this would be a tedious and error-prone task. Instead, we are going to select the set of "wrapper" files, which in the PyTamaro library are used to expose it's functionalities in the different languages.

The list of this files is:

- color_names
- color
- graphic
- io
- operations
- point_names
- point
- primitives

Which are present in all the different language versions:

- English
- Italian
- German
- French

In these files, there is already a mapping between the original English version and the target language, this is done by wrapping all the functions and re-defining all the constants.

Now we need a way to extract this information from the PyTamaro library and use it for our analysis.

### Extracting the mapping from the PyTamaro library

To extract the mapping from the PyTamaro library we are going to create an AST version of all the wrapper files, then we are going to extract all the functions, constants and operations that are present as nodes in the AST and save it in a dictionary that can be used later on.

We can organize the various steps that are required to create the translation dictionary in the following way:

1. Load the various wrapper files from the PyTamaro library
2. Create an AST version of the combined wrapper files
3. Extract all the functions, constants and operations from the AST
4. Save the extracted information in a dictionary

#### Loading the wrapper files and creating the AST

We are going to use the `ast` module to create the AST of the wrapper files. The `ast` module provides a simple way to create and manipulate the AST of Python programs.

In [1]:
import ast
import inspect
import importlib

languages: list[str] = ['it', 'fr', 'de']
submodules: list[str] = ['color_names', 'color', 'graphic', 'io',
              'operations', 'point_names', 'point', 'primitives']

source_code: str = ''

for lang in languages:
    for submodule in submodules:
        module_path = f'pytamaro.{lang}.{submodule}'
        try:
            # Dynamically get the module
            module = importlib.import_module(module_path)
            # Get the source code and append it to the source_code string
            source_code += inspect.getsource(module)
        except Exception as e:
            print(f'Error with {module_path}: {e}')

# Get the ast from the source code
pytamaro_ast = ast.parse(source_code)

#### NodeVisitor class for extracting the functions, constants and operations

Now that we have the AST of the wrapper files, we can create a custom `NodeVisitor` class that will extract all the functions, constants and operations from the AST. This class will be used to traverse the AST and at each node, in which we are interested in, we are going to add the corresponding PyTamaro element to the `translations` dictionary.

This dictionary has the form: `translations["original_name"] = "translated_name"`

In [2]:
class PyTamaroTranslatorVisitor(ast.NodeVisitor):
    def __init__(self: ast.NodeVisitor) -> None:
        self.translations: dict[str, str] = {}

    def is_pytamaro_word(self, word: str) -> bool:
        return word in self.translations

    def translate(self: ast.NodeVisitor, name: str) -> str:
        return self.translations.get(name, name)

    def visit_AnnAssign(self, node: ast.AnnAssign) -> None:
        self.translations[node.target.id] = node.value.attr
        self.translations[node.value.attr] = node.value.attr

    def visit_Assign(self, node: ast.Assign) -> None:
        self.translations[node.targets[0].id] = node.value.attr
        self.translations[node.value.attr] = node.value.attr

    def visit_FunctionDef(self: ast.NodeVisitor, node: ast.FunctionDef) -> None:
        # Get the last statement
        statement = node.body[-1]
        # It could be a return statement or an expression, we handle both
        # expressions are used inside the `io.py` file of the various languages
        if isinstance(statement, ast.Return):
            self.translations[node.name] = statement.value.func.attr
            self.translations[statement.value.func.attr] = statement.value.func.attr
        elif isinstance(statement, ast.Expr):
            self.translations[node.name] = statement.value.func.attr
            self.translations[statement.value.func.attr] = statement.value.func.attr

Now we can use our `PyTamaroTranslatorVisitor` class to extract the mapping from the PyTamaro library, and then take a look at the resulting dictionary.

In [5]:
import json

# Create a new instance of the NodeVisitor
translator_visitor = PyTamaroTranslatorVisitor()

# Visit the ast
translator_visitor.visit(pytamaro_ast)

# Print the translations
print(json.dumps(translator_visitor.translations, indent=4))

{
    "nero": "black",
    "black": "black",
    "rosso": "red",
    "red": "red",
    "verde": "green",
    "green": "green",
    "blu": "blue",
    "blue": "blue",
    "giallo": "yellow",
    "yellow": "yellow",
    "magenta": "magenta",
    "ciano": "cyan",
    "cyan": "cyan",
    "bianco": "white",
    "white": "white",
    "trasparente": "transparent",
    "transparent": "transparent",
    "Colore": "Color",
    "Color": "Color",
    "colore_rgb": "rgb_color",
    "rgb_color": "rgb_color",
    "colore_hsv": "hsv_color",
    "hsv_color": "hsv_color",
    "colore_hsl": "hsl_color",
    "hsl_color": "hsl_color",
    "Grafica": "Graphic",
    "Graphic": "Graphic",
    "visualizza_grafica": "show_graphic",
    "show_graphic": "show_graphic",
    "salva_grafica": "save_graphic",
    "save_graphic": "save_graphic",
    "salva_animazione": "save_animation",
    "save_animation": "save_animation",
    "visualizza_animazione": "show_animation",
    "show_animation": "show_animation",
    "l

We also included two utility functions:

One allows us to translate a single term to the corresponding English term.
The second one allows us to check if a given term is present in the dictionary.

In [7]:
print(translator_visitor.translate('settore_circolare'))
print(translator_visitor.translate('haut_centre'))
print(translator_visitor.translate('zeige_grafik'))
print(translator_visitor.translate('asdasdasdasd'))

print(translator_visitor.is_pytamaro_word('settore_circolare'))
print(translator_visitor.is_pytamaro_word('fubar'))

circular_sector
top_center
show_graphic
asdasdasdasd
True
False


# User Programs to TamaroCards

Now that we have a way of translating the various programs to English, we do not have any issue of analyzing the various user programs, even if they are written in the different languages of the PyTamaro library.

## Loading of the example Python programs

In [6]:
examples_folder = "example_codes/"

pacman_it = ""
with open(examples_folder + "pacman_it.py", "r") as f:
    pacman_it = f.read()

pacman_en = ""
with open(examples_folder + "pacman_en.py", "r") as f:
    pacman_en = f.read()

heart_en = ""
with open(examples_folder + "heart_en.py", "r") as f:
    heart_en = f.read()

In [12]:
import builtins


class UserProgramVisitor(ast.NodeVisitor):
    def __init__(self) -> None:
        self.tamaro_cards: dict[str, int] = {}
        self.user_defined_functions: list[str] = []
        self.pytamaro_python_used_functions: set[str] = set()
        self.excluded_names: set[str] = set()
        self._collected_user_functions: bool = False

    def _is_pytamaro_type(self, name: str) -> bool:
        return name[0].isupper()

    def _add_tamaro_card(self, name: str) -> None:
        if name not in self.excluded_names:
            self.tamaro_cards[name] = self.tamaro_cards.get(name, 0) + 1

    def _collect_user_defined_functions(self, node: ast.AST) -> None:
        for child in ast.walk(node):
            if isinstance(child, ast.FunctionDef):
                self.user_defined_functions.append(child.name)
                for arg in child.args.args:
                    self.excluded_names.add(arg.arg)
                    if isinstance(arg.annotation, ast.Name):
                        self.excluded_names.add(arg.annotation.id)
                    elif isinstance(arg.annotation, ast.Subscript):
                        self.excluded_names.add(arg.annotation.value.id)
    
    def visit(self, node: ast.AST) -> None:
        if not self._collected_user_functions:
            self._collect_user_defined_functions(node)
            self._collected_user_functions = True
        super().visit(node)

    def visit_FunctionDef(self, node: ast.FunctionDef) -> None:
        self._add_tamaro_card("function-def")
        self.user_defined_functions.append(node.name)
        for arg in node.args.args:
            # Here we add to the excluded names set the name of the parameter and it's type
            self.excluded_names.add(arg.arg)
            if isinstance(arg.annotation, ast.Name):
                self.excluded_names.add(arg.annotation.id)
            elif isinstance(arg.annotation, ast.Subscript):
                self.excluded_names.add(arg.annotation.value.id)
        super().generic_visit(node)

    def visit_Assign(self, node: ast.Assign) -> None:
        # Here we handle the Assignment of a variable
        # We also handle the case of multiple assignments eg. a, b = 1, 2
        for target in node.targets:
            if isinstance(target, ast.Name):
                self._add_tamaro_card("constant-def")
                self.excluded_names.add(target.id)
            elif isinstance(target, ast.Tuple):
                for elt in target.elts:
                    self._add_tamaro_card("constant-def")
                    self.excluded_names.add(elt.id)
        super().generic_visit(node)

    def _is_standard_library_function(self, func_name: str) -> bool:
        if func_name in dir(builtins):
            return True
        try:
            module_name = func_name.split(".")[0]
            module = importlib.import_module(module_name)
            func = eval(f"module.{func_name.split('.')[1]}")
            return inspect.ismodule(module) and inspect.isfunction(func)
        except (ImportError, AttributeError, IndexError):
            return False

    def visit_Call(self, node: ast.Call) -> None:
        # Here we handle the Call of a function
        # We check if the function is user defined or not, to choose the corresponding card
        # Between the generic USE-function# or the function specific card
        if isinstance(node.func, ast.Name):
            if node.func.id in self.user_defined_functions:
                n_args = len(node.args) if len(node.args) <= 3 else 3
                self._add_tamaro_card(f"function-use{n_args}")
                self.excluded_names.add(node.func.id)
            else:
                if translator_visitor.is_pytamaro_word(
                    node.func.id
                ) or self._is_standard_library_function(node.func.id):
                    translated_name = translator_visitor.translate(node.func.id)
                    self._add_tamaro_card(translated_name)
                    self.pytamaro_python_used_functions.add(translated_name)
        super().generic_visit(node)

    def visit_Constant(self, node: ast.Constant) -> None:
        self._add_tamaro_card("constant-use")
        super().generic_visit(node)

    def visit_Name(self, node: ast.Name) -> None:
        translated_name = translator_visitor.translate(node.id)
        if (
            translated_name not in self.pytamaro_python_used_functions
            and not self._is_pytamaro_type(translated_name)
            and (
                translator_visitor.is_pytamaro_word(translated_name)
                or self._is_standard_library_function(translated_name)
            )
        ):
            self._add_tamaro_card(translated_name)
        super().generic_visit(node)

    def visit_For(self, node: ast.For) -> None:
        if isinstance(node.target, ast.Name):
            self.excluded_names.add(node.target.id)
        super().generic_visit(node)

    """
    ---Operators
    """

    def _generic_operator_visit(self, node: ast.AST, to_visit: ast.AST) -> None:
        operator = node.__class__.__name__.lower()
        self._add_tamaro_card(operator)
        super().generic_visit(to_visit)

    def visit_BinOp(self, node: ast.BinOp) -> None:
        self._generic_operator_visit(node.op, node)

    def visit_UnaryOp(self, node: ast.UnaryOp) -> None:
        self._generic_operator_visit(node.op, node)

    def visit_BoolOp(self, node: ast.BoolOp) -> None:
        self._generic_operator_visit(node.op, node)

    def visit_Compare(self, node: ast.Compare) -> None:
        for op in node.ops:
            self._generic_operator_visit(op, node)

    def visit_IfExp(self, node: ast.IfExp) -> None:
        self._generic_operator_visit(node, node)


user_program_ast = ast.parse(test_code)
user_program_visitor = UserProgramVisitor()
# This will first collect all the user defined functions, and then visit the ast
user_program_visitor.visit(user_program_ast)

# print("Tamaro Cards: ", user_program_visitor.tamaro_cards)
print(user_program_visitor.tamaro_cards)
print("----")
print("User defined functions: ", user_program_visitor.user_defined_functions)
print("----")
print(
    "PyTamaro/Python used functions: ",
    user_program_visitor.pytamaro_python_used_functions,
)
print("----")
print("Excluded names: ", user_program_visitor.excluded_names)

{'function-def': 4, 'compose': 2, 'pin': 3, 'bottom_left': 3, 'circular_sector': 2, 'sub': 1, 'constant-use': 14, 'black': 1, 'rectangle': 1, 'constant-def': 6, 'eq': 3, 'mod': 3, 'max': 1, 'usub': 1, 'function-use2': 4, 'yellow': 1, 'green': 1, 'blue': 1, 'red': 1, 'empty_graphic': 2, 'range': 2, 'beside': 2, 'function-use1': 3, 'rotate': 1, 'mult': 1, 'show_graphic': 1}
----
User defined functions:  ['tile', 'color_var_tile', 'row', 'row_advanced', 'tile', 'color_var_tile', 'row', 'row_advanced']
----
PyTamaro/Python used functions:  {'rotate', 'beside', 'show_graphic', 'empty_graphic', 'circular_sector', 'pin', 'range', 'rectangle', 'compose', 'max'}
----
Excluded names:  {'length', 'i', 'Color', 'float', 'result', 'rotated_tile', 'size', 'tile', 'number', 'color', 'row_advanced', 'color_var_tile', 'int'}


In [17]:
test_font_code = '''
from pytamaro import text, blue

ciao = "Arial"

test = text("ciao", ciao, 12, blue)
'''

In [20]:
import builtins
from collections import Counter
import ast

class FontVisitor(ast.NodeVisitor):
    def __init__(self) -> None:
        self.fonts: dict[str, int] = {}
    
    def visit_Call(self, node: ast.Call) -> None:
        # Here we handle the Call of a function
        # We check if the function is user defined or not, to choose the corresponding card
        # Between the generic USE-function# or the function specific card
        if isinstance(node.func, ast.Name):
            translated_name = translator_visitor.translate(node.func.id)
            if translated_name == "text":
                if len(node.args) == 4:
                    if "value" in node.args[1].__dict__:
                        key = node.args[1].value
                    else:
                        key = "N/A"
                    self.fonts[key] = self.fonts.get(key, 0) + 1
        super().generic_visit(node)


# font_ast = ast.parse(test_font_code)
# font_program_visitor = FontVisitor()
# # This will first collect all the user defined functions, and then visit the ast
# font_program_visitor.visit(font_ast)

# print(font_program_visitor.fonts)

# Stiching together the TamaroCards

## Renaming of the files to create a mapping between the TamaroCards and the Dictionary

In [46]:
## Renamed files
import os

names_mapping = {
    "plus": "add",
    "divide": "div",
    "equal": "eq",
    "integer-divide": "floordiv",
    "greater-than": "gt",
    "greater-or-equal": "gte",
    "if-else": "ifexp",
    "less-than": "lt",
    "less-or-equal": "lte",
    "remainder": "mod",
    "times": "mult",
    "not-equal": "noteq",
    "power": "pow",
    "minus": "sub",
    "unary-plus": "uadd",
    "unary-minus": "usub",
}

# rename all the files inside the folder 'cards' with their mathing name
# and remove the files that are not .svg
for root, dirs, files in os.walk("cards"):
    for file in files:
        # just the files ending in .svg
        if file.endswith(".svg"):
            if file.split(".")[0] in names_mapping:
                new_name = names_mapping.get(file.split(".")[0], file.split(".")[0])
                os.rename(
                    os.path.join(root, file), os.path.join(root, f"{new_name}.svg")
                )
                print(f"Renamed {file} to {new_name}.svg")
        else:
            # remove the files that are not .svg
            os.remove(os.path.join(root, file))
            print(f"Removed {file}")

In [47]:
import svg_stack as ss
from lxml import etree
import os


def get_tamaro_svgs(
    tamaro_cards: dict[str, int], src: str = "cards"
) -> list[etree._ElementTree]:
    # cards could be in subfolders
    svgs = []
    for root, dirs, files in os.walk(src):
        for file in files:
            if file.split(".")[0] in tamaro_cards:
                for _ in range(tamaro_cards[file.split(".")[0]]):
                    svg = etree.parse(os.path.join(root, file))
                    svgs.append(svg)
    return svgs


    # svgs = []
    # for card, count in tamaro_cards.items():
    #     for _ in range(count):
    #         svg = etree.parse(f"{src}/{card}.svg")
    #         svgs.append(svg)
    # return svgs

def order_svgs(svgs: list[etree._ElementTree]) -> list[etree._ElementTree]:
    # Order the svgs based on the height and width
    return sorted(
        svgs,
        key=lambda svg: (svg.getroot().attrib["width"], svg.getroot().attrib["height"]),
    )


def create_svg_stack(
    svgs: list[etree._ElementTree],
    output_folder: str = ".",
    scale: float = 1.0,
    h_padding: float = 5.0,
    v_padding: float = 5.0,
) -> None:
    import os

    A4_WIDTH = 1200
    A4_HEIGHT = 800

    page_number = 0
    svg = ss.Document()
    page_layout = ss.VBoxLayout()
    row_layout = ss.HBoxLayout()
    page_layout.setSpacing(v_padding)
    row_layout.setSpacing(h_padding)

    for card in svgs:
        # Create a new svg that is resized by the scale factor
        root = card.getroot()
        # get the last 2 values of the viewBox attribute
        viewBox = root.attrib["viewBox"].split(" ")

        svg_width = float(viewBox[2]) * scale
        root.attrib["width"] = str(svg_width)
        svg_height = float(viewBox[3]) * scale
        root.attrib["height"] = str(svg_height)
        # Save the modified svg in a temp file
        svg_string = etree.tostring(root).decode()
        with open("__temp.svg", "w") as f:
            f.write(svg_string)

        # Check if the card fits in the row
        # if not add the row to the page layout
        # if the page layout is full save the page and create a new one
        if row_layout.get_size().width + svg_width + h_padding > A4_WIDTH:
            page_layout.addLayout(row_layout)
            row_layout = ss.HBoxLayout()
            row_layout.setSpacing(h_padding)
        if page_layout.get_size().height + svg_height + v_padding > A4_HEIGHT:
            svg.setLayout(page_layout)
            svg.save(f"{output_folder}/tamaroCards_{page_number}.svg")
            page_layout = ss.VBoxLayout()
            page_layout.setSpacing(v_padding)
            page_number += 1

        row_layout.addSVG("__temp.svg", alignment=ss.AlignHCenter | ss.AlignVCenter)
        # Remove the temp file
        os.remove("__temp.svg")

    if row_layout.get_size().width > 0:
        page_layout.addLayout(row_layout)
    if page_layout.get_size().height > 0:
        svg.setLayout(page_layout)
        svg.save(f"{output_folder}/tamaroCards_{page_number}.svg")


svgs = get_tamaro_svgs(user_program_visitor.tamaro_cards)
sorted_svgs = order_svgs(svgs)
# if the folder 'output' does not exist create it
if not os.path.exists("output"):
    os.makedirs("output")

create_svg_stack(sorted_svgs, output_folder="output", scale=0.22, v_padding=3)

---

# Students code analysis

In [23]:
# font
import os, ast

font_analysis = {}
# get number of folders inside students-code
n_students_code = len(os.listdir("students-code"))
general_program_visitor = FontVisitor()

for root, dirs, files in os.walk("students-code"):
    for index, dir_name in enumerate(dirs):
        user_code = ""
        for file in os.listdir(os.path.join(root, dir_name)):
            if file.endswith(".py") and file == "cell.py":
                with open(os.path.join(root, dir_name, file), "r") as f:
                    temp_code = f.read()
                    if temp_code != "" and temp_code != "pass":
                        user_code += temp_code + "\n\n"
        student_id: int = int(dir_name)
        font_analysis[student_id] = {}
        font_analysis[student_id]["code"] = user_code
        try:
            temp_user_program_ast = ast.parse(user_code)
            font_analysis[student_id]["error"] = None
            temp_user_program_visitor = FontVisitor()
            temp_user_program_visitor.visit(temp_user_program_ast)
            general_program_visitor.visit(temp_user_program_ast)
            font_analysis[student_id]["fonts"] = temp_user_program_visitor.fonts
        except Exception as e:
            font_analysis[student_id]["error"] = str(e)
        print(f"Processed {index + 1}/{n_students_code} students")

Processed 1/86948 students
Processed 2/86948 students
Processed 3/86948 students
Processed 4/86948 students
Processed 5/86948 students
Processed 6/86948 students
Processed 7/86948 students
Processed 8/86948 students
Processed 9/86948 students
Processed 10/86948 students
Processed 11/86948 students
Processed 12/86948 students
Processed 13/86948 students
Processed 14/86948 students
Processed 15/86948 students
Processed 16/86948 students
Processed 17/86948 students
Processed 18/86948 students
Processed 19/86948 students
Processed 20/86948 students
Processed 21/86948 students
Processed 22/86948 students
Processed 23/86948 students
Processed 24/86948 students
Processed 25/86948 students
Processed 26/86948 students
Processed 27/86948 students
Processed 28/86948 students
Processed 29/86948 students
Processed 30/86948 students
Processed 31/86948 students
Processed 32/86948 students
Processed 33/86948 students
Processed 34/86948 students
Processed 35/86948 students
Processed 36/86948 students
P

In [24]:
sorted_general_fonts = dict(
    sorted(
        general_program_visitor.fonts.items(),
        key=lambda item: item[1],
        reverse=True,
    )
)

print(
    json.dumps(
        sorted_general_fonts,
        indent=4,
    )
)

{
    "arial": 772,
    "Roboto": 572,
    "Fira Sans": 374,
    "Arial": 316,
    "Libre Bodoni": 304,
    "N/A": 228,
    "Fira Code": 90,
    "roboto": 45,
    "verdana": 31,
    "roboto sherif": 23,
    "": 16,
    "Sans Code": 11,
    "Helvetica": 11,
    "Jost": 9,
    "FiraSans": 8,
    "cormorant garamond": 8,
    "Papyrus": 8,
    "Courier": 6,
    "Calibri": 4,
    "!li!i!li!i!li!": 4,
    "20": 4,
    "Inter": 4,
    "gruen": 4,
    "robot": 3,
    "Noto Sans": 3,
    "Fire Code": 3,
    "Roboto Mono": 3,
    "fallback": 3,
    "PT Serif": 3,
    "monaco": 2,
    "monospaced": 2,
    "Cormorant Garamond": 2,
    "hallo": 2,
    "Roboto Serif": 2,
    "calibri": 2,
    "coding": 2,
    "Calbri": 1,
    "30": 1,
    "aryal": 1,
    "New Times Roman": 1,
    "Gebhardt": 1,
    "times new roman": 1,
    "Fallback": 1,
    "IBM Plex Sans": 1,
    "fallblack": 1,
    "New York Times": 1,
    "Times New Roman": 1,
    "Noto Color Emoji": 1,
    " calibri": 1,
    "Aptos": 1,
    "N

In [25]:
import os, ast

analysis: dict[int, str] = {}
# source_code_dir = "students-code"
source_code_dir = "nope"
# get number of folders inside students-code
n_students_code = len(os.listdir(source_code_dir))
general_program_visitor = UserProgramVisitor()

for root, dirs, files in os.walk(source_code_dir):
    for index, dir_name in enumerate(dirs):
        user_code = ""
        for file in os.listdir(os.path.join(root, dir_name)):
            if file.endswith(".py") and file == "cell.py":
                with open(os.path.join(root, dir_name, file), "r") as f:
                    temp_code = f.read()
                    if temp_code != "" and temp_code != "pass":
                        user_code += temp_code + "\n\n"
        student_id: int = int(dir_name)
        analysis[student_id] = {}
        analysis[student_id]["code"] = user_code
        try:
            temp_user_program_ast = ast.parse(user_code)
            analysis[student_id]["error"] = None
            temp_user_program_visitor = UserProgramVisitor()
            temp_user_program_visitor.visit(temp_user_program_ast)
            general_program_visitor.visit(temp_user_program_ast)
            analysis[student_id]["tamaro_cards"] = temp_user_program_visitor.tamaro_cards
            analysis[student_id]["user_defined_functions"] = temp_user_program_visitor.user_defined_functions
            analysis[student_id]["pytamaro_python_used_functions"] = temp_user_program_visitor.pytamaro_python_used_functions
        except Exception as e:
            analysis[student_id]["error"] = str(e)
        print(f"Processed {index + 1}/{n_students_code} students")

Processed 1/86948 students
Processed 2/86948 students
Processed 3/86948 students
Processed 4/86948 students
Processed 5/86948 students
Processed 6/86948 students
Processed 7/86948 students
Processed 8/86948 students
Processed 9/86948 students
Processed 10/86948 students
Processed 11/86948 students
Processed 12/86948 students
Processed 13/86948 students
Processed 14/86948 students
Processed 15/86948 students
Processed 16/86948 students
Processed 17/86948 students
Processed 18/86948 students
Processed 19/86948 students
Processed 20/86948 students
Processed 21/86948 students
Processed 22/86948 students
Processed 23/86948 students
Processed 24/86948 students
Processed 25/86948 students
Processed 26/86948 students
Processed 27/86948 students
Processed 28/86948 students
Processed 29/86948 students
Processed 30/86948 students
Processed 31/86948 students
Processed 32/86948 students
Processed 33/86948 students
Processed 34/86948 students
Processed 35/86948 students
Processed 36/86948 students
P

In [33]:
# convert the pytamaro_python_used_functions to a list foreach student
for student_id, student_analysis in analysis.items():
    if "pytamaro_python_used_functions" in student_analysis:
        student_analysis["pytamaro_python_used_functions"] = list(student_analysis["pytamaro_python_used_functions"])


# export the resulting analysis to a json file
with open("analysis.json", "w") as f:
    json.dump(analysis, f, indent=4)


In [26]:
# count all the errors
errors: dict[int, str] = {}
for student_id, data in analysis.items():
    if data["error"] != None:
        errors[student_id] = data["error"]

print(f"Errors: {len(errors)}/{n_students_code}")

Errors: 9194/86948


In [27]:
for student_id, error in errors.items():
    print(f"{student_id}: {error}")

424984: invalid syntax. Perhaps you forgot a comma? (<unknown>, line 14)
420518: invalid syntax (<unknown>, line 1)
402648: invalid syntax. Perhaps you forgot a comma? (<unknown>, line 10)
335219: invalid syntax. Perhaps you forgot a comma? (<unknown>, line 11)
377890: invalid syntax (<unknown>, line 21)
413508: invalid character '⬇' (U+2B07) (<unknown>, line 15)
379304: invalid syntax (<unknown>, line 6)
419200: invalid syntax (<unknown>, line 8)
370887: expected ':' (<unknown>, line 19)
371955: invalid syntax (<unknown>, line 6)
390467: '(' was never closed (<unknown>, line 2)
395115: unexpected indent (<unknown>, line 56)
408140: invalid syntax. Perhaps you forgot a comma? (<unknown>, line 11)
409092: invalid syntax (<unknown>, line 2)
372388: invalid syntax (<unknown>, line 8)
402822: invalid syntax. Perhaps you forgot a comma? (<unknown>, line 7)
383414: invalid syntax (<unknown>, line 4)
383626: invalid syntax. Perhaps you forgot a comma? (<unknown>, line 15)
355798: unexpected i

In [28]:
sorted_general_cards = dict(
    sorted(
        general_program_visitor.tamaro_cards.items(),
        key=lambda item: item[1],
        reverse=True,
    )
)

print(len(sorted_general_cards))

print(
    json.dumps(
        sorted_general_cards,
        indent=4,
    )
)



97
{
    "constant-use": 976641,
    "constant-def": 543449,
    "function-use2": 120066,
    "function-use3": 106587,
    "compose": 95646,
    "mult": 82677,
    "rgb_color": 65397,
    "rotate": 61958,
    "function-def": 55427,
    "circular_sector": 35086,
    "div": 31888,
    "range": 30395,
    "function-use1": 29921,
    "sub": 14927,
    "add": 13651,
    "usub": 13297,
    "function-use0": 10123,
    "bottom_left": 8824,
    "print": 8770,
    "eq": 7410,
    "rectangle": 5877,
    "bottom_right": 5759,
    "floordiv": 5738,
    "bottom_center": 5131,
    "show_graphic": 4995,
    "mod": 4487,
    "top_center": 4076,
    "empty_graphic": 4003,
    "top_right": 3615,
    "show_animation": 2630,
    "lte": 2615,
    "len": 2437,
    "top_left": 2108,
    "overlay": 1607,
    "hsv_color": 1478,
    "gt": 1331,
    "lt": 1219,
    "pow": 1035,
    "gte": 999,
    "hsl_color": 944,
    "center_right": 598,
    "save_animation": 569,
    "transparent": 509,
    "pin": 484,
    "el