## Overview

This notebook covers the most common mistakes when working with AST, along with proven patterns and best practices. These come from real-world experience building AST-based tools and the specific challenges that arise in static analysis projects.

We'll cover:
- Essential visitor patterns that prevent bugs
- Defensive programming for optional AST attributes
- Handling async/await constructs properly
- Advanced parameter extraction techniques
- Method call resolution patterns (core to static analysis)
- Testing strategies for AST code

## Common Pitfalls and Best Practices

### 19. Always Call generic_visit()

Forgetting to call `generic_visit()` is probably the most common AST bug. When you don't call it, the visitor stops traversing into child nodes, missing entire subtrees of your code. This can lead to mysteriously missing functions, uncounted calls, and hours of debugging. The only time you should skip `generic_visit()` is when you explicitly want to stop traversal.

In [1]:
import ast


# COMMON BUG: Forgetting generic_visit()
class BrokenVisitor(ast.NodeVisitor):
    def __init__(self):
        self.functions_found = []

    def visit_ClassDef(self, node):
        print(f"Found class: {node.name}")
        # OOPS! Forgot generic_visit()
        # Result: Methods inside this class will NEVER be visited!

    def visit_FunctionDef(self, node):
        self.functions_found.append(node.name)
        # Also forgot here - nested functions won't be found


# Test the broken visitor
code = """
class Calculator:
    def add(self):       # This won't be found!
        def helper():     # This won't be found either!
            pass
        return helper()

def outer():
    def inner():          # This won't be found!
        pass
"""

tree = ast.parse(code)
broken = BrokenVisitor()
broken.visit(tree)
print(f"Broken visitor found: {broken.functions_found}")  # Only finds 'outer'!

Found class: Calculator
Broken visitor found: ['outer']


In [2]:
# CORRECT: Always call generic_visit()
class CorrectVisitor(ast.NodeVisitor):
    def __init__(self):
        self.functions_found = []

    def visit_ClassDef(self, node):
        print(f"Found class: {node.name}")
        self.generic_visit(node)  # ESSENTIAL! Visits methods

    def visit_FunctionDef(self, node):
        self.functions_found.append(node.name)
        self.generic_visit(node)  # ESSENTIAL! Visits nested functions


correct = CorrectVisitor()
correct.visit(tree)
print(f"Correct visitor found: {correct.functions_found}")  # Finds all!

Found class: Calculator
Correct visitor found: ['add', 'helper', 'outer', 'inner']


In [3]:
# ADVANCED: Conditional traversal
class SelectiveVisitor(ast.NodeVisitor):
    def visit_FunctionDef(self, node):
        print(f"Function: {node.name}")

        # Sometimes you want to skip certain subtrees
        if node.name.startswith("test_"):
            print("  Skipping test function internals")
            return  # Don't traverse into test functions

        # But normally, always call generic_visit
        self.generic_visit(node)


# PATTERN: Ensure generic_visit with try/finally
class SafeVisitor(ast.NodeVisitor):
    def visit_FunctionDef(self, node):
        try:
            # Your processing logic
            print(f"Processing: {node.name}")
            # Could raise exception here
        finally:
            # Guarantee traversal continues even if processing fails
            self.generic_visit(node)


# Test selective visitor
test_code = """
def regular_func():
    def inner():
        pass

def test_something():
    def test_inner():
        pass
"""

test_tree = ast.parse(test_code)
selective = SelectiveVisitor()
selective.visit(test_tree)

Function: regular_func
Function: inner
Function: test_something
  Skipping test function internals


### 20. Handle Missing Attributes

AST nodes don't always have all attributes populated. Optional attributes like type annotations, default values, and return types might be None. Always check before accessing nested attributes, or your visitor will crash on perfectly valid Python code that just happens to lack annotations.

In [4]:
import ast

# Code with mixed annotations - some present, some missing
code = """
def fully_annotated(x: int, y: str = "default") -> bool:
    return True

def no_annotations(x, y):
    return x + y

def partial_annotations(x: int, y):
    pass  # No return annotation

class Example:
    # Method might not have return annotation
    def method(self): pass
"""

tree = ast.parse(code)


class UnsafeAnalyzer(ast.NodeVisitor):
    """This will CRASH on code without annotations."""

    def visit_FunctionDef(self, node):
        # DANGEROUS - assumes returns exists and is a Name
        # return_type = node.returns.id  # AttributeError!

        # DANGEROUS - assumes all args have annotations
        # for arg in node.args.args:
        #     print(arg.annotation.id)  # AttributeError!
        pass


class SafeAnalyzer(ast.NodeVisitor):
    """Properly handles missing annotations."""

    def visit_FunctionDef(self, node):
        print(f"Function: {node.name}")

        # Safe return annotation checking
        if node.returns:
            if isinstance(node.returns, ast.Name):
                print(f"  Returns: {node.returns.id}")
            elif isinstance(node.returns, ast.Constant):
                print(f"  Returns: '{node.returns.value}' (string)")
            else:
                print("  Returns: complex type")
        else:
            print("  Returns: no annotation")

        # Safe parameter annotation checking
        for arg in node.args.args:
            if arg.annotation:
                if isinstance(arg.annotation, ast.Name):
                    print(f"  Param {arg.arg}: {arg.annotation.id}")
                else:
                    print(f"  Param {arg.arg}: complex annotation")
            else:
                print(f"  Param {arg.arg}: no annotation")

        # Safe default value checking
        defaults = node.args.defaults
        if defaults:
            # Remember: defaults align to the RIGHT of args
            args_with_defaults = node.args.args[-len(defaults) :]
            for arg, default in zip(args_with_defaults, defaults, strict=False):
                if isinstance(default, ast.Constant):
                    print(f"  {arg.arg} has default: {default.value}")

        self.generic_visit(node)


analyzer = SafeAnalyzer()
analyzer.visit(tree)

Function: fully_annotated
  Returns: bool
  Param x: int
  Param y: str
  y has default: default
Function: no_annotations
  Returns: no annotation
  Param x: no annotation
  Param y: no annotation
Function: partial_annotations
  Returns: no annotation
  Param x: int
  Param y: no annotation
Function: method
  Returns: no annotation
  Param self: no annotation


In [5]:
# More defensive patterns - helper function for type extraction
def safe_get_annotation_name(annotation):
    """Safely extract type name from various annotation forms."""
    if annotation is None:
        return None

    if isinstance(annotation, ast.Name):
        return annotation.id

    if isinstance(annotation, ast.Constant) and isinstance(annotation.value, str):
        return annotation.value

    if isinstance(annotation, ast.Attribute):
        # Handle typing.Optional, etc.
        if hasattr(annotation, "attr"):
            return annotation.attr

    # For complex types, return None or a placeholder
    return "[complex]"


# Test the helper function
test_annotations = ast.parse("def func(x: int, y: 'str', z: typing.Optional[bool]): pass")
func_node = test_annotations.body[0]

for arg in func_node.args.args:
    type_name = safe_get_annotation_name(arg.annotation)
    print(f"Parameter {arg.arg}: {type_name}")

Parameter x: int
Parameter y: str
Parameter z: [complex]


### 21. Remember AsyncFunctionDef

Python has both `FunctionDef` and `AsyncFunctionDef` nodes. They have identical structure but are different types. If you only handle `FunctionDef`, you'll miss all async functions! This is especially important in modern Python where async is common. The same applies to `For`/`AsyncFor`, `With`/`AsyncWith`, etc.

In [6]:
import ast

code = """
# Mix of regular and async functions
def regular_function():
    return "sync"

async def async_function():
    return "async"

class Service:
    def sync_method(self):
        pass

    async def async_method(self):
        await something()

    @staticmethod
    async def async_static():
        pass

# Async context managers and loops
async def complex_async():
    async with get_connection() as conn:
        async for item in get_items():
            await process(item)
"""

tree = ast.parse(code)


# WRONG: Only handles regular functions
class IncompleteVisitor(ast.NodeVisitor):
    def __init__(self):
        self.functions = []

    def visit_FunctionDef(self, node):
        self.functions.append(("sync", node.name))
        self.generic_visit(node)


incomplete = IncompleteVisitor()
incomplete.visit(tree)
print(f"Incomplete found: {incomplete.functions}")  # Missing async functions!

Incomplete found: [('sync', 'regular_function'), ('sync', 'sync_method')]


In [7]:
# CORRECT: Handle both function types
class CompleteVisitor(ast.NodeVisitor):
    def __init__(self):
        self.functions = []

    def visit_FunctionDef(self, node):
        self.functions.append(("sync", node.name))
        self.generic_visit(node)

    def visit_AsyncFunctionDef(self, node):
        self.functions.append(("async", node.name))
        self.generic_visit(node)


complete = CompleteVisitor()
complete.visit(tree)
print(f"Complete found: {complete.functions}")  # Finds all!

Complete found: [('sync', 'regular_function'), ('async', 'async_function'), ('sync', 'sync_method'), ('async', 'async_method'), ('async', 'async_static'), ('async', 'complex_async')]


In [8]:
# PATTERN 1: Shared processing logic
class SharedLogicVisitor(ast.NodeVisitor):
    def process_function(self, node, is_async):
        print(f"{'Async' if is_async else 'Sync'} function: {node.name}")
        # All your function processing logic here
        self.generic_visit(node)

    def visit_FunctionDef(self, node):
        self.process_function(node, is_async=False)

    def visit_AsyncFunctionDef(self, node):
        self.process_function(node, is_async=True)


# PATTERN 2: Method aliasing (when logic is identical)
class AliasingVisitor(ast.NodeVisitor):
    def visit_FunctionDef(self, node):
        print(f"Function (any kind): {node.name}")
        self.generic_visit(node)

    # Point async handler to the same method
    visit_AsyncFunctionDef = visit_FunctionDef


# Test both patterns
shared_visitor = SharedLogicVisitor()
shared_visitor.visit(tree)
print("\n---\n")
alias_visitor = AliasingVisitor()
alias_visitor.visit(tree)

Sync function: regular_function
Async function: async_function
Sync function: sync_method
Async function: async_method
Async function: async_static
Async function: complex_async

---

Function (any kind): regular_function
Function (any kind): async_function
Function (any kind): sync_method
Function (any kind): async_method
Function (any kind): async_static
Function (any kind): complex_async


## Key Patterns for Static Analysis

### 22. Extracting Parameter Info

Extracting complete parameter information from functions is complex because Python supports many parameter types: regular positional, positional-only (Python 3.8+), keyword-only, *args, and **kwargs. Each type is stored in a different attribute of the ast.arguments object, and you need to check all of them to get complete parameter information. This pattern shows how to handle all parameter types systematically.

In [9]:
import ast


def extract_complete_parameters(args: ast.arguments):
    """Extract all parameter information from a function signature.

    This pattern shows how to handle all Python parameter types systematically.
    """
    parameters = []

    # 1. Regular positional arguments (most common)
    # These can be passed by position or keyword
    for arg in args.args:
        param_info = {
            "name": arg.arg,
            "has_annotation": arg.annotation is not None,
            "annotation": safe_get_annotation_name(arg.annotation) if arg.annotation else None,
            "kind": "positional_or_keyword",
        }
        parameters.append(param_info)
        print(f"Regular arg: {arg.arg}")

    # 2. Positional-only arguments (Python 3.8+)
    # These come before the / in the signature
    for arg in args.posonlyargs:
        param_info = {
            "name": arg.arg,
            "has_annotation": arg.annotation is not None,
            "annotation": safe_get_annotation_name(arg.annotation) if arg.annotation else None,
            "kind": "positional_only",
        }
        parameters.append(param_info)
        print(f"Positional-only arg: {arg.arg}")

    # 3. Keyword-only arguments
    # These come after * or *args in the signature
    for arg in args.kwonlyargs:
        param_info = {
            "name": arg.arg,
            "has_annotation": arg.annotation is not None,
            "annotation": safe_get_annotation_name(arg.annotation) if arg.annotation else None,
            "kind": "keyword_only",
        }
        parameters.append(param_info)
        print(f"Keyword-only arg: {arg.arg}")

    # 4. *args parameter (if present)
    if args.vararg:
        param_info = {
            "name": args.vararg.arg,
            "has_annotation": args.vararg.annotation is not None,
            "annotation": safe_get_annotation_name(args.vararg.annotation)
            if args.vararg.annotation
            else None,
            "kind": "var_positional",
            "is_variadic": True,
        }
        parameters.append(param_info)
        print(f"*args parameter: *{args.vararg.arg}")

    # 5. **kwargs parameter (if present)
    if args.kwarg:
        param_info = {
            "name": args.kwarg.arg,
            "has_annotation": args.kwarg.annotation is not None,
            "annotation": safe_get_annotation_name(args.kwarg.annotation) if args.kwarg.annotation else None,
            "kind": "var_keyword",
            "is_keyword": True,
        }
        parameters.append(param_info)
        print(f"**kwargs parameter: **{args.kwarg.arg}")

    # Handle default values (tricky because of alignment)
    if args.defaults:
        # defaults align RIGHT with args.args
        num_args = len(args.args)
        num_defaults = len(args.defaults)
        first_default_index = num_args - num_defaults

        for i, arg in enumerate(args.args):
            if i >= first_default_index:
                default = args.defaults[i - first_default_index]
                print(f"  {arg.arg} has default value")

    # Handle keyword-only defaults
    if args.kw_defaults:
        for arg, default in zip(args.kwonlyargs, args.kw_defaults, strict=False):
            if default is not None:
                print(f"  {arg.arg} (keyword-only) has default value")

    return parameters

In [10]:
# Test with complex function signature
code = """
def complex_function(
    pos_only_1, pos_only_2=10, /,  # Positional-only
    regular_1=15, regular_2: int = 20,  # Regular (positional or keyword)
    *args,  # Variadic positional
    kw_only_1, kw_only_2: str = "default",  # Keyword-only
    **kwargs  # Variadic keyword
) -> bool:
    pass
"""

tree = ast.parse(code)
func_node = tree.body[0]
parameters = extract_complete_parameters(func_node.args)

Regular arg: regular_1
Regular arg: regular_2
Positional-only arg: pos_only_1
Positional-only arg: pos_only_2
Keyword-only arg: kw_only_1
Keyword-only arg: kw_only_2
*args parameter: *args
**kwargs parameter: **kwargs
  regular_1 has default value
  regular_2 has default value
  kw_only_2 (keyword-only) has default value


### 23. Resolving Method Calls

This is a core pattern for variable tracking in static analysis. The key insight is maintaining enough context to resolve what type a variable is when we see a method call on it. This requires tracking assignments, maintaining scope context, and having a lookup mechanism when we encounter calls.

In [11]:
import ast


class MethodCallResolver(ast.NodeVisitor):
    """Pattern for resolving method calls - essential for static analysis."""

    def __init__(self, known_functions):
        self.known_functions = known_functions  # Set of qualified names like "Calculator.add"
        self.class_stack = []
        self.function_stack = []
        self.scoped_variables = {}
        self.resolved_calls = []
        self.unresolved_calls = []

    def _extract_call_name(self, node: ast.Call):
        """Core pattern for extracting method call information."""
        if not isinstance(node.func, ast.Attribute):
            # Simple function call, not a method
            if isinstance(node.func, ast.Name):
                return node.func.id
            return None

        # It's a method/attribute call
        method_name = node.func.attr

        # Pattern 1: self.method() - common in class methods
        if isinstance(node.func.value, ast.Name) and node.func.value.id == "self":
            if self.class_stack:
                qualified = ".".join(self.class_stack) + f".{method_name}"
                print(f"Resolved self.{method_name} -> {qualified}")
                return qualified
            return method_name

        # Pattern 2: ClassName.method() - static or class methods
        if isinstance(node.func.value, ast.Name):
            var_or_class = node.func.value.id

            # Check if it's a capitalized name (likely a class)
            if var_or_class[0].isupper():
                qualified = f"{var_or_class}.{method_name}"
                print(f"Resolved {var_or_class}.{method_name} -> {qualified}")
                return qualified

            # Pattern 3: variable.method() - requires type tracking
            # Need to resolve the variable's type
            var_type = self._lookup_variable_type(var_or_class)
            if var_type:
                qualified = f"{var_type}.{method_name}"
                print(f"Resolved {var_or_class}.{method_name} -> {qualified} (via type tracking)")
                return qualified
            print(f"Unresolved: {var_or_class}.{method_name} (unknown type)")
            self.unresolved_calls.append(f"{var_or_class}.{method_name}")
            return None

        # Pattern 4: Complex expressions
        # like: get_obj().method() or module.sub.method()
        return None  # These remain unresolved in current implementation

    def _lookup_variable_type(self, var_name):
        """Key addition for variable tracking - resolve variable types."""
        current_scope = self._get_current_scope()

        # Check current function scope first
        scoped_name = f"{current_scope}.{var_name}"
        if scoped_name in self.scoped_variables:
            return self.scoped_variables[scoped_name]

        # Check module scope
        module_name = f"__module__.{var_name}"
        if module_name in self.scoped_variables:
            return self.scoped_variables[module_name]

        # Unknown variable
        return None

    def _get_current_scope(self):
        """Build current scope name for variable tracking."""
        if self.function_stack:
            return ".".join(self.function_stack)
        return "__module__"

    def visit_FunctionDef(self, node):
        """Track function scope and parameter types."""
        self.function_stack.append(node.name)

        # Track parameter type annotations
        for arg in node.args.args:
            if arg.annotation:
                type_name = safe_get_annotation_name(arg.annotation)
                if type_name:
                    scope = self._get_current_scope()
                    self.scoped_variables[f"{scope}.{arg.arg}"] = type_name
                    print(f"Parameter {arg.arg}: {type_name} in scope {scope}")

        self.generic_visit(node)
        self.function_stack.pop()

    def visit_Assign(self, node):
        """Track variable assignments for type information."""
        if len(node.targets) == 1 and isinstance(node.targets[0], ast.Name):
            var_name = node.targets[0].id

            # Detect constructor calls
            if isinstance(node.value, ast.Call) and isinstance(node.value.func, ast.Name):
                class_name = node.value.func.id
                if class_name[0].isupper():
                    scope = self._get_current_scope()
                    self.scoped_variables[f"{scope}.{var_name}"] = class_name
                    print(f"Tracked: {var_name} = {class_name}() in scope {scope}")

        self.generic_visit(node)

    def visit_Call(self, node):
        """Count calls to known functions."""
        call_name = self._extract_call_name(node)
        if call_name and call_name in self.known_functions:
            self.resolved_calls.append(call_name)
            print(f"Counted call to known function: {call_name}")

        self.generic_visit(node)

In [12]:
code = """
# Pattern 1: self.method() in class methods
class Calculator:
    def __init__(self, precision: int = 2):
        self.precision = precision

    def add(self, a: float, b: float) -> float:
        result = a + b
        return self._round_result(result)  # self.method() call

    def _round_result(self, value: float) -> float:
        return round(value, self.precision)

    @classmethod
    def from_string(cls, config: str):
        return cls(int(config))

# Pattern 2: ClassName.method() - static/class methods
class MathUtils:
    @staticmethod
    def validate(x):
        return x > 0

    @classmethod
    def create_validator(cls):
        return cls()

# Pattern 3: variable.method() with type tracking
def process_calculations(data: list):
    # Constructor tracking
    calc = Calculator(precision=4)
    utils = MathUtils()

    # Method calls on tracked variables
    result = calc.add(10.123, 20.456)
    rounded = calc._round_result(result)

    # Static method call
    is_valid = MathUtils.validate(result)

    # Nested function with variable access
    def process_item(item):
        # Access outer scope variable
        return calc.add(item, 1.0)

    # Multiple assignments and reassignments
    processor = Calculator(2)
    processor = Calculator(3)  # Reassignment
    final = processor.add(1.1, 2.2)

    return [process_item(x) for x in data]

# Pattern 4: Complex expressions and unresolved calls
class DataProcessor:
    def get_calculator(self) -> Calculator:
        return Calculator()

    def process(self):
        # Chained call - harder to resolve
        self.get_calculator().add(1, 2)

        # Dynamic variable - can't resolve
        obj = get_dynamic_object()
        obj.unknown_method()

# Module-level variable tracking
module_calc = Calculator()
module_result = module_calc.add(5, 10)

# Edge cases
def edge_cases():
    # Variable without type annotation
    unknown = get_something()
    unknown.mystery_method()  # Can't resolve

    # Lambda with method call
    func = lambda x: x.upper()  # Can't track x's type

    # Conditional assignment
    if True:
        calc = Calculator()
    else:
        calc = MathUtils()
    calc.add(1, 2)  # Ambiguous type
"""

tree = ast.parse(code)

# Define known functions to track
known_functions = {
    "Calculator.add",
    "Calculator._round_result",
    "Calculator.from_string",
    "MathUtils.validate",
    "MathUtils.create_validator",
    "DataProcessor.get_calculator",
    "DataProcessor.process"
}

print("=== Testing Method Call Resolution ===\n")
resolver = MethodCallResolver(known_functions)
resolver.visit(tree)

print("\n=== Summary ===")
print(f"Total resolved calls to known functions: {len(resolver.resolved_calls)}")
print(f"Resolved: {resolver.resolved_calls}")
print(f"\nTotal unresolved calls: {len(resolver.unresolved_calls)}")
print(f"Unresolved: {resolver.unresolved_calls}")

# Analyze what was tracked
print(f"\n=== Tracked Variables ===")
for var, var_type in sorted(resolver.scoped_variables.items()):
    print(f"  {var} -> {var_type}")

=== Testing Method Call Resolution ===

Parameter precision: int in scope __init__
Parameter a: float in scope add
Parameter b: float in scope add
Parameter value: float in scope _round_result
Parameter config: str in scope from_string
Parameter data: list in scope process_calculations
Tracked: calc = Calculator() in scope process_calculations
Tracked: utils = MathUtils() in scope process_calculations
Resolved calc.add -> Calculator.add (via type tracking)
Counted call to known function: Calculator.add
Resolved calc._round_result -> Calculator._round_result (via type tracking)
Counted call to known function: Calculator._round_result
Resolved MathUtils.validate -> MathUtils.validate
Counted call to known function: MathUtils.validate
Unresolved: calc.add (unknown type)
Tracked: processor = Calculator() in scope process_calculations
Tracked: processor = Calculator() in scope process_calculations
Resolved processor.add -> Calculator.add (via type tracking)
Counted call to known function: C

## Testing Your AST Code

### 24. Create Test ASTs Programmatically

Sometimes it's easier to build AST nodes directly rather than parsing code strings. This is especially useful for testing edge cases or when you want to test your visitor with specific node structures. The key is remembering to call `ast.fix_missing_locations()` to add required line number information.

In [13]:
import ast

# Building AST nodes programmatically for testing


def create_function_node(name, params, return_type=None):
    """Create a FunctionDef node programmatically."""
    # Create parameter list
    args_list = []
    for param_name, param_type in params:
        arg = ast.arg(arg=param_name, annotation=None)
        if param_type:
            arg.annotation = ast.Name(id=param_type, ctx=ast.Load())
        args_list.append(arg)

    # Create arguments object
    arguments = ast.arguments(
        posonlyargs=[], args=args_list, vararg=None, kwonlyargs=[], kw_defaults=[], kwarg=None, defaults=[]
    )

    # Create function body (just pass for now)
    body = [ast.Pass()]

    # Create return annotation if provided
    returns = ast.Name(id=return_type, ctx=ast.Load()) if return_type else None

    # Create the function node
    func_node = ast.FunctionDef(
        name=name, args=arguments, body=body, decorator_list=[], returns=returns, type_comment=None
    )

    return func_node


def create_class_with_method():
    """Create a complete class with a method."""
    # Create method
    method = create_function_node("add", [("self", None), ("a", "int"), ("b", "int")], "int")

    # Create class
    class_node = ast.ClassDef(name="Calculator", bases=[], keywords=[], body=[method], decorator_list=[])

    return class_node


# Test creating a class
class_node = create_class_with_method()
print("Created class node:")
print(ast.dump(class_node, indent=2))

Created class node:
ClassDef(
  name='Calculator',
  body=[
    FunctionDef(
      name='add',
      args=arguments(
        args=[
          arg(arg='self'),
          arg(
            arg='a',
            annotation=Name(id='int', ctx=Load())),
          arg(
            arg='b',
            annotation=Name(id='int', ctx=Load()))]),
      body=[
        Pass()],
      returns=Name(id='int', ctx=Load()))])


In [14]:
def create_test_module():
    """Create a complete module for testing."""
    # Create an assignment: x = 5
    assign = ast.Assign(
        targets=[ast.Name(id="x", ctx=ast.Store())], value=ast.Constant(value=5), type_comment=None
    )

    # Create a function call: print(x)
    call = ast.Expr(
        value=ast.Call(
            func=ast.Name(id="print", ctx=ast.Load()), args=[ast.Name(id="x", ctx=ast.Load())], keywords=[]
        )
    )

    # Create the module
    module = ast.Module(body=[assign, call], type_ignores=[])

    # CRITICAL: Fix missing locations
    ast.fix_missing_locations(module)

    return module


# Test the programmatically created AST
module = create_test_module()

# You can compile and execute it!
code = compile(module, "[test]", "exec")
print("Executing programmatically created AST:")
exec(code)  # Will print: 5

Executing programmatically created AST:
5


In [15]:
# Or analyze it with your visitor
class TestVisitor(ast.NodeVisitor):
    def visit_Assign(self, node):
        print(f"Found assignment at line {getattr(node, 'lineno', 'unknown')}")
        self.generic_visit(node)

    def visit_Call(self, node):
        if isinstance(node.func, ast.Name):
            print(f"Found call to {node.func.id} at line {getattr(node, 'lineno', 'unknown')}")
        self.generic_visit(node)


visitor = TestVisitor()
visitor.visit(module)

Found assignment at line 1
Found call to print at line 1


### 25. Use Small Examples

When debugging AST issues, always start with the smallest possible example that reproduces your problem. This makes it much easier to understand what's happening and to test your fixes. Build up complexity gradually once the simple case works.

In [16]:
import ast


def debug_pattern(pattern_name, code_snippet):
    """Helper for debugging specific AST patterns."""
    print(f"\n=== Debugging: {pattern_name} ===")
    print(f"Code: {code_snippet}")

    try:
        tree = ast.parse(code_snippet)
        print("AST Structure:")
        print(ast.dump(tree, indent=2))

        # Walk through all nodes and show their types
        print("\nNode types present:")
        node_types = set()
        for node in ast.walk(tree):
            node_types.add(type(node).__name__)
        for node_type in sorted(node_types):
            print(f"  - {node_type}")

        return tree
    except SyntaxError as e:
        print(f"Syntax Error: {e}")
        return None


# Debug common static analysis patterns
debug_pattern(
    "Instance method call pattern",
    """calc = Calculator()
calc.add(1, 2)""",
)


=== Debugging: Instance method call pattern ===
Code: calc = Calculator()
calc.add(1, 2)
AST Structure:
Module(
  body=[
    Assign(
      targets=[
        Name(id='calc', ctx=Store())],
      value=Call(
        func=Name(id='Calculator', ctx=Load()))),
    Expr(
      value=Call(
        func=Attribute(
          value=Name(id='calc', ctx=Load()),
          attr='add',
          ctx=Load()),
        args=[
          Constant(value=1),
          Constant(value=2)]))])

Node types present:
  - Assign
  - Attribute
  - Call
  - Constant
  - Expr
  - Load
  - Module
  - Name
  - Store


<ast.Module at 0xffff7c92bed0>

In [17]:
# Start simple, then add complexity
debugging_sequence = [
    # Level 1: Simplest possible case
    ("Simple assignment", "x = 5"),
    ("Simple call", "foo()"),
    # Level 2: One step more complex
    ("Constructor call", "calc = Calculator()"),
    ("Method call", "calc.add()"),
    # Level 3: The actual pattern
    (
        "Full pattern",
        """calc = Calculator()
calc.add(1, 2)""",
    ),
    # Level 4: Edge cases
    (
        "Reassignment",
        """calc = Calculator()
calc = Processor()
calc.process()""",
    ),
    ("Chained calls", "get_calc().add(1, 2)"),
    (
        "Nested scopes",
        """def outer():
    calc = Calculator()
    def inner():
        calc.add(1, 2)""",
    ),
]

for name, code in debugging_sequence:
    debug_pattern(name, code)


=== Debugging: Simple assignment ===
Code: x = 5
AST Structure:
Module(
  body=[
    Assign(
      targets=[
        Name(id='x', ctx=Store())],
      value=Constant(value=5))])

Node types present:
  - Assign
  - Constant
  - Module
  - Name
  - Store

=== Debugging: Simple call ===
Code: foo()
AST Structure:
Module(
  body=[
    Expr(
      value=Call(
        func=Name(id='foo', ctx=Load())))])

Node types present:
  - Call
  - Expr
  - Load
  - Module
  - Name

=== Debugging: Constructor call ===
Code: calc = Calculator()
AST Structure:
Module(
  body=[
    Assign(
      targets=[
        Name(id='calc', ctx=Store())],
      value=Call(
        func=Name(id='Calculator', ctx=Load())))])

Node types present:
  - Assign
  - Call
  - Load
  - Module
  - Name
  - Store

=== Debugging: Method call ===
Code: calc.add()
AST Structure:
Module(
  body=[
    Expr(
      value=Call(
        func=Attribute(
          value=Name(id='calc', ctx=Load()),
          attr='add',
          ctx=Load(

In [18]:
# Create minimal test visitor for specific pattern
class MinimalVariableTracker(ast.NodeVisitor):
    """Minimal visitor to demonstrate variable type tracking."""

    def __init__(self):
        self.assignments = {}
        self.calls = []

    def visit_Assign(self, node):
        # Track: var = ClassName()
        if (
            len(node.targets) == 1
            and isinstance(node.targets[0], ast.Name)
            and isinstance(node.value, ast.Call)
            and isinstance(node.value.func, ast.Name)
        ):
            var_name = node.targets[0].id
            class_name = node.value.func.id
            self.assignments[var_name] = class_name
            print(f"Tracked: {var_name} = {class_name}()")

        self.generic_visit(node)

    def visit_Call(self, node):
        # Track: var.method()
        if isinstance(node.func, ast.Attribute) and isinstance(node.func.value, ast.Name):
            var_name = node.func.value.id
            method_name = node.func.attr

            # Try to resolve
            if var_name in self.assignments:
                class_name = self.assignments[var_name]
                resolved = f"{class_name}.{method_name}"
                print(f"Resolved: {var_name}.{method_name} -> {resolved}")
                self.calls.append(resolved)
            else:
                print(f"Unresolved: {var_name}.{method_name}")

        self.generic_visit(node)

In [19]:
# Test the minimal tracker
test_code = """
calc = Calculator()
result = calc.add(1, 2)
proc = Processor()
proc.process(result)
"""

tree = ast.parse(test_code)
tracker = MinimalVariableTracker()
tracker.visit(tree)
print(f"\nResolved calls: {tracker.calls}")

Tracked: calc = Calculator()
Resolved: calc.add -> Calculator.add
Tracked: proc = Processor()
Resolved: proc.process -> Processor.process

Resolved calls: ['Calculator.add', 'Processor.process']


## Key Takeaways

1. **Always call `generic_visit()`** - This is the #1 AST bug. Your visitor will silently miss entire code sections without it.

2. **Check for None before accessing attributes** - Not all AST nodes have all attributes populated. Always check `node.returns`, `arg.annotation`, etc. before using them.

3. **Handle both sync and async** - Modern Python has duplicate node types for async constructs. Don't forget `AsyncFunctionDef`, `AsyncFor`, `AsyncWith`.

4. **Extract parameters systematically** - Function parameters come in many forms (positional, keyword-only, *args, **kwargs). Check all the attributes of `ast.arguments`.

5. **Track variable scope for method resolution** - To resolve `variable.method()` calls, you need to track what type each variable is, in which scope.

6. **Test with minimal examples** - Start with the simplest code that reproduces your issue, then build up complexity.

7. **Use defensive programming patterns** - AST analysis deals with arbitrary user code. Expect the unexpected and handle edge cases gracefully.

## Conclusion

This completes the 5-part AST guide series. You now have a comprehensive understanding of:

1. **Core AST concepts** and how Python's abstract syntax trees work
2. **Essential node types** for analyzing Python code structure
3. **Visitor patterns** for traversing and analyzing AST nodes
4. **Debugging tools** for visualization and troubleshooting
5. **Best practices and testing techniques** for building robust AST-based tools

With these skills, you're ready to tackle complex code analysis tasks, build sophisticated static analysis tools, and contribute effectively to projects that work with Python's AST.