# Chapter 37: AST Basics

This notebook introduces Python's `ast` module, which lets you parse Python source code into an Abstract Syntax Tree (AST). You will learn how to parse code strings, inspect tree structure, examine node types and fields, and safely evaluate literal expressions with `ast.literal_eval`.

## Key Concepts
- **`ast.parse`**: Parses a source string into an AST `Module` node
- **`ast.dump`**: Produces a string representation of the AST structure
- **Node types**: `Module`, `Assign`, `FunctionDef`, `Expr`, `BinOp`, `Constant`, `Name`
- **Node fields**: Each node type has typed attributes (e.g., `BinOp.op`, `Name.id`)
- **`ast.literal_eval`**: Safely evaluates strings containing Python literals

## Section 1: Parsing Source Code with `ast.parse`

`ast.parse` takes a string of Python source code and returns an `ast.Module` node. The module's `body` attribute is a list of top-level statement nodes.

In [None]:
import ast

# Parse a simple assignment statement
source: str = "x = 1 + 2"
tree: ast.Module = ast.parse(source)

print(f"Tree type: {type(tree).__name__}")
print(f"Is Module: {isinstance(tree, ast.Module)}")
print(f"Number of statements: {len(tree.body)}")
print(f"First statement type: {type(tree.body[0]).__name__}")
print(f"Is Assign: {isinstance(tree.body[0], ast.Assign)}")

In [None]:
import ast

# Parse a function definition
source: str = "def greet(name): return f'Hello, {name}'"
tree: ast.Module = ast.parse(source)

func: ast.FunctionDef = tree.body[0]  # type: ignore[assignment]
print(f"Node type: {type(func).__name__}")
print(f"Is FunctionDef: {isinstance(func, ast.FunctionDef)}")
print(f"Function name: {func.name}")
print(f"Number of args: {len(func.args.args)}")
print(f"First arg name: {func.args.args[0].arg}")

In [None]:
import ast

# Parse multiple statements
source: str = """
x = 10
y = 20
result = x + y
"""
tree: ast.Module = ast.parse(source)

print(f"Number of statements: {len(tree.body)}")
for i, stmt in enumerate(tree.body):
    print(f"  Statement {i}: {type(stmt).__name__}")

## Section 2: Inspecting Trees with `ast.dump`

`ast.dump` returns a string representation of the AST, showing all node types and their field values. The `indent` parameter formats the output for readability.

In [None]:
import ast

# Dump a simple constant expression
tree: ast.Module = ast.parse("42")
dumped: str = ast.dump(tree)

print(f"Dump: {dumped}")
print(f"Contains 'Constant': {'Constant' in dumped}")
print(f"Contains '42': {'42' in dumped}")

In [None]:
import ast

# Dump with indentation for readability
tree: ast.Module = ast.parse("x = 1 + 2")
print(ast.dump(tree, indent=2))

In [None]:
import ast

# Dump a function definition to see the full structure
source: str = "def add(a, b): return a + b"
tree: ast.Module = ast.parse(source)
print(ast.dump(tree, indent=2))

## Section 3: AST Node Types

The `ast` module defines many node types. The most common ones include:
- **`Module`**: The root of any parsed source
- **`Assign`**: A variable assignment (`x = ...`)
- **`FunctionDef`**: A function definition (`def ...`)
- **`Expr`**: An expression used as a statement
- **`BinOp`**: A binary operation (`a + b`, `x * y`)
- **`Constant`**: A literal value (`42`, `"hello"`)
- **`Name`**: A variable reference

In [None]:
import ast

# Examining an Assign node
tree: ast.Module = ast.parse("x = 42")
assign: ast.Assign = tree.body[0]  # type: ignore[assignment]

print(f"Type: {type(assign).__name__}")
print(f"Targets: {[type(t).__name__ for t in assign.targets]}")
print(f"Target name: {assign.targets[0].id}")  # type: ignore[attr-defined]
print(f"Value type: {type(assign.value).__name__}")
print(f"Value: {assign.value.value}")  # type: ignore[attr-defined]

In [None]:
import ast

# Examining an expression statement with BinOp
tree: ast.Module = ast.parse("x + y")
expr: ast.Expr = tree.body[0]  # type: ignore[assignment]

print(f"Statement type: {type(expr).__name__}")
print(f"Is Expr: {isinstance(expr, ast.Expr)}")

binop: ast.BinOp = expr.value  # type: ignore[assignment]
print(f"\nValue type: {type(binop).__name__}")
print(f"Is BinOp: {isinstance(binop, ast.BinOp)}")
print(f"Operator: {type(binop.op).__name__}")
print(f"Is Add: {isinstance(binop.op, ast.Add)}")
print(f"Left: {type(binop.left).__name__} (id={binop.left.id})")  # type: ignore[attr-defined]
print(f"Right: {type(binop.right).__name__} (id={binop.right.id})")  # type: ignore[attr-defined]

## Section 4: Node Fields and Attributes

Every AST node has `_fields` listing its child field names, and `_attributes` listing metadata like line numbers and column offsets.

In [None]:
import ast

# Inspect the fields of various node types
node_types: list[type] = [ast.Module, ast.Assign, ast.FunctionDef, ast.BinOp, ast.Name]

for node_type in node_types:
    print(f"{node_type.__name__}:")
    print(f"  _fields:     {node_type._fields}")
    print(f"  _attributes: {node_type._attributes}")
    print()

In [None]:
import ast

# Access line number and column offset from parsed nodes
source: str = """x = 10
y = x + 5
"""
tree: ast.Module = ast.parse(source)

for stmt in tree.body:
    print(f"{type(stmt).__name__} at line {stmt.lineno}, col {stmt.col_offset}")

## Section 5: Safe Evaluation with `ast.literal_eval`

`ast.literal_eval` safely evaluates a string containing a Python literal expression. It only accepts strings, numbers, tuples, lists, dicts, booleans, `None`, `bytes`, and sets. It rejects any other expression, making it safe for untrusted input.

In [None]:
import ast

# literal_eval with basic types
int_val: int = ast.literal_eval("42")
print(f"Integer: {int_val} (type: {type(int_val).__name__})")

list_val: list[int] = ast.literal_eval("[1, 2, 3]")
print(f"List: {list_val} (type: {type(list_val).__name__})")

dict_val: dict[str, int] = ast.literal_eval("{'a': 1}")
print(f"Dict: {dict_val} (type: {type(dict_val).__name__})")

tuple_val: tuple[int, ...] = ast.literal_eval("(1, 2, 3)")
print(f"Tuple: {tuple_val} (type: {type(tuple_val).__name__})")

bool_val: bool = ast.literal_eval("True")
print(f"Bool: {bool_val} (type: {type(bool_val).__name__})")

none_val: None = ast.literal_eval("None")
print(f"None: {none_val} (type: {type(none_val).__name__})")

In [None]:
import ast

# literal_eval rejects non-literal expressions for safety
dangerous_inputs: list[str] = [
    "__import__('os')",
    "open('/etc/passwd')",
    "1 + 2",
    "x",
]

for expr in dangerous_inputs:
    try:
        ast.literal_eval(expr)
        print(f"  '{expr}' -> accepted (unexpected!)")
    except (ValueError, SyntaxError) as e:
        print(f"  '{expr}' -> rejected: {type(e).__name__}")

In [None]:
import ast

# Practical use: parsing configuration strings safely
config_strings: dict[str, str] = {
    "ports": "[8080, 8443, 9090]",
    "settings": "{'debug': True, 'verbose': False}",
    "max_retries": "5",
}

for key, value_str in config_strings.items():
    parsed = ast.literal_eval(value_str)
    print(f"{key}: {parsed} (type: {type(parsed).__name__})")

## Section 6: Building AST Nodes Programmatically

You can create AST nodes directly by calling their constructors. This is useful for code generation or AST transformation.

In [None]:
import ast

# Build an AST node for "x = 42" manually
assign_node: ast.Assign = ast.Assign(
    targets=[ast.Name(id="x", ctx=ast.Store())],
    value=ast.Constant(value=42),
    lineno=1,
    col_offset=0,
)

# Wrap in a Module
module: ast.Module = ast.Module(body=[assign_node], type_ignores=[])
ast.fix_missing_locations(module)

print("Hand-built AST:")
print(ast.dump(module, indent=2))

# Compare with parsed version
parsed: ast.Module = ast.parse("x = 42")
print(f"\nDumps match: {ast.dump(module) == ast.dump(parsed)}")

## Summary

### Parsing and Inspection
- **`ast.parse(source)`**: Returns an `ast.Module` node from a source code string
- **`ast.dump(tree, indent=N)`**: Shows the full AST structure as a readable string
- **`tree.body`**: List of top-level statement nodes in a `Module`

### Core Node Types
- **`Module`**: Root node containing a list of statements
- **`Assign`**: Variable assignment with `.targets` and `.value`
- **`FunctionDef`**: Function definition with `.name`, `.args`, `.body`
- **`Expr`**: Expression used as a statement, wrapping a `.value`
- **`BinOp`**: Binary operation with `.left`, `.op`, `.right`
- **`Name`**: Variable reference with `.id` and `.ctx`
- **`Constant`**: Literal value with `.value`

### Node Introspection
- **`node._fields`**: Tuple of child field names
- **`node._attributes`**: Tuple of metadata attributes (lineno, col_offset, etc.)

### Safe Evaluation
- **`ast.literal_eval(string)`**: Safely parses literal values (ints, lists, dicts, etc.)
- Rejects function calls, variable references, and any non-literal expression
- Preferred over `eval()` when processing untrusted input