# PHP-Parser-Py API Demo

This notebook demonstrates the functionality and APIs of `php-parser-py`, a Python wrapper for PHP-Parser that integrates with cpg2py's graph framework.

## Features

- Parse PHP code into Abstract Syntax Trees (AST)
- Query and traverse AST nodes using graph-based queries
- Access PHP-Parser attributes (line numbers, comments, etc.)
- **Rewrite and transform PHP code** (NEW!)
- Generate PHP code from modified ASTs (lossless round-trip)
- Type-safe operations with generic support

## Setup

First, let's import the library and set up our environment.

In [1]:
import sys
sys.path.insert(0, '../src')

from php_parser_py import parse, Parser, PrettyPrinter, AST, Node, Edge

## 1. Basic Parsing

Let's start by parsing a simple PHP function.

In [2]:
# Simple PHP code
php_code = """
<?php
function greet($name) {
    echo "Hello, " . $name;
    return true;
}
"""

# Parse the code
ast = parse(php_code)
print(f"Parsed AST with {len(list(ast.nodes()))} nodes")
print(f"AST type: {type(ast)}")

Parsed AST with 12 nodes
AST type: <class 'php_parser_py._ast.AST'>


## 2. Type-Safe Node Querying

The AST provides type-safe querying with generic support.

In [3]:
# Find all nodes
all_nodes = list(ast.nodes())
print(f"Total nodes: {len(all_nodes)}\n")

# Find function nodes (type-safe)
functions = list(ast.nodes(lambda n: n.node_type == "Stmt_Function"))
print(f"Found {len(functions)} function(s)")

# Get the first function
if functions:
    func: Node = functions[0]
    print(f"Function type: {func.node_type}")
    print(f"Function at line: {func.start_line}")

Total nodes: 12

Found 1 function(s)
Function type: Stmt_Function
Function at line: 3


## 3. Accessing Node Properties

Nodes support both Pythonic property access and dict-like access.

In [4]:
# Get the function node
func = ast.first_node(lambda n: n.node_type == "Stmt_Function")

if func:
    print("=== Pythonic Property Access ===")
    print(f"Node type: {func.node_type}")
    print(f"Start line: {func.start_line}")
    print(f"End line: {func.end_line}")
    
    print("\n=== Dict-like Access ===")
    print(f"Node type: {func['nodeType']}")
    print(f"By reference: {func['byRef']}")
    print(f"Has 'returnType': {'returnType' in func}")
    
    print("\n=== All Properties ===")
    for key, value in func.all_properties.items():
        if not isinstance(value, (dict, list)):
            print(f"  {key}: {value}")

=== Pythonic Property Access ===
Node type: Stmt_Function
Start line: 3
End line: 6

=== Dict-like Access ===
Node type: Stmt_Function
By reference: False
Has 'returnType': True

=== All Properties ===
  nodeType: Stmt_Function
  startLine: 3
  startTokenPos: 2
  startFilePos: 7
  endLine: 6
  endTokenPos: 25
  endFilePos: 76
  byRef: False
  returnType: None
  namespacedName: None


## 4. Graph Traversal

Navigate the AST using graph operations.

In [5]:
# Find different types of nodes
node_types = {}
for node in ast.nodes():
    node_type = node.node_type
    if node_type:
        node_types[node_type] = node_types.get(node_type, 0) + 1

print("Node types in the AST:")
for node_type, count in sorted(node_types.items()):
    print(f"  {node_type}: {count}")

Node types in the AST:
  Expr_BinaryOp_Concat: 1
  Expr_ConstFetch: 1
  Expr_Variable: 2
  Identifier: 1
  Name: 1
  Param: 1
  Scalar_String: 1
  Stmt_Echo: 1
  Stmt_Function: 1
  Stmt_InlineHTML: 1
  Stmt_Return: 1


## 5. Finding Specific Patterns

Use lambda functions to find specific code patterns.

In [6]:
# Find echo statements
echo_nodes = list(ast.nodes(lambda n: n.node_type == "Stmt_Echo"))
print(f"Found {len(echo_nodes)} echo statement(s)")

# Find variable expressions
var_nodes = list(ast.nodes(lambda n: n.node_type == "Expr_Variable"))
print(f"Found {len(var_nodes)} variable expression(s)")

# Print variable names
print("\nVariable names:")
for var in var_nodes:
    if 'name' in var:
        print(f"  ${var['name']}")

Found 1 echo statement(s)
Found 2 variable expression(s)

Variable names:
  $name
  $name


## 6. Code Generation (Round-trip)

Generate PHP code from the AST.

In [7]:
# Generate PHP code from AST
printer = PrettyPrinter()
generated_code = printer.print(ast)

print("Generated PHP code:")
print("=" * 50)
print(generated_code)
print("=" * 50)

Generated PHP code:

<?php 
function greet($name)
{
    echo "Hello, " . $name;
    return true;
}


## 7. AST Transformation - Wrapping Variables in Function Calls

**NEW!** Transform the AST by wrapping variables in function calls.

Example: `$data` → `sanitize($data)`

In [8]:
# Original code with user input
unsafe_code = """
<?php
echo $userInput;
$result = $userInput . " processed";
"""

ast2 = parse(unsafe_code)
print("Original code:")
print(unsafe_code)

# Transform: wrap all $userInput in sanitize()
def wrap_variable_in_function(ast: AST, var_name: str, func_name: str):
    """Wrap all occurrences of a variable in a function call."""
    # Find all variable nodes
    var_nodes = [
        node for node in ast.nodes()
        if node.node_type == "Expr_Variable" and node.get("name") == var_name
    ]
    
    for var_node in var_nodes:
        # Find parent edge
        parent_edges = [
            e for e in ast.storage.get_edges()
            if e[1] == var_node.id and e[2] == "PARENT_OF"
        ]
        
        if not parent_edges:
            continue
        
        parent_id = parent_edges[0][0]
        edge_props = ast.storage.get_edge_props(parent_edges[0])
        field_name = edge_props.get("field")
        
        # Create function call nodes
        name_id = f"new_name_{var_node.id}"
        ast.storage.add_node(name_id)
        ast.storage.set_node_props(name_id, {
            "nodeType": "Name",
            "parts": [func_name],
            "startLine": var_node.start_line,
            "endLine": var_node.end_line,
        })
        
        arg_id = f"new_arg_{var_node.id}"
        ast.storage.add_node(arg_id)
        ast.storage.set_node_props(arg_id, {
            "nodeType": "Arg",
            "name": None,
            "byRef": False,
            "unpack": False,
            "startLine": var_node.start_line,
            "endLine": var_node.end_line,
        })
        
        funccall_id = f"new_funccall_{var_node.id}"
        ast.storage.add_node(funccall_id)
        ast.storage.set_node_props(funccall_id, {
            "nodeType": "Expr_FuncCall",
            "startLine": var_node.start_line,
            "endLine": var_node.end_line,
        })
        
        # Connect nodes
        ast.storage.add_edge((funccall_id, name_id, "PARENT_OF"))
        ast.storage.set_edge_props((funccall_id, name_id, "PARENT_OF"), {"field": "name"})
        
        ast.storage.add_edge((funccall_id, arg_id, "PARENT_OF"))
        ast.storage.set_edge_props((funccall_id, arg_id, "PARENT_OF"), {"field": "args", "index": 0})
        
        ast.storage.add_edge((arg_id, var_node.id, "PARENT_OF"))
        ast.storage.set_edge_props((arg_id, var_node.id, "PARENT_OF"), {"field": "value"})
        
        # Replace parent reference
        ast.storage.remove_edge((parent_id, var_node.id, "PARENT_OF"))
        ast.storage.add_edge((parent_id, funccall_id, "PARENT_OF"))
        ast.storage.set_edge_props((parent_id, funccall_id, "PARENT_OF"), edge_props)

# Apply transformation
wrap_variable_in_function(ast2, "userInput", "sanitize")

# Generate transformed code
printer = PrettyPrinter()
transformed_code = printer.print(ast2)

print("\nTransformed code (with sanitization):")
print(transformed_code)

Original code:

<?php
echo $userInput;
$result = $userInput . " processed";


Transformed code (with sanitization):

<?php 
echo sanitize($userInput);
$result = sanitize($userInput) . " processed";


## 8. AST Transformation - Modifying String Values

Modify scalar values in the AST.

In [9]:
# Code with a string
code_with_string = '<?php echo "Hello";'
ast3 = parse(code_with_string)

print("Original code:")
print(code_with_string)

# Find and modify the string
string_node = ast3.first_node(lambda n: n.node_type == "Scalar_String")
if string_node:
    print(f"\nOriginal string value: {string_node['value']}")
    
    # Modify the string
    props = ast3.storage.get_node_props(string_node.id)
    props['value'] = 'World'
    props['rawValue'] = '"World"'
    ast3.storage.set_node_props(string_node.id, props)
    
    print(f"Modified string value: {string_node['value']}")

# Generate code
modified_code = printer.print(ast3)
print("\nModified code:")
print(modified_code)

Original code:
<?php echo "Hello";

Original string value: Hello
Modified string value: World

Modified code:
<?php

echo "World";


## 9. Working with Complex PHP Code

Parse and query a class with methods.

In [10]:
complex_php = """
<?php
class User {
    private $name;
    private $email;
    
    public function __construct($name, $email) {
        $this->name = $name;
        $this->email = $email;
    }
    
    public function getName() {
        return $this->name;
    }
}
"""

ast4 = parse(complex_php)

# Find class nodes
classes = list(ast4.nodes(lambda n: n.node_type == "Stmt_Class"))
print(f"Found {len(classes)} class(es)")

if classes:
    cls = classes[0]
    # Note: class name is in a child Identifier node
    print(f"\nClass at lines {cls.start_line}-{cls.end_line}")

# Find methods
methods = list(ast4.nodes(lambda n: n.node_type == "Stmt_ClassMethod"))
print(f"\nFound {len(methods)} method(s):")
for method in methods:
    # Method name is in child Identifier node
    print(f"  - Method at lines {method.start_line}-{method.end_line}")

Found 1 class(es)

Class at lines 3-15

Found 2 method(s):
  - Method at lines 7-10
  - Method at lines 12-14


## 10. Edge Traversal

Navigate parent-child relationships using edges.

In [11]:
# Get a function node
func = ast.first_node(lambda n: n.node_type == "Stmt_Function")

if func:
    # Find child nodes using edges
    print(f"Function node: {func.id}")
    print("\nChild nodes:")
    
    for edge in ast.edges(lambda e: e.from_nid == func.id):
        child = ast.node(edge.to_nid)
        if child:
            field = edge.get("field", "unknown")
            print(f"  - {field}: {child.node_type} (line {child.start_line})")

Function node: node_1

Child nodes:
  - name: Identifier (line 3)
  - params: Param (line 3)
  - stmts: Stmt_Echo (line 4)
  - stmts: Stmt_Return (line 5)


## Summary

This notebook demonstrated:

1. ✅ **Parsing**: Convert PHP code to AST
2. ✅ **Type-Safe Querying**: Find nodes with generic type support
3. ✅ **Properties**: Access node data via properties or dict-like syntax
4. ✅ **Traversal**: Navigate the AST structure
5. ✅ **Attributes**: Access PHP-Parser metadata (lines, positions)
6. ✅ **Code Generation**: Lossless round-trip from AST back to PHP
7. ✅ **AST Transformation**: Modify and rewrite PHP code
8. ✅ **Edge Traversal**: Navigate parent-child relationships

### Key Features

- **Type-Safe**: Generic support with `AST[Node, Edge]`
- **Dynamic**: No hardcoded node types - all types from PHP-Parser
- **Pythonic**: Properties with snake_case naming
- **Flexible**: Both property and dict-like access
- **Powerful**: Full AST transformation capabilities
- **Complete**: Full access to PHP-Parser's features
- **Graph-based**: Powered by cpg2py for advanced queries

For more information, see the [README](../README.md) and [design documentation](design.md).