______ _ _
(_____ \ _ | | | | _ /\
_____) _ _| |_ | | _ ___ ____ \ \ | |_ / \ ____
| ____| | | | _)| || \ / _ \| _ \ \ \| _) / /\ \| _ \
| | | |_| | |__| | | | |_| | | | |_____) | |__| |__| | | | |
|_| \__ |\___|_| |_|\___/|_| |_(______/ \___|______|_| |_|
(____/
PythonStAn is a comprehensive Python Static Analysis framework designed for advanced program analysis and research.
PythonStAn provides a robust infrastructure for performing various types of static analysis on Python programs. The framework supports multiple analysis domains including:
- Dataflow Analysis: Liveness analysis, reaching definition analysis, and def-use chains
- Pointer Analysis: k-CFA based pointer analysis with configurable context sensitivity
- Control Flow Analysis: CFG generation, interprocedural control flow graphs (ICFG)
- Abstract Interpretation: AI-based analysis with configurable abstract domains
- Scope Analysis: Module and function scope management
The PythonStAn framework follows a layered architecture with clear separation of concerns:
pythonstan/world/
├── world.py # Global singleton managing analysis environment
├── pipeline.py # Main analysis execution pipeline
├── config.py # Configuration management
├── scope_manager.py # Module and scope management
├── namespace.py # Namespace resolution
└── import_manager.py # Import dependency tracking
The World class serves as a central coordinator, maintaining:
- Scope and module management
- Namespace resolution
- Class hierarchy information
- Import dependency tracking
pythonstan/analysis/
├── analysis.py # Base analysis interfaces and configuration
├── dataflow/ # Dataflow analysis implementations
├── pointer/ # Pointer analysis with k-CFA
├── ai/ # Abstract interpretation framework
├── scope/ # Scope and closure analysis
└── transform/ # IR transformation pipeline
Analysis Types:
- Transform: IR transformations (AST → Three-address → CFG → SSA)
- Dataflow Analysis: Traditional dataflow frameworks (liveness, reaching definitions)
- Pointer Analysis: Context-sensitive pointer analysis using k-CFA
- Abstract Interpretation: Configurable abstract domains for program properties
pythonstan/ir/
├── ir_statements.py # IR statement definitions
└── ir_visitor.py # Visitor pattern for IR traversal
IR Pipeline:
- AST → Parse Python source code
- Three-Address Code → Normalize expressions and control flow
- CFG → Build control flow graphs
- SSA → Static single assignment form (TODO)
The pipeline orchestrates analysis execution:
- Module Discovery: Parse entry file and discover dependencies
- IR Generation: Transform source to intermediate representations
- Graph Construction: Build CFG, call graphs, and ICFG
- Analysis Execution: Run configured analyses in dependency order
- Result Collection: Aggregate and store analysis results
# Install dependencies
poetry install
# Or using pip
pip install -r requirements.txtThe main entry point for running customizable analysis pipelines.
Basic Usage:
python scripts/do_pipeline.pyConfiguration: The script uses a configuration dictionary to specify:
CONFIG = {
"filename": "/path/to/your/file.py",
"project_path": "/path/to/project/root",
"library_paths": [
"/usr/lib/python3.9",
"/usr/lib/python3.9/site-packages"
],
"analysis": [
{
"name": "liveness",
"id": "LivenessAnalysis",
"description": "liveness analysis",
"prev_analysis": ["cfg"],
"options": {
"type": "dataflow analysis",
"ir": "ssa"
}
}
]
}Available Analysis Types:
"dataflow analysis": Liveness, reaching definitions"pointer analysis": k-CFA pointer analysis"transform": IR transformations"inter-procedure": Interprocedural analysis
Specialized script for demonstrating k-CFA pointer analysis capabilities.
Basic Usage:
python scripts/do_pa.pyFeatures:
- Multiple test cases (interprocedural, OOP, higher-order functions)
- Configurable k-CFA parameters
- Comprehensive result reporting
- Field sensitivity options
Test Cases Included:
- Simple Interprocedural: Basic function calls and data flow
- Complex OOP: Class hierarchies, method calls, object interactions
- Higher-Order Functions: Closures, function composition, callbacks
- Complex Containers: Nested data structures, container operations
- Exception Handling: Try/catch blocks, error propagation
Configuration Options:
"options": {
"type": "pointer analysis",
"k": 2, # Context sensitivity depth
"obj_depth": 2, # Object allocation depth
"field_sensitivity": "attr", # Field sensitivity mode
"verbose": True
}Analysis results are accessible through:
world = pipeline.get_world()
scopes = world.scope_manager.get_scopes()
ir_forms = world.scope_manager.get_ir(scope, "three address form")
results = pipeline.analysis_manager.get_results("analysis_name")Results typically include:
- Points-to information: Variable → object mappings
- Call graph data: Function call relationships
- Dataflow results: Live variables, reaching definitions
- Statistics: Analysis performance metrics
This project is licensed under the terms specified in the LICENSE file.