Client-Agnostic Framework for SAS Code Intelligence and Modernization
A MicroZahn Product (Software Division of STAT X1, Inc.)
Initial development: 2025–2026
S2X is a proprietary framework developed by MicroZahn, the software division of STAT X1, Inc., prior to and independent of any client engagements.
This repository provides a high-level overview and selected non-sensitive materials for demonstration purposes only.
The core engine, automation logic, and advanced capabilities are not included and remain confidential intellectual property.
S2X is a code intelligence and transformation-readiness framework designed to analyze legacy SAS programs and generate structured, auditable Source-to-Target (S2T) mappings — without requiring access to underlying data.
S2X is designed to operate at scale across hundreds to thousands of SAS programs in enterprise environments.
The system enables organizations to:
- Understand complex SAS ETL environments
- Extract business logic and transformation rules
- Prepare for SAS-to-Python or platform modernization
- Accelerate documentation and lineage efforts
S2X is available as part of consulting engagements, assessment initiatives, and enterprise modernization programs.
What traditionally required weeks or months of manual analysis can now be completed in hours.
- No PHI, PII, or production data required
- Operates on sanitized SAS programs
- Variable-level lineage
- Transformation logic extraction
- Rule traceability back to source code
- Handles complex macro-driven SAS logic
- Preserves true execution behavior in analysis
- Identifies datasets across DATA steps, PROC SQL, and ETL flows
- Supports impact analysis and dependency mapping
- Prepares SAS environments for Python, SQL, or cloud migration
- Enables accurate scoping before redevelopment begins
- Input: SAS code (Data Step, PROC SQL, Macro-based pipelines)
- Phase 1: Parsing and normalization
- Phase 2: S2T (Source-to-Target) mapping generation
- Phase 3 (Planned): Automated Python and SQL ETL generation
S2X operates as a structured transformation pipeline:
Original SAS Code
↓
Normalized & Prepared Code
↓
Dataset Identification
↓
Macro-Aware Expansion
↓
Flattened Logical Representation
↓
Transformation Analysis Engine
↓
S2T Mapping Outputs (Excel / Structured Data)
Intermediate stages normalize and expand SAS logic into a structured, analyzable representation.
Three real-world SAS-to-Python examples are available in the examples/ folder. Each example includes:
- SAS Code (with standard input dataset)
- Source-to-Target (S2T) Mapping – full variable lineage
- Python / Pandas Equivalent – production-ready code
| Example | Technique | Folder |
|---|---|---|
| 1 | DATA Step (name parsing, calculations, derived fields) | examples/data-step/ |
| 2 | PROC SQL (joins, aggregations, conditional logic) | examples/proc-sql/ |
| 3 | %Macro + %Include (reusable modular ETL) | examples/macro-include/ |
All examples use the same input dataset and produce the identical target output structure (work.clean_transactions) for easy side-by-side comparison.
Click any folder above to explore the full SAS → S2T → Python transformation.
S2X produces structured outputs such as:
- Source-to-Target mapping tables
- Variable-level transformation logic
- Input-to-output dataset relationships
- Rule-level traceability (IF / WHERE / CASE logic)
These outputs are designed for:
- Data engineers
- Business analysts
- Modernization teams
- Audit and compliance stakeholders
- No embedded schemas, data, or business rules
- Portable across industries and environments
- Operates entirely on code structure
- Safe for restricted or regulated environments
- Every transformation traceable to source code
- Supports audit and validation workflows
- Independent processing stages
- Extensible to additional languages and platforms
Planned extensions include automated generation of Python and SQL-based ETL pipelines from S2T mappings.
- Multi-language code intelligence (SAS, SQL, ETL frameworks)
- Cross-platform lineage extraction
- Unified transformation mapping across data ecosystems
- SAS to Python migration planning
- Legacy ETL reverse engineering
- Rapid S2T documentation generation
- Data lineage and governance initiatives
- Pre-modernization system assessment
docs/— Process documentation and architectureexamples/— Sample inputs and outputs (sanitized) (coming soon)snippets/— Illustrative code patterns (non-proprietary)
This repository presents a high-level overview of the S2X framework, including workflow, architecture, and representative outputs.
The full engine, automation capabilities, and advanced implementation remain proprietary intellectual property of STAT X1, Inc.
📧 MikeKrizan@MicroZahn.com
📱 (612) 594-2954
🔗 LinkedIn
📄 Master’s Data Science Capstone – U.S. Health Insurance Cost Trends (2000–2023)
📄 Capstone Report Available → View/Download Full 75-page PDF
📍 Sioux Falls, SD
Education
M.S., Data Science – Utica University (May 2025)
B.S., Agricultural Studies (Economics/Statistics) – Iowa State University
This repository is provided for informational and demonstration purposes. All underlying methodologies, frameworks, and implementation logic are the intellectual property of STAT X1, Inc.