Skip to content

ssanjaychandra123/data-engineering-patterns

Repository files navigation

Data Engineering Patterns

Field notes from real data platform engagements. Microsoft Fabric, Azure Databricks, PySpark, and SQL. Free.

Every pattern names the assumption that quietly breaks production, explains what is actually happening under the hood, and gives you the decision to make instead.

Whether you are sitting an interview, stabilizing a live platform, or designing your first lakehouse, these are the patterns that separate a pipeline that runs from one you can trust.

One repository for everything data engineering. This is the single home for all of my data engineering material. Microsoft Fabric, Azure Databricks, PySpark, and SQL today, with more platforms, tools, and patterns added over time. Star the repo and check back as it grows.


Microsoft Fabric Patterns

Microsoft Fabric Engineering Patterns

250 patterns spanning Pipelines, Lakehouse, Warehouse, Power BI, and the architecture decisions that tie them together.

# Topic Download
1 Pipelines and Data Factory Download PDF
2 Lakehouse and PySpark Download PDF
3 Warehouse and SQL Download PDF
4 Power BI in Fabric Download PDF
5 Architecture Patterns Download PDF

Azure Databricks Patterns

Azure Databricks Engineering Patterns

350 patterns covering the full platform, from clusters and Delta Lake through Unity Catalog governance to the architecture and cost choices that keep it sustainable.

# Topic Download
1 Clusters and Compute Download PDF
2 Delta Lake Download PDF
3 Workflows and Orchestration Download PDF
4 Structured Streaming and Auto Loader Download PDF
5 Unity Catalog Download PDF
6 Databricks SQL and Photon Download PDF
7 Platform and Cost Architecture Download PDF

PySpark

The PySpark Handbook

88 concepts across 112 pages. A practical, concepts first companion for engineers writing production Spark across Fabric and Databricks.

Title Download
The PySpark Handbook for Fabric and Databricks Download PDF

SQL

The SQL Handbook

150+ patterns across 61 pages. A practical SQL companion that puts T-SQL on Fabric Warehouse and Spark SQL on Databricks side by side, covering dialect differences, Delta Lake, performance tuning, security, and the platform-specific gotchas that catch every engineer at least once.

Title Download
The SQL Handbook for Fabric and Databricks Download PDF

Found a gap or something that has moved on? Open an issue. These platforms evolve quickly and I keep the material current.


Compiled with Claude by Anthropic as a writing and research assistant. Every pattern was reviewed, edited, and validated by Sanjay Chandra.


Sanjay Chandra builds and advises on enterprise data platforms. If your team is wrestling with one of these problems at scale, let us talk.

LinkedIn · ssanjaychandra.com

About

600+ free patterns and concepts for data engineers on Azure Databricks, Microsoft Fabric, and PySpark. 12 books covering the full stack. Free forever.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors