# Databricks Performance Optimization

## Spark Architecture

![spark-introduction](images/spark-introduction.png)
![spark-introduction-application-executing](images/spark-introduction-application-executing.png)
![spark-introduction-application-architecture](images/spark-introduction-application-architecture.png)
![spark-introduction-scenario-1](images/spark-introduction-scenario-1.png)
![spark-introduction-scenario-1-cont](images/spark-introduction-scenario-1-cont.png)
![spark-introduction-scenario-1-stage1](images/spark-introduction-scenario-1-stage1.png)
![spark-introduction-scenario-1-stage2](images/spark-introduction-scenario-1-stage2.png)
![spark-introduction-cost-optimization](images/spark-introduction-cost-optimization.png)

## Designing the Foundation

![spark-foundation-fundamentals](images/spark-foundation-fundamentals.png)
![spark-foundation-common-peformance-bottleneck](images/spark-foundation-common-peformance-bottleneck.png)
![spark-foundation-common-peformance-bottleneck](images/spark-foundation-common-peformance-bottleneck.png)

### File Explosion
![spark-file-explosion](images/spark-file-explosion.png)
![spark-file-explosion-partition-write](images/spark-file-explosion-partition-write.png)
![spark-file-explosion-completed-queries](images/spark-file-explosion-completed-queries.png)

### Data Skipping
![spark-data-skipping](images/spark-data-skipping.png)
![spark-data-skipping-io-pruning](images/spark-data-skipping-io-pruning.png)
![spark-data-skipping-z-ordering](images/spark-data-skipping-z-ordering.png)
![spark-data-skipping-delta-lake-status](images/spark-data-skipping-delta-lake-status.png)
![spark-data-skipping-about-partitioning](images/spark-data-skipping-about-partitioning.png)
![spark-data-skipping-challanges-partitioning](images/spark-data-skipping-challanges-partitioning.png)
![spark-data-skipping-liquid-clustering](images/spark-data-skipping-liquid-clustering.png)
![spark-data-skipping-liquid-clustering-explain](images/spark-data-skipping-liquid-clustering-explain.png)
![spark-data-skipping-table-statistics](images/spark-data-skipping-table-statistics.png)
![spark-data-skipping-predictive-optimization](images/spark-data-skipping-predictive-optimization.png)
![spark-data-skipping-predictive-optimization-features.png](images/spark-data-skipping-predictive-optimization-features.png)


## Code Optimization

![spark-code-optimization](images/spark-code-optimization.png)
![spark-code-optimization-common-problems](images/spark-code-optimization-common-problems.png)

### Skew
![spark-code-skew](images/spark-code-skew.png)
![spark-code-skew-before-after](images/spark-code-skew-before-after.png)
![spark-code-handling-data-skew](images/spark-code-handling-data-skew.png)
![spark-code-skew-mitigation](images/spark-code-skew-mitigation.png)

### Shuffle
![spark-code-shuffle](images/spark-code-shuffle.png)
![spark-code-shuffle-side-effect](images/spark-code-shuffle-side-effect.png)
![spark-code-shuffles-at-glance](images/spark-code-shuffles-at-glance.png)
![spark-code-shuffles-mitigation](images/spark-code-shuffles-mitigation.png)
![spark-code-shuffles-code-generate-data](images/spark-code-shuffles-code-generate-data.png)
![spark-code-shuffles-code-generate-data-2](images/spark-code-shuffles-code-generate-data-2.png)
![spark-code-shuffles-code-create-with-shuffles](images/spark-code-shuffles-code-create-with-shuffles.png)
![spark-code-shuffles-code-create-with-shuffles-jobs](images/spark-code-shuffles-code-create-with-shuffles-jobs.png)
![spark-code-shuffles-code-broadcast-join](images/spark-code-shuffles-code-broadcast-join.png)
![spark-code-shuffles-code-broadcast-join-jobs](images/spark-code-shuffles-code-broadcast-join-jobs.png)
![spark-code-shuffles-code-aggregations](images/spark-code-shuffles-code-aggregations.png)
![spark-code-shuffles-code-aggregations-jobs](images/spark-code-shuffles-code-aggregations-jobs.png)

### Spill
![spark-code-spill](images/spark-code-spill.png)
![spark-code-spill-ram-to-disk-to-ram](images/spark-code-spill-ram-to-disk-to-ram.png)
![spark-code-spill-examples](images/spark-code-spill-examples.png)
![spark-code-spill-memory-disk](images/spark-code-spill-memory-disk.png)
![spark-code-spill-mitigations](images/spark-code-spill-mitigations.png)


### Serialization
![spark-code-serialization](images/spark-code-serialization.png)
![spark-code-serialization-problems](images/spark-code-serialization-problems.png)
![spark-code-serialization-problems-mitigation](images/spark-code-serialization-problems-mitigation.png)
![spark-code-serialization-code-generate-data](images/spark-code-serialization-code-generate-data.png)
![spark-code-serialization-code-computationally-expensive-python](images/spark-code-serialization-code-computationally-expensive-python.png)
![spark-code-serialization-code-computationally-expensive-python-add-partition](images/spark-code-serialization-code-computationally-expensive-python-add-partition.png)
![spark-code-serialization-code-computationally-expensive-python-add-partition-physical-plan](images/spark-code-serialization-code-computationally-expensive-python-add-partition-physical-plan.png)
![spark-code-serialization-code-sql-less-serialization](images/spark-code-serialization-code-sql-less-serialization.png)
![spark-code-serialization-code-sql-execute](images/spark-code-serialization-code-sql-execute.png)
![spark-code-serialization-code-sql-physical-plan](images/spark-code-serialization-code-sql-physical-plan.png)

## Fine-Tuning 

### Choosing the Right Cluster

![spark-code-fine-tunning-right-cluster](images/spark-code-fine-tunning-right-cluster.png)
![spark-code-fine-tunning-right-cluster-types](images/spark-code-fine-tunning-right-cluster-types.png)
![spark-code-fine-tunning-right-cluster-autoscalling](images/spark-code-fine-tunning-right-cluster-autoscalling.png)
![spark-code-fine-tunning-right-cluster-spot-instances](images/spark-code-fine-tunning-right-cluster-spot-instances.png)
![spark-code-fine-tunning-right-cluster-Photon](images/spark-code-fine-tunning-right-cluster-Photon.png)
![spark-code-fine-tunning-right-cluster-optimization](images/spark-code-fine-tunning-right-cluster-optimization.png)

### Pick the Best Instance Type
![spark-code-find-best-instance-type](images/spark-code-find-best-instance-type.png)
![spark-code-find-best-instance-type-picking-machines](images/spark-code-find-best-instance-type-picking-machines.png)
![spark-code-find-best-instance-type-rules](images/spark-code-find-best-instance-type-rules.png)
![spark-code-find-best-instance-type-rules-machine](images/spark-code-find-best-instance-type-rules-machine.png)
![spark-code-find-best-instance-type-what-care](images/spark-code-find-best-instance-type-what-care.png)
![spark-code-find-best-instance-type-driver](images/spark-code-find-best-instance-type-driver.png)
![spark-code-find-best-instance-type-spot-market](images/spark-code-find-best-instance-type-spot-market.png)
![spark-code-find-best-instance-type-ifttt-step-1](images/spark-code-find-best-instance-type-ifttt-step-1.png)
![spark-code-find-best-instance-type-ifttt-step-2](images/spark-code-find-best-instance-type-ifttt-step-2.png)
![spark-code-find-best-instance-type-ifttt-step-3](images/spark-code-find-best-instance-type-ifttt-step-3.png)
![spark-code-find-best-instance-type-ifttt-step-4](images/spark-code-find-best-instance-type-ifttt-step-4.png)
![spark-code-find-best-instance-type-ifttt-step-5](images/spark-code-find-best-instance-type-ifttt-step-5.png)
![spark-code-find-best-instance-type-shaffle-partitions](images/spark-code-find-best-instance-type-shaffle-partitions.png)
![spark-code-find-best-instance-type-event-log](images/spark-code-find-best-instance-type-event-log.png)

