# What Are Notebooks?

This tutorial introduces the core ideas behind interactive notebooks and how to use them effectively in data workflows.

## Why Notebooks Exist

Notebooks combine executable code, narrative text, visualizations, and output in one document. They let you explore data, document findings, and share reproducible workflows in a single place.

## Key Building Blocks

- **Code cells** run snippets of code and display their results inline.
- **Markdown cells** capture explanations, notes, and headings using Markdown formatting.
- **Output** includes tables, text, charts, or rich media generated by the code.
- **Kernel** is the language runtime (for example, Python or Spark) that executes code cells.

## Typical Notebook Workflow

1. Start a kernel for your language (Python, PySpark, SQL, etc.).
2. Add a code cell to experiment with data or prototype logic.
3. Interleave Markdown cells to explain decisions and record observations.
4. Rerun cells as the notebook evolves—execution order matters.
5. Share the notebook or export to formats such as HTML or PDF for wider audiences.

In [None]:
# Example: aggregating data with PySpark
from pathlib import Path
from pyspark.sql import SparkSession, functions as F

spark = SparkSession.builder.appName('NotebookIntro').getOrCreate()
# If your environment already has a SparkSession named `spark`, getOrCreate will reuse it.

repo_root = Path.cwd()
if (repo_root / 'notebooks').exists():
    data_path = repo_root / 'notebooks' / 'data' / 'orders_demo.csv'
else:
    data_path = Path('..') / 'data' / 'orders_demo.csv'

df = (
    spark.read
    .option('header', True)
    .option('inferSchema', True)
    .csv(str(data_path))
)

summary = (
    df.groupBy('region')
      .agg(
          F.sum('orders').alias('total_orders'),
          F.avg('orders').alias('avg_orders'),
      )
      .orderBy('region')
)

summary.show()


## Tips for Effective Notebooks

- Restart the kernel and rerun all cells before sharing to ensure reproducibility.
- Keep cells focused—each should accomplish a clear, discrete task.
- Use headings, lists, and links to make the story easy to follow.
- Version notebooks alongside code to track changes over time.

## Exercises

- Create a new markdown cell that summarizes the purpose of a recent analysis, then add a code cell underneath that prints a friendly greeting.
- Experiment with cell execution order: run the bottom half of the notebook first, then restart the kernel and rerun everything top-to-bottom to observe the difference.
- Export the notebook to HTML or Markdown and review how the narrative and outputs render for teammates without notebooks.
