Skip to content

Hands-on Jupyter Notebook tutorials covering setup, core features, and best practices, while serving as a sandbox to learn micromamba, Git, and Make.

Notifications You must be signed in to change notification settings

elecdot/jupyter-tutorials

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

1 Commit
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Jupyter Notebook Learning Project

A comprehensive beginner-friendly repository for learning notebook programming with Jupyter, VS Code, and WSL2 (Ubuntu 24.04). This project emphasizes clean, repeatable workflows and best practices from day one.

๐ŸŽฏ Project Goals

  • Learn Fundamentals: Master notebook workflows for data exploration, visualization, and basic machine learning
  • Build Clean Structure: Develop organized, modular project architecture avoiding messy outputs
  • Practice Best Habits: Version control, environment management, and code refactoring from day one
  • Mentor-Guided Learning: Use AI assistance for clear explanations and best practice guidance

๐Ÿ“ Current Project Structure

This project is set up with a complete development environment:

nb/
โ”œโ”€โ”€ README.md           # This file - comprehensive project documentation
โ”œโ”€โ”€ CODEX_INIT.md      # AI mentor instructions and learning context
โ”œโ”€โ”€ Makefile           # Automation for common tasks (env, jupyter, cleaning)
โ”œโ”€โ”€ env.yaml           # Micromamba environment specification
โ”œโ”€โ”€ .gitignore         # Comprehensive ignore patterns for clean Git history
โ”œโ”€โ”€ notebooks/         # Jupyter notebooks for tutorials
โ”œโ”€โ”€ src/              #  Reusable Python modules and functions
โ”œโ”€โ”€ data/             #  Datasets (ignored by Git, use .gitkeep)
โ””โ”€โ”€ outputs/          #  Generated plots, reports, results

๐Ÿš€ Quick Start Guide

Initial Setup (One-time)

Just run these commands:

  1. Initialize project folders:

    make init
  2. Create the environment:

    make env-create
  3. Install Jupyter kernel:

    make kernel
  4. Verify everything works:

    make check

Daily Workflow

Start your learning session:

make lab           # Launch JupyterLab
# or
make notebook      # Launch classic Jupyter Notebook

Before committing work:

make clear-outputs # Clear notebook outputs
git add .
git commit -m "Add: tutorial on data visualization"

Environment Management with Makefile

Your Makefile provides these automated tasks:

Command Purpose
make/make help Show all available commands
make init Create project folders and basic .gitignore
make env-create Create micromamba environment from env.yaml
make env-update Update environment when you add packages
make env-remove Remove the environment completely
make env-export Export environment spec for reproducibility
make kernel Install/refresh Jupyter kernel
make lab Launch JupyterLab
make notebook Launch classic Jupyter
make clear-outputs Clear all notebook outputs
make clean Remove caches and temporary files
make freeze-pip Export pip requirements
make check Verify tools and environment

๐Ÿ“š Learning Workflow

Adding New Packages

Your env.yaml is designed for easy expansion:

# Current setup in env.yaml
name: nbenv
channels:
  - conda-forge
dependencies:
  - python
  - ipykernel
  - jupyter
  - jupyterlab
  # Add new packages here:
  # - pandas
  # - matplotlib
  # - seaborn
  # - scikit-learn
  #- pip:
  #  - some-pip-package

Workflow for adding packages:

  1. Edit env.yaml to add new dependencies
  2. Run make env-update to install them
  3. If needed, run make kernel to refresh the Jupyter kernel

Your Git Workflow

Your .gitignore is configured to keep your repository clean:

What's tracked โœ…:

  • Source code (.py files in src/)
  • Notebooks (but outputs are cleared before commit)
  • Documentation and configuration files
  • Environment specifications (env.yaml)

What's ignored โŒ:

  • Data files (data/ directory)
  • Generated outputs (outputs/ directory)
  • Python caches (__pycache__/, *.pyc)
  • Jupyter checkpoints (.ipynb_checkpoints/)
  • Environment files (.env, secrets)
  • Editor-specific files (.vscode/, .idea/)

Best practices:

# Before committing notebooks
make clear-outputs

# Commit workflow
git status
git add src/ notebooks/ README.md  # Be selective
git commit -m "Add: data loading utilities"

# Export environment state for reproducibility
make env-export  # Creates mamba-linux-64.lock

๐Ÿ“š Learning Progression & Project Management

Phase 1: Foundation Setup โœ… COMPLETED

  • Project structure established
  • Environment configuration (env.yaml)
  • Automation tools (Makefile)
  • Git configuration (.gitignore)
  • Documentation framework

Phase 2: Basic Notebook Skills ๐ŸŽฏ NEXT STEPS

  • Create first notebook: notebooks/01_getting_started.ipynb
    • Practice markdown cells and code cells
    • Learn about kernel management
    • Understand cell execution order
  • Environment exploration: notebooks/02_environment_setup.ipynb
    • Test package imports
    • Verify micromamba environment
    • Practice using Makefile commands
  • Data basics: notebooks/03_data_fundamentals.ipynb
    • Load sample datasets
    • Basic pandas operations
    • Simple visualizations with matplotlib

Phase 3: Data Science Workflow ๐Ÿ”ฎ PLANNED

  • Add data science packages (pandas, matplotlib, seaborn, numpy)
  • Create notebooks/04_data_exploration.ipynb
  • Build first src/ module for reusable functions
  • Practice notebook โ†’ module refactoring
  • Learn about data versioning (DVC introduction)

Phase 4: Advanced Topics ๐Ÿ“ˆ FUTURE

  • Machine learning basics (scikit-learn)
  • Interactive visualizations (plotly, altair)
  • Notebook testing and quality assurance
  • Documentation generation from notebooks

Quick Commands Reference

Daily development:

make lab                    # Start JupyterLab
make clear-outputs         # Clean notebooks before Git
make clean                # Remove caches

Environment management:

make env-update           # After editing env.yaml
make kernel              # Refresh Jupyter kernel
make check              # Verify everything works

Project maintenance:

make env-export         # Backup environment state
git status             # Check what's changed
git add notebooks/ src/ # Stage specific changes

๐Ÿ› ๏ธ Your Project Tools & Files

Key Files Explained

env.yaml - Your Environment Blueprint

name: nbenv                    # Environment name (auto-detected by Makefile)
channels: [conda-forge]       # Package source (fast, up-to-date packages)
dependencies:                 # What's installed
  - python                    # Latest Python
  - ipykernel                # Jupyter kernel support  
  - jupyter                  # Classic notebook interface
  - jupyterlab              # Modern notebook interface

Makefile - Your Automation Hub

  • Smart environment detection: Reads env name from env.yaml
  • Micromamba integration: Uses fast package manager
  • Safe shell operations: Configured with error handling
  • Customizable paths: Override default directories (notebooks/, src/, data/)
  • Comprehensive help: Run make help anytime

.gitignore - Your Repository Guardian

  • Python-aware: Ignores __pycache__/, *.pyc, virtual envs
  • Jupyter-friendly: Excludes .ipynb_checkpoints/
  • Data-safe: Keeps large datasets out of Git
  • Editor-agnostic: Works with VS Code, PyCharm, vim, etc.
  • Security-conscious: Prevents committing secrets and env files

Workflow Integration

VS Code + Jupyter Setup:

  1. Open project in VS Code
  2. Install Python and Jupyter extensions
  3. Select kernel: Ctrl+Shift+P โ†’ "Python: Select Interpreter" โ†’ choose nbenv
  4. Create .ipynb files in notebooks/ folder
  5. Use make clear-outputs before Git commits

Command Line Workflow:

# Morning routine
make check                 # Verify environment health
make lab                  # Start JupyterLab

# Development cycle  
# ... work in notebooks ...
make clear-outputs        # Clean outputs
git add notebooks/01_*.ipynb
git commit -m "Add: basic data loading tutorial"

# Environment updates
# ... edit env.yaml to add packages ...
make env-update          # Install new packages
make kernel             # Refresh Jupyter kernel

โš ๏ธ Common Pitfalls to Avoid

  1. Hidden State in Notebooks: Always restart kernel and run all cells to verify reproducibility
  2. Large Datasets in Git: Add data files to .gitignore, use data versioning tools like DVC for large datasets
  3. Environment Mismatch: Always document exact package versions in env.yaml
  4. Messy Notebooks: Regularly clean up, refactor reusable code to src/
  5. No Backups: Commit frequently, especially before major experiments

๐Ÿ“– Learning Resources & Best Practices

Essential Tutorials (Recommended Order)

  1. Real Python: Jupyter Notebook Introduction - Start here
  2. Jupyter Notebook Beginner Guide - Official docs
  3. Pandas User Guide - Data manipulation
  4. Matplotlib Tutorials - Plotting basics

Notebook Best Practices

  • Structure: Clear markdown headers, single-concept cells, import everything upfront
  • Naming: Use numbered prefixes (01_, 02_) for tutorial sequence
  • Documentation: Explain your thinking in markdown cells
  • Reproducibility: Clear outputs before committing, restart kernel frequently
  • Modularity: Move reusable code to src/ modules when it appears in 2+ notebooks

Code Organization Philosophy

Start in notebooks โ†’ Refactor to modules โ†’ Import back to notebooks

# notebooks/01_data_exploration.ipynb
import sys
sys.path.append('../src')
from data_utils import load_dataset, clean_data
from viz_utils import create_scatter_plot

# Now your notebook focuses on analysis, not utility code
df = load_dataset('data/sample.csv')
clean_df = clean_data(df)
create_scatter_plot(clean_df, 'x', 'y')

Advanced Learning Path

  • Data Versioning: DVC for large datasets
  • Notebook Testing: nbval, pytest integration
  • Documentation: Sphinx, jupyter-book
  • Deployment: Voilร  for interactive dashboards
  • Collaboration: JupyterHub, Git workflows with notebooks

โš ๏ธ Common Pitfalls & Solutions

Problem Why It Happens Solution
"Kernel not found" Jupyter can't see your environment Run make kernel to install kernel
Import errors Package not in environment Add to env.yaml, run make env-update
Hidden state Cells run out of order Restart kernel, run all cells from top
Git conflicts Notebook outputs cause merge issues Use make clear-outputs before commits
Large repo size Data files tracked by Git Check .gitignore covers data/ directory
Environment drift Packages installed but not documented Use make freeze-pip or update env.yaml
Permission errors Makefile shell issues Check bash path with which bash

๐ŸŽฏ Project Milestones & Goals

Immediate Next Steps (This Week)

  1. Run make init to create project directories
  2. Create first notebook: Start with notebooks/01_hello_jupyter.ipynb
  3. Test environment: Import basic packages, create simple plots
  4. Practice Git workflow: Make first commit with cleared outputs

Short-term Goals (Next Month)

  • Complete 5 tutorial notebooks covering data basics
  • Create first reusable module in src/
  • Add pandas, matplotlib, seaborn to environment
  • Practice notebook โ†’ script โ†’ module workflow

Medium-term Goals (Next Quarter)

  • Build a complete data analysis project
  • Learn about data versioning and larger datasets
  • Explore machine learning basics with scikit-learn
  • Set up automated testing for your code

๐Ÿค AI Mentor Integration

This project works with AI assistance (see CODEX_INIT.md). When asking for help:

Request clear explanations: "Explain step-by-step how to..." Ask for best practices: "What's the best way to organize..."
Get structure guidance: "Should this code go in the notebook or src/?" Learn from warnings: "What could go wrong if I..." Seek pro tips: "What advanced techniques should I know about..."


๐ŸŽ‰ Quick Start Checklist

Ready to begin? Follow this checklist:

  • Environment: Run make env-create and make kernel
  • Folders: Run make init to create project structure
  • Test: Run make check to verify everything works
  • Launch: Run make lab to start JupyterLab
  • First notebook: Create notebooks/01_getting_started.ipynb
  • Git setup: Run make clear-outputs, then your first commit

Happy Learning! ๐Ÿš€

Remember: This setup emphasizes learning by doing with clean, reproducible workflows. Focus on understanding concepts while building good habits from day one. The automation tools are here to help you focus on learning, not fighting with environment setup.

About

Hands-on Jupyter Notebook tutorials covering setup, core features, and best practices, while serving as a sandbox to learn micromamba, Git, and Make.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published