<a target="_blank" href="https://colab.research.google.com/github/lukebarousse/Int_SQL_Data_Analytics_Course/blob/main/0_Intro/2_Colab_Notebooks.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

# Running SQL in Colab Notebooks

## Overview
1. [Intro: Jupyter Notebooks & Colab](#intro-jupyter-notebooks--colab)
2. [Manipulating a Colab Notebook](#manipulating-a-colab-notebook)
3. [Markdown Basics](#markdown-basics)
4. [How to Run SQL Queries](#how-to-run-sql-queries)

---
## Intro: Jupyter Notebooks & Colab

# 📌 What Is a Jupyter Notebook?
- ⚙️ **Multi-Language**: Run Python, R, and SQL in one environment  
- 📝 **Live Code**: Execute commands on the spot and view instant results  
- 🏗️ **Integrated Docs**: Combine markdown explanations with code cells  

# ☁️ What Is Google Colab?
- 💰 **Free Access**: Cloud-based resources at no cost  
- 🌐 **No Installs**: Work instantly from any browser  
- 🔗 **Easy Sharing**: Collaborate like a Google Doc  

# 🎯 Why Use Colab for SQL?
- ⚡ **Fast Setup**: Avoid installing databases or editors  
- 🖥️ **All-In-One**: Write queries and see outputs side by side  
- 📝 **Rich Notes**: Keep course instructions next to your code  

---

## Manipulating a Colab Notebook

### Create a Colab Notebook:
#### Option 1: Start from Colab  
1. 🔗 Go to [colab.research.google.com](https://colab.research.google.com)  
2. ➕ Click **"New Notebook"**  

#### Option 2: Start from Google Drive  
1. 🔗 Go to [drive.google.com](https://drive.google.com)  
2. ➕ Click **"+ New"** (top-left corner)  
3. 🔽 Select **"More"**, then **"Google Colaboratory"**  

**⚠️ NOTE:** You must be logged into Google to use Colab.

### Adding and moving cells

Add new cells by using the **+ CODE** and **+ TEXT** buttons that show when you hover between cells.

You can move a cell by selecting it and clicking **Cell Up** or **Cell Down** in the top toolbar.

### Markdown Cell vs. Python Cell

#### Markdown Example
This is a markdown cell.


#### Python Example

In [1]:
"This is a python cell"

'This is a python cell'

### How to Run a Cell

*Easy Way:* Click the run button.

*Pro Way:* With cell selected type **Cmd/Ctrl+Enter**. 

*Other Pro Tips:*
* Type **Shift+Enter** to run the cell and move selection to the next cell
* Type **Alt+Enter** to run the cell and insert a new code cell below it


In [2]:
2 + 2

4

### Cell Basics

Cells print out below it it's results.

In [3]:
2 + 2

4

#### Make a Comment

What if you wanted to include something in your code like a note?  
- Include a `#` before the line
- This is called a comment

That doesn't execute anything and isn't read by the Python interpreter when executing your code.

In [4]:
# This is a comment

print("What's Up, Data Nerds")

What's Up, Data Nerds


---
## Markdown Basics

Markdown is a lightweight markup language that uses characters like # for headings and * for emphasis to format text simply and intuitively.

| Element        | Markdown Syntax                        |
|----------------|----------------------------------------|
| Heading        | `# H1`<br>`## H2`<br>`### H3`          |
| Bold           | `**bold text**`                        |
| Italic         | `*italicized text*`                    |
| Blockquote     | `> blockquote`                         |
| Ordered List   | `1. First item`<br>`2. Second item`<br>`3. Third item` |
| Unordered List | `- First item`<br>`- Second item`<br>`- Third item`   |
| Code           | `` `code` ``                           |
| Horizontal Rule| `---`                                  |
| Link           | `[title](https://www.example.com)`     |
| Image          | `![alt text](image.jpg)`               |

[Here is a more info on Markdown](https://www.markdownguide.org/basic-syntax/)

---

## 📌 Menu Walkthrough

### 📂 Sidebar Menu (Left-Top)
- 📖 **Table of Contents** – Navigate through notebook sections  
- 🔍 **Find and Replace** – Search and replace text within the notebook  
- 📊 **Variables** – View and manage variables in the session  
- 🔒 **Secrets** – Store and access sensitive information securely  
- 📁 **Files** – Manage and browse files in the Colab environment  

### ⚙️ Sidebar Menu (Left-Bottom)
- 🧩 **Code Snippets** – Access useful pre-written code snippets  
- 🎛️ **Command Palette** – Quickly access commands and shortcuts  
- 🖥️ **Terminal** – Open a command-line terminal  

### 🎛️ Top Menu Bar
- ⚡ **Runtime** – Controls running cells and restarting the notebook  

### 📊 Colab Header (Top)
- 💾 **RAM & Disk Usage** – Monitor available system resources  
- 🤖 **Colab AI** – AI-powered assistance (may or may not be available)  

### Runtime Deep Dive

**Run All:**
- ⚙️ Executes all notebook cells in sequence  

**Interrupt Execution:**
- 🛑 Stops the execution of the current cell  

**Restart Session:**
- 🔄 Resets the notebook's kernel, clearing all the executed code memory  
- ✨ No code execution afterward unless manually triggered  

**Disconnect and Delete Runtime:**
- ❌ Stops the notebook, releases resources, and deletes any temporary files created during the session

---

## How to Run SQL Queries

We'll be running our SQL queries directly in our Jupyter Notebooks.

### Connecting to Database

Before writing the SQL queries, you **must** run the code block below.

> **Note:** Understanding the code in the next cell is NOT important NOR required. All it does is:
> - Sets up a connection to our database
> - Installs necessary tools to run SQL
> - Configures some settings to make our results look nice

In [5]:
import sys
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

# If running in Google Colab, install PostgreSQL and restore the database
if 'google.colab' in sys.modules:
    # Install PostgreSQL
    !sudo apt-get install postgresql -qq > /dev/null 2>&1

    # Start PostgreSQL service (suppress output)
    !sudo service postgresql start > /dev/null 2>&1

    # Set password for the 'postgres' user to avoid authentication errors (suppress output)
    !sudo -u postgres psql -c "ALTER USER postgres WITH PASSWORD 'password';" > /dev/null 2>&1

    # Create the 'colab_db' database (suppress output)
    !sudo -u postgres psql -c "CREATE DATABASE contoso_100k;" > /dev/null 2>&1

    # Download the PostgreSQL .sql dump
    !wget -q -O contoso_100k.sql https://github.com/lukebarousse/Int_SQL_Data_Analytics_Course/releases/download/v.0.0.0/contoso_100k.sql

    # Restore the dump file into the PostgreSQL database (suppress output)
    !sudo -u postgres psql contoso_100k < contoso_100k.sql > /dev/null 2>&1

    # Shift libraries from ipython-sql to jupysql
    !pip uninstall -y ipython-sql > /dev/null 2>&1
    !pip install jupysql > /dev/null 2>&1

# Load the sql extension for SQL magic
%load_ext sql

# Connect to the PostgreSQL database
%sql postgresql://postgres:password@localhost:5432/contoso_100k

# Enable automatic conversion of SQL results to pandas DataFrames
%config SqlMagic.autopandas = True

# Disable named parameters for SQL magic
%config SqlMagic.named_parameters = "disabled"

# Display pandas number to two decimal places
pd.options.display.float_format = '{:.2f}'.format

### Magic Commands

The following is a table of common magic commands:

| Symbol | Name           | Example            | Usage Explanation                                                                 |
|--------|----------------|--------------------|-----------------------------------------------------------------------------------|
| `%%`   | Cell Magic     | `%%timeit` | Applies the magic command to the entire cell, measuring execution time of all code in cell. |
| `%`    | Line Magic     | `%timeit` | Applies the magic command to the current line, measuring execution time of that line. |

In [6]:
%timeit 2 + 2

2.03 ns ± 0.0153 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)


With line magic, the next line will be executed but isn't subjected to the magic command.

In [7]:
%timeit
2 + 2

4

Instead we we use cell magic.

In [8]:
%%timeit

2 + 2

2.05 ns ± 0.0194 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)


### Common Magic Commands

| Command | Description |
|---------|-------------|
| `%timeit` | Measures the execution time of the next line of code |
| `%time` | Measures the execution time of the entire cell |
| `%load_ext` | Loads an extension into the notebook |
| `%config` | Configures the notebook's settings |
| `%matplotlib` | Configures the matplotlib settings |
| `%sql` | Executes a SQL query |
| `%pwd` | Prints the current working directory |
| `%cd` | Changes the current working directory |
| `%ls` | Lists the files in the current directory |

### Writing Queries - `%%sql` & `%sql`

| Magic Command | Usage Explanation |
|----------------|-------------------|
| `%%sql` | Applies the magic command to the entire cell, like writing a SQL query. |
| `%sql` | Applies the magic command to the current line, like writing a SQL query. |  

> **Note:** We'll be using `%%sql` to write our SQL queries since it applies to the entire cell.

Write a SQL query create a new code block with `%%sql` magic command at the top. Then below you can write your query as usual.

```sql
%%sql 

SELECT *
FROM table
```

In [9]:
%%sql 

SELECT
    EXTRACT(YEAR FROM orderdate) AS year,
    SUM(netprice) AS total_year_net_revenue
FROM
    sales
GROUP BY
    year
ORDER BY
    year

Unnamed: 0,year,total_year_net_revenue
0,2015,2411195.64
1,2016,3379301.2
2,2017,4229458.39
3,2018,7950000.89
4,2019,9972332.59
5,2020,3593779.5
6,2021,6742864.28
7,2022,13907299.92
8,2023,10069875.6
9,2024,2561994.76


### More Details on `%%sql` & `%sql` - JupySQL

[JupySQL](https://jupysql.ploomber.io/) is a library that allows you to write SQL queries in Jupyter Notebooks.

Checkout the [sample_notebook.ipynb](/Resources/sample_notebook.ipynb) for more examples of how to use `%%sql` & `%sql`.