# üìå How to use this notebook (please read once)

Before you start any exercise:

1. Go to the top menu and click Run ‚Üí Run all

![colab_top_bar_run](https://raw.githubusercontent.com/Haross/DB_pics_nt/main/colab_top_bar_run_all.png)

2. Wait about **1 minute** while everything loads

3. Once everything finishes loading, scroll down to start working with the notebook.

> üëâ Y**ou need to do this once per session (each time you open the notebook).**



## ‚úçÔ∏è Writing and running SQL queries

In this notebook you will see a SQL editor like the one below:

![SQL_editor_run_button](https://raw.githubusercontent.com/Haross/DB_pics_nt/main/colab_sql_gui.png)


Type your SQL queries inside the provided GUI

> Toolbar buttons (in the SQL editor)
>
> * ‚ñ∂ Blue Run query: runs your SQL and shows results.
>
> * ‚ü≤ Revert to last saved: restores the last saved query for this exercise.
>
> * üßπ Clear results: clears the displayed output (does not delete data).
>
> * ‚å´ Clear editor: clears the query text (does not delete data or history).


Do not edit regular code cells unless explicitly asked

### About the validator

If the validator says your query is incorrect but you believe it is correct:

* Don‚Äôt panic üôÇ

* The validator is a guideline, not a judge

* In rare cases, a correct query may not be recognized


## üß† About the SQL environment setup (do not edit)

You will see a section called ‚ÄúSQL Environment Setup (do not edit)‚Äù.

You do not need to understand or modify this

This code is required for the notebook to work

It runs automatically when you click **Run all**

> üëâ **Ignore the section SQL environment setup completely.**

## üîÑ If the SQL editor does not let you type

Sometimes the SQL editor may not respond.

If that happens:

1. Click the ‚ñ∂ Play button in that same cell that appears at the top-left of the cell. As in the below picture

![SQL_editor_run_button](https://raw.githubusercontent.com/Haross/DB_pics_nt/main/colab_sql_run_Cell_button.png)

2. The editor will refresh and work normally

## üìÇ Viewing your previous queries (optional)

After you submit your first query, new files will appear on the left sidebar:

* sql_query_log.csv ‚Üí latest query per exercise

* sql_query_log_history.csv ‚Üí all queries tried for each exercise

> ‚ö†Ô∏è These files do not appear until you run at least one query.

You do not need to open these files unless you are curious.

## üö´ About hidden code

Some cells hide code and show a ‚ÄúShow code‚Äù button.

* Do not open these

* If you open one by accident:

  1. Click the cell

  2. Go to View ‚Üí Show/Hide Code to hide it again

![SQL_editor_run_button](https://raw.githubusercontent.com/Haross/DB_pics_nt/main/colab_view_show_hide_cells.png)



# **SQL Environment Setup (do not edit)**

In [30]:
# @title
#from google.colab import output
#output.enable_custom_widget_manager()

In [None]:
# @title
%%capture
!pip -q install ipython-sql
!pip -q install ipywidgets
%load_ext sql
%config SqlMagic.feedback=False
%config SqlMagic.autopandas=True
%config SqlMagic.style = '_DEPRECATED_DEFAULT'
DB_FILE = "class.db"
%sql sqlite:///class.db

In [None]:
# @title
%%capture
import os, sqlite3

conn = sqlite3.connect(DB_FILE)
cur = conn.cursor()

#conn.close()

In [None]:
# @title
import hashlib
import pandas as pd


def df_fingerprint(
    df: pd.DataFrame,
    *,
    sort_rows=True,
    sort_cols=False,
    normalize_whitespace=True,
    na_token="<NA>",
) -> tuple[str, dict]:
    """
    Returns (sha256_hex, meta) for df, canonicalized.
    - sort_rows: ignores row order (recommended)
    - sort_cols: ignores column order (optional)
    """
    x = df.copy()

    # Optional column sorting (usually you want to enforce column order, so keep False)
    if sort_cols:
        x = x.reindex(sorted(x.columns), axis=1)

    # Normalize cell values to strings consistently
    def norm(v):
        if pd.isna(v):
            return na_token
        s = str(v)
        if normalize_whitespace:
            s = " ".join(s.split())
        return s

    x = x.map(norm)

    # Sort rows by all columns to remove ordering differences
    if sort_rows and len(x.columns) > 0 and len(x) > 0:
        x = x.sort_values(by=list(x.columns), kind="mergesort").reset_index(drop=True)
    else:
        x = x.reset_index(drop=True)

    payload = x.to_csv(index=False)  # stable serialization
    h = hashlib.sha256(payload.encode("utf-8")).hexdigest()

    meta = {"rows": int(len(df)), "cols": list(df.columns)}
    return h, meta


def make_df_validator_nospoilers(
    expected_hash: str,
    *,
    required_cols=None,
    exact_cols=False, #If True, the result must contain ONLY the required columns (no missing, no extra). Column order is ignored if sort_cols=True.
    expected_rows=None,
    sort_rows=True, # If True, rows are sorted before computing the fingerprint. This allows answers to be correct even if ORDER BY is missing.
    sort_cols=False, #If True, columns are sorted alphabetically before fingerprinting. This allows SELECT columns in any order to be accepted.
    hide_missing_cols=True,
    hide_row_count=False,
):
    required_cols = required_cols or []

    def validator(df: pd.DataFrame):
        problems = []
        structural_issue = False

        # --- columns ---
        if required_cols:
            missing = [c for c in required_cols if c not in df.columns]
            if missing:
                structural_issue = True
                if hide_missing_cols:
                    problems.append("Wrong number of columns! Make corrections and try again.")
                else:
                    problems.append(f"Missing column(s): {', '.join(missing)}")
                return False, problems

        if exact_cols and required_cols:
            if set(df.columns) != set(required_cols): # same columns order doesn't matter
                structural_issue = True
                return False, ["Wrong number of columns! Make corrections and try again."]

        # If structural issue, stop here (don‚Äôt add extra hints like hash mismatch)
        if structural_issue:
            return False, problems

        # --- rows ---
        if expected_rows is not None and len(df) != expected_rows:
            if hide_row_count:
                return False, ["Wrong number of rows! Make corrections and try again."]
            return False, [f"Wrong number of rows! Make corrections and try again."]  # generic but still useful

        # --- values ---
        got_hash, _ = df_fingerprint(df, sort_rows=sort_rows, sort_cols=sort_cols)
        if got_hash != expected_hash:
            return False, ["The result is not correct yet. Make corrections and try again."]

        return True, ["Nice ‚Äî your output matches the expected result ‚úÖ"]

    return validator


import random


SUCCESS_MESSAGES = [
    "üëè Nice!",
    "üëè Great job",
    "üëè Good job",
    "üëè Keep up the good work!",
    "üëè I think you‚Äôre getting the hang of this!",
    "üëè Well played",
    "üëè Fantastic! Let‚Äôs keep it going",
    "üëè Nicely done",
]


In [None]:
# @title
%%sql
CREATE TABLE IF NOT EXISTS car (
    VIN TEXT PRIMARY KEY,
    BRAND TEXT NOT NULL,
    MODEL TEXT NOT NULL,
    PRICE REAL,
    PRODUCTION_YEAR INTEGER NOT NULL
);

INSERT INTO car (VIN, BRAND, MODEL, PRICE, PRODUCTION_YEAR) VALUES
('LJCPCBLCX14500264', 'Ford', 'Focus', 8000.00, 2005),
('WPOZZZ79ZTS372128', 'Ford', 'Fusion', 12500.00, 2008),
('JF1BR93D7BG498281', 'Toyota', 'Avensis', 11300.00, 1999),
('KLATF08Y1VB363636', 'Volkswagen', 'Golf', 3270.00, 1992),
('1M8GDM9AXKP042788', 'Volkswagen', 'Golf', 13000.00, 2010),
('1HGCM82633A004352', 'Volkswagen', 'Jetta', 6420.00, 2003),
('1G1YZ23J9P5800003', 'Fiat', 'Punto', 5700.00, 1999),
('GS723HDSAK2399002', 'Opel', 'Corsa', NULL, 2007);


In [None]:
# @title
import sqlite3
import pandas as pd
import ipywidgets as widgets
from IPython.display import display, clear_output, HTML
from pathlib import Path
import csv
from datetime import datetime
import html as _html



def make_sql_runner(
    conn,
    runner_id: str,
    default_sql=None,
    select_only=True,
    validator=None,
    dedupe=True,
    description_md=None,
    hint_enabled=False,
    hint_md=None,
    schema_tables=None,
):
    # ---------- helpers ----------
    def md_to_html(md: str) -> str:
        try:
            import markdown as _md
            return _md.markdown(md)
        except Exception:
            return "<br>".join(_html.escape(md).splitlines())

    # ---------- persistence ----------
    LOG_ALL_FILE = Path("sql_query_log.csv")
    LOG_LATEST_FILE = Path("sql_query_latest.csv")

    def _append_history(runner_id: str, sql: str, log_path: Path = LOG_ALL_FILE):
        is_new = not log_path.exists()
        with log_path.open("a", newline="", encoding="utf-8") as f:
            w = csv.writer(f)
            if is_new:
                w.writerow(["ts", "runner_id", "sql"])
            w.writerow([datetime.now().isoformat(timespec="seconds"), runner_id, sql])

    def _load_latest_map(latest_path: Path = LOG_LATEST_FILE) -> dict:
        if not latest_path.exists():
            return {}
        latest = {}
        with latest_path.open("r", newline="", encoding="utf-8") as f:
            r = csv.DictReader(f)
            for row in r:
                latest[row["runner_id"]] = row["sql"]
        return latest

    def _save_latest_map(latest: dict, latest_path: Path = LOG_LATEST_FILE):
        with latest_path.open("w", newline="", encoding="utf-8") as f:
            w = csv.writer(f)
            w.writerow(["runner_id", "sql"])
            for rid, sql in latest.items():
                w.writerow([rid, sql])

    latest_map = _load_latest_map()
    last_saved = latest_map.get(runner_id)
    if last_saved is not None:
        initial_sql = last_saved
    else:
        initial_sql = default_sql or ""
    # ---------- UI chrome (CSS) ----------
    display(HTML("""
      <style>
        /* =========================
          Global runner bounds
          ========================= */
        .sql-runner{
          max-width: 100% !important;
          box-sizing: border-box !important;
          padding-right: 18px;   /* keep resize handle away from notebook scrollbar */
          padding-bottom: 12px;
          overflow-x: hidden;
        }

        .sql-runner .widget-box,
        .sql-runner .widget-vbox,
        .sql-runner .widget-hbox{
          width: 100% !important;
          max-width: 100% !important;
          box-sizing: border-box !important;
        }

        /* =========================
          Description / Hint boxes
          ========================= */
        .sql-desc{
          border-left: 4px solid #1a73e8;
          background: #f5f9ff;
          padding: 10px 12px;
          margin: 6px 0 10px 0;
          border-radius: 6px;
          font-size: 14px;
          line-height: 1.5;
        }
        .sql-hintbox{
          border-left: 4px solid #fbbc04;
          background: #fff8e1;
          padding: 10px 12px;
          margin: 8px 0 10px 0;
          border-radius: 6px;
          font-size: 14px;
          line-height: 1.5;
        }

        /* =========================
          Editor + toolbar panel
          ========================= */
        .sql-runner .sql-panel{
          border: 1px solid #d0d7de;
          border-radius: 12px;
          background: #f6f8fa;
          overflow: hidden;
        }

        /* Textarea wrapper adapts to resized textarea */
        .sql-runner .widget-textarea{
          height: auto !important;
        }

        /* Base textarea behavior (resizable) */
        .sql-runner .widget-textarea textarea{
          height: 95px;                 /* initial size */
          min-height: 120px !important;
          resize: vertical !important;
          width: 100% !important;
          max-width: 100% !important;
          box-sizing: border-box !important;
        }

        /* Editor look */
        .sql-runner .sql-editor textarea{
          background: #ffffff;
          border: 0 !important;          /* panel provides border */
          border-radius: 0 !important;
          padding: 12px !important;
          font-family: ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, "Liberation Mono", "Courier New", monospace;
          font-size: 13px;
          line-height: 1.4;
        }

        /* Toolbar */
        .sql-runner .sql-toolbar{
          border-top: 1px solid #d0d7de;
          background: #f6f8fa;
          margin: 0 !important;
          padding: 6px 10px !important;
        }

        .sql-runner .sql-toolbar .widget-button{
          min-width: 40px !important;
          width: 40px !important;
          height: 40px !important;
          border-radius: 8px !important;
          font-size: 18px !important;

          background: transparent !important;
          border: 1px solid transparent !important;
          box-shadow: none !important;
          color: #24292f !important;
        }
        .sql-runner .sql-toolbar .widget-button:hover{
          background: #eaeef2 !important;
          border-color: #d0d7de !important;
        }
        .sql-runner .sql-toolbar .widget-button.mod-primary{
          background: #1a73e8 !important;
          border-color: #1a73e8 !important;
          color: #ffffff !important;
        }
        .sql-runner .sql-toolbar .widget-button.mod-primary:hover{
          filter: brightness(0.95);
        }
        .sql-runner .sql-toolbar .hint{
          color: #57606a;
          font-size: 12px;
          margin-left: 12px;
        }

        /* =========================
          Tabs results panel (dock)
          ========================= */

        .sql-runner .sql-tabs-panel{
          border: 1px solid #d0d7de;
          border-radius: 10px;
          background: #ffffff;
          overflow: hidden;
        }

        /* Remove default focus rings */
        .sql-runner .widget-tab:focus,
        .sql-runner .widget-tab :focus{
          outline: none !important;
          box-shadow: none !important;
        }

        /* Let outer panel own the border */
        .sql-runner .sql-tabs-panel .widget-tab{
          border: 0 !important;
          background: transparent !important;
        }

        /* Tab bar */
        .sql-runner .widget-tab > .p-TabBar{
          background: #f6f8fa !important;
          border-bottom: 1px solid #d0d7de !important;
          padding: 0 6px !important;
        }

        /* Tabs */
        .sql-runner .p-TabBar-tab{
          margin: 0 6px 0 0 !important;
          padding: 6px 12px !important;
          font-size: 13px !important;
          line-height: 18px !important;
          color: #57606a !important;

          background: transparent !important;
          border: 1px solid transparent !important;
          border-bottom: 0 !important;
          border-radius: 8px 8px 0 0 !important;

          position: relative; /* for ::after indicator */
        }

        .sql-runner .p-TabBar-tab:hover{
          background: #eaeef2 !important;
          border-color: #d0d7de !important;
        }

        /* Active tab */
        .sql-runner .p-TabBar-tab.p-mod-current{
          background: #ffffff !important;
          color: #24292f !important;
          border-color: #d0d7de !important;
          border-bottom: 1px solid #ffffff !important; /* merges into content */
          font-weight: 500 !important;
          z-index: 2;
        }

        /* Blue indicator INSIDE the tab (prevents "blue line below") */
        .sql-runner .p-TabBar-tab.p-mod-current::after{
          content: "";
          position: absolute;
          left: 12px;
          right: 12px;
          bottom: 4px;      /* <-- inside the tab; NOT -1px */
          height: 2px;
          background: #1a73e8;
          border-radius: 2px;
        }

        /* Tab content */
        .sql-runner .widget-tab > .p-TabPanel{
          padding: 10px !important;
          background: #ffffff !important;
        }


        /* --- HARD OVERRIDES: remove Lumino/Colab active-tab blue indicator --- */
        .sql-runner .p-TabBar-tab.p-mod-current{
          box-shadow: none !important;      /* Lumino often draws the blue line here */
          background-image: none !important;
        }

        /* Some themes use ::before as the underline */
        .sql-runner .p-TabBar-tab.p-mod-current::before{
          content: none !important;
          display: none !important;
        }

        /* Some themes apply focus-visible outline/underline */
        .sql-runner .p-TabBar-tab:focus-visible,
        .sql-runner .p-TabBar-tab.p-mod-current:focus-visible{
          outline: none !important;
          box-shadow: none !important;
        }



        /* =========================
          Remove Lumino inner divider inside tab contents
          ========================= */
        .sql-runner .sql-tabs-panel .widget-tab-contents,
        .sql-runner .sql-tabs-panel .p-TabPanel-tabContents{
          border: 0 !important;
          box-shadow: none !important;
          outline: none !important;
          background: transparent !important;
        }





        /* =========================
          Resizable tabs container (stable)
          ========================= */

        /* Make the OUTER bordered panel resizable */
        .sql-runner .sql-tabs-panel{
          resize: vertical;
          overflow: hidden;          /* important: prevents the tab bar from getting clipped/overlapped */
          min-height: 220px;         /* prevents collapsing into the tabs */
        }

        /* Force the Tab widget to use a vertical flex layout */
        .sql-runner .sql-tabs-panel .widget-tab{
          height: 100% !important;
          display: flex !important;
          flex-direction: column !important;
        }

        /* Tab bar stays fixed at top */
        .sql-runner .sql-tabs-panel .widget-tab > .p-TabBar{
          flex: 0 0 auto !important;
        }

        /* Tab content area becomes the flexible, scrollable region */
        .sql-runner .sql-tabs-panel .widget-tab > .p-TabPanel{
          flex: 1 1 auto !important;
          overflow: auto !important;
          min-height: 140px;         /* keeps content area usable even when resized smaller */
          box-sizing: border-box !important;
        }





        /* =========================
          Validation box
          ========================= */
        .sql-validation{
          position: relative;
          padding: 10px 38px 10px 12px; /* extra right padding for ‚úï */
          margin: 8px 0 10px 0;
          border-radius: 6px;
          font-size: 14px;
          line-height: 1.5;
        }
        .sql-validation.ok{
          border-left: 4px solid #2e7d32;
          background: #e8f5e9;
        }
        .sql-validation.err{
          border-left: 4px solid #b00020;
          background: #ffebee;
        }
        .sql-validation .close{
          position: absolute;
          top: 8px;
          right: 10px;
          cursor: pointer;
          user-select: none;
          opacity: 0.65;
          font-weight: 700;
        }
        .sql-validation .close:hover{
          opacity: 1;
        }
        .sql-validation ul{
          margin: 6px 0 0 18px;
        }



      </style>
      """))


    # ---------- widgets ----------

    desc_widget = None
    if description_md:
        desc_widget = widgets.HTML(
            value=f"<div class='sql-desc'>{md_to_html(description_md)}</div>"
        )

    # Hint components (created only if enabled + text provided)
    show_hint_ui = bool(hint_enabled and hint_md)

    hint_btn = None
    hint_box = None
    if show_hint_ui:
        hint_btn = widgets.Button(
            description="üí°",
            tooltip="Show/hide hint",
            layout=widgets.Layout(width="40px", height="40px"),
        )
        hint_visible = False

        hint_html = widgets.HTML(value=f"<div class='sql-hintbox'>{md_to_html(hint_md)}</div>")
        hint_box = widgets.Box([hint_html], layout=widgets.Layout(display="none"))

        def on_hint_click(_):
            nonlocal hint_visible
            hint_visible = not hint_visible
            hint_box.layout.display = "block" if hint_visible else "none"

        hint_btn.on_click(on_hint_click)


    # ---------- validation banner ----------

    validation_widget = widgets.HTML(value="")
    validation_nonce = 0

    def hide_validation():
        validation_widget.value = ""


    def show_validation(ok: bool, problems_or_msg):
        nonlocal validation_nonce
        validation_nonce += 1

        if isinstance(problems_or_msg, str):
            problems = [problems_or_msg] if problems_or_msg else []
        else:
            problems = list(problems_or_msg or [])

        cls = "ok" if ok else "err"

        if ok:
            title = random.choice(SUCCESS_MESSAGES)
            message = ""   # no extra text needed
        else:
            title = "üôÅ Not correct yet"
            message = " ".join(problems)

        box_id = f"val_{runner_id}_{validation_nonce}"

        validation_widget.value = f"""
          <div id="{box_id}" class="sql-validation {cls}">
            <div class="close" onclick="document.getElementById('{box_id}').remove()">‚úï</div>
            <b>{_html.escape(title)}</b>
            { _html.escape(message) }
          </div>
        """


    # ---------- end validation banner ----------

    box = widgets.Textarea(
        value=initial_sql,
        placeholder="Type your SQL query here...",
        description="",
        layout=widgets.Layout(width="100%")
    )
    box.add_class("sql-editor")

    results_out = widgets.Output()
    schema_out = widgets.Output()

    results_box = widgets.Box(
        [results_out],
        layout=widgets.Layout(width="100%", padding="8px")
    )
    schema_box = widgets.Box(
        [schema_out],
        layout=widgets.Layout(width="100%",  padding="8px")
    )


    tabs = widgets.Tab(children=[results_box, schema_box], layout=widgets.Layout(width="100%"))
    tabs.set_title(0, "Query results")
    tabs.set_title(1, "Schema Database")

    # Toolbar buttons
    run_btn = widgets.Button(
        description="‚ñ∂",
        tooltip="Run query",
        layout=widgets.Layout(width="40px", height="40px"),
        button_style="primary"
    )
    revert_btn = widgets.Button(description="‚ü≤", tooltip="Revert to last saved", layout=widgets.Layout(width="40px", height="40px"))

    reset_btn = None
    if default_sql:
        reset_btn = widgets.Button(
            description="‚Ü©",
            tooltip="Reset to default SQL",
            layout=widgets.Layout(width="40px", height="40px")
        )

    clear_results_btn = widgets.Button(description="üßπ", tooltip="Clear results output", layout=widgets.Layout(width="40px", height="40px"))
    clear_query_btn = widgets.Button(description="‚å´", tooltip="Clear query editor", layout=widgets.Layout(width="40px", height="40px"))

    status = widgets.HTML('<span class="hint"></span>')

    def set_status(msg: str):
        status.value = f'<span class="hint">{msg}</span>' if msg else '<span class="hint"></span>'

    # Toolbar composition
    left_items = [run_btn]
    if hint_btn:
        left_items.append(hint_btn)


    left_items.append(revert_btn)
    if reset_btn:
        left_items.append(reset_btn)
    left_items.extend([clear_results_btn, clear_query_btn])

    left = widgets.HBox(left_items, layout=widgets.Layout(gap="8px", align_items="center"))
    right = widgets.HBox([status], layout=widgets.Layout(justify_content="flex-end", align_items="center"))

    toolbar = widgets.HBox(
        [left, right],
        layout=widgets.Layout(width="100%", align_items="center", justify_content="space-between")
    )
    toolbar.add_class("sql-toolbar")

    # ---------- schema renderer ----------
    def render_schema():
      with schema_out:
          clear_output()
          try:
              all_tables = pd.read_sql_query(
                  """
                  SELECT name
                  FROM sqlite_master
                  WHERE type='table' AND name NOT LIKE 'sqlite_%'
                  ORDER BY name;
                  """,
                  conn
              )["name"].tolist()

              if not all_tables:
                  display(HTML("<b>No tables found.</b>"))
                  return

              # Apply filter if provided
              if schema_tables:
                  # keep only tables that actually exist (preserve requested order)
                  tables = [t for t in schema_tables if t in all_tables]
                  missing = [t for t in schema_tables if t not in all_tables]

                  if missing:
                      display(HTML(
                          "<div style='margin:6px 0 10px 0;color:#b00020'>"
                          f"<b>Note:</b> table(s) not found: {', '.join(_html.escape(x) for x in missing)}"
                          "</div>"
                      ))
              else:
                  tables = all_tables

              if not tables:
                  display(HTML("<b>No matching tables to display.</b>"))
                  return

              items = []
              titles = []

              for t in tables:
                  info = pd.read_sql_query(f"PRAGMA table_info('{t}');", conn)
                  info = info[["name", "type", "notnull", "dflt_value", "pk"]]

                  out = widgets.Output()
                  with out:
                      display(info.style.format(na_rep="NULL").hide(axis="index"))

                  items.append(out)
                  titles.append(t)

              acc = widgets.Accordion(children=items)
              for i, t in enumerate(titles):
                  acc.set_title(i, t)

              acc.selected_index = 0 if len(tables) == 1 else None
              display(acc)

          except Exception as e:
              display(HTML(f"<pre style='color:#b00020'>Error:\n{e}</pre>"))

    def on_tab_change(change):
        if change["name"] == "selected_index" and change["new"] == 1:
            render_schema()

    tabs.observe(on_tab_change)

    # ---------- actions ----------
    def run_query(_):
        nonlocal last_saved, latest_map

        q = box.value.strip()
        with results_out:
            clear_output()

            if not q:
                display(HTML("<b>Please type a query.</b>"))
                tabs.selected_index = 0
                set_status("No query to run.")
                return

            norm = lambda s: " ".join(s.split())
            changed = (norm(q) != norm(last_saved or ""))

            if (not dedupe) or changed:
                _append_history(runner_id, q)
                latest_map = _load_latest_map()
                latest_map[runner_id] = q
                _save_latest_map(latest_map)
                last_saved = q

            if select_only and not q.lower().lstrip().startswith(("select", "with")):
                display(HTML("<b>Only SELECT/WITH queries are allowed.</b>"))
                tabs.selected_index = 0
                set_status("Blocked: only SELECT/WITH allowed.")
                return

            # disable toolbar during execution
            for b in (run_btn, revert_btn, clear_results_btn, clear_query_btn):
                b.disabled = True
            if reset_btn:
                reset_btn.disabled = True
            if hint_btn:
                hint_btn.disabled = True

            try:
                if q.lower().lstrip().startswith(("select", "with")):
                    df = pd.read_sql_query(q, conn)
                    display(df.style.format(na_rep="NULL").hide(axis="index"))
                    set_status(f"Returned {len(df)} row(s).")

                    if validator:
                        ok, problems = validator(df)   # (bool, list[str]) or (bool, str)
                        show_validation(ok, problems)
                    else:
                        hide_validation()

                else:
                    cur = conn.cursor()
                    cur.executescript(q)
                    conn.commit()
                    display(HTML("<b>‚úÖ Query executed.</b>"))
                    set_status("Query executed.")

                tabs.selected_index = 0

            except Exception as e:
                display(HTML(f"<pre style='color:#b00020'>Error:\n{e}</pre>"))
                tabs.selected_index = 0
                set_status(f"Error running query.")
            finally:
                for b in (run_btn, revert_btn, clear_results_btn, clear_query_btn):
                    b.disabled = False
                if reset_btn:
                    reset_btn.disabled = False
                if hint_btn:
                    hint_btn.disabled = False

    def revert_query(_):
        nonlocal last_saved
        latest_map_local = _load_latest_map()
        saved = latest_map_local.get(runner_id)
        box.value = saved if saved is not None else (default_sql or "")
        set_status("Reverted to last saved." if saved is not None else "No saved query ‚Äî reverted.")

    def reset_to_default(_):
        if default_sql:
          box.value = default_sql
          set_status("Reset to default SQL.")

    def clear_results(_):
        results_out.clear_output()
        set_status("Cleared results output.")

    def clear_query(_):
        box.value = ""
        set_status("Cleared query editor.")

    run_btn.on_click(run_query)
    revert_btn.on_click(revert_query)
    if reset_btn:
        reset_btn.on_click(reset_to_default)
    clear_results_btn.on_click(clear_results)
    clear_query_btn.on_click(clear_query)

    # ---------- layout ----------
    elements = []
    if desc_widget:
        elements.append(desc_widget)
    if hint_box:
        elements.append(hint_box)
    elements.append(validation_widget)

    editor_panel = widgets.VBox([box, toolbar])
    editor_panel.add_class("sql-panel")

    spacer = widgets.Box(layout=widgets.Layout(height="10px"))


    #elements.extend([editor_panel, spacer,tabs])
    tabs_panel = widgets.VBox([tabs])
    tabs_panel.add_class("sql-tabs-panel")
    tabs_panel.layout.height = "230px"

    elements.extend([editor_panel, spacer, tabs_panel])

    ui = widgets.VBox(elements, layout=widgets.Layout(width="100%"))

    ui.add_class("sql-runner")
    display(ui)

    render_schema()
    set_status("Ready.")


In [None]:
# @title
#Professor- answer validator generator
solution_sql = """
SELECT * FROM car;
"""
solution_df = pd.read_sql_query(solution_sql, conn)
expected_hash, meta = df_fingerprint(solution_df, sort_rows=True,sort_cols=True)
expected_hash, meta

# What is SQL?

So, how do we get in touch with a database?  
We use **Structured Query Language**. Of course, no one actually uses the full name.  
We just call it **SQL**.

SQL is the standard language used to interact with relational databases.  
It allows us to retrieve, filter, and organize data stored in tables.


> üí° **Did you know?**  
> There are many popular SQL databases, including SQLite, MySQL, PostgreSQL, Oracle, and Microsoft SQL Server.  
> While they all support the standard SQL language taught on this notebook, each database differs in the extra features and storage types it offers.

In this notebook, you will learn the basics of the standard SQL which will be understood by every relational database. Thanks to SQL, you'll be able to make queries in each database environment.



# Queries

The instructions you will learn in this course are called **queries**.  
As the name suggests, queries are simply questions we ask about the data stored in a database.

A query tells the database what information we are interested in and under what conditions.  
The database then figures out how to find that information for us.

Databases can do more than just return stored data.  
They can also perform calculations, combine tables, and summarize results.  
You will see examples of this as we go.

> **How a Query Works**
>
>At a basic level, a SQL query works as follows:
>
>1. The user specifies a condition.
>2. The database scans its records and looks for rows that satisfy that condition.
>3. The matching records are returned as a table.

We will start with very simple queries and introduce new ideas one at a time.  
By the end of the course, you will be able to write fairly complex queries.


## Types of SQL Commands

SQL commands are the basic building blocks used to perform operations on a database.

The following diagram summarizes the main categories of SQL commands and some common examples.

![SQL command categories](https://raw.githubusercontent.com/Haross/DB_pics_nt/main/sql_commands.webp)


- **Data Definition Language (DDL).** Used to define or change the structure of database objects such as tables.  

- **Data Manipulation Language (DML).**  Used to modify the data stored in the database.  

- **Data Query Language (DQL).**  Used to retrieve data from the database.  
  This is the part of SQL used to ask questions and get results back.

- **Data Control Language (DCL).**  Used to manage permissions and access to the database.  

- **Transaction Control Language (TCL)** Used to manage transactions and ensure data consistency.  


Since most data users learn SQL in order to interact with an **existing database**,  
this course focuses mainly on **DQL**.

In later sessions, we may briefly look at some of the other command types.


# Part 1: Selecting Data

To retrieve data from a SQL database, we use **SELECT statements**, often referred to simply as **queries**.

A query is a statement that describes:
- what data we are looking for,
- where to find it in the database, and
- optionally, how the data should be transformed before being returned.

Queries follow a specific syntax, which is what we will learn through the exercises in this lesson.

Given a table of data, the most basic query we can write selects
some columns (properties) from the table and returns rows (instances).
More specific variations of this idea will be introduced step by step.



## Your First SELECT Query

It‚Äôs time to run your first SQL query.

Data in a database is stored in tables. To see all the data stored in the `user` table, you can use the following query:

```sql
SELECT *
FROM user;
```

`SELECT` tells the database that you want to retrieve data.
`FROM` user tells the database which table the data should come from.

The asterisk (`*`) means that you want to see **all columns** in the table.

> üí° **Did you know?**  
> SQL commands usually end with a semicolon (`;`).  
> It works like a period at the end of a sentence and tells the database  
> that the command is complete.

> üìù **Note**  
> In the examples, we use the user table to focus on the SQL syntax.
> The practice exercises will use a different table.


>‚ö†Ô∏è Common mistake:
> * Forgetting the semicolon
> * Misspelling column names
> * Using quotes incorrectly

Try it yourself in **Practice 1** below

In [None]:
# @title Practice 1
validator = make_df_validator_nospoilers(
    expected_hash="e470e781818d142c321be9680e6f2668eacf239bbef44b2f4566d71f0397c5fd",
    required_cols=["VIN", "BRAND", "MODEL", "PRICE", "PRODUCTION_YEAR"],
    expected_rows=8,
    sort_rows=True
)
make_sql_runner(
    conn,
    runner_id="ex1",
    description_md="""
### Practice 1 ‚Äî First query
In our example database there is a car table, which contains information about a few cars.

Select all data from the car table and click Run and check code.
""",
    validator=validator,
    schema_tables=["car"]
    )



## Reading the Result

The result of a SQL query is returned as a table.

In this case, the `car` table has **five columns**:

- `vin` ‚Äì short for vehicle identification number  
- `brand`  
- `model`  
- `price`  
- `production_year`

The column names appear at the top of the result table.

There are **eight rows** in the table, each representing a different car.

From the results, we can see:
- two Ford cars,
- one Toyota,
- three Volkswagen cars,
- one Fiat, and
- one Opel.

The price of the Toyota is \$11,300.  
The Ford cars are priced at \$8,000 and \$12,500.

Notice that the price for the Opel is **not specified**, which means the value is missing.


## Selecting One Column

So far, we have selected all columns from a table using the asterisk (`*`).

If you only want to retrieve a specific column, you can list its name instead.
For example, to get the names of all users, you can write:

```sql
SELECT name
FROM user;
```
This query returns the values from the `name` column for every row in the table.

Try it yourself in Practice 2 below.



In [None]:
# @title Practice 2
#make_sql_runner(conn, runner_id="ex2", select_only=False)
validator = make_df_validator_nospoilers(
    expected_hash="2d3a7b407056f589199ccfc7591a781777309123fedbde499d331d51058f6e7e",
    required_cols=["BRAND"],
    expected_rows=8,
    sort_rows=True
)
make_sql_runner(
    conn,
    runner_id="ex2",
    description_md="""
### Practice 2
Select brand names from the **car** table.
""",
    validator=validator,
    schema_tables=["car"]
)



## Select many columns

If you want to retrieve more than one column, you can list the column names
after `SELECT`, separated by commas.

For example, to get the names and ages of all users, you can write:

```sql
SELECT
  name,
  age
FROM user;
```

When selecting multiple columns, remember to separate each column name
with a comma (`,`).

In general, the syntax looks like this:

```sql
SELECT column, another_column, ‚Ä¶
FROM mytable;
```

Try it yourself in **Practice 3** below.



In [None]:
# @title Practice 3
validator = make_df_validator_nospoilers(
    expected_hash="6c9f1bd96cf27b29750634a5dddbf213e6f919cc0688f700751545e41ec46d7a",
    required_cols=['MODEL', 'PRICE'],
    expected_rows=8,
    sort_rows=True,
    sort_cols=True,
    exact_cols=True,
)
make_sql_runner(
    conn,
    runner_id="ex3",
    description_md="""
### Practice 3
Select model and price from table car.
""",

     validator=validator,
    schema_tables=["car"],

)


# Part 2: Logical Filtering

## Filtering Rows with WHERE

So far, our examples have returned all rows from a table.
When a table is small, this may be acceptable. However, real databases
often contain thousands or even millions of rows.

To return only specific rows, SQL provides the `WHERE` clause.
The `WHERE` clause filters rows by applying a condition to one or more columns.
Only rows that satisfy the condition are included in the result.

For example, the following query retrieves information for a single user
with a specific ID:

```sql
SELECT *
FROM user
WHERE id = 100;
```
In this query, `WHERE` introduces a condition.
The condition `id = 100` means that only rows where the id column
has the value `100` will be returned.

In general, the syntax looks like this:

```sql
SELECT column, another_column, ‚Ä¶
FROM mytable
WHERE condition
```

Try it yourself in Practice 4 below.



In [None]:
# @title Practice 4
validator = make_df_validator_nospoilers(
    expected_hash="5484b0a6c58fa1118d364dbc8885f96b565da14b872ff5204e34e337fbf19e82",
    required_cols=['VIN', 'BRAND', 'MODEL', 'PRICE', 'PRODUCTION_YEAR'],
    expected_rows=2,
    sort_rows=True,
    sort_cols=True,
    exact_cols=True,
)
make_sql_runner(
    conn,
    runner_id="ex4",
    description_md="""
### Practice 4
Select all columns for those cars which were produced (column production_year) in 1999.
""",
    validator=validator,
    schema_tables=["car"],
)


## Conditional operators

So far, we have used the equality operator (`=`) to filter rows.
SQL also provides other **comparison operators** that can be used in conditions.

For example, the following query selects users whose age is less than 20:
```sql
SELECT *
FROM user
WHERE age < 20;
```

In this case, the condition uses the less-than operator (`<`) instead of equality (`=`).
Only rows where the value in the age column is below 20 are returned.

Common comparison operators include:

*   `<` (less than)
*   `>` (greater than)
* `<=` (less than or equal)
* `>=` (greater than or equal).

These operators can be used in the same way inside a `WHERE` clause.

Try it yourself in Practice 5 below.


In [None]:
# @title Practice 5

validator = make_df_validator_nospoilers(
    expected_hash="cbe7a15cac42afc2a54ed91ee8ae29cc13b468787546ac25334d2ef8557a2129",
    required_cols=['VIN', 'BRAND', 'MODEL', 'PRICE', 'PRODUCTION_YEAR'],
    expected_rows=3,
    sort_cols=True,
    sort_rows=True,
    exact_cols=True,
)

make_sql_runner(
    conn,
    runner_id="ex5",
    description_md="""
### Practice 5
Select all columns for all cars with price higher than $10,000.
""",
    validator=validator,
    schema_tables=["car"],
)



## The not equal sign (`!=`)

In addition to the comparison operators seen so far, SQL also supports an
**inequality operator**.

The not-equal operator is written as `!=` (and in some databases, also as `<>`).

For example, the following query selects all users whose age is not 18:

```sql
SELECT *
FROM user
WHERE age != 18;
```

In this case, the condition filters out rows where the value in the age column
is equal to 18 and returns all others.

Try it yourself in **Practice 6** below.


In [None]:
validator = make_df_validator_nospoilers(
    expected_hash="93bfcb7e314950eb790b2d52672c2a3bc6e735d6131ef31a4f5f0f471231c67d",
    required_cols=['VIN', 'BRAND', 'MODEL', 'PRICE', 'PRODUCTION_YEAR'],
    expected_rows=6,
    sort_rows=True,
    sort_cols=True,
    exact_cols=True,
)

# @title Practice 6
make_sql_runner(
    conn,
    runner_id="ex6",
    description_md="""
### Practice 6
Select all columns for those cars which weren't produced in 1999.
""",
    validator=validator,
    schema_tables=["car"],
)

## Conditional operators and selecting columns

So far, we have seen how to select specific columns and how to filter rows
using conditional operators. These two ideas can be combined in a single query.

For example, the following query selects only the `id` and `age` columns
for users whose age is less than or equal to 21:

```sql
SELECT
  id,
  age
FROM user
WHERE age <= 21;
```

Instead of using the asterisk (`*`), the query lists only the columns
that should be returned.

Try it yourself in **Practice 7** below.



In [None]:
# @title Practice 7
validator = make_df_validator_nospoilers(
    expected_hash="2330fb8a70e07571d28f3c12c1a8c1cf21b556dc0db91a73abccb88bba3db1ad",
    required_cols=['BRAND', 'MODEL', 'PRODUCTION_YEAR'],
    expected_rows=5,
    sort_rows=True,
    sort_cols=True,
    exact_cols=True,
)

make_sql_runner(
    conn,
    runner_id="ex7",
    description_md="""
### Practice 7
Select brand, model and production year of all cars cheaper than or equal to $11,300.
""",
    validator=validator,
    schema_tables=["car"],

)

## Logical operators ‚Äì OR

So far, we have filtered rows using a single condition.
In some cases, we may want to return rows that satisfy **one condition or another**.

SQL provides logical operators to combine multiple conditions.
One of these operators is `OR`.

For example, the following query selects the `id` and `name` of users
who are older than 50 **or** shorter than 185 cm:


```sql
SELECT id, name
FROM user
WHERE age > 50
  OR height < 185;
  ```

In this query, a row is included in the result if at least one of the conditions
is true.

In SQL, the OR operator is **inclusive**, meaning that a row is also included
when **both** conditions are true.

In SQL, the `OR` operator is inclusive, as shown below.

| Condition A | Condition B | A OR B |
|-------------|-------------|--------|
| False       | False       | False  |
| False       | True        | True   |
| True        | False       | True   |
| True        | True        | True   |

Try it yourself in **Practice 8** below.



In [None]:
# @title Practice 8
validator = make_df_validator_nospoilers(
    expected_hash="c5c035f6c93672b6aafbb7a76731a0095c1f4b8d7ff4df13f84d3794786d646b",
    required_cols=['VIN'],
    expected_rows=5,
    sort_rows=True,
    sort_cols=True,
    exact_cols=True,
)

make_sql_runner(
    conn,
    runner_id="ex8",
    description_md="""
### Practice 8
Select **vin**s of all cars which were produced before 2005 or whose price is below $10,000.
""",
    validator=validator,
    schema_tables=["car"],
)





## Logical operators ‚Äì AND

In addition to `OR`, SQL also provides the logical operator `AND`,
which allows multiple conditions to be combined in a query.

For example, the following query selects the `id` and `name` of users
whose age is between 13 and 70:

```sql
SELECT
  id,
  name
FROM user
WHERE age <= 70
  AND age >= 13;
```
In this case, a row is included in the result only if both conditions
are true.

| Condition A | Condition B | A AND B |
|-------------|-------------|---------|
| False       | False       | False   |
| False       | True        | False   |
| True        | False       | False   |
| True        | True        | True    |

Try it yourself in **Practice 9** below.


In [None]:
# @title Practice 9
validator = make_df_validator_nospoilers(
    expected_hash="420f8021994468b68c80d5aacaf74c3163a4284fe7898e1b8f5288ec562be0d9",
    required_cols=['VIN'],
    expected_rows=1,
    sort_rows=True,
    sort_cols=True,
    exact_cols=True,
)

make_sql_runner(
    conn,
    runner_id="ex9",
    description_md="""
### Practice 9
Select **vins** of all cars which were produced after 1999 and are cheaper than $7,000.
""",
    validator=validator,
    schema_tables=["car"],
    )


## The BETWEEN operator

So far, we have used two conditions combined with `AND` to select users
whose age falls within a specific range.

For example:

```sql
SELECT id, name
FROM user
WHERE age <= 70
  AND age >= 13;
  ```

SQL also provides a more compact way to express this type of condition
using the `BETWEEN` operator:

```sql
SELECT
  id,
  name
FROM user
WHERE age BETWEEN 13 AND 70;
```

The BETWEEN operator selects rows where the value of a column falls
within a given range. In standard SQL, the boundary values (13 and 70)
are included in the result.

> üí° **Did you know?**  
> In standard SQL, `BETWEEN` includes both boundary values.  
> When working with a new database system, it is always a good idea  
> to verify how range conditions are evaluated, as some systems may  
> handle boundary values differently.

Try it yourself in **Practice 10** below.



In [None]:
# @title Practice 10
validator = make_df_validator_nospoilers(
    expected_hash="2f33b7242447bbf66a03ea3c45e6a2b7205f5a6b4111ce28102a50966a16a01c",
    required_cols=['VIN', 'BRAND', 'MODEL'],
    expected_rows=4,
    sort_rows=True,
    sort_cols=True,
    exact_cols=True,
)

make_sql_runner(
    conn,
    runner_id="ex10",
    description_md="""
### Practice 10
SELECT vin, brand, model
FROM car
WHERE production_year BETWEEN 1995 AND 2005;
""",
    validator=validator,
    schema_tables=["car"],
)



## Logical operators ‚Äì NOT

SQL also provides the logical operator `NOT`, which is used to negate a condition.
When `NOT` is applied, rows that would normally satisfy the condition are excluded
from the result.

For example, the following query selects all users whose age is **not** between
20 and 30:

```sql
SELECT *
FROM user
WHERE age NOT BETWEEN 20 AND 30;
```

In this case, the query returns all users except those whose age falls within
the range from 20 to 30.

The `NOT` operator simply reverses the result of a condition.

| Condition A | Not A |
|-------------|-------|
| False       | True  |
| True        | False |

Try it yourself in **Practice 11** below.




In [None]:
# @title Practice 11
validator = make_df_validator_nospoilers(
    expected_hash="f4a29b564d59a8e20cdd0a14578af4bd829a4dd65813537ccec5b449c94b45a0",
    required_cols=['VIN', 'BRAND', 'MODEL'],
    expected_rows=4,
    sort_rows=True,
    sort_cols=True,
    exact_cols=True,
)

make_sql_runner(
    conn,
    runner_id="ex11",
    description_md="""
### Practice 11
Select **vin**, **brand**, and **model** of all cars except for those produced between 1995 and 2005.
""",
    validator=validator,
    schema_tables=["car"]
    )


## Combining Multiple Conditions

When a query includes several conditions, parentheses can be used to control
how those conditions are evaluated.

Parentheses make the intended logic explicit and help avoid ambiguity,
especially when combining `AND` and `OR`.

For example, the following query selects users who are either older than 70
or younger than 13, **and** who are at least 180 cm tall:


```sql
SELECT
  id,
  name
FROM user
WHERE (age > 70 OR age < 13)
  AND (height >= 180);
```

In this query, the conditions inside the parentheses are evaluated together
before being combined with the remaining condition.

Try it yourself in **Practice 12** below.



In [None]:
# @title Practice 12
validator = make_df_validator_nospoilers(
    expected_hash="9b8462dcb114ec242b4b39db5d4a0efe7888fee0046c33b4fae2d139aa237c29",
    required_cols=['VIN'],
    expected_rows=3,
    sort_rows=True,
    sort_cols=True,
    exact_cols=True,
)
make_sql_runner(
    conn,
    runner_id="ex12",
    description_md="""
### Exercise 12
Select the **vin** of all cars which were produced before 1999 or after 2005 and whose price is lower than $4,000 or greater than $10,000.
""",
    validator=validator,
    schema_tables=["car"]
)

# Part 3: Special Cases in Filtering

## Using Text Values

So far, we have used numeric values in `WHERE` clauses.
SQL also allows conditions to be based on **text values**.

When comparing text, the value must be written inside **single quotes**,
for example: `'example'`.

The following query selects the age of all users whose name is `Smith`:

```sql
SELECT age
FROM user
WHERE name = 'Smith';
```
> üí° **Did you know?**  
> Text comparisons in SQL are typically **case-sensitive**, meaning that  
> `'Smith'` and `'SMITH'` are treated as different values.

Try it yourself in **Practice 13** below.

In [None]:
# @title Practice 13
validator = make_df_validator_nospoilers(
    expected_hash="db57cc776cc23b0db9b78ed234f54ae22cea886709040c02935985c7ac9e2ff6",
    required_cols=['VIN', 'BRAND', 'MODEL', 'PRICE', 'PRODUCTION_YEAR'],
    expected_rows=2,
    sort_rows=True,
    sort_cols=True,
    exact_cols=True,
)
make_sql_runner(
    conn,
    runner_id="ex13",
    description_md="""
### Practice 13
Select all columns of all Ford cars from the table.
""",
    validator=validator,
    schema_tables=["car"]
)

## The percentage sign (%)

So far, we have used exact text matches in `WHERE` clauses.
In some cases, we may want to match text values **partially** rather than exactly.

SQL provides the `LIKE` operator for this purpose.

For example, the following query selects all users whose name starts with the
letter `A`:
```sql
SELECT *
FROM user
 WHERE name LIKE 'A%';
```
The LIKE operator allows the use of the percentage sign (%), which matches
any number of characters, including zero characters.

In this case, the query returns all users whose name begins with A,
regardless of what follows.


Try it yourself in **Practice 14** below.



In [None]:
# @title Practice 14

validator = make_df_validator_nospoilers(
    expected_hash="38545b08614369b539f2e28c33c03e08a63385d14098f60a59569b9e1d54b454",
    required_cols=['VIN', 'BRAND', 'MODEL'],
    expected_rows=3,
    sort_rows=True,
    sort_cols=True,
    exact_cols=True,
)
make_sql_runner(
    conn,
    runner_id="ex14",
    description_md="""
### Practice 14
Select **vin**, **brand**, and **model** of all cars whose brand begins with an F.
""",
    validator=validator,
    schema_tables=["car"]
)

## The percentage sign (%) continued

The percentage sign (`%`) can appear **anywhere** inside the pattern used with
the `LIKE` operator, and it can be used more than once.

For example, the following query selects all users whose name contains
the letter `A` anywhere in the text:
```sql
SELECT *
FROM user
WHERE name LIKE '%A%';
```

In this case, the pattern matches any name that **contains at least one A**,
regardless of its position in the string.

Because the percentage sign can also represent zero characters,
the name may begin or end with A.

Try it yourself in **Practice 15** below.



In [None]:
# @title Practice 15
validator = make_df_validator_nospoilers(
    expected_hash="a692bf4fe307b8beeed7c81df5f13852914abf84f524883ca6e29f82b8114460",
    required_cols=['VIN'],
    expected_rows=2,
    sort_rows=True,
    sort_cols=True,
    exact_cols=True,
)
make_sql_runner(
    conn,
    runner_id="ex15",
    description_md="""
### Practice 15
Select vin of all cars whose model ends with an s.
""",
    validator=validator,
    schema_tables=["car"]
)

## The underscore sign (_)

In addition to the percentage sign (`%`), SQL also supports the underscore
character (`_`) when using the `LIKE` operator.

The underscore matches **exactly one character**.

For example, the following query selects users whose name matches
the pattern `_atherine`:

```sql
SELECT *
FROM user
WHERE name LIKE '_atherine';
```

In this case, the pattern matches names such as `Catherine` or `Katherine`,
since the underscore replaces exactly one character at the beginning of the name.

Try it yourself in **Practice 16** below.

In [None]:
# @title Practice 16
validator = make_df_validator_nospoilers(
    expected_hash="31d82d533247762c700175c3db543f3ebabb2262399560465a77b21241c442cc",
    required_cols=['VIN', 'BRAND', 'MODEL', 'PRICE', 'PRODUCTION_YEAR'],
    expected_rows=3,
    sort_rows=True,
    sort_cols=True,
    exact_cols=True,
)
make_sql_runner(
    conn,
    runner_id="ex16",
    description_md="""
### Practice 16
Select all columns for cars which brand matches 'Volk_wagen'.
""",
    validator=validator,
    schema_tables=["car"]
    )

## Looking for NOT NULL Values

In a database table, some columns may contain `NULL` values.
A `NULL` value means that the data is **unknown or missing**.

For example, if a car has no recorded price, the value stored in the `price`
column is `NULL`. This does not mean the price is zero; it means the value is not known.

To check whether a column contains a value, SQL provides the condition
`IS NOT NULL`.

```sql
SELECT id
FROM user
WHERE middle_name IS NOT NULL;
```

This query selects only those users whose middle_name value is known.

Try it yourself in **Practice 17** below.


In [None]:
# @title Practice 17
validator = make_df_validator_nospoilers(
    expected_hash="0086b51a211e734dc4e67bb8776975950441f58c625b9ba0273c8167a05fb827",
    required_cols=['VIN', 'BRAND', 'MODEL', 'PRICE', 'PRODUCTION_YEAR'],
    expected_rows=7,
    sort_rows=True,
    sort_cols=True,
    exact_cols=True,
)
make_sql_runner(
    conn,
    runner_id="ex17",
    description_md="""
### Practice 17
Select all columns for each car whose price column isn't a NULL value.
""" ,
    validator=validator,
    schema_tables=["car"]
    )




## Looking for NULL Values

`NULL` is a special value in SQL.
Because of this, the equality operator (`=`) cannot be used to test for `NULL`.

To check whether a column contains a `NULL` value, SQL provides the condition
`IS NULL`, which is the opposite of `IS NOT NULL`.

```sql
SELECT id
FROM user
WHERE middle_name IS NULL;
```

This query returns only those users whose middle_name value is unknown.

Try it yourself in **Practice 18** below.


In [None]:
# @title Practice 18
validator = make_df_validator_nospoilers(
    expected_hash="d5eee7cf8f2fd1ffa49d8cd8995ff070efa5f8105fa2da0dbb99c75ae094a93f",
    required_cols=['VIN', 'BRAND', 'MODEL', 'PRICE', 'PRODUCTION_YEAR'],
    expected_rows=1,
    sort_rows=True,
    sort_cols=True,
    exact_cols=True,
)
make_sql_runner(
    conn,
    runner_id="ex18",
    description_md="""
### Practice 18
Select all columns for each car whose price is unknown (NULL).
""",
    validator=validator,
    schema_tables=["car"]
    )

## Comparisons with NULL

As seen in the previous sections, `NULL` is a special value that represents
missing or unknown information.

When a condition is applied to a column, such as `age < 70`, any rows where
the value of `age` is `NULL` are automatically excluded from the result.
This is because the condition cannot be evaluated when the value is unknown.

> üí° **Important**  
> In SQL, `NULL` means **unknown**, not empty or zero.  
> Because of this, SQL cannot determine whether one `NULL` value is equal  
> to another `NULL` value.  
>  
> As a result, any comparison involving `NULL` (including `NULL = NULL`)  
> does not evaluate to true, and rows with `NULL` values are excluded from  
> comparison-based conditions unless `IS NULL` or `IS NOT NULL` is used.


Try it yourself in **Practice 19** below.



In [None]:
# @title Practice 19
validator = make_df_validator_nospoilers(
    expected_hash="0086b51a211e734dc4e67bb8776975950441f58c625b9ba0273c8167a05fb827",
    required_cols=['VIN', 'BRAND', 'MODEL', 'PRICE', 'PRODUCTION_YEAR'],
    expected_rows=7,
    sort_rows=True,
    sort_cols=True,
    exact_cols=True,
)
make_sql_runner(
    conn,
    runner_id="ex19",
    description_md="""
### Practice 19
Select all columns for cars whose **price** column is greater than or equal to zero.

Note that the Opel with an unknown price is **not** in the result.
""",
    validator=validator,
    schema_tables=["car"]
)


## Basic mathematical operators

SQL also allows basic mathematical operations to be used inside queries.
These operations can appear in `SELECT` statements as well as in `WHERE` clauses.

For example, the following query calculates an annual salary by multiplying
the monthly salary by 12 and then filters users based on that value:

```sql
SELECT *
FROM user
WHERE (monthly_salary * 12) > 50000;
```
In this query, the asterisk (*) is used as the multiplication operator.
The result of the calculation is then compared to the value 50,000.

SQL supports the following basic mathematical operators:



*  `+` addition
*  `-` substraction
* `*` multiplication
* `/` division


Try it yourself in **Practice 20** below.

In [None]:
# @title Practice 20
validator = make_df_validator_nospoilers(
    expected_hash="cbe7a15cac42afc2a54ed91ee8ae29cc13b468787546ac25334d2ef8557a2129",
    required_cols=['VIN', 'BRAND', 'MODEL', 'PRICE', 'PRODUCTION_YEAR'],
    expected_rows=3,
    sort_rows=True,
    sort_cols=True,
    exact_cols=True,
)
make_sql_runner(
    conn,
    runner_id="ex20",
    description_md="""
### Exercise 20
Select all columns for cars with a tax amount over $2000. The tax amount for all cars is 20% of their price. Multiply the **price by 0.2 **to get the tax amount.
""",
    validator=validator,
    schema_tables=["car"]
)


## In-class Exercise

In this exercise, you will combine several of the concepts introduced so far.

Imagine a customer who walks in and wants to know whether the database contains
any cars that meet their requirements.

In [None]:
# @title In-class exercise 1
make_sql_runner(
    conn,
    runner_id="inclass_ex1",
    description_md="""
### In-cass Exercise 1
Select all columns of those cars that:

  *  were produced between 1999 and 2005,
  *  are not Volkswagens,
  *  have a model that begins with either **'P'** or **'F'**,
  *  have their price set.

  ‚ö†Ô∏è **Note:** This exercise does **not** include an automatic validator.
Your query will not be checked for correctness by the system.

""")
