<a target="_blank" href="https://colab.research.google.com/github/lukebarousse/Int_SQL_Data_Analytics_Course/blob/main/0_Intro/1_Intro.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

# Introduction

## How to Run SQL Queries

### In Jupyter Notebooks

Using the code block below, it will automatically detect if you're using Collab or locally on your machine.

In [None]:
import sys
import matplotlib.pyplot as plt
%matplotlib inline

# If running in Google Colab, install PostgreSQL and restore the database
if 'google.colab' in sys.modules:
    # Install PostgreSQL
    !sudo apt-get install postgresql -qq > /dev/null 2>&1

    # Start PostgreSQL service (suppress output)
    !sudo service postgresql start > /dev/null 2>&1

    # Set password for the 'postgres' user to avoid authentication errors (suppress output)
    !sudo -u postgres psql -c "ALTER USER postgres WITH PASSWORD 'password';" > /dev/null 2>&1

    # Create the 'colab_db' database (suppress output)
    !sudo -u postgres psql -c "CREATE DATABASE contoso_100k;" > /dev/null 2>&1

    # Download the PostgreSQL .sql dump
    !wget -q -O contoso_100k.sql https://github.com/lukebarousse/Int_SQL_Data_Analytics_Course/releases/download/v.0.0.0/contoso_100k.sql

    # Restore the dump file into the PostgreSQL database (suppress output)
    !sudo -u postgres psql contoso_100k < contoso_100k.sql > /dev/null 2>&1

    # Shift libraries from ipython-sql to jupysql
    !pip uninstall -y ipython-sql > /dev/null 2>&1
    !pip install jupysql > /dev/null 2>&1
        
# Load the ipython-sql extension for SQL magic
%load_ext sql

# Connect to the PostgreSQL database
%sql postgresql://postgres:password@localhost:5432/contoso_100k

# Enable automatic conversion of SQL results to pandas DataFrames
%config SqlMagic.autopandas = True

For both. Afterwards to write a SQL query create a new code block with `%%sql` magic command at the top. Then below you can write your query as usual.

We'll be using PostgreSQL for all of our SQL queries.

In [None]:
%%sql 

SELECT
    EXTRACT(YEAR FROM orderdate) AS year,
    SUM(netprice) AS total_year_net_revenue
FROM
    sales
GROUP BY
    year
ORDER BY
    year

### In Database Tool

Open your database tool like PgAdmin and write your SQL query.

**Insert image 🖼️ of SQL query in PGAdmin**

## What You Need to Know

### SQL

Below is what you should already know in SQL before taking this course (the functions below use PostgreSQL syntax):
1. **Basic** - `SELECT`, `FROM`, `WHERE`
2. **Comparisons** - `=`, `<>`, `>`, `<`, `≥`, `≤`
3. **Operations** - `+`, ``, ``, `/`
4. **Alias** - `AS`
5. **Wildcards** - `LIKE`, `%`, `_`
6. **Aggregation** - `SUM`, `COUNT`, `AVG`, `MIN`, `MAX`, `GROUP BY`, `HAVING`
7. **NULL values** - `IS NULL`, `IS NOT NULL`
8. **JOINs** - `LEFT JOIN`, `RIGHT JOIN`, `INNER JOIN`, `OUTER JOIN`
9. **Order of Execution** - Order query executes
10. **Data Types** - integer, text, numeric, boolean, date, timestamp
11. **Manipulate** - `CREATE`, `INSERT`, `ALTER`, `DROP` *(optional)*
12. **Database Load** - Create database & load tables using `CREATE TABLE` ,`ALTER TABLE` *(optional)* 
13. **DATEs** - `::DATE`, `AT TIME ZONE`, `EXTRACT`
14. **Case Expression** - `CASE WHEN`
15. **Subqueries & CTEs** - `WITH`
16. **UNIONs** - `UNION`, `UNION ALL`

If you need a refresher check out our other SQL course: [Beginner SQL](https://youtu.be/7mz73uXD9DA?si=cpI_1cUkJ7dgEdMT).

### Math

Understanding of these mathematical concepts concepts: 
1. **Basic Arithmetic** - `+`, `-`, `*`, `/`
2. **Order of Operations** - PEMDAS (Parentheses, Exponents, Multiplication/Division, Addition/Subtraction)
3. **Basic Algebra** - Variables in expressions, e.g., `SUM(value * multiplier)`
4. **Basic Statistics** - `AVG`, `COUNT`, Frequency analysis
5. **Percentages and Ratios** - Calculating proportions, e.g., `(part / total) * 100`
6. **Working with Ranges** - Numeric and date ranges, e.g., `BETWEEN 1 AND 100`
7. **Date/Time Concepts** - Days, weeks, months, years, date differences, `DATE_DIFF`, `INTERVAL`
8. **Logical Thinking** - Boolean logic: `AND`, `OR`, `NOT`
9. **Conditional Logic** - `CASE WHEN` as SQL's equivalent to **if/then statements**
10. **Optional** - Weighted averages, cumulative sums (e.g., `SUM(value) OVER()`)


## Analysis

- Include overview
- Background
- Final project (results)