In [1]:
import pandas as pd

In [2]:
pd.read_sql_table('players', 'sqlite:///source/sql-exercises/exercises.db')

Unnamed: 0,name,games_played,wins,total_score
0,Gino B.,69,192,101
1,Big D,23,88,46
2,VD,18,49,57
3,Bossti,77,256,108


`sqlite:///source/sql-exercises/exercises.db`:
- Specifies the path to the SQLite database file (`exercises.db`).
- The `sqlite:///` prefix indicates that it's using an SQLite database.

`pd.read_sql_table('players', ...)`:
- Queries the `players` table
- Converts it into a **pandas DataFrame**
- Returns the table's contents in a structured format


In [3]:
pd.read_sql_query(
    'SELECT * FROM players',
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,name,games_played,wins,total_score
0,Gino B.,69,192,101
1,Big D,23,88,46
2,VD,18,49,57
3,Bossti,77,256,108


`sqlite:///source/sql-exercises/exercises.db`:
- Specifies the **SQLite database file** location (`exercises.db`).
- The `sqlite:///` prefix indicates that the database is an SQLite file.
    
`SELECT * FROM players`
- This **SQL query** retrieves **all columns** (`*` means "all") and **all rows** from the `players` table.

`pd.read_sql_query(...)`
- Executes the SQL query.
- Loads the query **results into a pandas DataFrame**.
- Returns a **structured table** that can be processed with pandas.


In [4]:
pd.read_sql(
    'SELECT * FROM players',
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,name,games_played,wins,total_score
0,Gino B.,69,192,101
1,Big D,23,88,46
2,VD,18,49,57
3,Bossti,77,256,108


`sqlite:///source/sql-exercises/exercises.db`:
- Specifies the **SQLite database file** location (`exercises.db`).
- The `sqlite:///` prefix indicates that the database is an SQLite file.
    
`SELECT * FROM players`:
- This **SQL query** retrieves **all columns** (`*` means "all") and **all rows** from the `players` table.

`pd.read_sql_query(...)`:
- Executes the SQL query.
- Loads the query **results into a pandas DataFrame.**
- Returns a **structured table** that can be processed with pandas.

In [5]:
pd.read_sql(
    'players',
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,name,games_played,wins,total_score
0,Gino B.,69,192,101
1,Big D,23,88,46
2,VD,18,49,57
3,Bossti,77,256,108


- **Reads data from the** `players` **table** in the SQLite database `exercises.db`.
- **Loads the table into a Pandas DataFrame**, making it easy to manipulate in Python.
- **Returns the entire table**, similar to `read_sql_table` and `read_sql_query('SELECT * FROM players', ...)`.

In [6]:
pd.read_sql(
    """
    SELECT *
    FROM transactions
    WHERE country = "United Kingdom"
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
0,536365,85123A,WHITE HANGING HEART T-LIGHT HOLDER,6,01/12/2010 8:26,2.55,17850.0,United Kingdom
1,536365,71053,WHITE METAL LANTERN,6,01/12/2010 8:26,3.39,17850.0,United Kingdom
2,536365,84406B,CREAM CUPID HEARTS COAT HANGER,8,01/12/2010 8:26,2.75,17850.0,United Kingdom
3,536365,84029G,KNITTED UNION FLAG HOT WATER BOTTLE,6,01/12/2010 8:26,3.39,17850.0,United Kingdom
4,536365,84029E,RED WOOLLY HOTTIE WHITE HEART.,6,01/12/2010 8:26,3.39,17850.0,United Kingdom
5,536365,22752,SET 7 BABUSHKA NESTING BOXES,2,01/12/2010 8:26,7.65,17850.0,United Kingdom
6,536365,21730,GLASS STAR FROSTED T-LIGHT HOLDER,6,01/12/2010 8:26,4.25,17850.0,United Kingdom
7,536366,22633,HAND WARMER UNION JACK,6,01/12/2010 8:28,1.85,17850.0,United Kingdom
8,536366,22632,HAND WARMER RED POLKA DOT,6,01/12/2010 8:28,1.85,17850.0,United Kingdom
9,536367,84879,ASSORTED COLOUR BIRD ORNAMENT,32,01/12/2010 8:34,1.69,13047.0,United Kingdom


`pd.read_sql(...)`
- This **reads the result of an SQL query** into a Pandas DataFrame.
- It's useful for handling database data in Python.

`SELECT *`
- Retrieves **all columns** from the table.

`FROM transactions`
- Specifies the **table** from which the data is retrieved (`transactions`).

`WHERE country = "United Kingdom"`
- Filters the data to **only include rows where the** `country` **column is "United Kingdom"**.

`LIMIT 10`
- Retrieves only **the first 10 rows** that match the condition.
    
`'sqlite:///source/sql-exercises/exercises.db'`
- Specifies the **database file** to read from (`exercises.db` in the `sql-exercises` directory).

In [7]:
pd.read_sql(
    """
    SELECT *
    FROM transactions
    WHERE Quantity < 5
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
0,536365,22752,SET 7 BABUSHKA NESTING BOXES,2,01/12/2010 8:26,7.65,17850.0,United Kingdom
1,536367,22623,BOX OF VINTAGE JIGSAW BLOCKS,3,01/12/2010 8:34,4.95,13047.0,United Kingdom
2,536367,22622,BOX OF VINTAGE ALPHABET BLOCKS,2,01/12/2010 8:34,9.95,13047.0,United Kingdom
3,536367,21754,HOME BUILDING BLOCK WORD,3,01/12/2010 8:34,5.95,13047.0,United Kingdom
4,536367,21755,LOVE BUILDING BLOCK WORD,3,01/12/2010 8:34,5.95,13047.0,United Kingdom
5,536367,21777,RECIPE BOX WITH METAL HEART,4,01/12/2010 8:34,7.95,13047.0,United Kingdom
6,536367,48187,DOORMAT NEW ENGLAND,4,01/12/2010 8:34,7.95,13047.0,United Kingdom
7,536368,22913,RED COAT RACK PARIS FASHION,3,01/12/2010 8:34,4.95,13047.0,United Kingdom
8,536368,22912,YELLOW COAT RACK PARIS FASHION,3,01/12/2010 8:34,4.95,13047.0,United Kingdom
9,536368,22914,BLUE COAT RACK PARIS FASHION,3,01/12/2010 8:34,4.95,13047.0,United Kingdom


- `SELECT *` → Retrieves **all columns** from the `transactions` table.
- `FROM transactions` → Specifies that the data should come from the **transactions** table.
- `WHERE Quantity < 5` → Filters the results, selecting only **rows where the Quantity column is less than 5.**
- `LIMIT 10` → Restricts the output to **10 rows only.**
- `sqlite:///source/sql-exercises/exercises.db` → Specifies the database file from which the data is being fetched.

In [None]:
pd.read_sql(
    """
    SELECT *
    FROM transactions
    WHERE quantity < 5 -- Should preferably be "Quantity" because column names in the data base are case-sensitive
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

- Does the same as the code above

In [None]:
pd.read_sql(
    """
    SELECT *
    FROM transactions
    WHERE quantity > 5
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

- **Queries the** `transactions` **table** from the SQLite database located at `'sqlite:///source/sql-exercises/exercises.db'`.
- **Filters rows where** `quantity` **is greater than 5** (`WHERE quantity > 5`).
- **Limits the output to 10 row** (`LIMIT 10`).
- **Fetches and displays the results in a Pandas DataFrame.**

In [None]:
pd.read_sql(
    """
    SELECT *
    FROM transactions
    WHERE quantity <= 5
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

- `SELECT *` → Retrieves all columns from the `transactions` table.
- `FROM transactions` → Specifies that data should be fetched from the `transactions` table.
- `WHERE quantity <= 5` → Filters only rows where the `quantity` is **5 or less**.
- `LIMIT 10` → Returns only **the first 10 rows** that match the condition.
- `pd.read_sql(...)` → Executes the SQL query and loads the result into a **pandas DataFrame**.

In [None]:
pd.read_sql(
    """
    SELECT *
    FROM transactions
    WHERE quantity >= 5
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

- `SELECT *` → Retrieves **all columns** from the `transactions` table.  
- `FROM transactions` → Specifies that the query is retrieving data from the `transactions` table.  
- `WHERE quantity >= 5` → Filters results to **only include rows** where `quantity` is **5 or greater**.   
- `LIMIT 10` → Restricts the output to **only 10 rows** that meet the condition.  
- `pd.read_sql(...)` → Runs the SQL query and **loads the result into a pandas DataFrame.**  

In [12]:
pd.read_sql(
    """
    SELECT *
    FROM transactions
    WHERE quantity != 5
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
0,536365,85123A,WHITE HANGING HEART T-LIGHT HOLDER,6,01/12/2010 8:26,2.55,17850.0,United Kingdom
1,536365,71053,WHITE METAL LANTERN,6,01/12/2010 8:26,3.39,17850.0,United Kingdom
2,536365,84406B,CREAM CUPID HEARTS COAT HANGER,8,01/12/2010 8:26,2.75,17850.0,United Kingdom
3,536365,84029G,KNITTED UNION FLAG HOT WATER BOTTLE,6,01/12/2010 8:26,3.39,17850.0,United Kingdom
4,536365,84029E,RED WOOLLY HOTTIE WHITE HEART.,6,01/12/2010 8:26,3.39,17850.0,United Kingdom
5,536365,22752,SET 7 BABUSHKA NESTING BOXES,2,01/12/2010 8:26,7.65,17850.0,United Kingdom
6,536365,21730,GLASS STAR FROSTED T-LIGHT HOLDER,6,01/12/2010 8:26,4.25,17850.0,United Kingdom
7,536366,22633,HAND WARMER UNION JACK,6,01/12/2010 8:28,1.85,17850.0,United Kingdom
8,536366,22632,HAND WARMER RED POLKA DOT,6,01/12/2010 8:28,1.85,17850.0,United Kingdom
9,536367,84879,ASSORTED COLOUR BIRD ORNAMENT,32,01/12/2010 8:34,1.69,13047.0,United Kingdom


- `SELECT *` → Retrieves **all columns** from the `transactions` table.
- `FROM transactions` → Specifies that the query is retrieving data from **the** `transactions` **table.**
- `WHERE quantity != 5` → **Filters the rows, excluding** those where `quantity` **is exactly 5.**
    - The reason why you might be seeing larger `quantity` values first is because SQLite **retrieves rows in an arbitrary order by default** unless instructed otherwise.
- `LIMIT 10` → **Restricts** the result to **only 10 rows.**
- `pd.read_sql(...)` → **Runs the SQL query** and loads the **output into a pandas DataFrame.**

In [13]:
pd.read_sql(
    """
    SELECT *
    FROM transactions
    WHERE
        country = "United Kingdom"
        AND quantity > 5
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
0,536365,85123A,WHITE HANGING HEART T-LIGHT HOLDER,6,01/12/2010 8:26,2.55,17850.0,United Kingdom
1,536365,71053,WHITE METAL LANTERN,6,01/12/2010 8:26,3.39,17850.0,United Kingdom
2,536365,84406B,CREAM CUPID HEARTS COAT HANGER,8,01/12/2010 8:26,2.75,17850.0,United Kingdom
3,536365,84029G,KNITTED UNION FLAG HOT WATER BOTTLE,6,01/12/2010 8:26,3.39,17850.0,United Kingdom
4,536365,84029E,RED WOOLLY HOTTIE WHITE HEART.,6,01/12/2010 8:26,3.39,17850.0,United Kingdom
5,536365,21730,GLASS STAR FROSTED T-LIGHT HOLDER,6,01/12/2010 8:26,4.25,17850.0,United Kingdom
6,536366,22633,HAND WARMER UNION JACK,6,01/12/2010 8:28,1.85,17850.0,United Kingdom
7,536366,22632,HAND WARMER RED POLKA DOT,6,01/12/2010 8:28,1.85,17850.0,United Kingdom
8,536367,84879,ASSORTED COLOUR BIRD ORNAMENT,32,01/12/2010 8:34,1.69,13047.0,United Kingdom
9,536367,22745,POPPY'S PLAYHOUSE BEDROOM,6,01/12/2010 8:34,2.1,13047.0,United Kingdom


- **Selects all columns** (`SELECT *`) from the **"transactions"** table.
- **Filters the results using** `WHERE`:
    - Only includes rows where `country = "United Kingdom"` (**limits results to UK transactions**).
    - Only includes rows where `quantity > 5` (**excludes purchases with 5 or fewer items**).
Limits the output to 10 rows (LIMIT 10) → Ensures only the first 10 matching rows are retrieved.
Reads from an SQLite database (exercises.db) and loads the result into a pandas DataFrame.

In [None]:
pd.read_sql(
    """
    SELECT *
    FROM transactions
    WHERE
        country = "United Kingdom"
        AND quantity <> 5
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

- **Selects all columns** (`SELECT *`) from the **"transactions"** table.
- **Filters the results using** `WHERE`:
    - Only includes rows where `country = "United Kingdom"` (**limits results to UK transactions**).
    - Only includes rows where `quantity <> 5` (**excludes purchases where quantity is exactly 5**).
- **Limits the output to 10 rows** (`LIMIT 10`) → Ensures only the first 10 matching rows are retrieved.
- **Reads from an SQLite database** (`exercises.db`) and loads the result into a pandas DataFrame.


In [None]:
pd.read_sql(
    """
    SELECT *
    FROM transactions
    WHERE
        country LIKE "United %"
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

- `SELECT *` → Selects all columns from the `transactions` table.
- `FROM transactions` → Specifies the table to retrieve data from.
- `WHERE country LIKE "United %"` → Filters rows where the `country` name **starts with "United ".**
    - The `%` is a **wildcard** that matches **any characters** following `"United "` (including spaces).
- `LIMIT 10` → Limits the results to the **first 10 matching rows.**
- **Loads the query results into a Pandas DataFrame** using `pd.read_sql()`.

In [16]:
pd.read_sql(
    """
    SELECT Country, CustomerID
    FROM transactions
    WHERE
        country LIKE "United %"
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,Country,CustomerID
0,United Kingdom,17850.0
1,United Kingdom,17850.0
2,United Kingdom,17850.0
3,United Kingdom,17850.0
4,United Kingdom,17850.0
5,United Kingdom,17850.0
6,United Kingdom,17850.0
7,United Kingdom,17850.0
8,United Kingdom,17850.0
9,United Kingdom,13047.0


- `SELECT Country, CustomerID` → Retrieves only the `Country` and `CustomerID` columns.
- `FROM transactions` → Specifies the table to fetch the data from.
- `WHERE country LIKE "United %"` → Filters for records where the `Country` name **starts with** `"United "`.
- `LIMIT 10` → Limits the output to **only the first 10 matching rows**.
- **Loads the query results into a Pandas DataFrame** using `pd.read_sql()`.

In [17]:
pd.read_sql(
    """
    SELECT Country, CustomerID AS `Customer ID`
    FROM transactions
    WHERE
        country LIKE "United %"
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,Country,Customer ID
0,United Kingdom,17850.0
1,United Kingdom,17850.0
2,United Kingdom,17850.0
3,United Kingdom,17850.0
4,United Kingdom,17850.0
5,United Kingdom,17850.0
6,United Kingdom,17850.0
7,United Kingdom,17850.0
8,United Kingdom,17850.0
9,United Kingdom,13047.0


`SELECT Country, CustomerID AS 'Customer ID'`:
- Selects two columns: `Countr`y and `CustomerID`.
- **Renames** `CustomerID` to `Customer ID` for better readability.
                                        
`FROM transactions`:
- Specifies the table from which data is retrieved.

`WHERE country LIKE "United %"`:
- Filters results to include only **countries that start with** `"United"` (e.g., `"United Kingdom"`, `"United States"`, etc.).

`LIMIT 10`:
- Limits the output to **only the first 10 matching rows**.

**The query is executed using** `pd.read_sql()`
- The results are loaded into a **Pandas DataFrame** for analysis.

In [18]:
pd.read_sql(
    """
    SELECT Country, CustomerID
    FROM transactions
    WHERE
        country LIKE "Uni%"
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,Country,CustomerID
0,United Kingdom,17850.0
1,United Kingdom,17850.0
2,United Kingdom,17850.0
3,United Kingdom,17850.0
4,United Kingdom,17850.0
5,United Kingdom,17850.0
6,United Kingdom,17850.0
7,United Kingdom,17850.0
8,United Kingdom,17850.0
9,United Kingdom,13047.0


`SELECT Country, CustomerID`
- Retrieves the `Country` and `CustomerID` columns.

`FROM transactions`
- Specifies that data is fetched from the `"transactions"` table.

`WHERE country LIKE "Uni%"`
- **Filters rows where the** `Country` **starts with** `"Uni"`.
- `LIKE "Uni%"` means:
    - `"Uni%"` → Matches any country name **starting with** `"Uni"`.
    - Example matches: `"United Kingdom"`, `"United States"`, `"United Arab Emirates"`, etc.
    
`LIMIT 10`
- Restricts the output to **only the first 10 matching rows.**

`pd.read_sql()`
- The query is executed in Python, and the result is stored as a **Pandas DataFrame.**

In [19]:
pd.read_sql(
    """
    SELECT Country, CustomerID
    FROM transactions
    WHERE
        country LIKE "%Uni"
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,Country,CustomerID


`SELECT Country, CustomerID`
- Retrieves only the Country and CustomerID columns.

`FROM transactions`
- Specifies the `"transactions"` table as the data source.

`WHERE country LIKE "%Uni"`
- **Filters rows where the** `Country` **ends with** `"Uni"`".
- `LIKE "%Uni"` means:
    - `%` **wildcard at the beginning** → Matches **any characters before** "Uni".
    - Example matches: `"SomeUni"`, `"AnotherUni"`, `"RandomUni"`, etc.
    
`LIMIT 10`
- Restricts the output to **only the first 10 matching rows.**

`pd.read_sql()`
- Executes the query in **Python using Pandas**, storing the result as a **DataFrame**.

In [20]:
pd.read_sql(
    """
    SELECT Country, CustomerID
    FROM transactions
    WHERE
        country LIKE "%dom"
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,Country,CustomerID
0,United Kingdom,17850.0
1,United Kingdom,17850.0
2,United Kingdom,17850.0
3,United Kingdom,17850.0
4,United Kingdom,17850.0
5,United Kingdom,17850.0
6,United Kingdom,17850.0
7,United Kingdom,17850.0
8,United Kingdom,17850.0
9,United Kingdom,13047.0


`SELECT Country, CustomerID`:
- Retrieves **only the** `Country` **and** `CustomerID` columns.

`FROM transactions`:
- Specifies the "transactions" table as the data source.

`WHERE country LIKE "%dom"`:
- **Filters rows where the** `Country` **ends with** `"dom"`".
- The `LIKE "%dom"` condition means:
    - `%` **wildcard at the beginning** → Matches **any characters before** `"dom"`.
        - *Note:* An empty space counts as a character
    - **Example matches:** `"United Kingdom"`, `"Randomdom"`, `"Somekingdom"`, etc.
    
`LIMIT 10`:
- Restricts the output to **only the first 10 matching rows.**

`pd.read_sql()`
- Executes the query in **Python using Pandas**, storing the result as a **DataFrame**.

In [21]:
pd.read_sql(
    """
    SELECT Country, CustomerID
    FROM transactions
    WHERE
        country LIKE "%dom%"
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,Country,CustomerID
0,United Kingdom,17850.0
1,United Kingdom,17850.0
2,United Kingdom,17850.0
3,United Kingdom,17850.0
4,United Kingdom,17850.0
5,United Kingdom,17850.0
6,United Kingdom,17850.0
7,United Kingdom,17850.0
8,United Kingdom,17850.0
9,United Kingdom,13047.0


`SELECT Country, CustomerID`
- Retrieves only the **Country** and **CustomerID** columns.

`FROM transactions`
Fetches data from the **transactions** table.

`WHERE country LIKE "%dom%"`
- Uses the `LIKE` operator to filter rows where the country name **contains "dom"** anywhere.
- The wildcard `%` means:
    - `"%dom%"` → Matches any country name that has `"dom"` anywhere in it.
    - Examples: ✅ `"United Kingdom"`, ✅ `"Randomdom Country"`, ❌ `"Germany"` (does not contain "dom")
    
`LIMIT 10`
- Restricts the output to **only 10 rows.**

In [22]:
df = pd.read_sql(
    """
    SELECT Country, CustomerID
    FROM transactions
    WHERE
        country LIKE "%dom%"
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

`df = pd.read_sql(...)`:
- Runs an **SQL query** and stores the result in a **Pandas DataFrame** (`df`).

`SELECT Country, CustomerID`:
- Extracts only the **"Country"** and **"CustomerID"** columns.

`FROM transactions`
- Specifies the **"transactions"** table as the data source.

`WHERE country LIKE "%dom%"`
- Uses the `LIKE` operator to **filter** country names that contain the substring **"dom"** anywhere.
- The `%` wildcard means:
    - `"United Kingdom"` ✅ (contains `"dom"`)
    - `"Randomdom Country"` ✅
    - `"Germany"` ❌ (does not contain `"dom"`)
    
`LIMIT 10`
- Limits the results to **only 10 rows.**

`'sqlite:///source/sql-exercises/exercises.db'`
- Connects to the **SQLite database file** located at `source/sql-exercises/exercises.db`.

In [23]:
df

Unnamed: 0,Country,CustomerID
0,United Kingdom,17850.0
1,United Kingdom,17850.0
2,United Kingdom,17850.0
3,United Kingdom,17850.0
4,United Kingdom,17850.0
5,United Kingdom,17850.0
6,United Kingdom,17850.0
7,United Kingdom,17850.0
8,United Kingdom,17850.0
9,United Kingdom,13047.0


In [29]:
df.to_sql(
    'country_ids',
    'sqlite:///sample.sqlite',
    index=False,
    if_exists='replace'
)

10

`df.to_sql(...)`
- **Saves** the Pandas DataFrame (`df`) into an SQLite database as a new table.

`'country_ids'`
- The name of the **table** inside the database.

`'sqlite:///sample.sqlite'`
- Connects to an SQLite **database file** named `"sample.sqlite"`.
- If the file **does not exist**, SQLite will **automatically create it.**

`index=False`
Prevents Pandas from saving the index column in the table.

`if_exists='replace'`
- **If the table** `'country_ids'` **already exists**, it is **deleted and replaced** with the new data.
- Alternative values:
    - `'fail'` → Raises an error if the table exists.
    - `'append'` → Adds new rows to the existing table.
    
`10` **(Output)**
- The function **returns the number of rows inserted** into the SQLite table.
- In this case, **10 rows** were saved.

In [30]:
pd.read_sql('country_ids', 'sqlite:///sample.sqlite')

Unnamed: 0,Country,CustomerID
0,United Kingdom,17850.0
1,United Kingdom,17850.0
2,United Kingdom,17850.0
3,United Kingdom,17850.0
4,United Kingdom,17850.0
5,United Kingdom,17850.0
6,United Kingdom,17850.0
7,United Kingdom,17850.0
8,United Kingdom,17850.0
9,United Kingdom,13047.0


`pd.read_sql(...)`
- **Reads a table or query result** from an SQLite database into a Pandas DataFrame.

`'country_ids'`
- The **name of the table** to retrieve from the database.

`'sqlite:///sample.sqlite'`
- Specifies the **SQLite database file** from which to retrieve the data.
- If the file exists, it connects and extracts the table data.


In [33]:
pd.read_sql(
    """
    SELECT Country, CustomerID
    FROM transactions
    WHERE
        country REGEXP "^[AB].*"
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,Country,CustomerID
0,Australia,12431.0
1,Australia,12431.0
2,Australia,12431.0
3,Australia,12431.0
4,Australia,12431.0
5,Australia,12431.0
6,Australia,12431.0
7,Australia,12431.0
8,Australia,12431.0
9,Australia,12431.0


`SELECT Country, CustomerID`
- Retrieves only the `Country` and `CustomerID` columns from the table.

`FROM transactions`
- Specifies the **table name** (`transactions`) where the data is stored.

`WHERE country REGEXP "^[AB].*"`
- Uses **regular expressions** (`REGEXP`) to filter country names:
    - `^` → **Matches the start of the string.**
    - `[AB]` → **Matches "A" or "B" as the first character.**
    - `.*` → **Matches any number of characters after "A" or "B".**
- This means it retrieves all countries **starting with "A" or "B"** (e.g., Australia, Austria, Belgium, Brazil).

`LIMIT 10`
- **Limits** the output to **10 rows only**.

`'sqlite:///source/sql-exercises/exercises.db'`
- Specifies the **SQLite database file** from which the data is retrieved.

In [35]:
pd.read_sql(
    r"""
    SELECT Country, CustomerID
    FROM transactions
    WHERE
        country REGEXP "\w+ \w+"
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,Country,CustomerID
0,United Kingdom,17850.0
1,United Kingdom,17850.0
2,United Kingdom,17850.0
3,United Kingdom,17850.0
4,United Kingdom,17850.0
5,United Kingdom,17850.0
6,United Kingdom,17850.0
7,United Kingdom,17850.0
8,United Kingdom,17850.0
9,United Kingdom,13047.0


`SELECT Country, CustomerID`
- Retrieves only the `Country` and `CustomerID` columns from the table.

`FROM transactions`
- Specifies the **table name** (`transactions`) where the data is stored.

`WHERE country REGEXP "\w+ \w+"`
- Uses a **regular expression** (`REGEXP`) to filter country names:
    - `\w+` → **Matches a word** (one or more alphanumeric characters).
    - **Space (` `)** → Requires a **space between two words**.
    - `\w+` → **Matches another word** after the space.
- This ensures that **only country names containing exactly two words** are selected.
- Example matching countries:
    - ✅ `United Kingdom`
    - ✅ `New Zealand`
    - ❌ `Germany` (single word)
    - ❌ `South Korea` (if stored with a hyphen, e.g., `South-Korea`)

`LIMIT 10`
- **Limits** the output to **10 rows only**.

`'sqlite:///source/sql-exercises/exercises.db'`
- Specifies the **SQLite database file** from which the data is retrieved.

In [36]:
pd.read_sql(
    """
    SELECT Country, CustomerID
    FROM transactions
    WHERE
        country ISNULL
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,Country,CustomerID


`SELECT Country, CustomerID`:
- Retrieves the `Country` and `CustomerID` columns from the table.

`FROM transactions`
- Specifies that the data is retrieved from the `transactions` table.

`WHERE country ISNULL`
- Filters rows where the `country` **column contains** `NULL` **values**.
- `ISNULL` **in SQLite** is a shorthand for: `WHERE COUNTRY IS NULL`
- **NULL** means that the value is **missing or unknown** (i.e., no data was entered for that field).

`LIMIT 10`
Restricts the output to **10 rows only**.

`'sqlite:///source/sql-exercises/exercises.db'`
- Specifies the **SQLite database file** that stores the `transactions` table.

In [37]:
pd.read_sql(
    """
    SELECT Country, CustomerID
    FROM transactions
    WHERE
        country NOTNULL
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,Country,CustomerID
0,United Kingdom,17850.0
1,United Kingdom,17850.0
2,United Kingdom,17850.0
3,United Kingdom,17850.0
4,United Kingdom,17850.0
5,United Kingdom,17850.0
6,United Kingdom,17850.0
7,United Kingdom,17850.0
8,United Kingdom,17850.0
9,United Kingdom,13047.0


`SELECT Country, CustomerID`
- Retrieves the `Country` and `CustomerID` columns from the table.

`FROM transactions`
- Specifies that the data is retrieved from the `transactions` table.

`WHERE country NOTNULL`
- This part is incorrect! `NOTNULL` **is not a valid SQL keyword**.
- Instead, it should be: `WHERE country IS NOT NULL`
- This filters rows where the `country` column **contains a value** (i.e., not missing).

`LIMIT 10`
Retrieves only **10 rows** that match the condition.

`'sqlite:///source/sql-exercises/exercises.db'`
- Specifies the **SQLite database file** containing the `transactions` table.

In [39]:
pd.read_sql(
    """
    SELECT StockCode || Description
    FROM transactions
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,StockCode || Description
0,85123AWHITE HANGING HEART T-LIGHT HOLDER
1,71053WHITE METAL LANTERN
2,84406BCREAM CUPID HEARTS COAT HANGER
3,84029GKNITTED UNION FLAG HOT WATER BOTTLE
4,84029ERED WOOLLY HOTTIE WHITE HEART.
5,22752SET 7 BABUSHKA NESTING BOXES
6,21730GLASS STAR FROSTED T-LIGHT HOLDER
7,22633HAND WARMER UNION JACK
8,22632HAND WARMER RED POLKA DOT
9,84879ASSORTED COLOUR BIRD ORNAMENT


`SELECT StockCode || Description`
- `||` **is the string concatenation operator** in SQLite.
- This **joins the** `StockCode` **and** `Description` values into a **single string**.

`FROM transactions`
- The data is retrieved from the `transactions` table.
`LIMIT 10`
Restricts the output to **10 rows only**.
    
`'sqlite:///source/sql-exercises/exercises.db'`
- Specifies the **SQLite database file** that contains the `transactions` table.

In [40]:
pd.read_sql(
    """
    SELECT Country, CustomerID
    FROM transactions
    WHERE
        country NOTNULL
    ORDER BY Country
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,Country,CustomerID
0,Australia,12431.0
1,Australia,12431.0
2,Australia,12431.0
3,Australia,12431.0
4,Australia,12431.0
5,Australia,12431.0
6,Australia,12431.0
7,Australia,12431.0
8,Australia,12431.0
9,Australia,12431.0


`SELECT Country, CustomerID`
- Retrieves the `Country` and `CustomerID` columns.

`FROM transactions`
- Fetches data from the `transactions` table.

`WHERE country NOTNULL`
- Ensures that **only rows where** `Country` **is NOT NULL** are included.
- **However,** `NOTNULL` **is incorrect syntax**—it should be: `WHERE Country IS NOT NULL`

`ORDER BY Country`
- **Sorts the results alphabetically by** `Country`.

`LIMIT 10`
- Limits the output to **10 rows**.

In [41]:
pd.read_sql(
    """
    SELECT Country, CustomerID
    FROM transactions
    WHERE
        country NOTNULL
    ORDER BY Country DESC
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,Country,CustomerID
0,Unspecified,12363.0
1,Unspecified,12363.0
2,Unspecified,12363.0
3,Unspecified,12363.0
4,Unspecified,12363.0
5,Unspecified,12363.0
6,Unspecified,12363.0
7,Unspecified,12363.0
8,Unspecified,12363.0
9,Unspecified,12363.0


`SELECT Country, CustomerID`
- Retrieves the `Country` and `CustomerID` columns.

`FROM transactions`
- Fetches data from the `transactions` table.

`WHERE country NOTNULL`
- Ensures that **only rows where** `Country` **is NOT NULL** are included.
- **However,** `NOTNULL` **is incorrect syntax**—it should be: `WHERE Country IS NOT NULL`

`ORDER BY Country DESC`
- **Sorts the results in descending order** (`Z` to `A`).

`LIMIT 10`
- Limits the output to **10 rows**.

In [42]:
pd.read_sql(
    """
    SELECT Country, CustomerID
    FROM transactions
    WHERE
        country NOTNULL
    ORDER BY Country DESC, CustomerID
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,Country,CustomerID
0,Unspecified,
1,Unspecified,
2,Unspecified,
3,Unspecified,
4,Unspecified,
5,Unspecified,
6,Unspecified,
7,Unspecified,
8,Unspecified,
9,Unspecified,


`SELECT Country, CustomerID`
- Retrieves the `Country` and `CustomerID` columns.

`FROM transactions`
- Fetches data from the `transactions` table.

`WHERE country NOTNULL`
- Ensures that **only rows where** `Country` **is NOT NULL** are included.
- **However,** `NOTNULL` **is incorrect syntax**—it should be: `WHERE Country IS NOT NULL`

`ORDER BY Country DESC, CustomerID`
- **Sorts by** `Country` **in descending order** (`Z` to `A`).
- If multiple rows have the same `Country`, **they are then sorted by** `CustomerID`.

`LIMIT 10`
- Limits the output to **10 rows**.

In [43]:
pd.read_sql(
    """
    SELECT Country, CustomerID AS `Customer ID`
    FROM transactions
    WHERE
        country NOTNULL
    ORDER BY Country DESC, `Customer ID`
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,Country,Customer ID
0,Unspecified,
1,Unspecified,
2,Unspecified,
3,Unspecified,
4,Unspecified,
5,Unspecified,
6,Unspecified,
7,Unspecified,
8,Unspecified,
9,Unspecified,


`SELECT Country, CustomerID AS 'Customer ID'`
- Retrieves the `Country` and `CustomerID` columns.
- Renames `CustomerID` as `Customer ID` in the output.

`FROM transactions`
- Fetches data from the `transactions` table.

`WHERE country NOTNULL`
- **Filters out NULL values** in the `Country` column.
- ⚠ Incorrect syntax: Should be: `WHERE Country IS NOT NULL`

`ORDER BY Country DESC, 'Customer ID'`
- **Sorts by** `Country` **in descending order** (`Z` to `A`).
- If multiple rows have the same `Country`, they are **sorted by** `Customer ID`.

`LIMIT 10`
- Limits the output to **10 rows**.


In [45]:
pd.read_sql(
    """
    SELECT *
    FROM transactions
    WHERE
        country NOTNULL
    ORDER BY Country DESC, CustomerID
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
0,553857,23090,VINTAGE GLASS T-LIGHT HOLDER,12,19/05/2011 13:30,0.83,,Unspecified
1,553857,47021G,SET/6 BEAD COASTERS GAUZE BAG GOLD,48,19/05/2011 13:30,0.39,,Unspecified
2,553857,79030D,TUMBLER BAROQUE,24,19/05/2011 13:30,0.39,,Unspecified
3,553857,84877A,PINK ROUND COMPACT MIRROR,24,19/05/2011 13:30,1.25,,Unspecified
4,553857,22178,VICTORIAN GLASS HANGING T-LIGHT,12,19/05/2011 13:30,1.25,,Unspecified
5,553857,84913B,MINT GREEN ROSE TOWEL,8,19/05/2011 13:30,4.65,,Unspecified
6,553857,85125,SMALL ROUND CUT GLASS CANDLESTICK,3,19/05/2011 13:30,4.95,,Unspecified
7,553857,21380,WOODEN HAPPY BIRTHDAY GARLAND,6,19/05/2011 13:30,2.95,,Unspecified
8,553857,47590A,BLUE HAPPY BIRTHDAY BUNTING,3,19/05/2011 13:30,5.45,,Unspecified
9,553857,47590B,PINK HAPPY BIRTHDAY BUNTING,3,19/05/2011 13:30,5.45,,Unspecified


`SELECT *`
- Retrieves **all columns** from the `transactions` table.

`FROM transactions`
- Specifies that the data is being retrieved from the `transactions` table.

`WHERE country NOTNULL`
- **Filters out** records where the `Country` column is **null (empty/missing values)**.
                                                            
`ORDER BY Country DESC, CustomerID`
- **Sorts by Country in descending order (Z → A)**.
- If multiple rows have the same `Country`, they are **sorted by** `CustomerID` **in ascending order (default).**
                                                                
`LIMIT 10`
- Restricts the output to only **10 rows**.

In [46]:
pd.read_sql(
    """
    SELECT ABS(UnitPrice)
    FROM transactions
    WHERE
        country NOTNULL
    ORDER BY Country DESC, CustomerID
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,ABS(UnitPrice)
0,0.83
1,0.39
2,0.39
3,1.25
4,1.25
5,4.65
6,4.95
7,2.95
8,5.45
9,5.45


`SELECT ABS(UnitPrice)`
- Extracts the **absolute value** of `UnitPrice` (removes any negative signs).
- If there are **negative values** in `UnitPrice`, they are converted to positive.

`FROM transactions`
- Specifies that the data is being retrieved from the `transactions` table.

`WHERE country NOTNULL`
- Filters out records where the `Country` column is **null (empty/missing values)**.
                                                            
`ORDER BY Country DESC, CustomerID`
- **Sorts by `Country` in descending order (Z → A)**.
- If multiple rows have the same `Country`, they are **sorted by** `CustomerID` **in ascending order**.
                                                            
`LIMIT 10`
- Restricts the output to only **10 rows**.

In [47]:
pd.read_sql(
    """
    SELECT *
    FROM transactions
    WHERE
        ABS(UnitPrice) > 10
    ORDER BY Country DESC, CustomerID
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
0,559521,23007,SPACEBOY BABY GIFT SET,1,08/07/2011 16:26,16.95,,Unspecified
1,561658,22846,BREAD BIN DINER STYLE RED,1,28/07/2011 16:06,16.95,12743.0,Unspecified
2,561658,84813,SET OF 4 DIAMOND NAPKIN RINGS,1,28/07/2011 16:06,12.75,12743.0,Unspecified
3,561658,21843,RED RETROSPOT CAKE STAND,1,28/07/2011 16:06,10.95,12743.0,Unspecified
4,561658,22423,REGENCY CAKESTAND 3 TIER,1,28/07/2011 16:06,12.75,12743.0,Unspecified
5,559929,22485,SET OF 2 WOODEN MARKET CRATES,2,14/07/2011 9:10,12.75,14265.0,Unspecified
6,559929,21628,TRIANGULAR POUFFE VINTAGE,2,14/07/2011 9:10,14.95,14265.0,Unspecified
7,559929,22423,REGENCY CAKESTAND 3 TIER,1,14/07/2011 9:10,12.75,14265.0,Unspecified
8,564051,22839,3 TIER CAKE TIN GREEN AND CREAM,1,22/08/2011 13:32,14.95,14265.0,Unspecified
9,564051,22838,3 TIER CAKE TIN RED AND CREAM,1,22/08/2011 13:32,14.95,14265.0,Unspecified


`SELECT *`
- Selects **all columns** from the `transactions` table.
    
`FROM transactions`
- Specifies that the data is coming from the `transactions` table.

`WHERE ABS(UnitPrice) > 10`
- Filters the rows to **only include transactions where the absolute value of** `UnitPrice` **is greater than 10.**
- This ensures that **even negative values** (if present) are considered.

`ORDER BY Country DESC, CustomerID`
- **Sorts the results first by** `Country` **in descending order (Z → A)**.
- If multiple rows have the same `Country`, they are **sorted by** `CustomerID` **in ascending order**.

`LIMIT 10`
- Limits the output to **only 10 rows**.

In [49]:
pd.read_sql(
    """
    SELECT DISTINCT CustomerID
    FROM transactions
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,CustomerID
0,17850.0
1,13047.0
2,12583.0
3,13748.0
4,15100.0
5,15291.0
6,14688.0
7,17809.0
8,15311.0
9,14527.0


`SELECT DISTINCT CustomerID`
- Selects only **unique (distinct)** `CustomerID` **values** from the table.
- If there are **duplicate customer IDs**, only **one instance** is kept.

`FROM transactions`
- Specifies that the data is coming from the `transactions` table.

`LIMIT 10`
- Restricts the output to **only the first 10 unique** `CustomerID` **values**.

In [50]:
pd.read_sql(
    """
    SELECT *
    FROM transactions
    GROUP BY CustomerID
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
0,536414,22139,,56,01/12/2010 11:52,0.0,,United Kingdom
1,541431,23166,MEDIUM CERAMIC TOP STORAGE JAR,74215,18/01/2011 10:01,1.04,12346.0,United Kingdom
2,537626,85116,BLACK CANDELABRA T-LIGHT HOLDER,12,07/12/2010 14:57,2.1,12347.0,Iceland
3,539318,84992,72 SWEETHEART FAIRY CAKE CASES,72,16/12/2010 19:09,0.55,12348.0,Finland
4,577609,23112,PARISIENNE CURIO CABINET,2,21/11/2011 9:51,7.5,12349.0,Italy
5,543037,21908,CHOCOLATE THIS WAY METAL SIGN,12,02/02/2011 16:01,2.1,12350.0,Norway
6,544156,21380,WOODEN HAPPY BIRTHDAY GARLAND,6,16/02/2011 12:33,2.95,12352.0,Norway
7,553900,37449,CERAMIC CAKE STAND + HANGING CAKES,2,19/05/2011 17:47,9.95,12353.0,Bahrain
8,550911,23201,JUMBO BAG ALPHABET,10,21/04/2011 13:11,2.08,12354.0,Spain
9,552449,22693,GROW A FLYTRAP OR SUNFLOWER IN TIN,24,09/05/2011 13:49,1.25,12355.0,Bahrain


- **Executes an SQL query** on the SQLite database located at `'sqlite:///source/sql-exercises/exercises.db'`.

- The query:
    - **Selects all columns** (`SELECT *`) from the `transactions` table.
    - **Groups the results by** `CustomerID` (`GROUP BY CustomerID`).
    - **Limits the output to 10 rows** (`LIMIT 10`).
- The resulting data is stored in a pandas DataFrame and displayed.

In [51]:
pd.read_sql(
    """
    SELECT CustomerID, COUNT(*)
    FROM transactions
    GROUP BY CustomerID
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,CustomerID,COUNT(*)
0,,135080
1,12346.0,2
2,12347.0,182
3,12348.0,31
4,12349.0,73
5,12350.0,17
6,12352.0,95
7,12353.0,4
8,12354.0,58
9,12355.0,13


`SELECT CustomerID, COUNT(*)`
- Retrieves the `CustomerID` and the **count of transactions** for each customer.
- `COUNT(*)` counts **all rows (transactions)** for each `CustomerID`.
         
`FROM transactions`
- Specifies the **data source**, which is the `transactions` table.
         
`GROUP BY CustomerID`
- Groups all transactions **by** `CustomerID`.
- Each unique `CustomerID` will have **one row** in the result.
- `COUNT(*)` calculates the **number of transactions per** `CustomerID`.
         
`LIMIT 10`
- Limits the output to **only the first 10 grouped results**.

In [52]:
pd.read_sql(
    """
    SELECT CustomerID, COUNT(CustomerID)
    FROM transactions
    GROUP BY CustomerID
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,CustomerID,COUNT(CustomerID)
0,,0
1,12346.0,2
2,12347.0,182
3,12348.0,31
4,12349.0,73
5,12350.0,17
6,12352.0,95
7,12353.0,4
8,12354.0,58
9,12355.0,13


- **Executes an SQL query** on the SQLite database located at `'sqlite:///source/sql-exercises/exercises.db'`.
- The SQL query:
    - Selects the `CustomerID` and counts the occurrences of each `CustomerID` in the `transactions` table.
    - **Groups the results by** `CustomerID` (GROUP BY CustomerID), meaning each row represents a *unique customer*.
    - **Counts the number of transactions per** `CustomerID` (`COUNT(CustomerID)`).
    - **Limits the output to 10 rows** (`LIMIT 10`).
- The retrieved data is stored in a **pandas DataFrame** and displayed.

In [53]:
pd.read_sql(
    """
    SELECT CustomerID, COUNT(CustomerID), *
    FROM transactions
    GROUP BY CustomerID
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,CustomerID,COUNT(CustomerID),InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID.1,Country
0,,0,536414,22139,,56,01/12/2010 11:52,0.0,,United Kingdom
1,12346.0,2,541431,23166,MEDIUM CERAMIC TOP STORAGE JAR,74215,18/01/2011 10:01,1.04,12346.0,United Kingdom
2,12347.0,182,537626,85116,BLACK CANDELABRA T-LIGHT HOLDER,12,07/12/2010 14:57,2.1,12347.0,Iceland
3,12348.0,31,539318,84992,72 SWEETHEART FAIRY CAKE CASES,72,16/12/2010 19:09,0.55,12348.0,Finland
4,12349.0,73,577609,23112,PARISIENNE CURIO CABINET,2,21/11/2011 9:51,7.5,12349.0,Italy
5,12350.0,17,543037,21908,CHOCOLATE THIS WAY METAL SIGN,12,02/02/2011 16:01,2.1,12350.0,Norway
6,12352.0,95,544156,21380,WOODEN HAPPY BIRTHDAY GARLAND,6,16/02/2011 12:33,2.95,12352.0,Norway
7,12353.0,4,553900,37449,CERAMIC CAKE STAND + HANGING CAKES,2,19/05/2011 17:47,9.95,12353.0,Bahrain
8,12354.0,58,550911,23201,JUMBO BAG ALPHABET,10,21/04/2011 13:11,2.08,12354.0,Spain
9,12355.0,13,552449,22693,GROW A FLYTRAP OR SUNFLOWER IN TIN,24,09/05/2011 13:49,1.25,12355.0,Bahrain


- **Runs an SQL query** on the SQLite database located at: `'sqlite:///source/sql-exercises/exercises.db'`  

`SELECT CustomerID, COUNT(CustomerID), *`:
- Selects the `CustomerID`.
- Counts the number of occurrences of `CustomerID` (`COUNT(CustomerID)`) to determine the number of transactions per customer.
- Uses `*`, which attempts to select **all columns** from the transactions table.

`GROUP BY CustomerID`:
- Groups the results **by** `CustomerID`.

`LIMIT 10`:
- Limits the output to **10 rows**.

In [58]:
pd.read_sql(
    """
    SELECT CustomerID, COUNT(CustomerID)
    FROM transactions
    GROUP BY CustomerID
    HAVING COUNT(*) >= 5
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,CustomerID,COUNT(CustomerID)
0,,0
1,12347.0,182
2,12348.0,31
3,12349.0,73
4,12350.0,17
5,12352.0,95
6,12354.0,58
7,12355.0,13
8,12356.0,59
9,12357.0,131


1️⃣` SELECT CustomerID, COUNT(CustomerID)`:
- Retrieves the `CustomerID` column.
- Uses `COUNT(CustomerID` to count the number of transactions per customer.

2️⃣` FROM transactions`
- Specifies that data should be fetched from the `transactions` table.

3️⃣` GROUP BY CustomerID`
- Groups the results by `CustomerID`, meaning each row represents **one unique customer** with their total transaction count.

4️⃣` HAVING COUNT(*) >= 5`
- **Filters out customers who have fewer than 5 transactions.**
- `HAVING` is used **after** `GROUP BY` to apply conditions on grouped results.
- `COUNT(*)` counts the number of rows (transactions) per customer.
- **Only customers with 5 or more transactions are included** in the result.

5️⃣` LIMIT 10`
- Restricts the result to **only 10 rows**.

In [59]:
pd.read_sql(
    """
    SELECT Country, COUNT(CustomerID)
    FROM transactions
    GROUP BY CustomerID
    ORDER BY COUNT(CustomerID) DESC
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,Country,COUNT(CustomerID)
0,United Kingdom,7983
1,EIRE,5903
2,United Kingdom,5128
3,United Kingdom,4642
4,United Kingdom,2782
5,United Kingdom,2491
6,Netherlands,2085
7,United Kingdom,1857
8,United Kingdom,1677
9,United Kingdom,1640


1️⃣` SELECT Country, COUNT(CustomerID)`
- **Retrieves** two columns:
    - `Country`: The country where transactions occurred.
    - `COUNT(CustomerID)`: The total number of customer transactions for each country.
                                                                    
2️⃣ `FROM transactions`
- Specifies that data is being retrieved from the `transactions` table.
                                        
3️⃣ `GROUP BY Country`
- Groups the results by **Country**, meaning each row in the output represents **one country**.
                                        
4️⃣ `ORDER BY COUNT(CustomerID) DESC`
- **Sorts** the countries in **descending order** based on the number of transactions (`COUNT(CustomerID)`).
- The **country with the most transactions appears first**.
                                        
5️⃣ `LIMIT 10`
- Displays only **the top 10 countries** with the highest number of customer transactions.

In [60]:
pd.read_sql(
    """
    SELECT Country, COUNT(*) AS `# of customers in country`
    FROM transactions
    GROUP BY Country
    ORDER BY `# of customers in country` DESC
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,Country,# of customers in country
0,United Kingdom,495478
1,Germany,9495
2,France,8557
3,EIRE,8196
4,Spain,2533
5,Netherlands,2371
6,Belgium,2069
7,Switzerland,2002
8,Portugal,1519
9,Australia,1259


1️⃣` SELECT Country, COUNT(*)` **AS # of customers in country``**
- **Retrieves two columns:**
    - `Country`: The country associated with the transactions.
    - `COUNT(*)`: Counts the total number of transactions for each country.
    - The result is **renamed** as `# of customers in country` for better readability.
                                                         
2️⃣` FROM transactions`
- Specifies that the data is being retrieved from the `transactions` table.
                                          
3️⃣ `GROUP BY Country`
- Groups the results by **Country**, meaning each row in the output represents **one country** and the total number of transactions for that country.
                                          
4️⃣ `ORDER BY` **# of customers in country** `DESC`
- **Sorts** the results in **descending order** based on the total number of transactions (`COUNT(*)`).
- The **country with the most transactions appears first**.
                                          
5️⃣ `LIMIT 10`
- Displays only **the top 10 countries** with the highest number of transactions.

In [64]:
pd.read_sql(
    """
    SELECT Country, COUNT(CustomerID) AS `# of transactions with a non-null customer id in country`
    FROM transactions
    GROUP BY Country
    ORDER BY 2 DESC
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,Country,# of transactions with a non-null customer id in country
0,United Kingdom,361878
1,Germany,9495
2,France,8491
3,EIRE,7485
4,Spain,2533
5,Netherlands,2371
6,Belgium,2069
7,Switzerland,1877
8,Portugal,1480
9,Australia,1259


1️⃣` SELECT Country, COUNT(CustomerID) AS ...`
- **Retrieves two columns:**
    - `Country`: The country associated with each transaction.
    - `COUNT(CustomerID)`: Counts **only the transactions where** `CustomerID` **is not NULL.**
    - The column is **renamed** as `# of transactions with a non-null customer id in country` for clarity.

2️⃣` FROM transactions`
- Specifies that the data is being retrieved from the `transactions` table.

3️⃣` GROUP BY Country`
- Groups the results **by country**, meaning each row represents a **single country** with its corresponding number of non-null customer transactions.

4️⃣` ORDER BY 2 DESC`
- **Sorts the results in descending order** based on the second column (`COUNT(CustomerID)`).
- This ensures the **countries with the highest number of valid customer transactions appear first.**

5️⃣` LIMIT 10`
- Displays **only the top 10 countries** with the highest number of transactions.

In [63]:
pd.read_sql(
    """
    SELECT Country, COUNT(DISTINCT CustomerID) AS `# of unique customers in country`
    FROM transactions
    GROUP BY Country
    ORDER BY `# of unique customers in country` DESC
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,Country,# of unique customers in country
0,United Kingdom,3950
1,Germany,95
2,France,87
3,Spain,31
4,Belgium,25
5,Switzerland,21
6,Portugal,19
7,Italy,15
8,Finland,12
9,Austria,11


In [None]:
1️⃣` SELECT Country, COUNT(DISTINCT CustomerID) AS ...`
- **Retrieves two columns:**
    - `Country`: The country associated with each transaction.
    - `COUNT(DISTINCT CustomerID)`: **Counts unique customers** in each country (excluding duplicates).
    - The result is **renamed** as `# of unique customers in country` for clarity.

2️⃣` FROM transactions`
- Specifies that the data is being retrieved from the `transactions` table.

3️⃣` GROUP BY Country`
- Groups the results **by country**, meaning each row represents a **single country** with its corresponding **unique customer count**.
    
4️⃣ `ORDER BY` **# of unique customers in country** `DESC`
- **Sorts the results in descending order** based on the number of **unique customers.**
- The **country with the most unique customers appears first.**
    
5️⃣` LIMIT 10`
- Displays **only the top 10 countries** with the highest number of unique customers.

In [65]:
pd.read_sql(
    """
    SELECT Country, COUNT(DISTINCT CustomerID) AS `# of unique customers in country`
    FROM transactions
    GROUP BY Country
    ORDER BY Country
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,Country,# of unique customers in country
0,Australia,9
1,Austria,11
2,Bahrain,2
3,Belgium,25
4,Brazil,1
5,Canada,4
6,Channel Islands,9
7,Cyprus,8
8,Czech Republic,1
9,Denmark,9


1️⃣` SELECT Country, COUNT(DISTINCT CustomerID) AS ...`
- **Retrieves two columns:**
    - `Country`: The country associated with each transaction.
    - `COUNT(DISTINCT CustomerID)`: **Counts unique customers** in each country.
    - The result is **renamed** as `# of unique customers in country` for clarity.

2️⃣` FROM transactions`
- Specifies that the data is being retrieved from the `transactions` table.

3️⃣` GROUP BY Country`
- Groups the results **by country**, meaning each row represents a **single country** with its corresponding **unique customer count**.
    
4️⃣ `ORDER BY Country`
- **Sorts the results alphabetically by country name, not by the number of unique customers.**
- This might **not be useful** if the goal is to see the countries with the most unique customers.
    
5️⃣ `LIMIT 10`
- Displays **only the first 10 countries** in alphabetical order.

In [66]:
pd.read_sql(
    """
    SELECT Country, COUNT(DISTINCT CustomerID) AS `# of unique customers in country`
    FROM transactions
    WHERE Quantity >= 10
    GROUP BY Country
    ORDER BY Country
    LIMIT 10
    """,
    'sqlite:///source/sql-exercises/exercises.db'
)

Unnamed: 0,Country,# of unique customers in country
0,Australia,9
1,Austria,11
2,Bahrain,1
3,Belgium,24
4,Brazil,1
5,Canada,4
6,Channel Islands,9
7,Cyprus,7
8,Czech Republic,1
9,Denmark,9


1️⃣` SELECT Country, COUNT(DISTINCT CustomerID) AS ...`
- **Retrieves two columns:**
    - `Country`: The country associated with each transaction.
    - `COUNT(DISTINCT CustomerID)`: **Counts the number of unique customers per country**, ensuring that each customer is counted only once.
    - The column is **renamed** as `# of unique customers in country` for better readability.

2️⃣` FROM transactions`
- Specifies that the data is being retrieved from the `transactions` table.

3️⃣ WHERE Quantity >= 10
- **Filters the dataset** to **only include transactions where the** `Quantity` **is 10 or more.**
- This means only customers who made larger purchases (10+ items) are considered in the unique customer count.

4️⃣ `GROUP BY Country`
- Groups the results **by country**, meaning each row represents a **single country** and its corresponding **count of unique customers.**

5️⃣` ORDER BY Country`
- **Sorts the results alphabetically by country name, not by the number of unique customers.**

6️⃣` LIMIT 10`
- Displays **only the first 10 countries in alphabetical order.**