# SQLite Basics
This notebook includes basic instructions about how to access a database, and read/write data from/to it.<br>
Here, we are using **SQLite** which is light-weight database management system. To communicate with a database in SQLite we use Structured Query Language a.k.a **SQL**. We use SQL to send a querry to the database manager. Then the database manager processes the query and sends the result back. The query is the instructions that tells database manager what to do.
In python we can use the **sqlite3** package for this putpose.


In [1]:
import sqlite3

## Connecting to a database
Let's start by connecting to a database. We can do that using `sqlite.connect` which returns a connection object.

In [72]:
# create a database connection to the SQLite database specified by the db_file
db_file = 'Sales.db'
conn = sqlite3.connect(db_file)


If you didn't get any error, it means you have successfully connected to the database. <br>
**Note:** If the file doesn't exist sqlite will create an empty database with the given name.<br><br>
Now we can start executing a query. Let's start by the simplest query:<br>
<code>
    SELECT * FROM {name of table}
</code>
<br>
This query returns all the rows and columns of a specific table in the database. To use this query first we need to know the names of the tables in the database and in general the structure of the database. 
<img src="DB.PNG" alt="Image not found" width="50%">
Let's get a list of all the customers

In [5]:
query = '''
SELECT * FROM Customers;

'''


**Note:** It is customary to use SQL key word all in caps but that is not necessary. "SELECT" and "select" do the same job.<br>
<br>Now that we have the query, let's send it to the database manager. To do that, we need to create a cursor object and ask the cursor to execute the query and return all the information.

In [6]:
cur = conn.cursor()
cur.execute(query)
output = cur.fetchall()
print(output)

[(1, 'Mr.', 'Jorge', 'Dunn', None, '0429937146', '48585127', 'sbrooks@hotmail.com', '99098 Rachael Cliff', 'New Jasonborough', 'Poland', '341571816096695', '12/20', '2019-06-17'), (2, 'Mrs.', 'Debra', 'Wall', '56978970', '0470205559', None, 'colleenalexander@taylor-brown.net', '9716 Andrea Place Apt. 808', 'Harrisberg', 'Portugal', '4846279439697392', '02/24', '2015-01-02'), (3, 'Dr.', 'Cynthia', 'Ellis', None, '0469055029', '82096582', 'vstewart@gmail.com', '21014 Jared Mount', 'Lopezmouth', 'Germany', '213179187652107', '01/30', '2014-07-21'), (4, 'Mr.', 'John', 'Harris', None, '0475137980', None, 'yarmstrong@hotmail.com', '54725 Todd Corners', 'New Timothy', 'Bahrain', '3567765520251007', '02/22', '2011-07-26'), (5, 'Mr.', 'Kyle', 'Jones', '21424091', '0404892875', '24628569', 'cindy97@yahoo.com', '97603 Joseph Terrace Apt. 079', 'Sherryland', 'Hong Kong', '4290875996151482', '07/29', '2013-09-26'), (6, 'Mrs.', 'Pamela', 'Dennis', '41586619', '0490981941', None, 'ljohnston@hotmail.c

The output is a list of rows in the table. Each row is returned in form of a tuple. Let's have a look at the first row:

In [75]:
output[0]

(1,
 'Mr.',
 'Jorge',
 'Dunn',
 None,
 '0429937146',
 '48585127',
 'sbrooks@hotmail.com',
 '99098 Rachael Cliff',
 'New Jasonborough',
 'Poland',
 '341571816096695',
 '12/20',
 '2019-06-17')

Let's turn it into a function to avoid writing the same lines of code over and over.

In [7]:
# This function executes a query and prints all the rows
def execute_query(conn, query):
    cur = conn.cursor()
    cur.execute(query)
    rows = cur.fetchall()
    for i,row in enumerate(rows):
        print(f'{i+1}. {row}')

## SQL clauses
### SELECT
SELECT is the most common clause in SQL. Using SELECT you can specify which columns should be returned. 


In [8]:
# Returns a list of first and last names of all customers
query = '''
SELECT FirstName, Surname FROM Customers;
'''
execute_query(conn,query)

1. ('Jorge', 'Dunn')
2. ('Debra', 'Wall')
3. ('Cynthia', 'Ellis')
4. ('John', 'Harris')
5. ('Kyle', 'Jones')
6. ('Pamela', 'Dennis')
7. ('Elizabeth', 'Carpenter')
8. ('Brett', 'Floyd')
9. ('Christopher', 'Liu')
10. ('Christine', 'Bell')
11. ('Donna', 'Lucas')
12. ('Nicole', 'Byrd')
13. ('Alex', 'Harris')
14. ('Diane', 'Mueller')
15. ('Douglas', 'Torres')
16. ('Amy', 'Torres')
17. ('Christopher', 'Anthony')
18. ('Jacob', 'Thomas')
19. ('Abigail', 'Merritt')
20. ('Kenneth', 'Martinez')
21. ('Kevin', 'Dominguez')
22. ('Christopher', 'Miller')
23. ('Drew', 'Spencer')
24. ('Shane', 'Henry')
25. ('Stephanie', 'Mccullough')
26. ('Jeffrey', 'Perez')
27. ('Timothy', 'Garrett')
28. ('Kelly', 'Smith')
29. ('Alec', 'Martinez')
30. ('Holly', 'Boyer')
31. ('Sean', 'Munoz')
32. ('Robert', 'Morris')
33. ('Michael', 'Moore')
34. ('Jennifer', 'Oliver')
35. ('Claudia', 'Clark')
36. ('Lisa', 'Collins')
37. ('John', 'Gonzalez')
38. ('Tracey', 'Williams')
39. ('Matthew', 'Bond')
40. ('Daniel', 'Jackson')
41

In [9]:
# Returns a list of first name, last name, and email for all customers
query = '''
SELECT FirstName, Surname, Email FROM Customers;
'''
execute_query(conn,query)

1. ('Jorge', 'Dunn', 'sbrooks@hotmail.com')
2. ('Debra', 'Wall', 'colleenalexander@taylor-brown.net')
3. ('Cynthia', 'Ellis', 'vstewart@gmail.com')
4. ('John', 'Harris', 'yarmstrong@hotmail.com')
5. ('Kyle', 'Jones', 'cindy97@yahoo.com')
6. ('Pamela', 'Dennis', 'ljohnston@hotmail.com')
7. ('Elizabeth', 'Carpenter', 'ericataylor@yahoo.com')
8. ('Brett', 'Floyd', 'gouldheidi@armstrong.info')
9. ('Christopher', 'Liu', 'jenniferbailey@gmail.com')
10. ('Christine', 'Bell', 'hsimmons@hotmail.com')
11. ('Donna', 'Lucas', 'nielsenjohn@johnson.com')
12. ('Nicole', 'Byrd', 'kchandler@gmail.com')
13. ('Alex', 'Harris', 'knorton@russell.com')
14. ('Diane', 'Mueller', 'nicole79@ortiz.biz')
15. ('Douglas', 'Torres', 'leahgonzales@ramos-soto.com')
16. ('Amy', 'Torres', 'vchristensen@hotmail.com')
17. ('Christopher', 'Anthony', 'eburns@hotmail.com')
18. ('Jacob', 'Thomas', 'crobinson@craig.net')
19. ('Abigail', 'Merritt', 'stephaniemclean@chavez.biz')
20. ('Kenneth', 'Martinez', 'singhjeremy@gmail.com

We need to specify which columns we want to be returned. If we need all the columns we use `*`.

One issue you might have is not knowing what the names of the columns are. So how do we get the names of the columns? There are a few ways to do that. Probably, the simplest one is using the `.description` property of the cursor. When you execute a query you can find the names of the columns in `.description` property of the cursor. To have a full list of columns, use `SELECT *`.

In [13]:
query = '''
SELECT * from Customers
'''

In [14]:
cur = conn.cursor()
cur.execute(query)
cur.description

(('CustomerID', None, None, None, None, None, None),
 ('Title', None, None, None, None, None, None),
 ('Firstname', None, None, None, None, None, None),
 ('Surname', None, None, None, None, None, None),
 ('PhoneNumber', None, None, None, None, None, None),
 ('Mobile', None, None, None, None, None, None),
 ('Fax', None, None, None, None, None, None),
 ('Email', None, None, None, None, None, None),
 ('Address', None, None, None, None, None, None),
 ('City', None, None, None, None, None, None),
 ('Country', None, None, None, None, None, None),
 ('CreditCardNumber', None, None, None, None, None, None),
 ('CreditCardExpiry', None, None, None, None, None, None),
 ('SignupDate', None, None, None, None, None, None))

**Note:** you don't need to use `.fetchall()` method. Just executing the query will give the names of the columns.

Another way is using the query below which will return names of all the tables, names of columns in each table, and the type of data in each column.

In [16]:
query= "SELECT * FROM sqlite_master WHERE type = 'table'"
cur = conn.cursor()
cur.execute(query)
output = cur.fetchall()
for row in output:
    for c in row: print(c)
    print('-'*50)

table
Customers
Customers
2
CREATE TABLE Customers(
    CustomerID INTEGER PRIMARY KEY,
    Title NVARCHARS(5),
    Firstname NVARCHAR(30),
    Surname NVARCHAR(30),
    PhoneNumber NVARCHAR(15),
    Mobile NVARCHAR(15),
    Fax NVARCHAR(15),
    Email NVARCHAR(50),
    Address NVARCHAR(50),
    City  NVARCHAR(20),
    Country  NVARCHAR(30),
    CreditCardNumber NVARCHAR(25),
    CreditCardExpiry NVARCHAR(5),
    SignupDate NVARCHAR(15)
    )
--------------------------------------------------
table
Products
Products
42
CREATE TABLE Products(
    ProductID INTEGER PRIMARY KEY,
    StockCode NVARCHARS(15),
    Category NVARCHAR(20),
    Color NVARCHAR(30),
    Weight SMALLINT,
    UnitPrice FLOAT
    )
--------------------------------------------------
table
Sales
Sales
47
CREATE TABLE Sales(
    SaleID INTEGER PRIMARY KEY,
    InvoiceNumber NVARCHARS(15),
    ProductID INTEGER,
    Quantity INTEGER,
    CustomerID INTEGER,
    PurchaseDate NVARCHAR(15),
    ShippedDate NVARCHAR(15),
   

### ORDER BY
By default the output of a query has no specific order. `ORDER BY` clause is used to sort data in the output. You can sort the data based on one or more columns. We can also specify whether we want the sorted output to be ascending or descending. This can be achieved by adding the following code to the query:<br>
`ORDER BY {name of column} {ASC or DESC}`<br>
ASC: Ascending<br>
DESC: Descending



In [17]:
# Returns a list of first name, last name, Country and email for all customers
query = '''
SELECT FirstName, Surname, Country, Email FROM Customers
ORDER BY Country ASC,Surname ASC, FirstName ASC;
'''
execute_query(conn,query)

1. ('Edward', 'Ball', 'Australia', 'mcclainstephen@yahoo.com')
2. ('Christine', 'Bell', 'Australia', 'hsimmons@hotmail.com')
3. ('Stephanie', 'Bennett', 'Australia', 'tylerross@odom.com')
4. ('James', 'Brown', 'Australia', 'allentricia@smith.com')
5. ('Rebecca', 'Cisneros', 'Australia', 'cassandragreene@campbell.com')
6. ('Mary', 'Compton', 'Australia', 'ngregory@hotmail.com')
7. ('Kevin', 'Dominguez', 'Australia', 'amber37@yahoo.com')
8. ('Samantha', 'Garcia', 'Australia', 'jamessmith@jenkins-smith.com')
9. ('Matthew', 'Greene', 'Australia', 'teresa39@gmail.com')
10. ('Wendy', 'Greene', 'Australia', 'brianblanchard@herring.com')
11. ('Michael', 'Gutierrez', 'Australia', 'pattersondaniel@weber-murphy.net')
12. ('Elizabeth', 'Hansen', 'Australia', 'lmiller@phelps.info')
13. ('William', 'Hayes', 'Australia', 'nbarker@yahoo.com')
14. ('Angela', 'Hernandez', 'Australia', 'samanthamanning@gmail.com')
15. ('Amy', 'Johnson', 'Australia', 'awilliams@hotmail.com')
16. ('Sara', 'Johnson', 'Austr

**Note:** In the query above, the output is ordered by Country and then last and first name.

### DISTINCT
This clause is used to remove duplicates in the result. It appears immediately after `SELECT` followed by a column name or a list of columns. If only one column is specified only the values on that column is used to identify duplicates. If multiple columns are specified, the combination of them will be used to identify duplicates.
<code>
SELECT DISTINCT {column}
FROM {table};
</code>

In [18]:
query='''
SELECT DISTINCT City
FROM Customers
ORDER BY City;
'''
execute_query(conn,query)

1. ('Adamberg',)
2. ('Adamport',)
3. ('Alexanderberg',)
4. ('Alexanderbury',)
5. ('Alexandraville',)
6. ('Aliciaton',)
7. ('Allenport',)
8. ('Allenview',)
9. ('Allisonfurt',)
10. ('Allisonview',)
11. ('Amandaside',)
12. ('Amandaville',)
13. ('Amymouth',)
14. ('Andersonfurt',)
15. ('Andreahaven',)
16. ('Andrewsmouth',)
17. ('Angelafurt',)
18. ('Angelamouth',)
19. ('Angelastad',)
20. ('Angelaview',)
21. ('Anitamouth',)
22. ('Anitaview',)
23. ('Anthonystad',)
24. ('Arellanotown',)
25. ('Arroyofort',)
26. ('Ashleyborough',)
27. ('Ashleymouth',)
28. ('Ashleytown',)
29. ('Atkinsontown',)
30. ('Austinhaven',)
31. ('Austinmouth',)
32. ('Ayalafort',)
33. ('Ayalatown',)
34. ('Baileybury',)
35. ('Bakerbury',)
36. ('Bakerhaven',)
37. ('Bakertown',)
38. ('Ballardville',)
39. ('Barberfort',)
40. ('Barnettberg',)
41. ('Barrystad',)
42. ('Bartlettborough',)
43. ('Bartlettmouth',)
44. ('Beasleyburgh',)
45. ('Beckville',)
46. ('Bellmouth',)
47. ('Benitezbury',)
48. ('Benjaminshire',)
49. ('Berryport',)


### WHERE
We can add conditions to the query string using `WHERE` clause.<br>
<code>
SELECT column_list
FROM table
WHERE search_condition;
</code>
<br>
SQLite uses the following steps:
1. Check the table in the FROM clause.
2. Evaluate the conditions in the WHERE clause to get the rows that met these conditions.
3. Make the final result set based on the rows in the previous step with columns in the SELECT clause.

The search condition in the WHERE has the following form:<br>
left_expression  COMPARISON_OPERATOR  right_expression<br><br>
e.g. 
<code> 
WHERE column_1>5
    
WHERE column_2 IN (1,2,3)

</code>
**List of comparison operators:**
<table>
<thead>
  <tr>
    <th>Operator</th>
    <th>Meaning</th>
  </tr>
</thead>
<tbody>
  <tr>
    <td>=</td>
    <td>Equal to</td>
  </tr>
  <tr>
    <td>&lt;&gt; or !=</td>
    <td>Not equal to</td>
  </tr>
  <tr>
    <td>&lt;</td>
    <td>Less than</td>
  </tr>
  <tr>
    <td>&gt;</td>
    <td>Greater than</td>

  </tr>
  <tr>
    <td>&lt;=</td>
    <td>Less than or equal to</td>

  </tr>
  <tr>
    <td>&gt;=</td>
    <td>Greater than or equal to</td>

  </tr>
</tbody>
</table>

**List of logical operators:**
<table>
<thead>
  <tr>
    <th>Operator</th>
    <th>Meaning</th>
  </tr>
</thead>
<tbody>
  <tr>
    <td>ALL</td>
    <td>returns 1 if all expressions are 1.</td>
  </tr>
  <tr>
    <td>AND</td>
    <td>returns 1 if both expressions are 1, and 0 if one of the expressions is 0.</td>
  </tr>
  <tr>
    <td>ANY</td>
    <td>returns 1 if any one of a set of comparisons is 1.</td>
  </tr>
  <tr>
    <td>BETWEEN</td>
    <td>returns 1 if a value is within a range.</td>
  </tr>
  <tr>
    <td>EXISTS</td>
    <td>returns 1 if a subquery contains any rows.</td>
  </tr>
  <tr>
    <td>IN</td>
    <td>returns 1 if a value is in a list of values.</td>
  </tr>
  <tr>
    <td>LIKE</td>
    <td>returns 1 if a value matches a pattern</td>
  </tr>
  <tr>
    <td>NOT</td>
    <td>reverses the value of other operators such as NOT EXISTS, NOT IN, NOT BETWEEN, etc.</td>
  </tr>
  <tr>
    <td>OR</td>
    <td>returns true if either expression is 1</td>
  </tr>
</tbody>
</table>

Let's create a list of all the Products having StockCode, Category, Weight, and UnitPrice in the result and sorted from high to low by the price.

In [19]:
query = '''
SELECT StockCode,Category,Weight, UnitPrice FROM Products
ORDER BY UnitPrice DESC;
'''
execute_query(conn,query)

1. ('YJ878305', 'Beauty', 2657, 117.55)
2. ('GU312892', 'Home', 2042, 92.53)
3. ('ST643511', 'Appliances', 3555, 91.76)
4. ('PY014422', 'Home', 2275, 87.8)
5. ('CC424913', 'Technology', 2106, 78.7)
6. ('DR752339', 'Appliances', 115, 78.53)
7. ('JX993569', 'Furniture', 3237, 75.89)
8. ('LD272389', 'Furniture', 3346, 72.06)
9. ('UC279555', 'Furniture', 352, 71.91)
10. ('SJ372509', 'Stationary', 1707, 71.86)
11. ('DF577320', 'Stationary', 1361, 71.62)
12. ('TU924951', 'Furniture', 1841, 68.11)
13. ('NE964215', 'Home', 1322, 66.94)
14. ('DD186136', 'Clothing', 1581, 66.25)
15. ('QD294233', 'Furniture', 678, 65.15)
16. ('WV650733', 'Furniture', 462, 63.68)
17. ('AQ988179', 'Sports', 3787, 63.67)
18. ('KD402977', 'Home', 1840, 61.5)
19. ('YQ263581', 'Sports', 1271, 61.4)
20. ('VP150049', 'Furniture', 6730, 59.13)
21. ('HA328125', 'Furniture', 712, 56.15)
22. ('WV168813', 'Clothing', 567, 55.86)
23. ('GK060340', 'Clothing', 637, 55.28)
24. ('VT624907', 'Clothing', 642, 53.83)
25. ('FH047820',

Now, let's do the same thing but this time only the products with price above $15.

In [20]:
query = '''
SELECT StockCode,Category,Weight, UnitPrice FROM Products
WHERE UnitPrice>15
ORDER BY UnitPrice DESC;
'''
execute_query(conn,query)

1. ('YJ878305', 'Beauty', 2657, 117.55)
2. ('GU312892', 'Home', 2042, 92.53)
3. ('ST643511', 'Appliances', 3555, 91.76)
4. ('PY014422', 'Home', 2275, 87.8)
5. ('CC424913', 'Technology', 2106, 78.7)
6. ('DR752339', 'Appliances', 115, 78.53)
7. ('JX993569', 'Furniture', 3237, 75.89)
8. ('LD272389', 'Furniture', 3346, 72.06)
9. ('UC279555', 'Furniture', 352, 71.91)
10. ('SJ372509', 'Stationary', 1707, 71.86)
11. ('DF577320', 'Stationary', 1361, 71.62)
12. ('TU924951', 'Furniture', 1841, 68.11)
13. ('NE964215', 'Home', 1322, 66.94)
14. ('DD186136', 'Clothing', 1581, 66.25)
15. ('QD294233', 'Furniture', 678, 65.15)
16. ('WV650733', 'Furniture', 462, 63.68)
17. ('AQ988179', 'Sports', 3787, 63.67)
18. ('KD402977', 'Home', 1840, 61.5)
19. ('YQ263581', 'Sports', 1271, 61.4)
20. ('VP150049', 'Furniture', 6730, 59.13)
21. ('HA328125', 'Furniture', 712, 56.15)
22. ('WV168813', 'Clothing', 567, 55.86)
23. ('GK060340', 'Clothing', 637, 55.28)
24. ('VT624907', 'Clothing', 642, 53.83)
25. ('FH047820',

The database puts `NULL` in the table where there is no data. When this data is transfered to Python `NULL` is converted to `None`.<br>
Let's get a list of customers without a Fax number.

In [21]:
query = '''
SELECT Surname,Country,Fax FROM Customers
WHERE Fax IS NULL;
'''
execute_query(conn,query)

1. ('Wall', 'Portugal', None)
2. ('Harris', 'Bahrain', None)
3. ('Dennis', 'Poland', None)
4. ('Carpenter', 'Poland', None)
5. ('Liu', 'Lithuania', None)
6. ('Bell', 'Australia', None)
7. ('Byrd', 'Italy', None)
8. ('Torres', 'Norway', None)
9. ('Torres', 'Iceland', None)
10. ('Anthony', 'Bahrain', None)
11. ('Thomas', 'Sweden', None)
12. ('Merritt', 'Cyprus', None)
13. ('Miller', 'Japan', None)
14. ('Mccullough', 'Iceland', None)
15. ('Perez', 'Iceland', None)
16. ('Garrett', 'Hong Kong', None)
17. ('Martinez', 'Netherlands', None)
18. ('Munoz', 'Israel', None)
19. ('Moore', 'Denmark', None)
20. ('Oliver', 'Brazil', None)
21. ('Clark', 'Hong Kong', None)
22. ('Collins', 'Greece', None)
23. ('Gonzalez', 'Spain', None)
24. ('Williams', 'Hong Kong', None)
25. ('Bond', 'Poland', None)
26. ('Jackson', 'Cyprus', None)
27. ('Johnson', 'Hong Kong', None)
28. ('Bauer', 'Netherlands', None)
29. ('Garcia', 'Greece', None)
30. ('Parker', 'SingaporeCzech Republic', None)
31. ('Garrett', 'Netherlan

**Note:** `WHERE` must come before `ORDER BY`.

In [22]:
# Only Customers from Australia
query = '''
SELECT CustomerId, FirstName, Surname FROM Customers
WHERE Country="Australia"
ORDER BY Surname DESC;
'''
execute_query(conn,query)

1. (905, 'Philip', 'Walsh')
2. (461, 'Mitchell', 'Wagner')
3. (262, 'Craig', 'Stevens')
4. (23, 'Drew', 'Spencer')
5. (501, 'Jonathan', 'Smith')
6. (824, 'Michael', 'Shaw')
7. (293, 'Cindy', 'Russell')
8. (470, 'Steven', 'Robinson')
9. (882, 'Michele', 'Reid')
10. (863, 'Theresa', 'Peck')
11. (471, 'Mackenzie', 'Owens')
12. (876, 'Stephanie', 'Nelson')
13. (224, 'Amber', 'Morales')
14. (441, 'Julie', 'Morales')
15. (585, 'Shannon', 'Mcconnell')
16. (327, 'Deborah', 'Martinez')
17. (372, 'Kenneth', 'Livingston')
18. (219, 'Heidi', 'Kramer')
19. (74, 'Sara', 'Johnson')
20. (171, 'Wendy', 'Johnson')
21. (602, 'Amy', 'Johnson')
22. (805, 'Angela', 'Hernandez')
23. (811, 'William', 'Hayes')
24. (672, 'Elizabeth', 'Hansen')
25. (815, 'Michael', 'Gutierrez')
26. (246, 'Matthew', 'Greene')
27. (808, 'Wendy', 'Greene')
28. (553, 'Samantha', 'Garcia')
29. (21, 'Kevin', 'Dominguez')
30. (424, 'Mary', 'Compton')
31. (756, 'Rebecca', 'Cisneros')
32. (218, 'James', 'Brown')
33. (369, 'Stephanie', 'B

### LIMIT
This clause is used to constrain the number of rows in the result.

In [23]:
# This query shows the 10 most expensive products.
query = '''
SELECT StockCode,Category,Weight, UnitPrice FROM Products
ORDER BY UnitPrice DESC
LIMIT 10;
'''
execute_query(conn,query)

1. ('YJ878305', 'Beauty', 2657, 117.55)
2. ('GU312892', 'Home', 2042, 92.53)
3. ('ST643511', 'Appliances', 3555, 91.76)
4. ('PY014422', 'Home', 2275, 87.8)
5. ('CC424913', 'Technology', 2106, 78.7)
6. ('DR752339', 'Appliances', 115, 78.53)
7. ('JX993569', 'Furniture', 3237, 75.89)
8. ('LD272389', 'Furniture', 3346, 72.06)
9. ('UC279555', 'Furniture', 352, 71.91)
10. ('SJ372509', 'Stationary', 1707, 71.86)


You can use `OFFSET` to skip a few rows.

In [26]:
# This query shows the next top 10 products with highest price.
# Since it is skipping the first 10 the result would be numbers 11-20
query = '''
SELECT StockCode,Category,Weight, UnitPrice FROM Products
ORDER BY UnitPrice DESC
LIMIT 10 OFFSET 10;
'''
execute_query(conn,query)

1. ('DF577320', 'Stationary', 1361, 71.62)
2. ('TU924951', 'Furniture', 1841, 68.11)
3. ('NE964215', 'Home', 1322, 66.94)
4. ('DD186136', 'Clothing', 1581, 66.25)
5. ('QD294233', 'Furniture', 678, 65.15)
6. ('WV650733', 'Furniture', 462, 63.68)
7. ('AQ988179', 'Sports', 3787, 63.67)
8. ('KD402977', 'Home', 1840, 61.5)
9. ('YQ263581', 'Sports', 1271, 61.4)
10. ('VP150049', 'Furniture', 6730, 59.13)


**Note:** `LIMIT` should always be used with `ORDER BY` so the rows are always in a specific order.

## Joins
As you have probably noticed the data in this database is in various tables. Each table contains a specific part of data. For instance, one table has the information about invoices and another table has information about customers. These tables are linked together so we can find which invoice belongs to which customer. To do this we use various types of `JOIN`. Each join clause determines how SQLite uses data from one table to match with rows in another table.<br>
Tables are connected using unique identifiers. For instance, __Sales__ table has a column called _CustomerId_. The same column name can be found in __Customers__ table. By looking up the customer id from invoices in the customers table we can find the information about the customer of each invoice.`JOIN` uses these identifiers to connect the tables and look up information.




### INNER JOIN


In [31]:
query = '''
SELECT PurchaseDate,FirstName, Surname, ProductID, Quantity
FROM Sales
INNER JOIN Customers 
    ON Sales.CustomerId = Customers.CustomerId
ORDER BY PurchaseDate
LIMIT 20;
'''

execute_query(conn,query)

1. ('2010-10-29', 'Cheyenne', 'Fletcher', 201, 2)
2. ('2010-11-23', 'Michael', 'Miller', 148, 2)
3. ('2010-11-23', 'Michael', 'Miller', 33, 1)
4. ('2010-11-23', 'Michael', 'Miller', 182, 1)
5. ('2011-01-02', 'Lisa', 'Terrell', 135, 1)
6. ('2011-01-05', 'Craig', 'Stevens', 127, 3)
7. ('2011-01-19', 'Craig', 'Stevens', 42, 1)
8. ('2011-01-19', 'Craig', 'Stevens', 261, 2)
9. ('2011-01-19', 'Craig', 'Stevens', 84, 1)
10. ('2011-02-15', 'Lauren', 'Dickerson', 289, 1)
11. ('2011-04-08', 'Natasha', 'Wolf', 205, 3)
12. ('2011-04-29', 'Katie', 'Thompson', 262, 3)
13. ('2011-04-29', 'Katie', 'Thompson', 77, 1)
14. ('2011-04-30', 'Michael', 'Miller', 247, 1)
15. ('2011-04-30', 'Michael', 'Miller', 48, 4)
16. ('2011-05-09', 'Jessica', 'Pierce', 39, 2)
17. ('2011-05-09', 'Jessica', 'Pierce', 211, 1)
18. ('2011-05-09', 'Isaac', 'Price', 129, 3)
19. ('2011-05-18', 'Amy', 'Griffin', 185, 2)
20. ('2011-05-18', 'Amy', 'Griffin', 160, 2)


In [32]:
# This query shows the purchase history of customers

query = '''
SELECT Surname, ProductID, Quantity, PurchaseDate
FROM Customers
INNER JOIN Sales ON
    Customers.CustomerID = Sales.CustomerID
ORDER BY Surname;
'''
execute_query(conn,query)

1. ('Acosta', 196, 1, '2017-04-08')
2. ('Acosta', 65, 1, '2019-02-17')
3. ('Acosta', 138, 1, '2019-02-17')
4. ('Adams', 43, 1, '2013-02-03')
5. ('Adams', 149, 2, '2013-02-03')
6. ('Adams', 59, 2, '2017-06-11')
7. ('Adams', 247, 1, '2017-06-11')
8. ('Adams', 69, 1, '2017-06-11')
9. ('Adams', 48, 2, '2019-11-10')
10. ('Adams', 223, 3, '2019-11-10')
11. ('Adams', 199, 1, '2013-11-10')
12. ('Adams', 88, 1, '2013-11-10')
13. ('Adams', 77, 1, '2013-11-10')
14. ('Adams', 211, 2, '2019-11-17')
15. ('Adams', 233, 1, '2018-03-18')
16. ('Adams', 200, 2, '2020-06-01')
17. ('Adams', 276, 1, '2020-06-14')
18. ('Adams', 58, 2, '2020-07-01')
19. ('Adams', 15, 1, '2018-11-01')
20. ('Adams', 13, 1, '2018-11-01')
21. ('Adams', 181, 1, '2019-04-30')
22. ('Adams', 156, 1, '2019-04-30')
23. ('Adams', 65, 3, '2020-07-29')
24. ('Aguirre', 286, 1, '2017-04-27')
25. ('Aguirre', 170, 2, '2020-06-26')
26. ('Alexander', 83, 1, '2015-01-15')
27. ('Alexander', 266, 2, '2015-11-09')
28. ('Alexander', 266, 1, '2015-11

1785. ('Lopez', 128, 2, '2020-03-23')
1786. ('Lopez', 119, 2, '2020-03-23')
1787. ('Lopez', 4, 1, '2020-03-23')
1788. ('Lopez', 285, 1, '2016-08-10')
1789. ('Lopez', 185, 2, '2016-08-10')
1790. ('Lopez', 135, 1, '2018-08-23')
1791. ('Lopez', 193, 2, '2018-08-23')
1792. ('Lopez', 129, 4, '2018-08-23')
1793. ('Lopez', 252, 3, '2017-03-26')
1794. ('Lopez', 206, 1, '2017-03-26')
1795. ('Lopez', 193, 1, '2017-03-26')
1796. ('Lopez', 3, 1, '2018-03-20')
1797. ('Lopez', 292, 2, '2019-12-01')
1798. ('Lopez', 182, 1, '2018-01-09')
1799. ('Lopez', 185, 3, '2018-01-09')
1800. ('Lopez', 70, 2, '2018-01-09')
1801. ('Lopez', 80, 2, '2014-01-27')
1802. ('Lowe', 171, 4, '2017-01-19')
1803. ('Lozano', 26, 4, '2018-02-05')
1804. ('Lozano', 97, 4, '2018-02-05')
1805. ('Lozano', 42, 2, '2018-02-05')
1806. ('Lozano', 113, 1, '2018-02-05')
1807. ('Lozano', 125, 1, '2016-01-12')
1808. ('Lozano', 80, 2, '2016-01-12')
1809. ('Lozano', 23, 4, '2016-01-12')
1810. ('Lozano', 12, 3, '2016-01-12')
1811. ('Lozano', 

### LEFT  JOIN
Also called `LEFT OUTER JOIN`

In [33]:
# This query shows the purchase history of customers

query = '''
SELECT Surname, ProductID, Quantity, PurchaseDate
FROM Customers
LEFT JOIN Sales ON
    Customers.CustomerID = Sales.CustomerID
ORDER BY Surname;
'''

execute_query(conn,query)

1. ('Acosta', 65, 1, '2019-02-17')
2. ('Acosta', 138, 1, '2019-02-17')
3. ('Acosta', 196, 1, '2017-04-08')
4. ('Adams', 43, 1, '2013-02-03')
5. ('Adams', 48, 2, '2019-11-10')
6. ('Adams', 77, 1, '2013-11-10')
7. ('Adams', 88, 1, '2013-11-10')
8. ('Adams', 149, 2, '2013-02-03')
9. ('Adams', 199, 1, '2013-11-10')
10. ('Adams', 223, 3, '2019-11-10')
11. ('Adams', 58, 2, '2020-07-01')
12. ('Adams', 65, 3, '2020-07-29')
13. ('Adams', 200, 2, '2020-06-01')
14. ('Adams', 276, 1, '2020-06-14')
15. ('Adams', 156, 1, '2019-04-30')
16. ('Adams', 181, 1, '2019-04-30')
17. ('Adams', 233, 1, '2018-03-18')
18. ('Adams', 13, 1, '2018-11-01')
19. ('Adams', 15, 1, '2018-11-01')
20. ('Adams', 59, 2, '2017-06-11')
21. ('Adams', 69, 1, '2017-06-11')
22. ('Adams', 211, 2, '2019-11-17')
23. ('Adams', 247, 1, '2017-06-11')
24. ('Adkins', None, None, None)
25. ('Aguirre', 170, 2, '2020-06-26')
26. ('Aguirre', 286, 1, '2017-04-27')
27. ('Alexander', None, None, None)
28. ('Alexander', 83, 1, '2015-01-15')
29. (

2111. ('Merritt', 18, 3, '2015-11-15')
2112. ('Merritt', 87, 1, '2020-04-11')
2113. ('Merritt', 116, 1, '2017-09-15')
2114. ('Merritt', 129, 1, '2015-11-15')
2115. ('Merritt', 157, 2, '2016-01-22')
2116. ('Merritt', 170, 3, '2015-11-15')
2117. ('Merritt', 198, 1, '2020-04-11')
2118. ('Merritt', 238, 2, '2017-09-15')
2119. ('Merritt', 263, 1, '2018-12-19')
2120. ('Merritt', 286, 1, '2016-01-22')
2121. ('Merritt', 288, 3, '2018-12-19')
2122. ('Meyer', 196, 3, '2018-12-11')
2123. ('Meyer', 29, 1, '2019-09-02')
2124. ('Meyers', 119, 3, '2017-06-01')
2125. ('Meyers', 160, 2, '2012-08-24')
2126. ('Meyers', 235, 1, '2012-08-24')
2127. ('Meza', 143, 2, '2015-11-29')
2128. ('Meza', 147, 1, '2015-11-29')
2129. ('Meza', 163, 1, '2015-11-29')
2130. ('Meza', 216, 1, '2017-10-14')
2131. ('Meza', 219, 1, '2020-04-06')
2132. ('Miles', 241, 2, '2018-12-09')
2133. ('Miles', 260, 2, '2018-09-09')
2134. ('Miles', 262, 1, '2018-12-09')
2135. ('Miller', 186, 1, '2019-11-09')
2136. ('Miller', 226, 1, '2020-0

3611. ('White', 139, 1, '2019-07-18')
3612. ('White', 281, 2, '2019-07-18')
3613. ('Whitney', 121, 6, '2016-03-08')
3614. ('Whitney', 244, 1, '2016-03-08')
3615. ('Whitney', 284, 1, '2018-06-11')
3616. ('Wilcox', 10, 1, '2019-11-22')
3617. ('Wilcox', 98, 1, '2017-09-08')
3618. ('Wilcox', 273, 2, '2019-11-22')
3619. ('Wiley', 33, 1, '2016-04-10')
3620. ('Wiley', 76, 2, '2018-02-28')
3621. ('Wiley', 190, 3, '2018-02-28')
3622. ('Wilkins', 18, 1, '2016-01-17')
3623. ('Wilkins', 264, 1, '2017-11-09')
3624. ('Wilkins', 285, 1, '2017-11-09')
3625. ('Wilkins', 15, 1, '2015-04-24')
3626. ('Wilkins', 18, 1, '2015-04-24')
3627. ('Wilkins', 92, 2, '2016-09-14')
3628. ('Wilkins', 113, 1, '2016-09-14')
3629. ('Wilkins', 290, 1, '2015-04-24')
3630. ('Williams', 45, 1, '2019-12-04')
3631. ('Williams', 51, 1, '2020-05-11')
3632. ('Williams', 97, 2, '2020-04-15')
3633. ('Williams', 202, 2, '2020-04-15')
3634. ('Williams', 225, 1, '2020-04-15')
3635. ('Williams', 253, 5, '2020-05-11')
3636. ('Williams',

**Note:** The top two cells are both showing the purchase history of customers. But one is using `INNER JOIN` and the other is using `LEFT JOIN`. So what's the difference? If you pay close attention you will see that the two tables don't have the same number of rows. The main difference between the two is that `LEFT JOIN` conserves all the entries on the left (customers) and if it can't find any purchases for that customer it will return `NULL`. However, `INNER JOIN` tries to find a match. If a customer doesn't have any purchases in the list it will not be shown in the final result.<br>
`INNER JOIN` and `LEFT JOIN` are the most common types of join. However, there are other types such as `CROSS JOIN`, `FULL OUTER JOIN`, `RIGHT JOIN`, etc. You can find more information about these types of join on https://www.sqlitetutorial.net/.


One trick that makes writing a query faster is using aliases to tables and column. The query below is the same as the one above. The only difference is use of aliases.

In [34]:

query = '''
SELECT Surname, ProductID, Quantity, PurchaseDate
FROM Customers as c
LEFT JOIN Sales as s ON
    c.CustomerID = s.CustomerID
ORDER BY Surname;
'''

execute_query(conn,query)

1. ('Acosta', 65, 1, '2019-02-17')
2. ('Acosta', 138, 1, '2019-02-17')
3. ('Acosta', 196, 1, '2017-04-08')
4. ('Adams', 43, 1, '2013-02-03')
5. ('Adams', 48, 2, '2019-11-10')
6. ('Adams', 77, 1, '2013-11-10')
7. ('Adams', 88, 1, '2013-11-10')
8. ('Adams', 149, 2, '2013-02-03')
9. ('Adams', 199, 1, '2013-11-10')
10. ('Adams', 223, 3, '2019-11-10')
11. ('Adams', 58, 2, '2020-07-01')
12. ('Adams', 65, 3, '2020-07-29')
13. ('Adams', 200, 2, '2020-06-01')
14. ('Adams', 276, 1, '2020-06-14')
15. ('Adams', 156, 1, '2019-04-30')
16. ('Adams', 181, 1, '2019-04-30')
17. ('Adams', 233, 1, '2018-03-18')
18. ('Adams', 13, 1, '2018-11-01')
19. ('Adams', 15, 1, '2018-11-01')
20. ('Adams', 59, 2, '2017-06-11')
21. ('Adams', 69, 1, '2017-06-11')
22. ('Adams', 211, 2, '2019-11-17')
23. ('Adams', 247, 1, '2017-06-11')
24. ('Adkins', None, None, None)
25. ('Aguirre', 170, 2, '2020-06-26')
26. ('Aguirre', 286, 1, '2017-04-27')
27. ('Alexander', None, None, None)
28. ('Alexander', 83, 1, '2015-01-15')
29. (

2273. ('Morgan', 88, 5, '2013-11-15')
2274. ('Morgan', 92, 1, '2013-11-15')
2275. ('Morgan', 163, 2, '2014-09-14')
2276. ('Morgan', 198, 2, '2013-11-15')
2277. ('Morgan', 211, 2, '2013-11-15')
2278. ('Morgan', 281, 2, '2019-07-09')
2279. ('Morris', 64, 1, '2019-05-12')
2280. ('Morris', 156, 1, '2019-05-12')
2281. ('Morris', 53, 2, '2020-06-10')
2282. ('Morris', 86, 1, '2017-06-12')
2283. ('Morris', 109, 3, '2020-06-10')
2284. ('Morris', 229, 3, '2020-06-10')
2285. ('Morris', 252, 7, '2019-05-08')
2286. ('Morris', 57, 1, '2020-04-26')
2287. ('Morris', 66, 2, '2020-04-26')
2288. ('Morris', 105, 1, '2019-11-11')
2289. ('Morris', 138, 1, '2020-04-26')
2290. ('Morris', 142, 3, '2019-11-11')
2291. ('Morris', None, None, None)
2292. ('Morris', 72, 4, '2019-09-28')
2293. ('Morris', 93, 1, '2019-07-12')
2294. ('Morrison', 79, 1, '2018-06-19')
2295. ('Morrison', 79, 4, '2015-02-18')
2296. ('Morrison', 103, 2, '2015-02-18')
2297. ('Morrison', 156, 1, '2018-06-19')
2298. ('Morrison', 223, 3, '2012

After `SELECT` we can also use math operations on the columns.

In [35]:
query = '''
SELECT InvoiceNumber, StockCode, Quantity, UnitPrice, Quantity*UnitPrice as Total,  CustomerID
FROM Sales as s
INNER JOIN Products as p
    ON s.ProductId = p.ProductId
ORDER BY InvoiceNumber DESC
'''
execute_query(conn,query)

1. ('202007318518', 'UT900480', 1, 23.45, 23.45, 932)
2. ('202007318518', 'OX204742', 1, 29.47, 29.47, 932)
3. ('202007313404', 'LF434461', 2, 27.01, 54.02, 916)
4. ('202007308826', 'RM419830', 2, 6.81, 13.62, 258)
5. ('202007299565', 'OX806183', 5, 21.45, 107.25, 968)
6. ('202007299565', 'JO827398', 4, 28.04, 112.16, 968)
7. ('202007299565', 'YN760953', 1, 26.58, 26.58, 968)
8. ('202007297766', 'LJ294158', 2, 41.37, 82.74, 114)
9. ('202007294555', 'CD076891', 1, 21.27, 21.27, 714)
10. ('202007294542', 'RM419830', 3, 6.81, 20.43, 404)
11. ('202007281061', 'VB192908', 1, 51.11, 51.11, 449)
12. ('202007280236', 'XJ792088', 1, 48.33, 48.33, 237)
13. ('202007273674', 'EB831613', 2, 12.31, 24.62, 676)
14. ('202007273674', 'NG216735', 1, 17.9, 17.9, 676)
15. ('202007268734', 'OO406570', 5, 28.87, 144.35, 850)
16. ('202007268734', 'KA323284', 1, 32.15, 32.15, 850)
17. ('202007262544', 'AQ988179', 3, 63.67, 191.01, 829)
18. ('202007262544', 'XX748942', 2, 16.46, 32.92, 829)
19. ('202007258439'

1419. ('201905285628', 'CP824418', 1, 7.83, 7.83, 459)
1420. ('201905285628', 'GS673129', 1, 9.1, 9.1, 459)
1421. ('201905282685', 'DK020939', 1, 8.66, 8.66, 844)
1422. ('201905282685', 'FF356985', 2, 8.56, 17.12, 844)
1423. ('201905282685', 'ZQ525720', 2, 5.99, 11.98, 844)
1424. ('201905282685', 'IK042267', 3, 28.23, 84.69, 844)
1425. ('201905282685', 'AV611930', 1, 46.36, 46.36, 844)
1426. ('201905271201', 'LZ422758', 3, 7.82, 23.46, 719)
1427. ('201905271201', 'DF577320', 6, 71.62, 429.72, 719)
1428. ('201905271201', 'AM850835', 1, 11.24, 11.24, 719)
1429. ('201905260045', 'CD076891', 1, 21.27, 21.27, 942)
1430. ('201905259466', 'WF268546', 2, 6.64, 13.28, 715)
1431. ('201905259466', 'AO469492', 5, 34.53, 172.65, 715)
1432. ('201905244603', 'FT918550', 2, 4.32, 8.64, 800)
1433. ('201905239218', 'WB045037', 3, 38.25, 114.75, 852)
1434. ('201905239218', 'DK720585', 2, 17.12, 34.24, 852)
1435. ('201905234568', 'QD294233', 2, 65.15, 130.3, 153)
1436. ('201905234568', 'CD076891', 2, 21.2

3435. ('201404114498', 'YT873595', 1, 6.56, 6.56, 384)
3436. ('201404114498', 'WI150710', 1, 6.75, 6.75, 384)
3437. ('201404062193', 'FO303231', 3, 17.98, 53.94, 227)
3438. ('201403241935', 'EW850027', 1, 22.77, 22.77, 472)
3439. ('201403241935', 'UC279555', 2, 71.91, 143.82, 472)
3440. ('201403238557', 'YU673093', 1, 36.96, 36.96, 503)
3441. ('201403165867', 'NW176696', 2, 19.43, 38.86, 157)
3442. ('201403114155', 'NE675700', 1, 13.91, 13.91, 1021)
3443. ('201403034734', 'WF268546', 1, 6.64, 6.64, 316)
3444. ('201402271443', 'XS324415', 2, 17.99, 35.98, 196)
3445. ('201402271443', 'YU673093', 1, 36.96, 36.96, 196)
3446. ('201402269701', 'AL225388', 2, 14.35, 28.7, 280)
3447. ('201402269701', 'SJ521162', 3, 15.88, 47.64, 280)
3448. ('201402245303', 'WC075791', 1, 10.13, 10.13, 979)
3449. ('201402245303', 'FH047820', 1, 53.57, 53.57, 979)
3450. ('201402224182', 'DU573649', 1, 11.1, 11.1, 547)
3451. ('201402224182', 'FK759406', 4, 11.04, 44.16, 547)
3452. ('201402195691', 'LF545577', 1, 

### GROUP BY
This clause allows us to summarise the result of a query. It returns only one row for every group of rows. The rows are summarised using an aggregate function such as MIN, MAX, COUNT, AVG, or SUM.


In [36]:
query = '''
SELECT InvoiceNumber, PurchaseDate, COUNT(StockCode) as ProductsCount,SUM(Quantity) as ItemsCount
FROM Sales as s
INNER JOIN Products as p
    ON s.ProductID = p.ProductID
GROUP BY InvoiceNumber
LIMIT 20
'''
execute_query(conn,query)

1. ('201010298367', '2010-10-29', 1, 2)
2. ('201011239479', '2010-11-23', 3, 4)
3. ('201101020547', '2011-01-02', 1, 1)
4. ('201101059019', '2011-01-05', 1, 3)
5. ('201101191070', '2011-01-19', 3, 4)
6. ('201102156752', '2011-02-15', 1, 1)
7. ('201104087983', '2011-04-08', 1, 3)
8. ('201104294178', '2011-04-29', 2, 4)
9. ('201104300913', '2011-04-30', 2, 5)
10. ('201105098599', '2011-05-09', 2, 3)
11. ('201105099480', '2011-05-09', 1, 3)
12. ('201105183462', '2011-05-18', 2, 4)
13. ('201105250465', '2011-05-25', 1, 5)
14. ('201106197449', '2011-06-19', 1, 1)
15. ('201107017720', '2011-07-01', 1, 4)
16. ('201107184758', '2011-07-18', 2, 4)
17. ('201107282433', '2011-07-28', 2, 9)
18. ('201108022852', '2011-08-02', 1, 1)
19. ('201108188005', '2011-08-18', 1, 3)
20. ('201109191411', '2011-09-19', 3, 6)


In the example above we first join `Sales` with `Products` to get a list of purchases with product details. Then we group them by invoice number, and aggregate the total invoice once using count (to find how many types of product was purchased) and once using sum (to find how many items were purchased). <br>

### HAVING
This clause adds conditions to the result of group by. It can be used as follows:
<code> 
SELECT column_1, column_2, aggregate_function (column_3)
FROM table
GROUP BY
	column_1, column_2
HAVING search_condition;
</code>

<br> __<font color="red">Note that the HAVING clause is applied after GROUP BY clause, whereas the WHERE clause is applied before the GROUP BY clause.</font>__

Let's repeat the last example, but this time return only the invoices with more than one type of product in them.

In [37]:
query = '''
SELECT FirstName, LastName, COUNT(Total) ,SUM(Total)
FROM invoices
INNER JOIN customers 
    ON invoices.CustomerId = customers.CustomerId
GROUP BY LastName, FirstName
HAVING SUM(Total)>40
'''
query = '''
SELECT InvoiceNumber, PurchaseDate, COUNT(StockCode) as ProductsCount,SUM(Quantity) as ItemsCount
FROM Sales as s
INNER JOIN Products as p
    ON s.ProductID = p.ProductID
GROUP BY InvoiceNumber
HAVING ProductsCount>1
LIMIT 20
'''

execute_query(conn,query)

1. ('201011239479', '2010-11-23', 3, 4)
2. ('201101191070', '2011-01-19', 3, 4)
3. ('201104294178', '2011-04-29', 2, 4)
4. ('201104300913', '2011-04-30', 2, 5)
5. ('201105098599', '2011-05-09', 2, 3)
6. ('201105183462', '2011-05-18', 2, 4)
7. ('201107184758', '2011-07-18', 2, 4)
8. ('201107282433', '2011-07-28', 2, 9)
9. ('201109191411', '2011-09-19', 3, 6)
10. ('201110174631', '2011-10-17', 3, 5)
11. ('201111213763', '2011-11-21', 3, 3)
12. ('201112173724', '2011-12-17', 2, 2)
13. ('201201317780', '2012-01-31', 4, 7)
14. ('201203035545', '2012-03-03', 2, 2)
15. ('201203181977', '2012-03-18', 2, 2)
16. ('201203279140', '2012-03-27', 3, 6)
17. ('201204091860', '2012-04-09', 2, 6)
18. ('201204152123', '2012-04-15', 2, 4)
19. ('201204302330', '2012-04-30', 2, 6)
20. ('201205173127', '2012-05-17', 2, 3)


### Subqueries

Subqueries are queries within queries. We can use a query to grab data and arrange it in a table and the write another query to work on the table we just created. This table is not saved in the database but is available until the end of the query.

Let's use what we have learned so far and create a list of top 10 invoices with the higherst values.

In [97]:
query = '''
SELECT InvoiceNumber, Quantity*UnitPrice as Total,  CustomerID
FROM Sales as s
INNER JOIN Products as p
    ON s.ProductId = p.ProductId
GROUP BY InvoiceNumber
ORDER BY Total DESC
LIMIT 10

'''
execute_query(conn,query)

1. ('201903032453', 555.1800000000001, 255)
2. ('201806066277', 482.13, 320)
3. ('201904173254', 473.04, 680)
4. ('201404161325', 439.0, 922)
5. ('201703177159', 439.0, 1006)
6. ('201603087949', 431.15999999999997, 276)
7. ('201912067383', 397.5, 587)
8. ('201702189438', 382.16, 825)
9. ('201605076164', 370.12, 861)
10. ('201905306914', 370.12, 540)


Now If we wanted to have Customers details in the result as well, we could use this as a subquery.

In [38]:
query = '''
SELECT InvoiceNumber, FirstName, Surname, Mobile, Total
FROM
    (SELECT InvoiceNumber, Quantity*UnitPrice as Total,  CustomerID
    FROM Sales as s
    INNER JOIN Products as p
        ON s.ProductId = p.ProductId
    GROUP BY InvoiceNumber
    ORDER BY Total DESC
    LIMIT 10) as invoices
INNER JOIN Customers as c
    ON invoices.CustomerID = c.CustomerID
'''
execute_query(conn,query)

1. ('201804304243', 'Erik', 'Kim', '0490187675', 647.71)
2. ('201806066277', 'Charles', 'Bishop', '0426258619', 482.13)
3. ('202002232320', 'Mark', 'Watson', '0447024997', 462.65)
4. ('201703177159', 'Brittany', 'Powers', '0462782029', 439.0)
5. ('201912067383', 'Joseph', 'Johnson', '0409593000', 397.5)
6. ('201706034938', 'Amy', 'Griffin', '0474235932', 379.45)
7. ('201905306914', 'Charles', 'Martin', '0471680001', 370.12)
8. ('201605076164', 'Devin', 'Barnes', '0461952271', 370.12)
9. ('202005261105', 'Megan', 'Smith', '0475260400', 359.3)
10. ('202007142772', 'Cindy', 'Russell', '0444757377', 352.65)


## CRUD
CRUD is an acronym for four main database operations: Create, Read, Update, and Delete.<br>
We have already used `SELECT` to read data from database. Now we can learn about other operations. Let's start by adding data to a database. But before that we need to create a new database and a new table inside it.


In [41]:
# create a new database
db_file = 'sampledb.db'
conn = sqlite3.connect(db_file)

__Note:__ When no file is found by the name we have entered, it will create a new database by that name.

We can create a table using `CREATE TABLE` followed by the name of the table. Then, we define name of each column, the type of data it contains, and its default value. <br>
We can use `NOT NULL` to specify that the data for a certain column cannot be left empty (NULL).


In [44]:
query = '''
CREATE TABLE employees (
    FirstName nvarchar(25) NOT NULL,
    Surname nvarchar(25) NOT NULL,
    PhoneNumber nvarchar(25) NOT NULL,
    Email nvarchar(40) DEFAULT ""
    )

'''
execute_query(conn,query)

__Note:__ FirstName, Surname, and PhoneNumber must be entered. But if the email is not entered it will be set to default value which is an empty string.

### Data types in SQL
We need to specify the type of data each column contains. The types of data supported by SQL are as follows:
- __Exact numeric:__ BOOLEAN, TINYINT, SMALLINT, INT, BIGINT
- __Approximate Numeric:__ FLOAT, DOUBLE
- __String:__ CHARACTER, VARCHAR, NCHAR, NVARCHAR 
- __Date/Time:__ DATE, DATETIME, TIME, YEAR

For details about each type visit [sqlite.org](https://www.sqlite.org/datatype3.html)

Now let's see what the table looks like.

In [57]:
query = '''
SELECT * FROM employees
'''
execute_query(conn,query)

1. ('Darth', 'Vader', '456123789', 'd.vader@deathstar.com')
2. ('Luke', 'Skywalker', '456123789', '')
3. ('Leia', 'Organa', '404200501', 'princess@aldraan.gov')
4. ('Han', 'Solo', '437294558', 'm.falcon@smagglers.com')
5. ('Luke', 'Skywalker', '456123789', 'l.skywalker@deathstar.com')


Nothing is printed. Because the table is empty. To ensure the table has been created, let's use `.description` to get a list of columns in the table.

In [58]:
cur = conn.cursor()
cur.execute(query)
cur.description

(('FirstName', None, None, None, None, None, None),
 ('Surname', None, None, None, None, None, None),
 ('PhoneNumber', None, None, None, None, None, None),
 ('Email', None, None, None, None, None, None))

If you see the name of columns, then the table has been successfully created.

To add new rows to the table, we use `INSERT INTO` followed by the name of the table and the values we want to add. The values need to be in the same order as the columns.

In [59]:
query = '''
INSERT INTO employees
VALUES ("Darth","Vader","456123789","d.vader@deathstar.com")

'''
execute_query(conn,query)

In [60]:
execute_query(conn,'SELECT * FROM employees')

1. ('Darth', 'Vader', '456123789', 'd.vader@deathstar.com')
2. ('Luke', 'Skywalker', '456123789', '')
3. ('Leia', 'Organa', '404200501', 'princess@aldraan.gov')
4. ('Han', 'Solo', '437294558', 'm.falcon@smagglers.com')
5. ('Luke', 'Skywalker', '456123789', 'l.skywalker@deathstar.com')
6. ('Darth', 'Vader', '456123789', 'd.vader@deathstar.com')


Now, let's add another row but this time we won't enter a value for email. Since we defined a default value for email, we expect to see it appear for the new row.

In [62]:
query = '''
INSERT INTO employees
VALUES ("Luke","Skywalker","456123789")

'''
execute_query(conn,query)

OperationalError: table employees has 4 columns but 3 values were supplied

There is an error: _`table employees has 4 columns but 3 values were supplied`_ <br>
The reason is the database manager doesn't know which value you have not entered.
The correct way to add data is not only specify the values, but also the name of the columns. This way we are clear in our instruction that which value belongs to which column.

In [63]:
query = '''
INSERT INTO employees(FirstName,Surname,PhoneNumber)
VALUES ("Luke","Skywalker","456123789")

'''
execute_query(conn,query)

In [64]:

execute_query(conn,"SELECT * FROM employees")

1. ('Darth', 'Vader', '456123789', 'd.vader@deathstar.com')
2. ('Luke', 'Skywalker', '456123789', '')
3. ('Leia', 'Organa', '404200501', 'princess@aldraan.gov')
4. ('Han', 'Solo', '437294558', 'm.falcon@smagglers.com')
5. ('Luke', 'Skywalker', '456123789', 'l.skywalker@deathstar.com')
6. ('Darth', 'Vader', '456123789', 'd.vader@deathstar.com')
7. ('Luke', 'Skywalker', '456123789', 'l.skywalker@deathstar.com')
8. ('Luke', 'Skywalker', '456123789', '')


We can add multiple rows of data simultaneously. Each row must be separated by __`,`__

In [65]:
query = '''
INSERT INTO employees(FirstName,Surname,PhoneNumber,Email)
VALUES ("Leia","Organa","404200501","princess@aldraan.gov"),
     ("Han","Solo","437294558","m.falcon@smagglers.com")
'''
execute_query(conn,query)

In [66]:
execute_query(conn,"SELECT * FROM employees")

1. ('Darth', 'Vader', '456123789', 'd.vader@deathstar.com')
2. ('Luke', 'Skywalker', '456123789', '')
3. ('Leia', 'Organa', '404200501', 'princess@aldraan.gov')
4. ('Han', 'Solo', '437294558', 'm.falcon@smagglers.com')
5. ('Luke', 'Skywalker', '456123789', 'l.skywalker@deathstar.com')
6. ('Darth', 'Vader', '456123789', 'd.vader@deathstar.com')
7. ('Luke', 'Skywalker', '456123789', 'l.skywalker@deathstar.com')
8. ('Luke', 'Skywalker', '456123789', '')
9. ('Leia', 'Organa', '404200501', 'princess@aldraan.gov')
10. ('Han', 'Solo', '437294558', 'm.falcon@smagglers.com')


### Update
Next operation to learn is `UPDATE`. It is used for change the values in the table.<br>
A simple form of updating is presented below. Right after `UPDATE` we specify the name of the table. Then use `SET` to specify the new value for the column. Using `WHERE` clause we can conditions and control which rows should be updated.

In [67]:
# Sets the value of Email column to NULL for the rows where email address is empty
query = '''
UPDATE employees 
SET Email = NULL
WHERE Email=""
'''
execute_query(conn,query)

In [68]:
execute_query(conn,"SELECT * FROM employees")

1. ('Darth', 'Vader', '456123789', 'd.vader@deathstar.com')
2. ('Luke', 'Skywalker', '456123789', None)
3. ('Leia', 'Organa', '404200501', 'princess@aldraan.gov')
4. ('Han', 'Solo', '437294558', 'm.falcon@smagglers.com')
5. ('Luke', 'Skywalker', '456123789', 'l.skywalker@deathstar.com')
6. ('Darth', 'Vader', '456123789', 'd.vader@deathstar.com')
7. ('Luke', 'Skywalker', '456123789', 'l.skywalker@deathstar.com')
8. ('Luke', 'Skywalker', '456123789', None)
9. ('Leia', 'Organa', '404200501', 'princess@aldraan.gov')
10. ('Han', 'Solo', '437294558', 'm.falcon@smagglers.com')


We can see that the empty email address is now updated to `None`.<br>
__Note:__ `None` is Python equivalent of `NULL`.


### Delete
Finally, to delete data from table we use `DELETE`. We need to specify the name of the table and conditions of the row(s).

In [69]:
query = '''
DELETE FROM employees
WHERE Surname="Vader"
'''
execute_query(conn,query)
execute_query(conn,"SELECT * FROM employees")

1. ('Luke', 'Skywalker', '456123789', None)
2. ('Leia', 'Organa', '404200501', 'princess@aldraan.gov')
3. ('Han', 'Solo', '437294558', 'm.falcon@smagglers.com')
4. ('Luke', 'Skywalker', '456123789', 'l.skywalker@deathstar.com')
5. ('Luke', 'Skywalker', '456123789', 'l.skywalker@deathstar.com')
6. ('Luke', 'Skywalker', '456123789', None)
7. ('Leia', 'Organa', '404200501', 'princess@aldraan.gov')
8. ('Han', 'Solo', '437294558', 'm.falcon@smagglers.com')


Adding the condition is important. If no condition is specified all the rows will be removed.

In [70]:
query = '''
DELETE FROM employees
'''
execute_query(conn,query)
execute_query(conn,"SELECT * FROM employees")

As you can see the query doesn't return anything, which suggests the table is empty.

When your work is done on a database make sure you close the connection.

In [71]:
conn.close()

In [76]:
query = '''
SELECT Country, COUNT (*) AS Number
    FROM Customers
    GROUP BY Country
    ORDER BY Country

'''



cur = conn.cursor()
cur.execute(query)
output = cur.fetchall()
print(output)

[('Australia', 35), ('Austria', 31), ('Bahrain', 53), ('Belgium', 29), ('Brazil', 42), ('Canada', 32), ('Cyprus', 41), ('Denmark', 38), ('Finland', 44), ('France', 44), ('Germany', 41), ('Greece', 36), ('Hong Kong', 41), ('Iceland', 30), ('Israel', 35), ('Italy', 34), ('Japan', 34), ('Lithuania', 34), ('Netherlands', 42), ('Norway', 45), ('Poland', 43), ('Portugal', 27), ('SingaporeCzech Republic', 31), ('Spain', 44), ('Sweden', 31), ('Switzerland', 24), ('USA', 36), ('United Kingdom', 37)]


## Exrecises
To make sure you learned writing a query try creating the following tables:
- How many customers are there in each country?
- How many products are in each category?
- A list of customers ordered by their total purchase (it can be across multiple invoices)
- A list of products orders by the number of time they have been ordered
- A list of total sales per country

## Further reading
- [SQLITE documentation](https://www.sqlite.org)
- [SQLITE tutorial](https://www.sqlitetutorial.net/sqlite-python)
- [W3schools](https://www.w3schools.com/sql)