### In this notebook we'll go over a brief introduction to the structure of the Sakila Database and setting up SQL in your Python Environment.
#### DISCLAIMER:
______________________________________________________________________________________________________
There are many ways to browse through a SQL database, throughout this workshop we are going to be focusing on learning about SQL queries using a combination of Python, SQLite, and pandas. Please note that this is a pretty specific way of operating with a SQL Database, and may or may not fit other general needs. The primary goal of this section is to teach you how to use SQL queries to grab information and set it as a pandas DataFrame. We will not be going over more general topics of relational databases, MySQL, or using a SQL console directly.

______________________________________________________________________________________________________

#### Step 1: Download SQL Alchemy
To start this appendix, download SQLAlchemy. You can do this by either downloading it
by typing conda install sqlalchemy if you are using the Anaconda installation of Python.

#### Step 2: Download SQLite Broswer
Next up we will download a sql browser. We will be using SQLite Browser because it is lightweight and free to use.

Download SQLite Browser here: http://sqlitebrowser.org/


#### All done! Now let's look at the database before diving into how to work with it in Python.

Now that we have seen an overview of what the database looks like, let's go ahead and learn how to communicate with it with Python and pandas.

Python comes with SQLite3, which provides a lightweight disk-based database that doesn't require a seperate server process. It's useful to prototype with SQLite and then port the code to a larger database system, like MySQL. Python comes with a module to connect to a SQL database with SQLite. The module is SQLite3, let's go ahead and import it (and pandas as well).

## What is SQL?
SQL stands for Structured Query Language. It is the language of Databases. SQL is used to manage (store and access) data held within relational database systems.
SQL requires that you use predefined schemas to determine the structure of your data before you work with it. 

In [23]:
#Picture of Sakila Schema 
from IPython.display import Image
Image(url= "https://dev.mysql.com/doc/sakila/en/images/sakila-schema.png")

In [19]:
import sqlite3
import pandas as pd

To use the module, you must first create a Connection object that represents the database. If the database name already exists SQLite3 will automatically connect to it, if it does not exsist, SQLite3 will automatically create.

Let's make the connection!

In [20]:
# Connect to the database


In [22]:
# Set SQL query as a comment

# Use pandas to pass sql query using connection form SQLite3

# Show the resulting DataFrame


### SQL SELECT Statement
The SELECT statement is used to select data from a database. The result is then stored in a result table, sometimes called the result-set.

### Syntax for SQL SELECT
SELECT column_name FROM table_name

We could also select multiple columns:

SELECT column_name1,column_name2 
FROM table_name

Or we could select everything in a table using *

SELECT * FROM table_name

To see how this and multiple other queries work, we'll connect to the database and make a function that automatically takes in our query and returns a DataFrame.

#### Selecting Multiple Columns

#### Selecting Everything from table with *

### Syntax for the SQL DISTINCT Statement

In a table, a column may contain duplicate values; and sometimes you only want to list the distinct (unique) values. The DISTINCT keyword can be used to return only distinct (unique) values.

SELECT DISTINCT column_name
FROM table_name;

### Syntax for the SQL WHERE

The WHERE clause is used to filter records, and is used to extract only the records that fulfill the specific parameter.

SELECT column_name
FROM table_name
WHERE column_name (math operator) desired_value;

Note, there are a variety of logical operators you can use for a SQL request.



<table>
<tr>
<th>Operator</th>
<th>Description</th>
</tr>
<tr>
<td>%</td>
<td> Equal</td>
</tr>
<tr>
<td><></td>
<td>Not equal. Note: In some versions of SQL this operator may be written  !=</td>
</tr>
<tr>
<td>></td>
<td> Greater than</td>
</tr>
<tr>
<td><</td>
<td> Less than
</td>
</tr>
<tr>
<td>>=</td>
<td> Greater than or equal</td>
</tr>
<tr>
<td><=</td>
<td> Less than or equal</td>
</tr>
</table>

SQL requires single quotes around text values, while numeric fields are not enclosed in quotes.

### Syntax for AND

The AND operator is used to filter records based on more than one condition.

The AND operator displays a record if both the first condition AND the second condition are true.

### Syntax for OR

The OR operator displays a record if either the first condition OR the second condition is true.

Before we begin with Wildcards, ORDER BY, and GROUP BY. Let's take a look at aggregate functions.

* AVG() - Returns the average value.
* COUNT() - Returns the number of rows.
* FIRST() - Returns the first value.
* LAST() - Returns the last value.
* MAX() - Returns the largest value.
* MIN() - Returns the smallest value.
* SUM() - Returns the sum.

You can call any of these aggregate functions on a column to get the resulting values back. For example:

The usual syntax is:

SELECT column_name, aggregate_function(column_name) <br/>
FROM table_name <br/>
WHERE column_name

## SQL Wildcards

A wildcard character can be used to substitute for any other characters in a string. In SQL, wildcard characters are used with the SQL LIKE operator. The LIKE operator is used in a WHERE clause to search for a specified pattern in a column.

There are several wildcard operators:

<table>
<tr>
<th>Wildcard</th>
<th>Description</th>
</tr>
<tr>
<td>%</td>
<td>A substitute for zero or more characters</td>
</tr>
<tr>
<td>_</td>
<td>A substitute for a single character</td>
</tr>
<tr>
<td>[character_list]</td>
<td>Sets and ranges of characters to match</td>
</tr>
</table>

SQL ORDER BY
The ORDER BY keyword is used to sort the result-set by one or more columns. The ORDER BY keyword sorts the records in ascending order by default. To sort the records in a descending order, you can use the DESC keyword. The syntax is:
SELECT column_name 
FROM table_name
ORDER BY column_name ASC|DESC
Let's see it in action:

## SQL GROUP BY 

The GROUP BY statement is used with the aggregate functions to group the results by one or more columns. The syntax is:

SELECT column_name, aggregate_function(column_name) <br/>
FROM table_name <br/>
WHERE column_name operator value <br/>
GROUP BY column_name; 

Let's see how it works.

## SQL NESTED SELECT
  
  
We can include select statement from within a select statement. 
Let's say we wanted to get all the payments where the amount was equal to the largest payment that the staff member 1. How would we do that?


## SQL JOINS
The JOIN statement is used to merge columns between two different tables in a relational database. 

In [16]:
Image(url= "https://www.dofactory.com/Images/sql-joins.png")

What if we wanted to get data from two tables, say customer and payments?
We can do this explicitly by calling a join or implicitly with a where clause

Instead of using a JOIN statement, we can also use the WHERE statement and SELECT customer and payment tables