# Introduction to SQL

|DATA TYPE | DESCRIPTION |
| ------ | ------ |
|NULL	|The value is a NULL value|
|INTEGER	|The value is a signed integer, stored in 1, 2, 3, 4, 6,or 8 bytes depending on the magnitude of the value|
|REAL	|The value is a floating point value, stored in 8-bytes|
|TEXT	|The value is a text string|
|BLOB	|The data is stored exactly as it was input, Used for binary data like images. |

![](Schema2.png)

If the "Passengers" Table were to be connected to the "Trips" table, which column in "Passengers" should be used as the "Foreign Key". E.g. which column in Passengers will allow it to be related to "Trips"


### Using DB Browser

<center>
    <img src = 'images/dbrowse.png' />
</center>

##### `SELECT`

```sql
SELECT colnames
FROM tablename
WHERE conditions
GROUP BY colnames
HAVING conditions
ORDER BY colnames
```

**Return whole table**

```sql
SELECT *
FROM torrents
```

**Select specific columns**

```sql
SELECT groupName, artist, groupYear
FROM torrents;
```

**Limit output**

```sql
SELECT *
FROM torrents
LIMIT 5;
```

`WHERE` 

```SQL
SELECT groupName, artist, groupYear
FROM torrents
WHERE groupYear < 1985;
```

**Combining Conditions**

```sql
SELECT groupName, artist
FROM torrents
WHERE groupYear < 1990 AND groupYear > 1985;
```

**BETWEEN**

```sql
SELECT groupName, artist
FROM torrents
WHERE totalSnatched > 500 AND groupYear BETWEEN 1983 AND 1985;
```

**IN**

```sql
SELECT groupName, artist
FROM torrents
WHERE totalSnatched > 500 AND groupYear IN (1980, 1984, 1987);
```

**Sorting Results**

```sql
SELECT groupName, artist
FROM torrents
WHERE groupYear = 1984
ORDER BY totalSnatched;
```

**ASC** and **DESC**

```sql
SELECT groupName, artist
FROM torrents
WHERE groupYear = 1984
ORDER BY totalSnatched DESC;
```

##### PROBLEMS

We will use the example from w3schools [here](https://www.w3schools.com/sql/trysql.asp?filename=trysql_op_in).

1. Write a `SELECT` statement to select all columns from the `Employees` table.
2. What is the name of the customer from Germany in the customers table?
3. Write a query to return the orders with quantity between 10 and 20 in the `OrderDetails` table.
4. Sort the orders in descending order based on quantity in the `OrderDetails` table and limit the results to 5.

**Create New Columns**

```sql
SELECT D02_total_plot * 2.4701
FROM PLOTS;
```

**Alias'**

```sql
SELECT D02_total_plot * 2.4701 AS D02_total_plot_converted
FROM Plots;
```

**Built-in Functions**

```sql
SELECT ROUND(D02_total_plot * 2.4701, 2) as D02_total_plot_converted
FROM Plots;
```

**Dates**

yyyy-mm-dd

```sql
SELECT A01_interview_date,
       substr(A01_interview_date,7,4) as year,
       substr(A01_interview_date,4,2) as month,
       substr(A01_interview_date,1,2) as day
FROM Farms;
```

```sql
SELECT A01_interview_date,
       substr(A01_interview_date,7,4) || '-' ||
       substr(A01_interview_date,4,2) || '-' ||
       substr(A01_interview_date,1,2) as converted_date
FROM Farms;
```

```sql
SELECT A01_interview_date,
       date(
       substr(A01_interview_date,7,4) || '-' ||
       substr(A01_interview_date,4,2) || '-' ||
       substr(A01_interview_date,1,2)
       ) as converted_date
FROM Farms;
```

```sql
SELECT A01_interview_date,
       date(
       substr(A01_interview_date,7,4) || '-' ||
       substr(A01_interview_date,4,2) || '-' ||
       substr(A01_interview_date,1,2)
       ) as converted_date
FROM Farms
ORDER BY converted_date;
```

**CASE**

```sql
SELECT Id, country,
       CASE country
           WHEN 'Moz' THEN 'Mozambique'
           WHEN 'Taz' THEN 'Tanzania'
       ELSE 'Unknown Country'
       END AS country_fullname
FROM Farms;
```

```sql
SELECT Id, A11_years_farm,
       CASE
           WHEN  A11_years_farm BETWEEN 1 AND 10 THEN '1-10'
           WHEN  A11_years_farm BETWEEN 11 AND 20 THEN '11-20'
           WHEN  A11_years_farm BETWEEN 21 AND 30 THEN '21-30'
           WHEN  A11_years_farm BETWEEN 31 AND 40 THEN '31-40'
           WHEN  A11_years_farm BETWEEN 41 AND 50 THEN '41-50'
           WHEN  A11_years_farm BETWEEN 41 AND 50 THEN '51-60'
       ELSE '> 60'       
       END AS A11_years_farm_range
FROM Farms;
```

**distinct**

```sql
SELECT distinct A06_province
from Farms;
```

**group by**


```sql
SELECT A08_ward, COUNT(*) as How_many
FROM Farms
GROUP BY A08_ward;
```


**aggregation in groups**

```sql
SELECT A09_village,
       min(A11_years_farm) as min
FROM Farms
GROUP BY A09_village 
LIMIT 5;
```

**PROBLEM**

1. Using the database in W3 schools, write a query that will return a count of the countries in our `Customers` table.
2. How many countries have more than 10 records in the Customers table?
3. If you wanted the results from:
```sql
SELECT COUNT(*), Country 
FROM Customers 
GROUP BY Country;
```
to be *ordered by the count*, what could you add to the end of the query?
4. What price appears three and only three times in the "Products" table?



### JOINS

**Which Farms with more than 12 people in the household grow Maize?**

```sql
SELECT * FROM Crops
WHERE D_curr_crop = 'maize';
```



```sql
SELECT Id, B_no_membrs 
FROM Farms
WHERE B_no_membrs > 12;
```

```SQL
SELECT a.Id, a.B_no_membrs,
       b.Id, b.D_curr_crop
FROM Farms AS a
JOIN Crops AS b
ON a.Id = b.Id AND a.B_no_membrs > 12 AND b.D_curr_crop = 'maize';
```

**PROBLEMS**

1. After running the query below, what is the FirstName of the employee responsible for ID 10255.
```sql
SELECT Orders.OrderID, Employees.FirstName FROM Orders 
LEFT JOIN Employees ON Orders.EmployeeID = Employees.EmployeeID;
```

EX: See which Employee is reponsible for the most money spent by customers. 

- Joining Order Detail with Products Table (and start to use aliases).
-  multiply our quantity by our price
- sum the total cost while grouping by OrderID
- Finally, "Quantity" will also be summed so that average price per item might be calculated later. OrderID will be kept, as it is needed to relate this information to the employee table.



- First add a join to the "Orders" Table, which provides EmployeeID:
- Then add a join with Employees table;
- Finally modify the GROUP BY to find the total sales of each employee;

**PROBLEMS**

1. How many orders is the employee with first name "Laura" responsible for?
2. How many total items is "Speedy Express" responsible for shipping?
3. What PostalCode (from Customers table) recieved the most expensive order?


### Connecting with Pandas

In [14]:
#!pip install sqlalchemy

In [38]:
import pandas as pd
from sqlalchemy import create_engine

In [30]:
engine = create_engine('sqlite:///hiphop.sqlite')

In [31]:
sql = """
SELECT *
FROM torrents;
"""

In [34]:
df = pd.read_sql_query(sql, engine)

In [37]:
df.head()

Unnamed: 0,groupName,totalSnatched,artist,groupYear,releaseType,groupId,id
0,superappin&#39;,239,grandmaster flash & the furious five,1979,single,720949,0
1,spiderap / a corona jam,156,ron hunt & ronnie g & the sm crew,1979,single,728752,1
2,rapper&#39;s delight,480,sugarhill gang,1979,single,18513,2
3,rap-o clap-o / el rap-o clap-o,200,joe bataan,1979,single,756236,3
4,christmas rappin&#39;,109,kurtis blow,1979,single,71818958,4


**EXIT**

https://docs.google.com/forms/d/e/1FAIpQLSd5SzTaPSgeWj0ZmdltKlUM21cepqCRd2mmb2p52KUqSYCuGw/viewform?usp=sf_link