## WELCOME TO YOUR FIRST JUPITER NOTEBOOK!!!
- Jupyter can integrate instructional text with area that you can run code. In our case SQL. We are taking advantage of it to make lecture more understandable.
- Jupiter runs through Python so minor extras in executing the commands are needed.


# Basic commands

|Pressing button	| Result |
|--------------|----------|
|a| Add a cell above|
|b|	Add a cell below|	
|c|	Copy a cell|	
|v|	Paste a cell|	
|x|	Cut a shell|	
|dd| Delete a shell	
|Ctrl +  z|	Undo|	
|Ctrl + Shift + z|	Redo|
|Ctrl + Enter|	Execute a command in a cell|


# Installing the sql compiler

First thing needed is to install the sql compiler. We can do it with the following command:

In [2]:
!python3 -m pip install ipython-sql

You should consider upgrading via the '/usr/local/bin/python3 -m pip install --upgrade pip' command.[0m


# Loading the installed compiler
Is the second thing we are trying and first from now on everytime we want to start working with a database.
Load the SQL library.

In [4]:
%load_ext sql

# Using Jupiter to run SQL
As Jupiter runs through Python, to execute a statement will require applying minimal amount of python code in our case:
- The **%sql** before each command
- **\** to continue a line of code to a new line

# Connecting with the MIMIC III demo database

In [6]:
# Connect to the MIMIC database, try it using the path you downloaded the database.
%sql sqlite://///"Users/leonidas/Desktop/AUTH\ HealthData/Scripts/SQL_HSDA/mimic3.db"

(pysqlite2.dbapi2.OperationalError) near "HealthData": syntax error
[SQL: HealthData/Scripts/SQL_HSDA/mimic3.db]
(Background on this error at: https://sqlalche.me/e/14/e3q8)


## Show the tables of the database
Let's start by finding out which tables our database mimic3.db is having:
- All info about tables and fields, can be found here: 
https://mimic.mit.edu/docs/iii/tables/

In [None]:
%sql SELECT name FROM sqlite_master WHERE type='table';

## Show columns and other information

The **PRAGMA** command in conjunction with **table_info** function returns one row for each column in the named table.


In [None]:
%sql PRAGMA table_info(admissions);

#### Columns in the result set include: 
> - The column name
> - Data type
> - Whether or not the column can be NULL 
> - The default value for the column. 
> - The "pk" column in the result set is zero for columns that are not part of the primary key, and is the index of the column in the primary key for columns that are part of the primary key.

## The SELECT command and friends ...

The SQL SELECT statement is used to retrieve records from one or more tables in your SQL database. The records retrieved are known as a result set. 

The skeleton of the command is:

`SELECT columnname FROM tablename ;`

Some useful advice before we continue:
> SQL syntax is case insensitive. (Which means that any way you write your command, or your table, capitals or small letters sql will understand and run your code). It is recommended though, in order not to get confused and to easily read and debug your code to use SQL keywords in capital letters and table or column names in lower case or as they look (defined) initially in your database.  

Let's try out an example from the first table in alphabetical order that we have, admissions. Let's check out what is included.

In [None]:
%sql SELECT ethnicity FROM admissions;

In case we need to add extra columns to our select command, we separate each name of the column with a comma ',' except of the last name column. 

There **MUST NOT** be included. 

`SELECT column1,column2,column3,...,columnn FROM table;`

In [None]:
%sql SELECT ethnicity,admittime,diagnosis FROM admissions;

### Exercise 
Try to retrieve the date of birth **dob** and date of death **dod** from **patients** table. 

In [None]:
%sql

### Including all the columns 

If we would like to include all the columns of the table we just use * instead of all field (column) names. 

` SELECT * FROM tablename;`

In [None]:
%sql SELECT * FROM icustays;

## The DISTINCT command for removing duplicates

In case you need to retrieve unique rows and remove the duplicates, you need to use the DISTINCT keyword just after the SELECT statement:

` SELECT DISTINCT column1,... FROM table;`


In [None]:
%sql SELECT DISTINCT ethnicity FROM admissions;

## Exercise
Retrieve all the distinct coupling cases from religion and  marital status fields in admissions table.

In [None]:
%sql 

## Sorting results using ORDER BY

**ORDER BY** statement allows to sort alphabetically the output of the query. The default is to order in ascending order (**ASC**). You can always use **DESC** statement to reverse the order:


In [None]:
%sql SELECT DISTINCT subject_id,ethnicity FROM admissions ORDER BY ethnicity;

In [None]:
%sql SELECT DISTINCT subject_id,ethnicity FROM admissions ORDER BY ethnicity DESC;

## Sorting by multiple fields
Sorting can be extended to multiple fields using a specified order.
For example you can see all the possible combinations of ethnicity types and their marital status using ORDER BY and including both fields:

In [None]:
%sql SELECT DISTINCT ethnicity,marital_status FROM admissions ORDER BY \
ethnicity,marital_status;

## LIMIT Statement
To limit your results, you can use **LIMIT** and the number of the first terms to report e.g.

In [None]:
%sql SELECT DISTINCT subject_id,ethnicity FROM admissions ORDER BY ethnicity LIMIT 10;

### EXERCISE 

- Find the **description** of **caregivers** in alphabetical order and print the first 10 of them, from **caregivers** table.

In [None]:
%sql SELECT DISTINCT description FROM caregivers ORDER BY description LIMIT 10;

## ALIASES

You can rename the name of the column in your query outputs using the **AS** command. That makes your code more understandable and readable. To implement it just include **AS** after the column you would like to change, and right after, the new name (alias) of the column. AN alias is assigned (temporary name). You can also apply it to tables. The structure is as follows:


`SELECT column1 AS newcolumn1,... FROM table1;` 

In [None]:
%sql SELECT subject_id AS PatientNo, dob AS DateOfBirth FROM patients;

## Filtering results. The WHERE Statement

The WHERE statement is used to filter the results so as to extract specific rows that fulfill a specific target.

`SELECT column1, column2, ... FROM table1 WHERE filter1 ;`

It is followed by operators 

|Operator	| Description |
|--------------|----------|
| AND, OR, NOT| Logical operators|
|=|	Equal|	
|>|	Greater than|	
|<|	Less than|	
|>=|	Greater than or equal|	
|<=|	Less than or equal|	
|<>|	Not equal. In some versions of SQL may be written as !=|	
|BETWEEN|	Between a certain range|
|LIKE|	Search for a pattern|	
|IN|	To specify multiple possible values for a column|

Let's start with an simple example to make things clearer. Filter the women in the **patients** table.

In [None]:
%sql SELECT * FROM patients WHERE gender = 'F' ;

As there are only two categories, alternative queries could be:
- %sql SELECT * FROM patients WHERE gender <> 'M' ;  
- %sql SELECT * FROM patients WHERE gender IS NOT 'M' ;

## Exercise 
Select the patients from **admissions** table with row_id between 10000 and 40000

In [None]:
%sql SELECT * FROM admissions WHERE subject_id BETWEEN 10000 AND 40000;

## LIKE statement
You can use LIKE to find a pattern, word or part of words using % and _. 
These are called wildcards. e.g.

|Wildcard | Explanation|
|--------------|----------|
|%|Allows you to match any string of any length (including zero length)|
|_|Allows you to match on a single character|

To better understand how it works take a look of the following table:

|LIKE Operator|Description|
|--------------|----------|
|WHERE drug LIKE 'a%'	|Finds any values that start with "a"|
|WHERE drug LIKE '%a'	|Finds any values that end with "a"|
|WHERE drug LIKE '%or%'	|Finds any values that have "or" in any position|
|WHERE drug LIKE '_r%'	|Finds any values that have "r" in the second position|
|WHERE drug LIKE 'a_%'	|Finds any values that start with "a" and are at least 2 characters in length|
|WHERE drug LIKE 'a__%'	|Finds any values that start with "a" and are at least 3 characters in length|
|WHERE drug LIKE 'a%s'	|Finds any values that start with "a" and ends with "s"|


### EXERCISE 

- Find all drug generic names which have the word *Magnesium* in the **prescriptions** table and are not *Magnesium Sulfate*.	

In [None]:
%sql SELECT * FROM prescriptions WHERE drug_name_generic LIKE '%Magnesium%'\
AND drug_name_generic IS NOT 'Magnesium Sulfate';

## IN statement

**IN** is used to filter to specific outputs. E.g.  

In [None]:
%sql SELECT * FROM prescriptions WHERE drug IN \
('Magnesium Oxide','Magnesium Citrate','Magnesium Sulfate');

## Exercise 
Try doing the same thing as above using OR statement, an alternative.

In [None]:
%sql SELECT * FROM prescriptions WHERE drug = 'Magnesium Oxide' \
OR drug ='Magnesium Citrate'\
OR drug ='Magnesium Sulfate';