**SQL**: <u>Structured Query Language</u> for *Relational databases*.<br>

* *Rows* aka **Records**<br>
* *Columns* aka **Fields**

# Connecting
**Engine connection blueprint**: <code>dialect+driver://username:password@host:port/database</code>

In [1]:
#initializing sql magic in jupyter notebook
%load_ext sql

In [111]:
#sample postgresql connection
%sql postgresql+psycopg2://student:datacamp@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


'Connected: student@census'

## Table names

### PostgreSQL
<code>information_schema</code> is a meta-database that holds information about your current database. The <code>'public'</code> schema holds information about user-defined tables and databases.

In [5]:
%%sql
select table_name 
from information_schema.tables 
where table_schema='public'

 * postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
8 rows affected.


table_name
vrska
census1
data
data1
employees3
users
employees
employees_2


### Sqlite

In [3]:
#connecting to local .sqlite db file
%sql sqlite:///sql_files/census.sqlite

'Connected: @sql_files/census.sqlite'

In [7]:
%%sql
select name from sqlite_master

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
 * sqlite:///sql_files/census.sqlite
Done.


name
census
state_fact


## Column names

### PostgreSQL

In [2]:
#connecting to local postgres server
%sql postgresql://postgres:taskuarvuti@localhost:5432/postgres

'Connected: postgres@postgres'

In [3]:
%%sql
--table names
select table_name 
from information_schema.tables 
where table_schema='public'

 * postgresql://postgres:***@localhost:5432/postgres
11 rows affected.


table_name
cities
countries
languages
economies
currencies
populations
countries_plus
economies2010
economies2015
geogropued_countries


In [7]:
%%sql
--column names in languages
select column_name, data_type
from information_schema.columns
where table_name = 'languages'

 * postgresql://postgres:***@localhost:5432/postgres
5 rows affected.


column_name,data_type
lang_id,integer
code,character varying
name,character varying
percent,real
official,boolean


# Selecting columns
## SELECT

In [8]:
%%sql
--column names in census table
select * 
from census 
limit 0

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
 * sqlite:///sql_files/census.sqlite
Done.


state,sex,age,pop2000,pop2008


In [12]:
%%sql
select state,sex 
from census 
limit 3

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
 * sqlite:///sql_files/census.sqlite
Done.


state,sex
Illinois,M
Illinois,M
Illinois,M


### distinct

Selecting unique values only (dropping duplicates)

In [7]:
%%sql
select distinct state,sex
from census
limit 3

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
 * sqlite:///sql_files/census.sqlite
Done.


state,sex
Illinois,M
Illinois,F
New Jersey,M


### count
Return the number of records/rows of a given query

In [11]:
%%sql
--number of unique states
select count(distinct state) as "# of unique states"
from census

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
 * sqlite:///sql_files/census.sqlite
Done.


# of unique states
51


# Filtering rows
## WHERE
Filtering results. Some **comparison** operators:<br>
* <code>=</code> equal
* <code><></code> **not equal!**
* <code><</code> less than
* <code>></code> ...
* <code><=</code>
* <code>>=</code>

In [13]:
%%sql
select state,pop2000
from census
where state = 'Ohio'
limit 3

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
 * sqlite:///sql_files/census.sqlite
Done.


state,pop2000
Ohio,76427
Ohio,75867
Ohio,76503


In [67]:
%%sql
select state,pop2000
from census
where pop2000 >= 200000
group by state
order by pop2000 desc

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
 * sqlite:///sql_files/census.sqlite
Done.


state,pop2000
California,252494
New York,226378
Florida,221202


### and
Multiple conditions

In [66]:
%%sql
select state,pop2000
from census
where pop2000 < 200000 and pop2000 > 150000
group by state
order by pop2000 desc

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
 * sqlite:///sql_files/census.sqlite
Done.


state,pop2000
Pennsylvania,173095
Texas,172223
California,160674
New York,156168


### or

In [65]:
%%sql
select state,pop2000
from census
where pop2000 < 500 or pop2000 > 250000
group by state
order by pop2000 desc

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
 * sqlite:///sql_files/census.sqlite
Done.


state,pop2000
California,252494
Wyoming,490
District of Columbia,481
Alaska,470


<code>AND</code> and <code>OR</code> together

In [80]:
%%sql
select state,pop2000,pop2008
from census
where (state = 'California' or state = 'New York') and (pop2008 < pop2000)
group by state

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
 * sqlite:///sql_files/census.sqlite
Done.


state,pop2000,pop2008
California,272801,266225
New York,124725,121615


### between
<code>BETWEEN</code> is **inclusive** in the beginnign and in the end of the range.

In [83]:
%%sql
select state,pop2000
from census
where pop2000 between 0 and 500
group by state
order by pop2000 desc

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
 * sqlite:///sql_files/census.sqlite
Done.


state,pop2000
Wyoming,490
District of Columbia,481
Alaska,470


### in
Specify multiple values in a <code>WHERE</code> clause.

In [91]:
%%sql
select state,age,pop2008
from census
where age in (7, 13, 45)
group by age
order by pop2008
limit 4

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
 * sqlite:///sql_files/census.sqlite
Done.


state,age,pop2008
Illinois,7,90940
Illinois,13,91661
Illinois,45,94278


In [101]:
%%sql
select * from state_fact limit 3

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
 * sqlite:///sql_files/census.sqlite
Done.


id,name,abbreviation,country,type,sort,status,occupied,notes,fips_state,assoc_press,standard_federal_region,census_region,census_region_name,census_division,census_division_name,circuit_court
13,Illinois,IL,USA,state,10,current,occupied,,17,Ill.,V,2,Midwest,3,East North Central,7
30,New Jersey,NJ,USA,state,10,current,occupied,,34,N.J.,II,1,Northeast,2,Mid-Atlantic,3
34,North Dakota,ND,USA,state,10,current,occupied,,38,N.D.,VIII,2,Midwest,4,West North Central,8


In [99]:
%%sql
select name,abbreviation,census_region_name
from state_fact
where census_region_name in ('West', 'East')
group by name

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
 * sqlite:///sql_files/census.sqlite
Done.


name,abbreviation,census_region_name
Alaska,AK,West
Arizona,AZ,West
California,CA,West
Colorado,CO,West
Hawaii,HI,West
Idaho,ID,West
Montana,MT,West
Nevada,NV,West
New Mexico,NM,West
Oregon,OR,West


### null
<code>IS NULL</code> represents **missing** or **unknown** value

In [126]:
#connect to a db that contains missing values
%sql sqlite:///sql_files/chinook.db

'Connected: @sql_files/chinook.db'

In [127]:
%%sql
--return couple of tables in the chinook db
select name from sqlite_master limit 5

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
   sqlite:///sql_files/census.sqlite
 * sqlite:///sql_files/chinook.db
Done.


name
albums
sqlite_sequence
artists
customers
employees


In [129]:
%%sql
select FirstName,LastName,Company 
from customers where Company is null
limit 3

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
   sqlite:///sql_files/census.sqlite
 * sqlite:///sql_files/chinook.db
Done.


FirstName,LastName,Company
Leonie,Köhler,
François,Tremblay,
Bjørn,Hansen,


Example about a <code>IS NOT NULL</code> query.

In [130]:
%%sql
select FirstName,LastName,Company
from customers
where Company is not null

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
   sqlite:///sql_files/census.sqlite
 * sqlite:///sql_files/chinook.db
Done.


FirstName,LastName,Company
Luís,Gonçalves,Embraer - Empresa Brasileira de Aeronáutica S.A.
František,Wichterlová,JetBrains s.r.o.
Eduardo,Martins,Woodstock Discos
Alexandre,Rocha,Banco do Brasil S.A.
Roberto,Almeida,Riotur
Mark,Philips,Telus
Jennifer,Peterson,Rogers Canada
Frank,Harris,Google Inc.
Jack,Smith,Microsoft Corporation
Tim,Goyer,Apple Inc.


### like
<code>LIKE</code> operator is used for detecting **patterns** in columns. Some helping *wildcards*:<br>
* <code>%</code> matches 0 or any characters in a text.
* <code>_</code> matches exactly 1 character

In [135]:
%%sql
select FirstName,LastName
from customers
where LastName like 'Ra%'

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
   sqlite:///sql_files/census.sqlite
 * sqlite:///sql_files/chinook.db
Done.


FirstName,LastName
Fernanda,Ramos
Frank,Ralston


In [141]:
%%sql
select FirstName,LastName,Phone 
from customers
where Phone like '%6_9%'

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
   sqlite:///sql_files/census.sqlite
 * sqlite:///sql_files/chinook.db
Done.


FirstName,LastName,Phone
Wyatt,Girard,+33 05 56 96 96 96
Isabelle,Mercier,+33 03 80 73 66 99
Hugh,O'Reilly,+353 01 6792424


# Aggregation
Performing calculation on data in a db.

In [142]:
%%sql
select name from sqlite_master limit 7

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
   sqlite:///sql_files/census.sqlite
 * sqlite:///sql_files/chinook.db
Done.


name
albums
sqlite_sequence
artists
customers
employees
genres
invoices


In [144]:
%%sql
select * from invoices limit 2

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
   sqlite:///sql_files/census.sqlite
 * sqlite:///sql_files/chinook.db
Done.


InvoiceId,CustomerId,InvoiceDate,BillingAddress,BillingCity,BillingState,BillingCountry,BillingPostalCode,Total
1,2,2009-01-01 00:00:00,Theodor-Heuss-Straße 34,Stuttgart,,Germany,70174,1.98
2,4,2009-01-02 00:00:00,Ullevålsveien 14,Oslo,,Norway,171,3.96


## COUNT()

In [158]:
%%sql
select COUNT(BillingCountry) as invoices_from_Canada
from invoices 
where BillingCountry = 'Canada'

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
   sqlite:///sql_files/census.sqlite
 * sqlite:///sql_files/chinook.db
Done.


invoices_from_Canada
56


## SUM()

In [150]:
%%sql
select SUM(Total) as total_of_invoices
from invoices

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
   sqlite:///sql_files/census.sqlite
 * sqlite:///sql_files/chinook.db
Done.


total_of_invoices
2328.600000000004


## AVG()

In [152]:
%%sql
select AVG(Total) as average_invoice
from invoices

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
   sqlite:///sql_files/census.sqlite
 * sqlite:///sql_files/chinook.db
Done.


average_invoice
5.651941747572825


## MAX() & MIN()

In [159]:
%%sql 
select MAX(Total) as max_invoice
from invoices 

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
   sqlite:///sql_files/census.sqlite
 * sqlite:///sql_files/chinook.db
Done.


max_invoice
25.86


# Aliasing
## AS

In [164]:
%sql @sqlite:///sql_files/census.sqlite

'Connected: @sql_files/census.sqlite'

In [165]:
%%sql
select * from census limit 1

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
 * sqlite:///sql_files/census.sqlite
   sqlite:///sql_files/chinook.db
Done.


state,sex,age,pop2000,pop2008
Illinois,M,0,89600,95012


In [167]:
%%sql
select state, pop2008 - pop2000 as pop_difference
from census
group by state
order by abs(pop_difference) desc
limit 5

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
 * sqlite:///sql_files/census.sqlite
   sqlite:///sql_files/chinook.db
Done.


state,pop_difference
Texas,40137
California,35406
Florida,21954
Arizona,14377
Georgia,13357


In [169]:
%%sql
select state, min(pop2008) as min_pop2008
from census

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
 * sqlite:///sql_files/census.sqlite
   sqlite:///sql_files/chinook.db
Done.


state,min_pop2008
Alaska,391


Find top populations in millions.

In [174]:
%%sql
select state, pop2008 / 1e4 as pop_in_millions
from census
group by state
order by pop_in_millions desc
limit 5

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
 * sqlite:///sql_files/census.sqlite
   sqlite:///sql_files/chinook.db
Done.


state,pop_in_millions
California,28.79
Texas,21.236
New York,12.8088
Florida,11.8845
Illinois,9.5012


# Sorting & Grouping

## ORDER BY
Sort results in ascending or descending order according to the values of one or more columns.

In [175]:
%sql @sql_files/chinook.db

'Connected: @sql_files/chinook.db'

In [179]:
%%sql
select name from sqlite_master where name not like 'sqlite_%' and name not like 'IFK%'

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
   sqlite:///sql_files/census.sqlite
 * sqlite:///sql_files/chinook.db
Done.


name
albums
artists
customers
employees
genres
invoices
invoice_items
media_types
playlists
playlist_track


In [194]:
%%sql
select * from tracks limit 1

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
   sqlite:///sql_files/census.sqlite
 * sqlite:///sql_files/chinook.db
Done.


TrackId,Name,AlbumId,MediaTypeId,GenreId,Composer,Milliseconds,Bytes,UnitPrice
1,For Those About To Rock (We Salute You),1,1,1,"Angus Young, Malcolm Young, Brian Johnson",343719,11170334,0.99


Order tracks by their length descending.

In [204]:
%%sql
select Name, round(Milliseconds / 60000.0, 2) as minutes
from tracks
order by minutes desc
limit 5

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
   sqlite:///sql_files/census.sqlite
 * sqlite:///sql_files/chinook.db
Done.


Name,minutes
Occupation / Precipice,88.12
Through a Looking Glass,84.81
"Greetings from Earth, Pt. 1",49.34
The Man With Nine Lives,49.28
"Battlestar Galactica, Pt. 2",49.27


## GROUP BY

In [206]:
%sql @sqlite:///sql_files/census.sqlite

'Connected: @sql_files/census.sqlite'

In [211]:
%%sql
select sex, count(*)
from census

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
 * sqlite:///sql_files/census.sqlite
   sqlite:///sql_files/chinook.db
Done.


sex,count(*)
M,8772


In [212]:
%%sql
select sex, count(*)
from census
group by sex

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
 * sqlite:///sql_files/census.sqlite
   sqlite:///sql_files/chinook.db
Done.


sex,count(*)
F,4386
M,4386


## HAVING
Filter based on the result of an aggregate function.

In [213]:
%sql sqlite:///sql_files/chinook.db

'Connected: @sql_files/chinook.db'

In [214]:
%%sql
select * from invoices limit 1

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
   sqlite:///sql_files/census.sqlite
 * sqlite:///sql_files/chinook.db
Done.


InvoiceId,CustomerId,InvoiceDate,BillingAddress,BillingCity,BillingState,BillingCountry,BillingPostalCode,Total
1,2,2009-01-01 00:00:00,Theodor-Heuss-Straße 34,Stuttgart,,Germany,70174,1.98


In [224]:
%%sql
select BillingCountry, count(*) as num_invoices
from invoices 
group by BillingCountry
order by num_invoices desc
limit 10

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
   sqlite:///sql_files/census.sqlite
 * sqlite:///sql_files/chinook.db
Done.


BillingCountry,num_invoices
USA,91
Canada,56
France,35
Brazil,35
Germany,28
United Kingdom,21
Portugal,14
Czech Republic,14
India,13
Sweden,7


In [228]:
%%sql
select BillingCountry, count(*) as num_invoices
from invoices 
group by BillingCountry
having count(BillingCountry) > 30
order by num_invoices desc
limit 10

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
   sqlite:///sql_files/census.sqlite
 * sqlite:///sql_files/chinook.db
Done.


BillingCountry,num_invoices
USA,91
Canada,56
France,35
Brazil,35


In [241]:
%%sql
select BillingCountry
from invoices
group by BillingCountry
having count(BillingCountry) > 30
order by count(BillingCountry) desc

   postgresql+psycopg2://student:***@postgresql.csrrinzqubik.us-east-1.rds.amazonaws.com:5432/census
   sqlite:///sql_files/census.sqlite
 * sqlite:///sql_files/chinook.db
Done.


BillingCountry
USA
Canada
France
Brazil
