## Set Theory

In [1]:
%load_ext sql

In [2]:
%sql sqlite:///db1.db

'Connected: @db1.db'

In [3]:
%sql select name from sqlite_master where type = 'table';

 * sqlite:///db1.db
Done.


name
cities
countries
countries_plus
currencies
economies
economies2010
economies2015
languages
populations


In [5]:
%sql select * from economies2010 limit 1;

 * sqlite:///db1.db
Done.


code,year,income_group,gross_savings
AFG,2010,Low income,37.133


In [6]:
%sql select * from economies2015 limit 1;

 * sqlite:///db1.db
Done.


code,year,income_group,gross_savings
AFG,2015,Low income,21.466


### Union
Combine two tables using union

In [8]:
%%sql
select * from economies2010
union
select * from economies2015
order by code,year
limit 5;

 * sqlite:///db1.db
Done.


code,year,income_group,gross_savings
AFG,2010,Low income,37.133
AFG,2015,Low income,21.466
AGO,2010,Upper middle income,23.534
AGO,2015,Upper middle income,-0.425
ALB,2010,Upper middle income,20.011


Determine all (non-duplicated) country codes in either the cities or the currencies table. The result should be a table with only one field called country_code

In [9]:
%sql select * from cities limit 1;

 * sqlite:///db1.db
Done.


name,country_code,city_proper_pop,metroarea_pop,urbanarea_pop
Abidjan,CIV,4765000,,4765000


In [10]:
%sql select * from currencies limit 1;

 * sqlite:///db1.db
Done.


curr_id,code,basic_unit,curr_code,frac_unit,frac_perbasic
1,AFG,Afghan afghani,AFN,Pul,100


In [12]:
%%sql
select country_code from cities
union
select code from currencies
order by country_code
limit 1


 * sqlite:///db1.db
Done.


country_code
ABW


### Union All
Will include duplicates

In [14]:
%%sql
select code,year from  economies
union all
select country_code, year from populations
order by code, year
limit 5;

 * sqlite:///db1.db
Done.


code,year
ABW,2010
ABW,2015
AFG,2010
AFG,2010
AFG,2015


### Intersect
Repeat UNION ALL exercise, this time looking at the records in common for country code and year for the economies and populations tables.

In [20]:
%%sql
select code,year from  economies
intersect
select country_code, year from populations
order by code, year
limit 5;

 * sqlite:///db1.db
Done.


code,year
AFG,2010
AFG,2015
AGO,2010
AGO,2015
ALB,2010


which countries also have a city with the same name as their country name

In [22]:
%%sql
select country_name from countries
intersect
select name from cities

 * sqlite:///db1.db
Done.


country_name
Hong Kong
Singapore


### Except
Get the names of cities in cities which are not noted as capital cities in countries as a single field result.

In [25]:
%%sql
select name from cities
except
select capital from countries
order by name
limit 5

 * sqlite:///db1.db
Done.


name
Abidjan
Ahmedabad
Alexandria
Almaty
Auckland


### Semi-join
Use the concept of a semi-join to identify languages spoken in the Middle East.

In [27]:
%%sql
select code from countries
where region = 'Middle East'
limit 5

 * sqlite:///db1.db
Done.


code
ARE
ARM
AZE
BHR
GEO


In [28]:
%%sql select distinct name
from languages
order by name
limit 5

 * sqlite:///db1.db
Done.


name
Afar
Afrikaans
Akyem
Albanian
Alsatian


In [30]:
%%sql select distinct name from languages
where code in (select code from countries where region = 'Middle East')
order by name
limit 5

 * sqlite:///db1.db
Done.


name
Arabic
Aramaic
Armenian
Azerbaijani
Azeri


### Diagnosing problems using anti-join

In [32]:
%sql select * from countries limit 1;

 * sqlite:///db1.db
Done.


code,country_name,continent,region,surface_area,indep_year,local_name,gov_form,capital,cap_long,cap_lat
AFG,Afghanistan,Asia,Southern and Central Asia,652090.0,1919,Afganistan/Afqanestan,Islamic Emirate,Kabul,69.1761,34.5228


In [36]:
%sql select * from currencies limit 1;

 * sqlite:///db1.db
Done.


curr_id,code,basic_unit,curr_code,frac_unit,frac_perbasic
1,AFG,Afghan afghani,AFN,Pul,100


In [33]:
%sql select count(*) from countries where continent = 'Oceania'

 * sqlite:///db1.db
Done.


count(*)
19


get the different currencies used in oceania

In [37]:
%%sql
select c1.code, country_name, basic_unit as currency
from countries as c1
inner join currencies as c2
on c1.code = c2.code
where c1.continent = 'Oceania'
limit 5

 * sqlite:///db1.db
Done.


code,country_name,currency
AUS,Australia,Australian dollar
KIR,Kiribati,Australian dollar
MHL,Marshall Islands,United States dollar
NRU,Nauru,Australian dollar
PLW,Palau,United States dollar


use anti-join to determine countries not included

In [38]:
%%sql
select code, country_name from countries
where continent = 'Oceania'
and code not in (select code from currencies)

 * sqlite:///db1.db
Done.


code,country_name
ASM,American Samoa
FJI,Fiji Islands
GUM,Guam
FSM,"Micronesia, Federated States of"
MNP,Northern Mariana Islands
