# Project Title
### Data Engineering Capstone Project

#### Project Summary
--describe your project at a high level--

The project follows the follow steps:
* Step 1: Scope the Project and Gather Data
* Step 2: Explore and Assess the Data
* Step 3: Define the Data Model
* Step 4: Run ETL to Model the Data
* Step 5: Complete Project Write Up

In [1]:
# Do all imports and installs here
import os

import pandas as pd

In [2]:
# full immigration data
#os.listdir("../../data/18-83510-I94-Data-2016/")

['i94_apr16_sub.sas7bdat',
 'i94_sep16_sub.sas7bdat',
 'i94_nov16_sub.sas7bdat',
 'i94_mar16_sub.sas7bdat',
 'i94_jun16_sub.sas7bdat',
 'i94_aug16_sub.sas7bdat',
 'i94_may16_sub.sas7bdat',
 'i94_jan16_sub.sas7bdat',
 'i94_oct16_sub.sas7bdat',
 'i94_jul16_sub.sas7bdat',
 'i94_feb16_sub.sas7bdat',
 'i94_dec16_sub.sas7bdat']

In [3]:
#os.listdir("../../data2/")

['GlobalLandTemperaturesByCity.csv']

### Step 1: Scope the Project and Gather Data

#### Scope 
Explain what you plan to do in the project in more detail. What data do you use? What is your end solution look like? What tools did you use? etc>

#### Describe and Gather Data 
Describe the data sets you're using. Where did it come from? What type of information is included? 

In [2]:
# Read in the data here
airport_codes = pd.read_csv("../sample_data/airport-codes_csv.csv")
airport_codes.head()

Unnamed: 0,ident,type,name,elevation_ft,continent,iso_country,iso_region,municipality,gps_code,iata_code,local_code,coordinates
0,00A,heliport,Total Rf Heliport,11.0,,US,US-PA,Bensalem,00A,,00A,"-74.93360137939453, 40.07080078125"
1,00AA,small_airport,Aero B Ranch Airport,3435.0,,US,US-KS,Leoti,00AA,,00AA,"-101.473911, 38.704022"
2,00AK,small_airport,Lowell Field,450.0,,US,US-AK,Anchor Point,00AK,,00AK,"-151.695999146, 59.94919968"
3,00AL,small_airport,Epps Airpark,820.0,,US,US-AL,Harvest,00AL,,00AL,"-86.77030181884766, 34.86479949951172"
4,00AR,closed,Newport Hospital & Clinic Heliport,237.0,,US,US-AR,Newport,,,,"-91.254898, 35.6087"


In [3]:
imm_data = pd.read_csv("../sample_data/immigration_data_sample.csv")
imm_data.head()

Unnamed: 0.1,Unnamed: 0,cicid,i94yr,i94mon,i94cit,i94res,i94port,arrdate,i94mode,i94addr,...,entdepu,matflag,biryear,dtaddto,gender,insnum,airline,admnum,fltno,visatype
0,2027561,4084316.0,2016.0,4.0,209.0,209.0,HHW,20566.0,1.0,HI,...,,M,1955.0,7202016,F,,JL,56582670000.0,00782,WT
1,2171295,4422636.0,2016.0,4.0,582.0,582.0,MCA,20567.0,1.0,TX,...,,M,1990.0,10222016,M,,*GA,94362000000.0,XBLNG,B2
2,589494,1195600.0,2016.0,4.0,148.0,112.0,OGG,20551.0,1.0,FL,...,,M,1940.0,7052016,M,,LH,55780470000.0,00464,WT
3,2631158,5291768.0,2016.0,4.0,297.0,297.0,LOS,20572.0,1.0,CA,...,,M,1991.0,10272016,M,,QR,94789700000.0,00739,B2
4,3032257,985523.0,2016.0,4.0,111.0,111.0,CHM,20550.0,3.0,NY,...,,M,1997.0,7042016,F,,,42322570000.0,LAND,WT


* Unnamed: 0 : unique id
  * 1000 values
* cicid: unique id
  * 1000 values
* i94yr: year
  * Year of arrival
  * 1 value per year
* i94mon: month
  * Month of arrival
  * All in april
* i94cit, i94res: country code (both valid and invalid)
  * Country code. i94cit does not equal i94res for 100+ records. i94cit could be country of previous transit, and i94res could be country of residence. Can convert from code to string.
* i94port: us port code (both valid and invalid)
  * i94 us port codes. No missing values. Can convert from code to string.
* arrdate: arrival date
  * Arrival date in US in SAS format. No missing values
* i94mode: mode of arrival
  * 4 different values. Air, sea, land and not reported
* i94addr: U.S. state address
  * US state address. No missing values
* depdate: departure date
  * departure date from US in SAS format. No missing values
* i94bir: respondent's age
  * Not unique. No missing values
* i94visa: respondent's visa type
  * Not unique. 3 different value: Business, pleasure, student. No missing value
* count: used for summary statistics
  * All values is 1
  * Can drop
* dtadfile: date added to I-94 files
  * not unique. All values seem to be in 2016. No missing values. Probably a person will have unique add dates for each visit
* visapost: department of state where visa is issued
  * Not unique. no nan values
* occup: occupation in U.S.
  * 4 different values. Mostly nan. Might still keep since most visitors should not have jobs
* entdepa: arrival flag - admitted or paroled into the U.S.
  * 9 different values. No nan
* entdepd: departure flag - departed, lost I-94 or is deceased
  * Not unique. 10 different values. Some nan. nan may mean still in us.
* entdepu: update flag - either apprehended, overstayed, adjusted to perm residence
  * all nan. May drop
* matflag: match flag - match of arrival and departure records
  * Binary flag. Some missing values
* biryear: birth year
  * Not unique
* dtaddto: date to which admitted to U.S. (allowed to stay until)
  * Not unique. 99 different values. No nan
* gender: non-immigrant gender
  * Not unique. Some NaN
* insnum: ins num
  * Not unique. Mostly NA values
  * Can drop
* airline: airline used to arrive in U.S.
  * Not unique. 101 different types
* admnum: admission number
  * Unique value. No missing value
* fltno: flight number
  * Not unique. 502 different types
* visatype: class of admission legally admitting the non-immigrant to temporarily stay in U.S.
  * Not unique. 10 different types

In [9]:
imm_data.describe()

Unnamed: 0.1,Unnamed: 0,cicid,i94yr,i94mon,i94cit,i94res,arrdate,i94mode,depdate,i94bir,i94visa,count,dtadfile,entdepu,biryear,insnum,admnum
count,1000.0,1000.0,1000.0,1000.0,1000.0,1000.0,1000.0,1000.0,951.0,1000.0,1000.0,1000.0,1000.0,0.0,1000.0,35.0,1000.0
mean,1542097.0,3040461.0,2016.0,4.0,302.928,298.262,20559.68,1.078,20575.037855,42.382,1.859,1.0,20160420.0,,1973.618,3826.857143,69372370000.0
std,915287.9,1799818.0,0.0,0.0,206.485285,202.12039,8.995027,0.485955,24.211234,17.903424,0.386353,0.0,49.51657,,17.903424,221.742583,23381340000.0
min,10925.0,13208.0,2016.0,4.0,103.0,103.0,20545.0,1.0,20547.0,1.0,1.0,1.0,20160400.0,,1923.0,3468.0,0.0
25%,721442.2,1412170.0,2016.0,4.0,135.0,131.0,20552.0,1.0,20561.0,30.75,2.0,1.0,20160410.0,,1961.0,3668.0,55993010000.0
50%,1494568.0,2941176.0,2016.0,4.0,213.0,213.0,20560.0,1.0,20570.0,42.0,2.0,1.0,20160420.0,,1974.0,3887.0,59314770000.0
75%,2360901.0,4694151.0,2016.0,4.0,438.0,438.0,20567.25,1.0,20580.0,55.0,2.0,1.0,20160420.0,,1985.25,3943.0,93436230000.0
max,3095749.0,6061994.0,2016.0,4.0,746.0,696.0,20574.0,9.0,20715.0,93.0,3.0,1.0,20160800.0,,2015.0,4686.0,95021510000.0


In [50]:
imm_data[imm_data["visatype"]=="WB"]["i94visa"].unique()

array([1.])

In [46]:
imm_data["visatype"].value_counts(dropna=False)

WT     443
B2     356
WB      91
B1      61
GMT     27
F1      10
CP       5
F2       3
E2       3
M1       1
Name: visatype, dtype: int64

In [45]:
imm_data[imm_data["i94cit"] != imm_data["i94res"]]

Unnamed: 0.1,Unnamed: 0,cicid,i94yr,i94mon,i94cit,i94res,i94port,arrdate,i94mode,i94addr,...,entdepu,matflag,biryear,dtaddto,gender,insnum,airline,admnum,fltno,visatype
2,589494,1195600.0,2016.0,4.0,148.0,112.0,OGG,20551.0,1.0,FL,...,,M,1940.0,07052016,M,,LH,5.578047e+10,00464,WT
7,112205,232708.0,2016.0,4.0,113.0,135.0,NYC,20546.0,1.0,NY,...,,M,1983.0,06302016,F,,BA,5.547449e+10,00117,WT
12,1339656,2711583.0,2016.0,4.0,148.0,112.0,FTL,20559.0,2.0,,...,,M,1962.0,07132016,F,,VES,5.617586e+10,93724,WT
14,682005,1387607.0,2016.0,4.0,148.0,112.0,BOS,20552.0,1.0,MA,...,,M,1982.0,07062016,F,,AF,5.583339e+10,00338,WT
18,2283033,4668286.0,2016.0,4.0,746.0,158.0,SEA,20568.0,1.0,NV,...,,M,1970.0,10232016,M,,DL,9.443560e+10,00143,B2
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
979,484243,1000074.0,2016.0,4.0,129.0,687.0,DAL,20550.0,1.0,CA,...,,M,1967.0,07042016,M,,AA,5.569490e+10,00940,WT
984,1540368,3133114.0,2016.0,4.0,148.0,112.0,FTL,20560.0,1.0,VA,...,,M,1973.0,07152016,M,,LH,5.629125e+10,00414,WB
990,2590789,5242730.0,2016.0,4.0,135.0,509.0,HAM,20572.0,1.0,MA,...,,M,1969.0,10272016,F,,DL,9.475131e+10,00560,B2
992,1920712,3874218.0,2016.0,4.0,148.0,112.0,SFR,20565.0,1.0,CA,...,,M,1967.0,07192016,,,LH,5.653427e+10,00454,WT


In [5]:
us_cities_demog = pd.read_csv("../sample_data/us-cities-demographics.csv")
us_cities_demog.head()

Unnamed: 0,City;State;Median Age;Male Population;Female Population;Total Population;Number of Veterans;Foreign-born;Average Household Size;State Code;Race;Count
0,Silver Spring;Maryland;33.8;40601;41862;82463;...
1,Quincy;Massachusetts;41.0;44129;49500;93629;41...
2,Hoover;Alabama;38.5;38040;46799;84839;4819;822...
3,Rancho Cucamonga;California;34.5;88127;87105;1...
4,Newark;New Jersey;34.6;138040;143873;281913;58...


In [7]:
# Temperature data
temperature = pd.read_csv("../sample_data/GlobalLandTemperaturesByCity.csv")
temperature.head()

Unnamed: 0,dt,AverageTemperature,AverageTemperatureUncertainty,City,Country,Latitude,Longitude
0,1743-11-01,6.068,1.737,Århus,Denmark,57.05N,10.33E
1,1743-12-01,,,Århus,Denmark,57.05N,10.33E
2,1744-01-01,,,Århus,Denmark,57.05N,10.33E
3,1744-02-01,,,Århus,Denmark,57.05N,10.33E
4,1744-03-01,,,Århus,Denmark,57.05N,10.33E


In [10]:
temperature["Country"].unique()

array(['Denmark', 'Turkey', 'Kazakhstan', 'China', 'Spain', 'Germany',
       'Nigeria', 'Iran', 'Russia', 'Canada', "Côte D'Ivoire",
       'United Kingdom', 'Saudi Arabia', 'Japan', 'United States',
       'India', 'Benin', 'United Arab Emirates', 'Mexico', 'Venezuela',
       'Ghana', 'Ethiopia', 'Australia', 'Yemen', 'Indonesia', 'Morocco',
       'Pakistan', 'France', 'Libya', 'Burma', 'Brazil', 'South Africa',
       'Syria', 'Egypt', 'Algeria', 'Netherlands', 'Malaysia', 'Portugal',
       'Ecuador', 'Italy', 'Uzbekistan', 'Philippines', 'Madagascar',
       'Chile', 'Belgium', 'El Salvador', 'Romania', 'Peru', 'Colombia',
       'Tanzania', 'Tunisia', 'Turkmenistan', 'Israel', 'Eritrea',
       'Paraguay', 'Greece', 'New Zealand', 'Vietnam', 'Cameroon', 'Iraq',
       'Afghanistan', 'Argentina', 'Azerbaijan', 'Moldova', 'Mali',
       'Congo (Democratic Republic Of The)', 'Thailand',
       'Central African Republic', 'Bosnia And Herzegovina', 'Bangladesh',
       'Switzerland'

In [None]:
temperature[temperature["Country"]=="United States"]

In [8]:
print(airport_codes.shape)
print(imm_data.shape)
print(us_cities_demog.shape)
print(temperature.shape)

(55075, 12)
(1000, 29)
(2891, 1)
(8599212, 7)


In [8]:
	
from pyspark.sql import SparkSession
spark = SparkSession.builder.\
config("spark.jars.packages","saurfang:spark-sas7bdat:2.0.0-s_2.11")\
.enableHiveSupport().getOrCreate()
df_spark =spark.read.format('com.github.saurfang.sas.spark').load('../../data/18-83510-I94-Data-2016/i94_apr16_sub.sas7bdat')


In [11]:
#write to parquet
df_spark.write.parquet("sas_data")
df_spark=spark.read.parquet("sas_data")

### Step 2: Explore and Assess the Data
#### Explore the Data 
Identify data quality issues, like missing values, duplicate data, etc.

#### Cleaning Steps
Document steps necessary to clean the data

##### 2.1 Explore Airport Codes

In [6]:
# EDA
airport_codes.isnull().mean()

ident           0.000000
type            0.000000
name            0.000000
elevation_ft    0.127208
continent       0.503296
iso_country     0.004485
iso_region      0.000000
municipality    0.103059
gps_code        0.255016
iata_code       0.833155
local_code      0.479147
coordinates     0.000000
dtype: float64

In [18]:
airport_codes["continent"].unique()

array([nan, 'OC', 'AF', 'AN', 'EU', 'AS', 'SA'], dtype=object)

In [10]:
airport_codes[airport_codes["elevation_ft"].isnull()].head()

Unnamed: 0,ident,type,name,elevation_ft,continent,iso_country,iso_region,municipality,gps_code,iata_code,local_code,coordinates
93,01MD,seaplane_base,Annapolis Seaplane Base,,,US,US-MD,Annapolis,01MD,,01MD,"-76.456001, 38.999199"
391,06LA,heliport,Panther Helicopters Inc Heliport,,,US,US-LA,Belle Chasse,06LA,,06LA,"-90.02780151367188, 29.84600067138672"
451,07LA,closed,Air Oil Inc Nr 1 Heliport,,,US,US-LA,Harahan,,,,"-90.183701, 29.937099"
661,0CL2,heliport,Parking Lot Heliport,,,US,US-CA,Chula Vista,0CL2,,0CL2,"-117.084999084, 32.593898773199996"
705,0FD6,seaplane_base,Fulton Seaplane Base,,,US,US-FL,Sebastian,0FD6,,0FD6,"-80.48639678955078, 27.907499313354492"


In [14]:
airport_codes[airport_codes["continent"].isnull()].head()

Unnamed: 0,ident,type,name,elevation_ft,continent,iso_country,iso_region,municipality,gps_code,iata_code,local_code,coordinates
0,00A,heliport,Total Rf Heliport,11.0,,US,US-PA,Bensalem,00A,,00A,"-74.93360137939453, 40.07080078125"
1,00AA,small_airport,Aero B Ranch Airport,3435.0,,US,US-KS,Leoti,00AA,,00AA,"-101.473911, 38.704022"
2,00AK,small_airport,Lowell Field,450.0,,US,US-AK,Anchor Point,00AK,,00AK,"-151.695999146, 59.94919968"
3,00AL,small_airport,Epps Airpark,820.0,,US,US-AL,Harvest,00AL,,00AL,"-86.77030181884766, 34.86479949951172"
4,00AR,closed,Newport Hospital & Clinic Heliport,237.0,,US,US-AR,Newport,,,,"-91.254898, 35.6087"


In [15]:
airport_codes[~airport_codes["continent"].isnull()].head()

Unnamed: 0,ident,type,name,elevation_ft,continent,iso_country,iso_region,municipality,gps_code,iata_code,local_code,coordinates
223,03N,small_airport,Utirik Airport,4.0,OC,MH,MH-UTI,Utirik Island,K03N,UTK,03N,"169.852005, 11.222"
1111,0TT8,heliport,Dynasty Heliport,150.0,OC,MP,MP-U-A,"San Jose, Tinian",0TT8,,0TT8,"145.64199829101562, 14.963299751281738"
10134,9OG1,heliport,Barrigada Readiness Center Heliport,311.0,OC,GU,GU-U-A,Guam,9OG1,,9OG1,"144.812142, 13.475863"
10368,AAD,small_airport,Adado Airport,1001.0,AF,SO,SO-GA,Adado,,AAD,,"46.6375, 6.095802"
10369,AAXX,small_airport,Rothera Point Airport,,AN,AQ,AQ-U-A,Rothera Point,AAXX,,,"-68.1269931793, -67.5669411575"


In [13]:
airport_codes[airport_codes["name"]=="Panther Helicopters Inc Heliport"].head()

Unnamed: 0,ident,type,name,elevation_ft,continent,iso_country,iso_region,municipality,gps_code,iata_code,local_code,coordinates
391,06LA,heliport,Panther Helicopters Inc Heliport,,,US,US-LA,Belle Chasse,06LA,,06LA,"-90.02780151367188, 29.84600067138672"


In [24]:
airport_codes["iso_country"].unique()

array(['US', 'PR', 'MH', 'MP', 'GU', 'SO', 'AQ', 'GB', 'PG', 'AD', 'SD',
       'SA', 'AE', 'SS', 'ES', 'CN', 'AF', 'LK', 'SB', 'CO', 'AU', 'MG',
       'TD', 'AL', 'AM', 'MX', 'MZ', 'PW', 'NR', 'AO', 'AR', 'AS', 'AT',
       'ZZ', 'GA', 'AZ', 'BA', 'BB', 'BE', 'DE', 'BF', 'BG', 'GL', 'BH',
       'BI', 'IS', 'BJ', 'OM', 'XK', 'BM', 'KE', 'PH', 'BO', 'BR', 'BS',
       'CV', 'BW', 'FJ', 'BY', 'UA', 'LR', 'BZ', 'CA', 'CD', 'CF', 'CG',
       'MR', 'CH', 'CL', 'CM', 'MA', 'CR', 'CU', 'CY', 'CZ', 'SK', 'PA',
       'DZ', 'ID', 'GH', 'RU', 'CI', 'DK', 'NG', 'DO', 'NE', 'HR', 'TN',
       'TG', 'EC', 'EE', 'FI', 'EG', 'GG', 'JE', 'IM', 'FK', 'EH', 'NL',
       'IE', 'FO', 'LU', 'NO', 'PL', 'ER', 'MN', 'PT', 'SE', 'ET', 'LV',
       'LT', 'ZA', 'SZ', 'GQ', 'SH', 'MU', 'IO', 'ZM', 'FM', 'KM', 'YT',
       'RE', 'TF', 'ST', 'FR', 'SC', 'ZW', 'MW', 'LS', nan, 'ML', 'GM',
       'GE', 'GF', 'SL', 'GW', 'GN', 'SN', 'GR', 'GT', 'TZ', 'GY', 'SR',
       'DJ', 'HK', 'LY', 'HN', 'VN', 'KZ', 'RW', 'HT

In [23]:
airport_codes[airport_codes["iso_country"].isnull()].head()

Unnamed: 0,ident,type,name,elevation_ft,continent,iso_country,iso_region,municipality,gps_code,iata_code,local_code,coordinates
21422,FYAA,small_airport,Ai-Ais Airport,2000.0,AF,,NA-KU,Ai-Ais,FYAA,AIW,,"17.5966, -27.995"
21423,FYAB,small_airport,Aroab B Airport,3235.0,AF,,NA-KA,Aroab,FYAB,,,"19.633100509643555, -26.776100158691406"
21424,FYAK,small_airport,Aussenkehr Airport,970.0,AF,,NA-KA,Aussenkehr,FYAK,,,"17.4645, -28.4587"
21425,FYAM,small_airport,Aminuis Airstrip,4012.0,AF,,NA-OH,Aminuis,FYAM,,,"19.351699829101562, -23.655799865722656"
21426,FYAR,medium_airport,Arandis Airport,1905.0,AF,,NA-ER,Arandis,FYAR,ADI,,"14.979999542236328, -22.462200164794922"


In [25]:
airport_codes["municipality"].unique()

array(['Bensalem', 'Leoti', 'Anchor Point', ..., 'Sealand',
       'Grande Glorieuse', 'Mishima-Mura'], dtype=object)

In [26]:
airport_codes["gps_code"].unique()

array(['00A', '00AA', '00AK', ..., 'ZYYK', 'ZYYY', 'RJX7'], dtype=object)

In [27]:
airport_codes["iata_code"].unique()

array([nan, 'UTK', 'OCA', ..., 'SHE', 'YNJ', 'YKH'], dtype=object)

In [28]:
airport_codes["local_code"].unique()

array(['00A', '00AA', '00AK', ..., 'FAWT', 'ZEN', 'ZNC'], dtype=object)

Dealing with NaN values:
* elevation_ft: Leave the missing values as NaN. It is hard to infer the original elevation of the airports.
* continent: Replace NaN with 'NA'. The original values of continent are: {nan, OC: Oceania, AF: Africa, AN: Antartica, EU: Europe, AS: Asia, SA: South America}. Based on a visual inspection of airports with continent values of NaN, it is likely they are from the continent of North America.
* iso_country: Leave the missing values as NaN. While it is possible to impute the country with their actual values by investigating the locations of the airport, I will leave it as it is, as the task will be too time-consuming.
* municipality: Leave the missing values as NaN. Same reason as iso_country.
* gps_code: Leave the missing values as NaN. Same reason as iso_country.
* iata_code: Leave the missing values as NaN. Same reason as iso_country.
* local_code: Leave the missing values as NaN. Same reason as iso_country.

In [7]:
imm_data.isnull().mean()

Unnamed: 0    0.000
cicid         0.000
i94yr         0.000
i94mon        0.000
i94cit        0.000
i94res        0.000
i94port       0.000
arrdate       0.000
i94mode       0.000
i94addr       0.059
depdate       0.049
i94bir        0.000
i94visa       0.000
count         0.000
dtadfile      0.000
visapost      0.618
occup         0.996
entdepa       0.000
entdepd       0.046
entdepu       1.000
matflag       0.046
biryear       0.000
dtaddto       0.000
gender        0.141
insnum        0.965
airline       0.033
admnum        0.000
fltno         0.008
visatype      0.000
dtype: float64

In [8]:
us_cities_demog.isnull().mean()

City;State;Median Age;Male Population;Female Population;Total Population;Number of Veterans;Foreign-born;Average Household Size;State Code;Race;Count    0.0
dtype: float64

In [9]:
# Performing cleaning tasks here





### Step 3: Define the Data Model
#### 3.1 Conceptual Data Model
Map out the conceptual data model and explain why you chose that model

#### 3.2 Mapping Out Data Pipelines
List the steps necessary to pipeline the data into the chosen data model

### Step 4: Run Pipelines to Model the Data 
#### 4.1 Create the data model
Build the data pipelines to create the data model.

In [None]:
# Write code here

#### 4.2 Data Quality Checks
Explain the data quality checks you'll perform to ensure the pipeline ran as expected. These could include:
 * Integrity constraints on the relational database (e.g., unique key, data type, etc.)
 * Unit tests for the scripts to ensure they are doing the right thing
 * Source/Count checks to ensure completeness
 
Run Quality Checks

In [None]:
# Perform quality checks here

#### 4.3 Data dictionary 
Create a data dictionary for your data model. For each field, provide a brief description of what the data is and where it came from. You can include the data dictionary in the notebook or in a separate file.

#### Step 5: Complete Project Write Up
* Clearly state the rationale for the choice of tools and technologies for the project.
* Propose how often the data should be updated and why.
* Write a description of how you would approach the problem differently under the following scenarios:
 * The data was increased by 100x.
 * The data populates a dashboard that must be updated on a daily basis by 7am every day.
 * The database needed to be accessed by 100+ people.