Skip to content

oianchevska/etl-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Team Members:

  1. Oleksandra Ianchevska
  2. Surabhi Sood
  3. Zina Zachmann

Goal: Prepare a database for the analysis of a correlation between Seattle pet licenses issue and an income by a zip code.


ETL Project Guidelines

Project Proposal

Before you start writing any code, remember that you only have one week to complete this project. View this project as a typical assignment from work. Imagine a bunch of data came in and you and your team are tasked with migrating it to a production data base.

Take advantage of your Instructor and TA support during office hours and class project work time. They are a valuable resource and can help you stay on track.

Finding Data

Your project must use 2 or more sources of data. We recommend the following sites to use as sources of data:

You can also use APIs or data scraped from the web. However, get approval from your instructor first. Again, there is only a week to complete this!

Data Cleanup & Analysis

Once you have identified your datasets, perform ETL on the data. Make sure to plan and document the following:

  • The sources of data that you will extract from.

  • The type of transformation needed for this data (cleaning, joining, filtering, aggregating, etc).

  • The type of final production database to load the data into (relational or non-relational).

  • The final tables or collections that will be used in the production database.

You will be required to submit a final technical report with the above information and steps required to reproduce your ETL process.

Project Report

At the end of the week, your team will submit a Final Report that describes the following:

  • Extract: your original data sources and how the data was formatted (CSV, JSON, pgAdmin 4, etc).

  • Transform: what data cleaning or transformation was required.

  • Load: the final database, tables/collections, and why this was chosen.

Please upload the report to Github and submit a link to Bootcampspot.


Copyright

Coding Boot Camp © 2019. All Rights Reserved.

Releases

No releases published

Packages

No packages published