No description, website, or topics provided.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


This is my solutions for each lesson of Udacity "Data Wrangling with MongoDB" course.


Lesson 1: Data Extraction Fundamentals

Assessing the Quality of Data

Intro to Tabular Formats

Parsing CSV

Parsing XLS with XLRD

Intro to JSON

Using Web APIs

Lesson 2: Data in More Complex Formats

Intro to XML XML Design Principles

Parsing XML

Web Scraping

Parsing HTML

Lesson 3: Data Quality

What is Data Cleaning?

Sources of Dirty Data

Measuring Data Quality

A Blueprint for Cleaning

Auditing Validity

Auditing Accuracy

Auditing Completeness

Auditing Consistency

Auditing Uniformity

Lesson 4: Working with MongoDB

Data Modelling in MongoDB

Introduction to PyMongo

Field Queries

Projection Queries

Getting Data into MongoDB

Using mongoimport

Operators like $gt, $lt, $exists, $regex

Querying Arrays and using $in and $all Operators

Changing entries: $update, $set, $unset

Lesson 5: Analyzing Data

Examples of Aggregation Framework

The Aggregation Pipeline

Aggregation Operators: $match, $project, $unwind, $group

Multiple Stages Using a Given Operator

Lesson 6: Case Study - OpenStreetMap Data

Using iterative parsing for large datafiles

Open Street Map XML Overview

Exercises around OpenStreetMap data

Final Project Instructions