GitHub - MUSA-620-Spring-2018/course-materials: Syllabus: MUSA 620

MUSA 620 - Data Wrangling and Data Visualization
University of Pennsylvania, School of Design

SCHEDULING

Class: Tuesdays from 9am to 12pm in Meyerson Hall, room B2.

Office hours: Monday from 4pm to 7pm. Email galkamaxd at gmail to schedule a time.

Instructor: Max Galka (galkamaxd at gmail dot com)

TA: Evan Cernea (ecernea at sas dot upenn dot edu)

OBJECTIVE

The purpose of this course is to familiarize students with the “pipeline” approach to data science. This involves the process of gathering data, storing the data, analyzing the data, and visualizing the data such that non-technical decision makers can make sense of it. The course is broken down accordingly into four sections.

Data collection: Students will learn how to gather data by way of web scraping, APIs, and other unstructured sources.
Databases: This part of the course teaches students how to store this data for efficient retrieval and analysis.
Analytics: Students will learn a range of machine-driven techniques for analyzing structured and unstructured data.
Data visualization: The last part of the course teaches students how to present the results of their analysis visually using R and the web application framework Shiny.

FORMAT

The course will be conducted in weekly sessions devoted to lectures, demonstrations, and in-class projects.

ASSIGNMENTS

There is one required final project at the end of the semester. Homework will be assigned before the close of class and will be due the following Tuesday by the end of day. Five of the homework assignments will be explicitly required. The remainder are optional, but will count toward the participation component of your final grade.

For the final project, students will replicate the pipeline approach on a dataset (or datasets) of their choosing. The final deliverable will be a web-based data visualization and accompanying description including a summary of the results and the methods used in each step of the process (collection, storage, analysis and visualization).

Final assignment

Assignment Q&A:

If you get stuck, the first step should always be to see if you can find the answer to your question online. In particular, Stack Overflow, Stack Exchange: GIS, and the rest of the Stack Exchange family are great resources.
You are encouraged to ask [and answer] questions via the Slack channel as opposed to email, in case other students will have also have the same question.
Evan and I are available for in depth discussion about assignment during office hours.

GRADING

The grading breakdown is as follows: 50% for homework; 40% for final project, 10% for participation

There will be five required homework assignments, due at the beginning of class. Late homework will be accepted for up to one week after the deadline and will be deducted 10%. Credit will not be given for homework that is late by more than one week.

SOFTWARE

This course relies on use of the R Statistical Package in conjunction with Shiny and other associated extensions. For geospatial topics, we will also use QGIS.

SCHEDULE

Class #	Date	Topic	Homework*
Week 1	Jan 16	ggplot2, QGIS, data visualization fundamentals
Week 2	Jan 23	Data frames, tidyverse, map projections	Assign HW 1
Week 3	Jan 30	Geocoding/mapping: ggmap, sf (simple features) package
Week 4	Feb 6	Databases: Postgres, SQL
Week 5	Feb 13	Databases: PostGIS, spatial queries	Assign HW 2
Week 6	Feb 20	Web scraping 1: The DOM, web inspector
Week 7	Feb 27	Web scraping 2: CSS selectors, scraping dynamic pages	Assign HW 3
Spring Break
Week 8	Mar 13	Unstructured data: Twitter API
Week 9	Mar 20	Natural language processing: sentiment analysis	Assign HW 4
Week 10	Mar 27	Advanced data visualization
Week 11	Apr 3	Interactive maps: Leaflet
Week 12	Apr 10	Shiny	Assign HW 5
Week 13	Apr 17	Shiny
Week 14	Apr 24	In-class work on final projects

Assignment dates of homework are tentative and subject to change

Final assignment

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Repository files navigation

MUSA 620 - Data Wrangling and Data Visualization
University of Pennsylvania, School of Design

SCHEDULING

OBJECTIVE

FORMAT

ASSIGNMENTS

GRADING

SOFTWARE

SCHEDULE

About

Releases

Packages

MUSA-620-Spring-2018/course-materials

Folders and files

Latest commit

History

README.md

README.md

Repository files navigation

MUSA 620 - Data Wrangling and Data VisualizationUniversity of Pennsylvania, School of Design

SCHEDULING

OBJECTIVE

FORMAT

ASSIGNMENTS

GRADING

SOFTWARE

SCHEDULE

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

MUSA 620 - Data Wrangling and Data Visualization
University of Pennsylvania, School of Design

Packages