Skip to content
This is the course website for BSDS 100: "Introduction to Data Science with R" at the University of San Francisco. Assignments, lecture notes, and open source code will all be available on this website.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
Assignments
Code_Demonstrations
Data
Images
Lectures
.DS_Store
Fall_2017.pdf
README.md

README.md

BSDS 100: Intro to Data Science with R

James D. Wilson

Email: jdwilson4@usfca.edu

Class Time: TR, 9:55 - 11:40 AM in Harney 430

Office Hours: TR, 12:00 - 12:30 PM in Harney 107B

Book: R for Data Science by Hadley Wickham and Garret Grolemund

Syllabus: Link

Course Learning Outcomes

By the end of this course, you will be able to

  • Proficiently wrangle, manipulate, and explore data using the R programming language
  • Utilize contemporary R libraries including ggplot2, tibble, tidyr, dplyr, knitr, and stringr
  • Visualize, present, and communicate trends in a variety of data types
  • Communicate results using R markdown and R Shiny
  • Formulate data-driven hypotheses using exploratory data analysis and introductory model building techniques

Course Overview

Assessment

The focus of this course will be to provide you with the basic techniques available for making informed, data-driven decisions using the R programming language. This is not a statistics course, but will provide you the intuition to make hypotheses about complex questions through visualization, wrangling, manipulation, and exploration of data. The course will be graded based on the following components:

  • Attendence (20%): You will lose 2% of this grade for every course you miss.
  • Assignments (40%): You will be assigned a computational assignment to be completed using RStudio and the package knitr regularly throughout class.
  • Case Studies (20%): You will be assigned applied case studies throughout the class that are to be completed using RStudio.
  • Final Project (20%): The final project will be a computational case study that brings together the techniques learned throughout the semester. The description for this project will be provided towards the mid point of the semester.
  • Extra Credit (+5%): Create a well-organized database of all R functions that you use throughout the semester. These include those mentioned in lectures, those introduced in homework, etc. Along with each function, give a brief description that details the use of the function. Also, organize these functions into categories according to their use.

Data Science Links and News

Schedule

Overall, this course will be split into two main parts: (1) learning the basics of how to code in R and (2) performing data analysis on real case studies and examples using data science techniques in R.

Introduction

Topic Reading Assignment Due Date In Class Code
Introduction - History of Data Science Ch. 1 What is Data Science? HW 1 Thursday, 8/24
R and RStudio HW 2 Tuesday, 8/29 My First Code
R Packages and RMarkdown HW 3 Tuesday, 9/5 My First Knit

Data Structures in R

Topic Reading Assignment Due Date In Class Code
Vectors, Matrices, and Arrays HW 4 Tuesday, 9/12 [Data Structures I] [Data Structures II]
Lists and Data Frames Ch. 20 in R for Data Science Data Structures III
Tibbles Ch. 10 in R for Data Science HW 5 Tuesday, 9/26 Tibbles
String Analysis Ch. 14 in R for Data Science HW 6 Thursday, 9/28 String Analysis I
String Analysis 2 Ch. 14 in R for Data Science HW 7 Thursday, 10/5 String Analysis II
Factors Ch. 15 in R for Data Science Factors

Data Wrangling and Plotting

Topic Reading Assignment Due Date In Class Code
Input and Output Input and Output
Plotting in R HW 8 Friday, 10/27 Plotting 1
Wrangling Data

Programming

Topic Reading Assignment Due Date In Class Code
Control Flow Ch. 19 in R for Data Science Functions 1
Writing Functions Ch. 19 in R for Data Science Functions 2
Functionals Ch. 18 in R for Data Science

Statistical Modeling in R

Topic Reading Assignment Due Date In Class Code
Intro to Statistical Modeling in R Ch. 23 and 24 in R for Data Science

Case Studies

Case Study Data Date
CS 1: Beer Review Analysis beerdata.RData September 12th, 2017
CS 2: Text Mining [tweets.csv]; [stateoftheunion1790-2012.txt] September 28th, 2017
CS 3: Building the Game of Blackjack November 8th, 2017

Final Project

Description Due Date
Project Signup October 31st at 9:00 AM
Final Project Description November 30th at 9:00 AM

Important Dates

  • Monday, August 28th - Last day to add the class
  • Friday, September 8th - Census date. Last day to withdraw with tuition reversal
  • Tuesday, October 17th - Fall break! (no class)
  • Friday, November 3rd - Last day to withdraw
  • Thursday, November 23rd - Thanksgiving Holiday (no class)
  • Tuesday, December 5th - Last day of class
You can’t perform that action at this time.