Skip to content
DSCI 511: Programming for Data Science
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
README.md

README.md

DSCI 511: Programming for Data Science

Overview of data structures, iteration, flow control, and program design relevant to data exploration and analysis. When and how to exploit pre-existing packages/libraries.

Course Learning Outcomes

By the end of the course, students are expected to:

  1. Translate fundamental programming concepts such as loops and conditionals into R and Python code.
  2. Translate a computational problem into efficient R and Python code.
  3. Understand and implement the basics of object-oriented programming in Python.
  4. Understand how to write functions in both R and Python.
  5. Produce human-readable code that incorporates best practices of programming and coding style.
  6. Predict the output of code based on R's scoping rules.

Assessments

This is an assignment-based course. You'll be evaluated as follows:

Assessment Weight Due Date Location
Lab Assignment 1 15% Saturday, Sept 15 at 18:00 Submit to Github
Lab Assignment 2 15% Saturday, Sept 22 at 18:00 Submit to Github
Quiz 1 20% Tuesday, Sept 25, 15:00-15:30 Your lab room
Lab Assignment 3 15% Saturday, Sept 29 at 18:00 Submit to Github
Lab Assignment 4 15% Wednesday, Oct 3 at 18:00 Submit to Github
Quiz 2 20% Friday, Oct 5, 14:40 Wood IRC Room 4

Tip: Use the lecture learning objectives as beacons when studying for your quizzes!

Lecture Details

Lecture Topic Pre-readings/Resources
1 Python datatypes and operators Python documentation: standard data types and builtin functions; Think Python: variables, expressions and statements, lists, tuples and strings
2 Python control flow and functions Python documentation: control flow and functions
3 Python program design and testing Python Testing, PEP 257: Docstrings and NumPy docstring examples
4 Python classes, objects, modules and packages Python documentation: objects and classes and modules and packages
5-6 The R landscape - adv-r: Data Structures, except for the "Attributes" and "Factors" sections.

- adv-r: Subsetting, up until (but not including) "Missing/out of bounds indices".

7 Environments and Scoping

- adv-r: Functions: Lexical Scoping section

- adv-r: Environments: "Environment Basics" section, and "Function Environments" section (but don't worry about "binding environments" and "calling environments").

8 Programming for Humans

- Style guide, python: pep8

- Style guide, R: adv-r: Style

- adv-r: Exceptions and Debugging, sections "Debugging Tools" and "Defensive Programming"

If time remains, we will cover:

Topic Relevant Text
R "Vocabulary" adv-r: Vocabulary

Annotated Resources

Here are prominent course resources that we will be referring to:

  1. Python documentation
    • The Python documentation is a great resource for learning Python especially the Python tutorial.
  2. Think Python: How to Think Like a Computer Scientist
    • "How to Think Like a Computer Scientist" is a standard textbook for introductory programming courses. It includes case studies and exercises.
  3. Advanced R, by Hadley Wickham (free version online)
    • This is a prominent resource for R as a programming language, allowing the reader to dig deep into R. It anticipates readers to already have some programming background. Its first part on Foundations is closely aligned with the objectives of DSCI 511, and is therefore the textbook for the second half of the course. Gaining familiarity with this book will likely be an asset in your data science career.
  4. R swirl
    • For those new to R who want more practice with the basics.

Here are other resources that you might find useful:

Policies

Please see the general MDS policies.

You can’t perform that action at this time.