Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?


The files in this repository contain lecture code for Data Analysis for R & Python (ECO 590 and ECO 389) at Pace University taught by Mary Kaltenberg.

I'm going to try a different style of presenting slides - we won't use them! Instead, I will share my jupyter notebook and we will go through the material together with the lecture code notebooks available through this repository.

Course Description:

In this course, you’ll learn the basics of coding for data analysis using R and Python. This is a project-based course to help you learn the practicalities of working with data, which includes how to find data sources and collect your own data, data cleaning, data visualization and basic data analysis. You’ll apply your knowledge of statistics and econometrics with the ultimate goal to answer a research question in economics or business.

Python is a robust computer language with a powerful set of libraries that can enable you to do anything from building a software program to machine learning. We will focus on web scraping, data cleaning, and visualization in Python using pandas, numpy, beautiful soup, matplotlib, and seaborn, among others. In R, we will primarily focus on econometrics and visualization. This course is not meant to substitute statistics or econometrics, but rather complement it and show practical applications of concepts you've already learned.

These programs are widely used in many scientific areas for data analysis. This course is an introduction to R and Python programming language for students without prior programming experience but, programming is like any other language, it will require practice, patience, and application to become fluent.

##Lecture Overview Courses are taught once a week for 3 hours. Most lectures take the entire three hours to review including group exercises and weekly task reviews. Some lectures spill into others, R lectures tend to be shorter.

I reccommend teaching in the following order:

  1. Install and Intro

Overview of Jupyter Notebook including installation instructions and various install issues.

  1. Python ABCs

Basic building blocks of python coding

  1. Sentence Structures Count Python-ula

Forloops,while statements, conditionals and numpy

  1. Let's get this data started Pandas

Overview of pandas and data wrangling

  1. New Data City Access Grants

Overview of Servers and APIs

5a. Optional: Working with Servers

Basics in how to access servers via a terminal

  1. Webscraping

Beautiful Soup and Requests

  1. Webscraping 2 and Selenium

More examples with webscraping and overview of Selenium (Chrome drive extentions, how to scrape java content/interactive websites)

  1. Becoming a Viz Kid

Data Visualization- matplotlib

  1. Becoming a Viz Kid part 2

Data Visualization - details with matplotlib (fig,ax, legends, ticks) and seaborn. Lecture on Bad Graphs (Typically I start on introducing R in this class)

  1. New R Kids on the Block - Intro to R

Introduction to R and R Studio - basics in the environment, translations of code from Python to R

  1. Data Wrangling and Intro to Regressions

Translations of Python to R for data manipulation/cleaning (Data analysis on it's own is too short, so I begin regressions)

  1. Advanced Regressions

More detailed regression analysis in R & Stargazer

  1. Becoming an R Viz Kid



No description, website, or topics provided.






No releases published


No packages published