Data_Analysis_Python_R
The files in this repository contain lecture code for Data Analysis for R & Python (ECO 590 and ECO 389) at Pace University taught by Mary Kaltenberg.
I'm going to try a different style of presenting slides - we won't use them! Instead, I will share my jupyter notebook and we will go through the material together with the lecture code notebooks available through this repository.
Course Description:
In this course, you’ll learn the basics of coding for data analysis using R and Python. This is a project-based course to help you learn the practicalities of working with data, which includes how to find data sources and collect your own data, data cleaning, data visualization and basic data analysis. You’ll apply your knowledge of statistics and econometrics with the ultimate goal to answer a research question in economics or business.
Python is a robust computer language with a powerful set of libraries that can enable you to do anything from building a software program to machine learning. We will focus on web scraping, data cleaning, and visualization in Python using pandas, numpy, beautiful soup, matplotlib, and seaborn, among others. In R, we will primarily focus on econometrics and visualization. This course is not meant to substitute statistics or econometrics, but rather complement it and show practical applications of concepts you've already learned.
These programs are widely used in many scientific areas for data analysis. This course is an introduction to R and Python programming language for students without prior programming experience but, programming is like any other language, it will require practice, patience, and application to become fluent.
##Lecture Overview Courses are taught once a week for 3 hours. Most lectures take the entire three hours to review including group exercises and weekly task reviews. Some lectures spill into others, R lectures tend to be shorter.
I reccommend teaching in the following order:
- Install and Intro
Overview of Jupyter Notebook including installation instructions and various install issues.
- Python ABCs
Basic building blocks of python coding
- Sentence Structures Count Python-ula
Forloops,while statements, conditionals and numpy
- Let's get this data started Pandas
Overview of pandas and data wrangling
- New Data City Access Grants
Overview of Servers and APIs
5a. Optional: Working with Servers
Basics in how to access servers via a terminal
- Webscraping
Beautiful Soup and Requests
- Webscraping 2 and Selenium
More examples with webscraping and overview of Selenium (Chrome drive extentions, how to scrape java content/interactive websites)
- Becoming a Viz Kid
Data Visualization- matplotlib
- Becoming a Viz Kid part 2
Data Visualization - details with matplotlib (fig,ax, legends, ticks) and seaborn. Lecture on Bad Graphs (Typically I start on introducing R in this class)
- New R Kids on the Block - Intro to R
Introduction to R and R Studio - basics in the environment, translations of code from Python to R
- Data Wrangling and Intro to Regressions
Translations of Python to R for data manipulation/cleaning (Data analysis on it's own is too short, so I begin regressions)
- Advanced Regressions
More detailed regression analysis in R & Stargazer
- Becoming an R Viz Kid
ggplot2