This repository contains my code for the labs and exercises in "An Introduction to Statistical Learning", by James, Witten, Hastie, and Tibshirani. Throughout each chapter, I used the R code they provided as a starting point, and translated them to Python. Hope you enjoy!
A few things:
-
I Found it most helpful to read through each of the chapters first, then breeze through the videos in the course (I usually watched them at 2x speed), then try to implement the concepts into Python code. Having a background in R made it much easier to translate the code, but it was still a challenge at times!
-
Many online tutorials helped me out along the way. I surely could not have successfully translated all the code without Scikit-Learn's Documentation and Jordi Warmenhoven's ISLR-Python repo
-
After Chapter 6, I was itching to apply my new skills. I attempted to tackle a couple competitions on Kaggle, but I found that I was really struggling with processing the data. I decided to take a week to focus on my data preprocessing skills and building a data pipeline. During that week, I built this pipeline.