Skip to content
high level overview of python for R users, data cleaning, preprocessing, modeling, model evaluation
HTML Jupyter Notebook Python
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


Date: September 9, 2019 Author: Sylvia Tran

The work in this repository was designed for the LA R Users Group. This repository is intended for a high level overview of python for R users, data cleaning, preprocessing, modeling.


  • The code provided uses Python 3.7.0
  • Environment setup is not addressed as part of the scope of this repository
  • The work was done on a MacOS, therefore nuances pertaining to Windows OS are not addressed
  • The use of RandomForest is demonstrative, and neither intended to optimize hyperparameters nor minimize loss
  • Forthcoming: R <-> Python Cheatsheet to be added to this repository in the coming weeks in the ./slides-etc/ directory

This Repository:

  • assets (pictures and .mov files for screen capture)
  • notebooks (jupyter notebook (that can be converted to a slide deck))
  • slides (holds slide deck as .html)
  • src (.py file as an example)

Ways to Learn Python:

A. Interactive Python Can be accessed through RStudio using the Terminal by

  1. starting from the working directory of choice
  2. $ ipython

B. Jupyter ipynb (interactive Python notebook)

  1. after downloading the repo, make a copy of the .ipynb file in the /notebooks folder
  2. take apart the code line by line, or go to town on trying different things on the play dataset


  1. Importing Packages
  2. Loading Toy Datasets (sklearn) & using pandas
  3. Cursory Inspection (pandas & numpy)
  4. Light Cleaning (base python, pandas)
  5. Train-test-split (sklearn)
  6. Feature Scaling (sklearn)
  7. Model (sklearn)
You can’t perform that action at this time.