An introduction to the data science workflow in Python
DC PyLadies | Thursday, 22 March 2018
These are the tutorial materials for the 22 March 2018 DC PyLadies meetup, An introduction to the data science workflow in Python.
These notebooks can be viewed and interacted with Binder: (note: this may take a while to load, as it installs all the libraries)
You can also view these notebooks here in GitHub (without interaction).
You can also download these notebooks and install everything on your local machine. To do so, choose a directory where you want to put this (
path/of/your/choosing), go to that directory (
cd path/of/your/choosing), clone these materials (
git clone), install the Python packages with
pip, and then launch the Jupyter notebooks (
cd path/of/your/choosing git clone email@example.com:angelaambroz/2018_03_pyladies.git pip install -r requirements.txt jupyter notebook
Table of Contents
- 0_Introduction - Welcome to the tutorial! How to get data from files, from databases, and from APIs. An introduction to
- 1_EDA - Exploratory data analysis, using
matplotlib. An introduction to visualizations.
- 2_StatsML - Fitting a linear regression using three different libraries:
numpy. Comparing the results. Discussion of other libraries for machine learning and statistics.
- 3_Etc - Some recommended resources for learning more. Other things to learn.