Here's all the code and examples from my book Data Science from Scratch. The code
directory contains Python 2.7 versions, and the code-python3
direction contains the Python 3 equivalents. (I tested them in 3.5, but they should work in any 3.x.)
July 2018: I am currently working on the second edition. It will be based on Python 3.6, will have much cleaner code, and will contain expanded coverage of deep learning, NLP, and whatever else I feel like adding. Stay tuned.
Each can be imported as a module, for example (after you cd into the /code directory):
from linear_algebra import distance, vector_mean
v = [1, 2, 3]
w = [4, 5, 6]
print distance(v, w)
print vector_mean([v, w])
Or can be run from the command line to get a demo of what it does (and to execute the examples from the book):
python recommender_systems.py
Additionally, I've collected all the links from the book.
And, by popular demand, I made an index of functions defined in the book, by chapter and page number. The data is in a spreadsheet, or I also made a toy (experimental) searchable webapp.
- Introduction
- A Crash Course in Python
- Visualizing Data
- Linear Algebra
- Statistics
- Probability
- Hypothesis and Inference
- Gradient Descent
- Getting Data
- Working With Data
- Machine Learning
- k-Nearest Neighbors
- Naive Bayes
- Simple Linear Regression
- Multiple Regression
- Logistic Regression
- Decision Trees
- Neural Networks
- Clustering
- Natural Language Processing
- Network Analysis
- Recommender Systems
- Databases and SQL
- MapReduce
- Go Forth And Do Data Science