Skip to content

This repository contains my implementation of the algorithms described in the book "Data Science From Scratch" by Joel Grus. Please scroll down for description of each file.

Notifications You must be signed in to change notification settings

neerajkumarvaid/Data-Science-From-Scratch-Python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data-Science-From-Scratch-Python

This repository contains my implementation of the algorithms described in the book "Data Science From Scratch" by Joel Grus. I used Python 3.7 to code these algorithms.

S.No. File name Description
1 classes_set.py Illustrates how to create classes in Python
2 visualization.py Illustartes how to create bar, scatter and line plots.
3 vector_operations.py Illustrates how to create and manipulate vectors in Python
4 matrix_operations.py Illustrates how to create and manipulate matrices in Python
5 statistics.py Computes basic statstics including mean, variance, covariance, correlation etc. from the data.
6 probability.py Demonstrates basic porbability operations in Python.
7 hypothesis_testing.py Performs basic hypothesis testing in Python.
8 gardient_descent.py Implementation of batch, mninbatch and stochastic gradient descent algorithm (IPython Notebook).
9 working_with_data.py Demonstrates how to do data correlation, rescaling and dimensionality reduction - PCA algorithm implemented from scratch.(IPython Notebook).
10 machine_learning.py Computes model assessment metrics like accuracy, precision, recall and F1-score. (IPython Notebook).
11 k_nearest_neighbors.py Implemention of k-nearest neighbors algorithm from scratch in Python. (IPython Notebook).
12 naive_bayes.py Implemention of Naive Bayes algorithm from scratch in Python. (IPython Notebook).
13 linear_regression.py Implemention of simple linear regression algorithm from scratch in Python. (IPython Notebook).
14 multiple_regression.py Implemention of simple linear regression algorithm from scratch in Python. (IPython Notebook).
15 logistic_regression.py Implemention of simple logistic regression algorithm from scratch in Python. (IPython Notebook).
16 decision_trees.py Implemention of decision trees from scratch in Python. (IPython Notebook).
17 neural_networks.py Implementation of feed-forward neural networks and backpropagation algorithm from scratch in Python. (IPython Notebook).
18 deep_learning.py Implementation of deep neural networks with various loss fucntions and optimization techniques including network regulation using dropout from scratch in Python. (IPython Notebook).
19 clustering.py Implementation of k-means and bottom-up hierarchical clustering from scratch in Python. (IPython Notebook).
20 nlp.py Implementation of popular natural language processing algorithms including bigrams, trigrams, topic modeling, word vectors and recurrent neural networks from scratch in Python (IPython Notebook).
21 network_analysis.py Demonstrates how to do simple network analysis in Python (IPython Notebook).
22 recommender_systems.py Implementation of user and item based collaborative filtering, and a matrix factorization algorithm in Python.(IPython Notebook).
23 databases_sql.py This file contains an impelementation of basic SQL operations in Python.(IPython Notebook).
24 MapReduce.py An impelementation of mapper and reducer functions with a few examples in Python.(IPython Notebook).

About

This repository contains my implementation of the algorithms described in the book "Data Science From Scratch" by Joel Grus. Please scroll down for description of each file.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published