We are writing a collaborative data science book with an open-source so that anyone can join and help us!
10 projects, 10 chapters, 1 book with a goal to make it easy for anyone to start with data science by focusing on practical side of it.
If you want to be mentioned in the book and receive a free copy of the final ebook, please help us make the code better.
We plan the following chapters in the book:
Your Data Science setup
Anaconda (installation, updating anaconda, installing new libraries)
Virtualenv (why you need it, creating new environments)
Jupyter Notebook (starting, popular shortcuts)
GitHub (creating an account, basic commands)
Google Colab (data science in the cloud, basic usage)
-
Analysing pharmaceutical sales data (Pandas, Matplotlib, Clustering, Regression)
-
Predicting House Pricing (KNN, XGBoost)
-
Introduction to Computer Vision with MNIST (Neural Networks)
-
Face recognition (Computer Vision)
-
Titanic Challenge (regression)
-
Clustering wine dataset with k-means and DBSCAN (Neural Networks)
-
PGA Tour 2010-2019 clustering (Clustering)
-
Sentiment Analysis with Twitter (NLP)
-
Cats and Dogs (Neural Networks)
-
IMDB Database (regression, data visualisation, advanced NLP)
Practice what you have learnt with these projects (some more project ideas but without solution)
Real-world examples of Data Science in business (how some of the algorithms are currently used in practice)
Next steps (next things to learn, books to read etc.)