Skip to content
πŸ“Š All of courses, assignments, exercises, mini-projects and books that I've done so far in the process of learning by myself Machine Learning and Data Science.
Jupyter Notebook MATLAB HTML
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
coursera-deep-learning-deeplearning.ai
coursera-ibm-data-professional-certificate
coursera-ml-andrew-ng
dataquest-courses
fastai-ml-for-coders
interview
mini-projects
projects
reading-isl
.gitignore
README.md
good-articles.md

README.md

data-science-learning

The list of things I've finished so far on the way of learning by myself Machine Learning and Data Science.

πŸ”₯ Projects

  • Setting up a cafΓ© in Ho Chi Minh City β€” find a best place to setting up a new business β€” article β€” source.
  • Titanic: Machine Learning from Disaster (from Kaggle) β€” predicts which passengers survived the Titanic shipwreck β€” source.
  • "Bull Book for Bulldozers" Kaggle competition.

I also do some mini-projects for understanding the concepts. You can find the html files (exported from the corresponding Jupyter Notebook files) and "Open in Colab" files for below mini projects here.

πŸ€– Algorithms

  • Convolutional Neural Network (CNN).
  • Decision Tree β€” my note.
  • Density-based Clustering.
  • Gaussian Naive Bayes.
  • Hierarchical Clustering.
  • K-Means Clustering.
  • K-Nearest Neighbors (KNN)
  • Linear Regression / Logistic Regression.
  • Neural Networks.
  • Perceptron.
  • Principal Component Analysis (PCA) β€” my note.
  • Random Forest β€” my note.
  • Recurrent neural network (RNN).
  • Singular Value Decomposition (SVD).
  • Stochastic Gradient Decent (SGD).
  • Support Vector Machine (SVM) β€” my note.

πŸ’‘ Concepts

  • Activation functions.
  • Active learning (ML).
  • Cost function.
  • Confusion matrix.
  • Cross Validation (K-folds).
  • Decision boundary.
  • Gradient Descent.
  • GridSearch.
  • Functions: Sigmoid, ReLU.
  • F-test, p-value, f1-score, t-value, z-score.
  • Forward/Backward propagation.
  • Overfitting (High variance) / Underfitting (High bias).
  • Pipeline.
  • Plots / Charts: box plots, heat map plots, line plots, area plots, bar chart, choropleth map, waffle chart, factorplot.
  • Regular Expressions (RegEx).
  • Scaling.
  • Supervised Learning / Unsupervised Learning.
  • Train / Dev / Test sets.
  • Tuning.
  • Whitening.

🎲 Tasks

  • Data Visualization.
  • Data Wrangling.
  • Model evaluation.
  • Preprocessing (texts, images, dates & times, structured data).
  • Testing.
  • Web Scraping.

🐍 Programming Languages

  • GraphQL β€” an open-source data query and manipulation language for APIs, and a runtime for fulfilling queries with existing data.
  • Python β€” an interpreted, high-level, general-purpose programming language β€” my note.
  • R β€” a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing.
  • Scala β€” a general-purpose programming language providing support for functional programming and a strong static type system.
  • SQL β€” a domain-specific language used in programming and designed for managing data held in a relational database management system, or for stream processing in a relational data stream management system.

βš™οΈ Frameworks & Platforms

  • Docker β€” a set of platform as a service products that use OS-level virtualization to deliver software in packages called containers.
  • Google Colab β€” a free cloud service, based on Jupyter Notebooks for machine-learning education and research β€” my note.
  • Kaggle β€” an online community of data scientists and machine learners, owned by Google.
  • Hadoop β€” a collection of open-source software utilities that facilitate using a network of many computers to solve problems involving massive amounts of data and computation.
  • PostgreSQL (Postgres) β€” a free and open-source relational database management system emphasizing extensibility and technical standards compliance.
  • Spark β€” an open-source distributed general-purpose cluster-computing framework.

βš’οΈ Tools

  • Git β€” a distributed version-control system for tracking changes in source code during software development β€” my note.
  • Markdown β€” a lightweight markup language with plain text formatting syntax β€” my note.
  • Jupyter Notebook β€” an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text β€” my note.
  • Trello β€” a web-based Kanban-style list-making application.

πŸ“š Libraries

The "ticked" libraries don't mean that I've known/understand whole of them (but I can easily use them with their documentation)!

  • D3js β€” a JavaScript library for producing dynamic, interactive data visualizations in web browsers.
  • Keras β€” an open-source neural-network library written in Python.
  • Matplotlib β€” a plotting library for the Python programming language and its numerical mathematics extension NumPy.
  • numpy, matplotlib, pandas β€” a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.
  • OpenCV β€” a library of programming functions mainly aimed at real-time computer vision.
  • pandas β€” a software library written for the Python programming language for data manipulation and analysis.
  • Seaborn β€” a Python data visualization library based on matplotlib.
  • scikit-learn β€” a free software machine learning library for the Python programming language.
  • TensorFlow β€” a free and open-source software library for dataflow and differentiable programming across a range of tasks..

πŸ‘¨β€πŸ« Courses

The "non-checked" courses are under the way to be finished!

πŸ“– Books

The "non-checked" books are under the way to be finished!

πŸ€– Github's repositories

🌏 Other resources


The descriptions of terms in this site are borrowed from Wikipedia.

You can’t perform that action at this time.