Skip to content
πŸ“Š All of courses, assignments, exercises, mini-projects and books that I've done so far in the process of learning by myself Machine Learning and Data Science.
Jupyter Notebook MATLAB HTML
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


The list of things I've finished so far on the way of learning by myself Machine Learning and Data Science.

πŸ”₯ Projects

  • Setting up a cafΓ© in Ho Chi Minh City β€” find a best place to setting up a new business β€” article β€” source.
  • Titanic: Machine Learning from Disaster (from Kaggle) β€” predicts which passengers survived the Titanic shipwreck β€” source.
  • "Bull Book for Bulldozers" Kaggle competition.

I also do some mini-projects for understanding the concepts. You can find the html files (exported from the corresponding Jupyter Notebook files) and "Open in Colab" files for below mini projects here.

πŸ€– Algorithms

  • Convolutional Neural Network (CNN).
  • Decision Tree β€” my note.
  • Density-based Clustering.
  • Gaussian Naive Bayes.
  • Hierarchical Clustering.
  • K-Means Clustering.
  • K-Nearest Neighbors (KNN)
  • Linear Regression / Logistic Regression.
  • Neural Networks.
  • Perceptron.
  • Principal Component Analysis (PCA) β€” my note.
  • Random Forest β€” my note.
  • Recurrent neural network (RNN).
  • Singular Value Decomposition (SVD).
  • Stochastic Gradient Decent (SGD).
  • Support Vector Machine (SVM) β€” my note.

πŸ’‘ Concepts

  • Activation functions.
  • Active learning (ML).
  • Cost function.
  • Confusion matrix.
  • Cross Validation (K-folds).
  • Decision boundary.
  • Gradient Descent.
  • GridSearch.
  • Functions: Sigmoid, ReLU.
  • F-test, p-value, f1-score, t-value, z-score.
  • Forward/Backward propagation.
  • Overfitting (High variance) / Underfitting (High bias).
  • Pipeline.
  • Plots / Charts: box plots, heat map plots, line plots, area plots, bar chart, choropleth map, waffle chart, factorplot.
  • Regular Expressions (RegEx).
  • Scaling.
  • Supervised Learning / Unsupervised Learning.
  • Train / Dev / Test sets.
  • Tuning.
  • Whitening.

🎲 Tasks

  • Data Visualization.
  • Data Wrangling.
  • Model evaluation.
  • Preprocessing (texts, images, dates & times, structured data).
  • Testing.
  • Web Scraping.

🐍 Programming Languages

  • GraphQL β€” an open-source data query and manipulation language for APIs, and a runtime for fulfilling queries with existing data.
  • Python β€” an interpreted, high-level, general-purpose programming language β€” my note.
  • R β€” a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing.
  • Scala β€” a general-purpose programming language providing support for functional programming and a strong static type system.
  • SQL β€” a domain-specific language used in programming and designed for managing data held in a relational database management system, or for stream processing in a relational data stream management system.

βš™οΈ Frameworks & Platforms

  • Docker β€” a set of platform as a service products that use OS-level virtualization to deliver software in packages called containers.
  • Google Colab β€” a free cloud service, based on Jupyter Notebooks for machine-learning education and research β€” my note.
  • Kaggle β€” an online community of data scientists and machine learners, owned by Google.
  • Hadoop β€” a collection of open-source software utilities that facilitate using a network of many computers to solve problems involving massive amounts of data and computation.
  • PostgreSQL (Postgres) β€” a free and open-source relational database management system emphasizing extensibility and technical standards compliance.
  • Spark β€” an open-source distributed general-purpose cluster-computing framework.

βš’οΈ Tools

  • Git β€” a distributed version-control system for tracking changes in source code during software development β€” my note.
  • Markdown β€” a lightweight markup language with plain text formatting syntax β€” my note.
  • Jupyter Notebook β€” an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text β€” my note.
  • Trello β€” a web-based Kanban-style list-making application.

πŸ“š Libraries

The "ticked" libraries don't mean that I've known/understand whole of them (but I can easily use them with their documentation)!

  • D3js β€” a JavaScript library for producing dynamic, interactive data visualizations in web browsers.
  • Keras β€” an open-source neural-network library written in Python.
  • Matplotlib β€” a plotting library for the Python programming language and its numerical mathematics extension NumPy.
  • numpy, matplotlib, pandas β€” a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.
  • OpenCV β€” a library of programming functions mainly aimed at real-time computer vision.
  • pandas β€” a software library written for the Python programming language for data manipulation and analysis.
  • Seaborn β€” a Python data visualization library based on matplotlib.
  • scikit-learn β€” a free software machine learning library for the Python programming language.
  • TensorFlow β€” a free and open-source software library for dataflow and differentiable programming across a range of tasks..

πŸ‘¨β€πŸ« Courses

The "non-checked" courses are under the way to be finished!

πŸ“– Books

The "non-checked" books are under the way to be finished!

πŸ€– Github's repositories

🌏 Other resources

The descriptions of terms in this site are borrowed from Wikipedia.

You can’t perform that action at this time.