The list of things I've finished so far on the way of learning by myself Machine Learning and Data Science.
- My raw notes: rawnote.dinhanhthi.com (quickly capture ideas from the courses).
- My main notes: dinhanhthi.com/notes (well-written notes, not only for me).
- My learning log.
- Setting up a cafΓ© in Ho Chi Minh City β find a best place to setting up a new business β article β source.
- Titanic: Machine Learning from Disaster (from Kaggle) β predicts which passengers survived the Titanic shipwreck β source.
I also do some mini-projects for understanding the concepts. You can find the html files (exported from the corresponding Jupyter Notebook files) and "Open in Colab" files for below mini projects here.
- Image compression using K-Means β source β Open in Colab β my note
- Example to understand the idea of PCA β source β Open in Colab.
- Image compression using PCA β source β Open in Colab.
- PCA without scikit-learn β source β Open in Colab.
- Face Recognition using SVM β source β Open in Colab.
- XOR problem using SVM to see the effect of gamma and C in the case of using RBF kernel β source β Open in Colab.
- Anomaly Detection. β my note
- Data Aggregation β my note
- Data Overview. β my note
- Data Visualization.
- Model evaluation.
- Preprocessing (texts, images, dates & times, structured data). β my note
- Testing. β my note
- Web Scraping.
- GraphQL β an open-source data query and manipulation language for APIs, and a runtime for fulfilling queries with existing data.
- Python β an interpreted, high-level, general-purpose programming language β my note.
- R β a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing.
- Scala β a general-purpose programming language providing support for functional programming and a strong static type system.
- SQL β a domain-specific language used in programming and designed for managing data held in a relational database management system, or for stream processing in a relational data stream management system.
- Apache Airflow β my note
- Docker β a set of platform as a service products that use OS-level virtualization to deliver software in packages called containers β my note
- Google Colab β a free cloud service, based on Jupyter Notebooks for machine-learning education and research β my note.
- Google Kubernetes
- Hadoop β a collection of open-source software utilities that facilitate using a network of many computers to solve problems involving massive amounts of data and computation.
- Kaggle β an online community of data scientists and machine learners, owned by Google.
- PostgreSQL (Postgres) β a free and open-source relational database management system emphasizing extensibility and technical standards compliance.
- Spark β an open-source distributed general-purpose cluster-computing framework.
- Bash β my note
- Git β a distributed version-control system for tracking changes in source code during software development β my note.
- Markdown β a lightweight markup language with plain text formatting syntax β my note.
- Jupyter Notebook β an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text β my note.
- Trello β a web-based Kanban-style list-making application.
The "ticked" libraries don't mean that I've known/understand whole of them (but I can easily use them with their documentation)!
- D3js β a JavaScript library for producing dynamic, interactive data visualizations in web browsers.
- Keras β an open-source neural-network library written in Python.
- Matplotlib β a plotting library for the Python programming language and its numerical mathematics extension NumPy. β my note
- Numpy β a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays. β my note
- OpenCV β a library of programming functions mainly aimed at real-time computer vision.
- Pandas β a software library written for the Python programming language for data manipulation and analysis. -- my note
- Plotly -- the front-end for ML and data science models.
- PyTorch -- my note
- Seaborn β a Python data visualization library based on matplotlib.
- Scikit-learn β a free software machine learning library for the Python programming language.
- TensorFlow β a free and open-source software library for dataflow and differentiable programming across a range of tasks.
The "non-checked" courses are under the way to be finished!
- Advanced Data Science with IBM Specialization on Coursera.
- Advanced Machine Learning with TensorFlow on Google Cloud Platform Specialization by Google Cloud Training on Coursera.
- Advanced Statistics for Data Science Specificaton by Johns Hopkins University on Coursera.
- Anomaly Detection in Time Series Data with Keras by Coursera Project Network. -- my certificate
- CS231n: Convolutional Neural Networks for Visual Recognition by Stanford.
- Data Science Path on Codecademy. It contains 27 sub-courses covering all necessary knowledge about data science β my certificate β notes & codes.
- Data Scientist path & Data Engineer path on Dataquest. Both of them contain many sub-courses covering all about Data Science β my note β my certificate
- Deep Learning Specialization by Andrew NG on Coursera. It contains 5 courses covering the foundations of Deep Learning (CNN, RNN, LSTM, Adam, Dropout, BatchNorm, Xavier/He initialization,...). Many case studies projects are proposed β my note -- my certificate.
- fast.ai's courses for Machine Learning and Deep Learning..
- IBM AI Engineering Professional Certificate on Coursera.
- IBM Data Professional Certificate specialization on Coursera. It contains 9 sub-courses covering fundamental knowledge about data science β my note β my certificate.
- Introduction to Statistics with NumPy on Codecademy β my certificate.
- Learn Python 3 on Codecademy β my note β my certificate.
- Learn SQL on Codecademy β my certificate.
- Machine Learning by Andew NG on Coursera. It introduces a general idea about ML and some commonly used algorithms β my note β my certificate.
- Machine Learning Crash Course by Google.
- Machine Learning with TensorFlow on Google Cloud Platform Specialization by Google Cloud Training on Coursera.
- MIT Deep Learning
- Shervine Amidi's courses about Machine Learning, Deep Learning, AI, Stats (Stanford University)
- TensorFlow: Data and Deployment Specialization by deeplearning.ai on Coursera.
- TensorFlow in Practice Specialization by deeplearning.ai on Coursera. -- mynote -- my cerficate.
- TensorFlow Tutorials.
The "non-checked" books are under the way to be finished!
- An Introduction to Statistical Learning by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirami.
- Deep Learning with Python by François Chollet.
- Dive into Deep Learning β An interactive deep learning book with code, math, and discussions, based on the NumPy interface. β Github.
- Hands-On Machine Learning with Scikit-Learn, Keras, and Tensorflow: Concepts, Tools, and Techniques to Build Intelligent Systems (2nd edition) by AurΓ©lien GΓ©ron.
- Machine Learning Yearing by Andew NG.
- Practical Machine Learning: A New Look at Anomaly Detection -- Ted Dunning & Ellen Friedman
- The Elements of Statistical Learning: Data Mining, Inference, and Prediction by Trevor Hastie, Robert Tibshirani and Jerome Friedman.
- Awesome's lists:
- Awesome Anomaly Detection -- A curated list of awesome anomaly detection resources.
- Awesome Big Data β A curated list of awesome big data frameworks, ressources and other awesomeness.
- Awesome Data Engineering β A curated list of data engineering tools for software developers.
- Awesome Deep Learning β A curated list of awesome Deep Learning tutorials, projects and communities.
- Awesome Deep learning papers and other resources β Deep Learning and deep reinforcement learning research papers and some codes.
- Awesome Machine Learning β A curated list of awesome Machine Learning frameworks, libraries and software.
- Awesome Public Datasets β A topic-centric list of HQ open datasets.
- 120 Data Science Interview Questions β Answers to 120 commonly asked data science interview questions.
- A Machine Learning Course with Python β Machine Learning Course with Python. Refer to the course page for step-by-step explanations.
- Python Data Science Handbook β Python Data Science Handbook: full text in Jupyter Notebooks.
- Homemade Machine Learning β Python examples of popular machine learning algorithms with interactive Jupyter demos and math being explained.
- TensorFlow-Course β Simple and ready-to-use tutorials for TensorFlow.
- Machine Learning & Deep Learning Tutorials β ML and DL tutorials, articles and other resources.
- 100-Days-Of-ML-Code.
- Data science blogs β A curated list of data science blogs.
- data-science-ipython-notebooks β DS Python notebooks: DL (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
- Papers With Code β a free and open resource with Machine Learning papers, code and evaluation tables.
- Chris Albon's notes β Notes On Using Data Science & Artificial Intelligence To Fight For Something That Matters.
- Seeing Theory β A visual introduction to probabilities and statistics.
- Collection of useful articles for understanding concepts in ML, AI and DS.
The descriptions of terms in this site are borrowed from Wikipedia.