Skip to content

πŸ“Š All of courses, assignments, exercises, mini-projects and books that I've done so far in the process of learning by myself Machine Learning and Data Science.

Notifications You must be signed in to change notification settings

ceydatekin/data-science-learning

Β 
Β 

Repository files navigation

πŸ“Š data-science-learning

The list of things I've finished so far on the way of learning by myself Machine Learning and Data Science.

πŸ”₯ Projects

  • Setting up a cafΓ© in Ho Chi Minh City β€” find a best place to setting up a new business β€” article β€” source.
  • Titanic: Machine Learning from Disaster (from Kaggle) β€” predicts which passengers survived the Titanic shipwreck β€” source.

I also do some mini-projects for understanding the concepts. You can find the html files (exported from the corresponding Jupyter Notebook files) and "Open in Colab" files for below mini projects here.

🎲 Tasks

  • Anomaly Detection. β€” my note
  • Data Aggregation β€” my note
  • Data Overview. β€” my note
  • Data Visualization.
  • Model evaluation.
  • Preprocessing (texts, images, dates & times, structured data). β€” my note
  • Testing. β€” my note
  • Web Scraping.

🐍 Programming Languages

  • GraphQL β€” an open-source data query and manipulation language for APIs, and a runtime for fulfilling queries with existing data.
  • Python β€” an interpreted, high-level, general-purpose programming language β€” my note.
  • R β€” a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing.
  • Scala β€” a general-purpose programming language providing support for functional programming and a strong static type system.
  • SQL β€” a domain-specific language used in programming and designed for managing data held in a relational database management system, or for stream processing in a relational data stream management system.

βš™οΈ Frameworks & Platforms

  • Apache Airflow β€” my note
  • Docker β€” a set of platform as a service products that use OS-level virtualization to deliver software in packages called containers β€” my note
  • Google Colab β€” a free cloud service, based on Jupyter Notebooks for machine-learning education and research β€” my note.
  • Google Kubernetes
  • Hadoop β€” a collection of open-source software utilities that facilitate using a network of many computers to solve problems involving massive amounts of data and computation.
  • Kaggle β€” an online community of data scientists and machine learners, owned by Google.
  • PostgreSQL (Postgres) β€” a free and open-source relational database management system emphasizing extensibility and technical standards compliance.
  • Spark β€” an open-source distributed general-purpose cluster-computing framework.

βš’οΈ Tools

  • Bash β€” my note
  • Git β€” a distributed version-control system for tracking changes in source code during software development β€” my note.
  • Markdown β€” a lightweight markup language with plain text formatting syntax β€” my note.
  • Jupyter Notebook β€” an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text β€” my note.
  • Trello β€” a web-based Kanban-style list-making application.

πŸ“š Libraries & Frameworks

The "ticked" libraries don't mean that I've known/understand whole of them (but I can easily use them with their documentation)!

  • D3js β€” a JavaScript library for producing dynamic, interactive data visualizations in web browsers.
  • Keras β€” an open-source neural-network library written in Python.
  • Matplotlib β€” a plotting library for the Python programming language and its numerical mathematics extension NumPy. β€” my note
  • Numpy β€” a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays. β€” my note
  • OpenCV β€” a library of programming functions mainly aimed at real-time computer vision.
  • Pandas β€” a software library written for the Python programming language for data manipulation and analysis. -- my note
  • Plotly -- the front-end for ML and data science models.
  • PyTorch -- my note
  • Seaborn β€” a Python data visualization library based on matplotlib.
  • Scikit-learn β€” a free software machine learning library for the Python programming language.
  • TensorFlow β€” a free and open-source software library for dataflow and differentiable programming across a range of tasks.

πŸ‘¨β€πŸ« Courses

The "non-checked" courses are under the way to be finished!

πŸ“– Books

The "non-checked" books are under the way to be finished!

πŸ€– Github's repositories

🌏 Other resources


The descriptions of terms in this site are borrowed from Wikipedia.

About

πŸ“Š All of courses, assignments, exercises, mini-projects and books that I've done so far in the process of learning by myself Machine Learning and Data Science.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 56.1%
  • HTML 42.6%
  • MATLAB 1.3%