Skip to content

alex-bochkarev/ML-SMTB-2022

Repository files navigation

[SMTB-2022 course] How to teach 🤖 machines: simple ML examples.

  • This README is also available in Russian (original version).
  • Teaching materials online: Gitub repo (or the SMTB course folder).
  • Discord: co05-как-учить-машины-простые-примеры-про-ml

Status: SMTB-2022 track is over! I am always interested to improve the course for the next season (and it is better to do in advance), so any comments / questions / suggestions are very welcome. Feel free to drop me an email any time.

[ 𝚺 ] Summary

Course point: Machine learning methods (ML) have in a wide variety of applications: recognition of faces and car license plates, recommending movies and books, prediction of accident probabilities or optimal protein folding structure, the list goes on. (See a separate SMTB course on that protein folding thing!)

In this course we will try to look at simple examples and figure how do some fundamental ML models work. In particular, we will consider things that take some data as inputs and:

For each of these cases, we will do two things. First, consider an very simplified model to try and build an intuitive understanding of how does it work. Second, we will build a small, but practically reasonable model of that type, for some sort of realistic setting.

The main goal of the course is to build some sort of confidence and encourage students to dive deeper into the topic, if / when it will be necessary.

Timeframe: Four classes, 50 minutes each. No mandatory home assignments.

Prerequisites: Working knowledge of high-school math (calculating “Derivatives” and the “Chain rule”) would help, but is not strictly necessary. We will try to discuss the intuition behind the required concepts and terms. Examples will be implemented in Python, but programming skills are not really required. Desire to read the existing code would be enough.

Tech needed: having a working python installation locally along with Jupyter Notebook will help. However, you will be just OK using Google Colab (in which case you will need just a web browser and a free Google account). I also provide links to pre-rendered notebooks (nbviewer), so that one could review them briefly without running the whole Jupyter.

[ ☰ ] Course outline

The course comprises three topics, corresponding to the three types of models. (The list is not exhaustive, of course; these are just illustrations for some fundamental models)

Topic ① Linear Regression

Predicting the linear relations.

Topic ② Logistic Regression

Now: what to do if we want to predict a yes-or-no answer?

Topic ③ Neural Network (two sessions)

Non-linear relations and logits stacked one upon another – a three-node example.

  • 📝 practice: handwriting recognition (with sticky tape numpy and torch).
  • 📓 jupyter notebooks:
  • 💾 data: downloaded automatically with PyTorch (we will need torchvision).

[ 👓 ] Further reading.

The topic is way, way too broad. Whatever we have here, it wouldn’t be enough 🤷. But let us try to highlight a few things that I (very subjectively) like:

  • Linear Algebra:
  • Probabilities and Co. – there are a few links from my previous-year intro course (make sure to check out the wonderful numerical illustration!). Perhaps I would recommend to try different courses (Coursera / EdX / stepik / etc) and perhaps different books and see what works for you.
  • In general about “Analytics”, I liked The Analytics Edge EdX-course a lot (see also a book with the same title, by Bertsimas, O’Hair, and Pulleyblank). Examples about the wine and Framingham Risc Score (and a few pretty cool others!) are described there.
  • Also a few links on Calculus came up in a discussion (thanks, Alexey Matyash!):
  • Finally, on ML in general. I would recommend to check out the numerous discussions on “what to read” on Reddit, Quora, and so on, and check out what works for you best. I liked a few courses on Coursera / EdX, but unfortunately not all of them are available these days. One could try to start with the classic by Andrew Ng on Coursera, or even CS229. In general, in my opinion, various “certificates” are more or less useless, unlike the practical experience / projects you try to do during these courses. Practice (even if these are relatively simple projects) does add understanding and, actually, some code on Github one can demonstrate just in case.

About

Selected ML examples: "How to Teach Machines", a mini-course for SMTB-2022

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages