Skip to content

Commit

Permalink
ml curriculum
Browse files Browse the repository at this point in the history
  • Loading branch information
rasbt committed Nov 12, 2015
1 parent 3955f23 commit b8cec4b
Show file tree
Hide file tree
Showing 14 changed files with 44 additions and 2 deletions.
2 changes: 1 addition & 1 deletion README.md
Expand Up @@ -83,7 +83,7 @@ Excerpts from the [Foreword](./docs/foreword_ro.pdf) and [Preface](./docs/prefac
- [How important do you think having a mentor is to the learning process?](./faq/mentor.md)
- [Where are the best online communities centered around data science/machine learning or python?](./faq/ml-python-communities.md)
- [How would you explain machine learning to a software engineer?](./ml-to-a-programmer.md)

- [How would your curriculum for a machine learning beginner look like?](./ml-curriculum.md)

### Questions about ML Concepts

Expand Down
1 change: 1 addition & 0 deletions faq/README.md
Expand Up @@ -27,6 +27,7 @@ Sebastian
- [How important do you think having a mentor is to the learning process?](./mentor.md)
- [Where are the best online communities centered around data science/machine learning or python?](./ml-python-communities.md)
- [How would you explain machine learning to a software engineer?](./ml-to-a-programmer.md)
- [How would your curriculum for a machine learning beginner look like?](./ml-curriculum.md)

### Questions about Machine Learning Concepts

Expand Down
2 changes: 1 addition & 1 deletion faq/datamining-vs-ml.md
@@ -1,4 +1,4 @@
# What are differences in research nature between the two fields: Machine Learning & Data Mining?
# What are differences in research nature between the two fields: Machine Learning & Data Mining?

In a nutshell, Data Mining is about the discovery of patterns in datasets or "gaining knowledge and insights" from data. Machine Learning is closely related though. We can think of Machine Learning algorithms as one of he work horses of Data Mining; most Data Mining approaches are based on Machine Learning algorithms. Maybe it helps to think of Data Mining as a pipeline of steps and approaches, and the use of a Machine Learning algorithm is one part of this pipeline.
Or in other words, Data Mining is not "just" Machine Learning. E.g., data visualization or summarization is also part of Data Mining. What I was trying to say is that Machine Learning is one part, one set of techniques, that is/are being used in Data Mining.
40 changes: 40 additions & 0 deletions faq/ml-curriculum.md
@@ -0,0 +1,40 @@
# How would your curriculum for a machine learning beginner look like?
If I had to put together a study plan for a beginner, I would probably start with an easy-going intro course such as

- Andrew Ng's [Machine Learning course on Coursera](https://class.coursera.org/ml-005/lecture)

![](./ml-curriculum/ng.png)

Next, I would recommend a good intro book on 'Data Mining' (data mining is essentially about extracting knowledge from data, mainly using machine learning algorithms). I can highly recommend the following book written by one of my former professors:

- P.-N. Tan, M. Steinbach, and V. Kumar. [Introduction to Data Mining](http://www-users.cs.umn.edu/~kumar/dmbook/index.php), (First Edition). Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 2005.

![](./ml-curriculum/tan.jpeg)

This book will provide you with a great overview of what's currently out there; you will not only learn about different machine learning techniques, but also learn how to "understand" and "handle" and interpret data -- remember; without "good," informative data, a machine learning algorithm is practically worthless. Additionally, you will learn about alternative techniques since machine learning is not always the only and best solution to a problem

> if all you have is a hammer, everything looks like a nail ...
Now, After completing the Coursera course, you will have a basic understanding of ML and broadened your understanding via the Data Mining book.
I don't want to self-advertise here, but I think my book would be a good follow-up to learn ML in more depth, understand the algorithms, learn about different data processing pipelines and evaluation techniques, best practices, and learn how to put in into action using Python, NumPy, scikit-learn, and Theano so that you can start working on your personal projects.

![](./ml-curriculum/raschka.jpeg)

While you work on your individual projects, I would maybe deepen your (statistical learning) knowledge via one of the three below:


- T. Hastie, R. Tibshirani, J. Friedman, T. Hastie, J. Friedman, and R. Tibshirani. [The Elements of Statistical Learning](http://statweb.stanford.edu/~tibs/ElemStatLearn/), volume 2. Springer, 2009.
- C. M. Bishop et al. [Pattern recognition and machine learning](http://www.springer.com/us/book/9780387310732), volume 1. springer New York, 2006.
- Duda, Richard O., Peter E. Hart, and David G. Stork. [Pattern classification](http://www.wiley.com/WileyCDA/WileyTitle/productCd-0471056693.html). John Wiley & Sons, 2012.

![](./ml-curriculum/three.png)

When you are through all of that and still hungry to learn more, I recommend

- [the Deep Learning book](http://www.iro.umontreal.ca/~bengioy/dlbook/) by Yoshua Bengio, Ian Goodfellow, and Aaron Courville. The release date is set around 2016, but the 613-page manuscript is already available as as of today (online and for free).

![](./ml-curriculum/bengio.png)

- And in-between, if you are looking for a less technical yet very inspirational free-time read, I highly recommend [Pedro Domingo's The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World](https://homes.cs.washington.edu/~pedrod/)

![](./ml-curriculum/domingos.png)
Binary file added faq/ml-curriculum/bengio.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faq/ml-curriculum/bishop.jpeg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faq/ml-curriculum/domingos.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faq/ml-curriculum/duda.jpg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faq/ml-curriculum/ng.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faq/ml-curriculum/raschka.jpeg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faq/ml-curriculum/tan.jpeg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faq/ml-curriculum/three.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faq/ml-curriculum/tibshirani.jpeg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions faq/ml-to-a-programmer.md
Expand Up @@ -30,6 +30,7 @@ In machine learning, we take data (e.g., e-mails), provide information about the


**machine learning:**

- results + data -> machine learning algorithm + computer -> set of rules


Expand Down

0 comments on commit b8cec4b

Please sign in to comment.