<h1 align="center">INTRODUCTION TO MACHINE LEARNING</h1>
<h2 align="left"><ins>Lesson Guide</ins></h2>

- [**A (VERY) BRIEF HISTORY OF STATISTICAL LEARNING**](#history)
- [**WHAT IS MACHINE LEARNING?**](#ml)
    - [**Machine Learning vs Deep Learning vs Neural Networks**](#vs)
    - [**How Do Machine Learning Algorithms Work?**](#work)
        - [**Learning a Function**](#learn)
        - [**Learning a Function To Make Predictions**](#func)
        - [**Techniques For Learning a Function**](#tech)
        - [**Applications of Machine Learning**](#app)
- [**REFERENCES**](#ref)

<a id='history'></a>
<h2 align="center">A (VERY) BRIEF HISTORY OF STATISTICAL LEARNING</h2>

- Early 19th Century - Gauss and Legendre independently discovering the method of least squares.
- Sir Francis Galton - Regression to the mean (Tall Parents -> Less tall children), correlation. Cousin of Charles Darwin. Coined term Eugenicist. Discouraged low intelligence people from reproducing.
- Karl Pearson - Student of Galton. Father of mathematical statistics. Eugenicist and racist.
- Ronald Fisher - Father of modern stats and experimental design. ANOVA. Also Eugenicist and racist.

Machine Learning Timeline
- 1940's: Linear discrimant analysis - First classification method developed by Fisher
- 1950's: Perceptron and Neural Networks - Frank Rosenblatt
- 1960's: Nearest Neighbor, K-means clustering
- 1970's: Logistic regression
- 1980's: Decision Trees and other non-linear methods
- 1990's: Support Vector Machines(Vapnik)
- 2000's: Random Forest (Brieman), Deep Learning (Hinton)

<img src="./images/What is ML/1. History_of_artificial_intelligence.jpg" width=400 height=400/>

<a id='ml'></a>
<h2 align="center">WHAT IS MACHINE LEARNING?</h2>

**Machine learning** is a branch of artificial intelligence (AI) and computer science which focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving its accuracy. 

Machine learning is an important component of the growing field of data science. Through the use of statistical methods, algorithms (that iteratively learn from the data) are trained to make classifications or predictions, uncovering key insights within data mining projects (without being explicitly programmed where to look). These insights subsequently drive decision making within applications and businesses, ideally impacting key growth metrics. As big data continues to expand and grow, the market demand for data scientists will increase, requiring them to assist in the identification of the most relevant business questions and subsequently the data to answer them.

<a id='vs'></a>
<h3><ins>Machine Learning vs Deep Learning vs Neural Networks</ins></h3>

Since deep learning and machine learning tend to be used interchangeably, it’s worth noting the nuances between the two. Machine learning, deep learning, and neural networks are all sub-fields of artificial intelligence. However, deep learning is actually a sub-field of machine learning, and neural networks is a sub-field of deep learning.

The way in which deep learning and machine learning differ is in how each algorithm learns. Deep learning automates much of the feature extraction piece of the process, eliminating some of the manual human intervention required and enabling the use of larger data sets. You can think of deep learning as "scalable machine learning". Classical, or "non-deep", machine learning is more dependent on human intervention to learn. Human experts determine the set of features to understand the differences between data inputs, usually requiring more structured data to learn.

"Deep" machine learning can leverage labeled datasets, also known as supervised learning, to inform its algorithm, but it doesn’t necessarily require a labeled dataset. It can ingest unstructured data in its raw form (e.g. text, images), and it can automatically determine the set of features which distinguish different categories of data from one another. Unlike machine learning, it doesn't require human intervention to process data, allowing us to scale machine learning in more interesting ways. Deep learning and neural networks are primarily credited with accelerating progress in areas, such as computer vision, natural language processing, and speech recognition.

Neural networks, or artificial neural networks (ANNs), are comprised of node layers, containing an input layer, one or more hidden layers, and an output layer. Each node, or artificial neuron, connects to another and has an associated weight and threshold. If the output of any individual node is above the specified threshold value, that node is activated, sending data to the next layer of the network. Otherwise, no data is passed along to the next layer of the network. The “deep” in deep learning is just referring to the depth of layers in a neural network. A neural network that consists of more than three layers—which would be inclusive of the inputs and the output—can be considered a deep learning algorithm or a deep neural network. A neural network that only has two or three layers is just a basic neural network.

<img src="./images/What is ML/2. Ai_vs_ml.png" width=400 height=400/>

<a id='work'></a>
<h3><ins>How Do Machine Learning Algorithms Work?</ins></h3>

1. **A Decision Process:** In general, machine learning algorithms are used to make a prediction or classification. Based on some input data, which can be labelled or unlabeled, your algorithm will produce an estimate about a pattern in the data.
2. **An Error Function:** An error function serves to evaluate the prediction of the model. If there are known examples, an error function can make a comparison to assess the accuracy of the model.
3. **An Model Optimization Process:** If the model can fit better to the data points in the training set, then weights are adjusted to reduce the discrepancy between the known example and the model estimate. The algorithm will repeat this evaluate-and-optimize process, updating weights autonomously until a threshold of accuracy has been met. 

>"...think of the model as the specific representation learned from data and the algorithm as the process for learning it.
**$$𝑀𝑜𝑑𝑒𝑙=𝐴𝑙𝑔𝑜𝑟𝑖𝑡ℎ𝑚(𝐷𝑎𝑡𝑎)$$**
>For example, a decision tree or a set of coefficients are a model and the C5.0 and Least Squares Linear Regression are algorithms to learn those respective models."

<a id="learn"></a>
<h5 style="text-decoration:underline">Learning a Function</h5>

Machine learning algorithms are described as learning a target function ($f$) that best maps input variables (X) to an output variable (Y).
$$Y = f(X)$$ 
This is a general learning task where we would like to make predictions in the future (Y) given new examples of input variables (X). We don’t know what the function ($f$) looks like or it’s form. If we did, we would use it directly and we would not need to learn it from data using machine learning algorithms. It is harder than you think. There is also error (e) that is independent of the input data (X).
$$Y = f(X) + e$$
This error might be error such as not having enough attributes to sufficiently characterize the best mapping from X to Y . This error is called irreducible error because no matter how good we get at estimating the target function ($f$), we cannot reduce this error.

<a id="func"></a>
<h5 style="text-decoration:underline">Learning a Function To Make Predictions</h5>

The most common type of machine learning is to learn the mapping $Y = f(X)$ to make predictions of Y for new X. This is called predictive modeling or predictive analytics and our goal is to make the most accurate predictions possible.<br>
&emsp;&emsp;&emsp;As such, we are not really interested in the shape and form of the function ($f$) that we are learning, only that it makes accurate predictions. We could learn the mapping of $Y = f(X)$ to learn more about the relationship in the data and this is called statistical inference. If this were the goal, we would use simpler methods and value understanding the learned model and form of ($f$) above making accurate predictions.<br>
&emsp;&emsp;&emsp;When we learn a function (f) we are estimating its form from the data that we have available. As such, this estimate will have error. It will not be a perfect estimate for the underlying hypothetical best mapping from Y given X. Much time in applied machine learning is spent attempting to improve the estimate of the underlying function and in term improve the performance of the predictions made by the model.

<a id="tech"></a>
<h5 style="text-decoration:underline">Techniques For Learning a Function</h5>

Machine learning algorithms are techniques for estimating the target function ($f$) to predict the output variable (Y) given input variables (X). Different representations make different assumptions about the form of the function being learned, such as whether it is linear or nonlinear. Different machine learning algorithms make different assumptions about the shape and structure of the function and how best to optimize a representation to approximate it. This is why it is so important to try a suite of different algorithms on a machine learning problem, because we cannot know before hand which approach will be best at estimating the structure of the underlying function we are trying to approximate.

<a id="app"></a>
<h5 style="text-decoration:underline">Applications of Machine Learning</h5>

<img src="./images/What is ML/3. Applications-of-ML.png" width=500 height=500/>

<a id="ref"></a>
<h3><ins>REFERENCES</ins></h3>

<ins>BOOKS</ins>
- [Master Machine Learning Algorithms by Jason Brownlee](https://machinelearningmastery.com/master-machine-learning-algorithms/)

<ins>ARTICLES / WEBSITES</ins>
- [Difference between machine-learning and deep-learning](https://www.geospatialworld.net/blogs/difference-between-ai%EF%BB%BF-machine-learning-and-deep-learning/)
- [IBM cloud education](https://www.ibm.com/au-en/cloud/learn/machine-learning#toc-machine-le-K7VszOk6)
- [Very Very Good Article on Machine Learning Algorithms](https://blogs.sas.com/content/subconsciousmusings/2020/12/09/machine-learning-algorithm-use/)
- [Inference vs Prediction by Matthias Doring](https://www.datascienceblog.net/post/commentary/inference-vs-prediction/)
- [Statology - Machine Learning Tutorials](https://www.statology.org/machine-learning-tutorials/)