# MCSE 6309 and EMoS 6309: 
# Machine Learning
## Introduction
### June, 2019

## What is Machine Learning?
Machine Learning is about designing algorithms that __automatically__ extract valuable information from data.

>Machine learning is the principle technology underpinning the recent advances in artificial intelligence. Machine learning is perhaps the principal technology behind two emerging domains: data science and artificial intelligence. The rise of machine learning is coming about through the availability of data and computation, but machine learning methdologies are fundamentally dependent on models. (Neil Lawrence, 2017)

$$ data\quad  + \quad model \quad \underrightarrow{compute} \quad prediction
$$
* __data__: observations, could be actively or passively acquired (meta-data).
* __model__: assumptions, based on previous experience (other data! transfer learning etc), or beliefs about the regularities of the universe. Inductive bias.
* __prediction__:an action to be taken or a categorization or a quality score.

## Concepts of Machine Learning
Three concepts that are at the core of machine learning: _data, a model and learning_

__Data__ is at the core of machine learning. Machine Learning is inherently data driven with the goal to design general purpose methodologies to extract valuable patterns from data, ideally without much domain-specific expertise e.g. books in many libraries, machine learning methods can be used to automatically find relevant topics that are shared across documents.

To achieve this goal, we design __models__ that are typically related to the process that generates data. e.g. in regression setting, a model would describe a function that maps inputs to real-valued outputs.

A __model__ is said to learn from data if its performance on a given task improves after the data is taken into account. The goal is to find good models that generalize well to yet unseen data, which we may care about in the future. A __model__ is typically used to describe a process for generating data, similar to the dataset at hand. 

A _good model_:
* a simplified version of the real (unknown) data-generating process, capturing aspects that are relevant for modeling the data and extracting hidden patterns from it.
* can be used to predict what would happen in the real world without performing real-world experiments.

__Learning__: a way to automatically find patterns and structure in data by optimizing the parameters of the model.

## Important Terms for Intuitions
__Algorithm__ is used in at least two different senses in the context of Machine Learning:
1. Machine Learning algorithm to mean a system that makes predictions based on input data. We refer to these algorithms as __predictors__.
2. Machine Learning algorithm also means a system that adapts some internal parameters of the predictor so that it performs well on future unseen input data. Here we refer to this adaptation as __training__ a system.

Given a dataset and a suitable model, training the model means to use the data available to optimize some parameters of the model with respect to a utility function that evaluates how well the model predicts the training data.

## Applications of Machine Learning
* Text classification
* Natural Language Processing
* Computer vision tasks, eg. Image recognition, face recognition
* Medical diagnosis
* Recommendation systems
* Games eg. AlphaGo
* Speech recognition eg. Siri, Google Home
* Self driving cars etc.

## Data Science
>Data Science is the process of formulating a quantitative question that can be answered with data, collecting and cleaning the data, analyzing the data and communicating the answer to the question to a relevant audience. 

### What question are you trying to answer with data?

>We define the field of data science to be the challenge of making sense of
the large volumes of data that have now become available through the
increase in sensors and the large interconnection of the internet.
Phenomena variously known as “big data” or “the internet of things”.
Data science differs from traditional statistics in that this data is not
necessarily collected with a purpose or experiment in mind. It is collected
by happenstance, and we try and extract value from it later. (Neil Lawrence, 2017)

Example: Can we predict the weather for Arusha from historical weather data? 

## Statistics
Statistics is the discipline of analyzing data. It intersects heavily with
data science, machine learning and, of course, traditional statistical
analysis.
Key activities that define the field:
1. Descriptive statistics (EDA, quantification, summarization, clustering)
2. Inference (estimation, sampling, variability, defining populations)
3. Prediction (machine learning, supervised learning)
4. Experimental Design (the process of designing experiments)

## Machine Learning Approaches
Machine learning takes the approach of observing a system in practice
and emulating its behavior with mathematics. One of the design aspects
in designing machine learning solutions is where to put the mathematical
function. Obtaining complex behavior in the resulting system can require
some imagination in the design process.
The Machine Learning classical approaches:
* supervised learning
* unsupervised learning
* reinforcement learning

1. Supervised Learning 
    * Learn a model from a given set of input-output pairs, in order to predict the output of new  inputs. 
    * Further grouped into __Regression__ and __classification__ problems.
2. Unsupervised Learning
    * Discover patterns and learn the structure of unlabelled data. 
    * Example __Distribution modeling__ and __Clustering__.
3. Reinforcement Learning 
    * Learn what actions to take in a given situation, based on rewards and penalties. 
    * Example consider teaching a dog a new trick: you cannot tell it what to do, but you can reward/punish it.

## Machine Learning vs Statistics

Machine Learning |Traditional statistics
--------------------------|-------------------------------
Emphasize predictions| Emphasizes superpopulation inference
Evaluates results via prediction performance| Focuses on a-priori hypotheses
Concern for overfitting but not model complexity per se| Simpler models preferred over complex ones (parsimony)
Emphasis on performance| Emphasis on parameter interpretability
Generalizability is obtained through performance on novel datasets| Statistical modeling or sampling assumptions
Concern over performance and robustness|Concern over assumptions and robustness

## End-to-end Data Science Approach

STEP 1: Define the goal

STEP 2: Data understanding and preparation
* Importing, cleaning, manipulating and
* Visualizing your data

STEP 3: Building your machine learning model
* Feature selection
* Model training
* Model validation

STEP 4: Model deployment

## Mathematical Foundations of Machine Learning
It is important to understand fundamental mathematical principles upon which more complicated machine learning systems are built.
* facilitates creating new machine learning solutions
* understanding and debugging existing approaches and learning about the inherent assumptions of the methodologies we are working with.

## Pillars of Machine Learning
The four pillars of Machine Learning:
1. Regression
2. Dimensionality Reduction
3. Density Estimation
4. Classification

require a solid mathematical foundation.



![pillar](figsML/pillar.png)

## Further Reading
1. Neil Lawrence (2017), __[What is Machine Learning?](http://inverseprobability.com/2017/07/17/what-is-machine-learning)__
2. Deisenroth, M. et al.(2019) Mathematics for Machine Learning, _Chapter 1: Introduction and Motivation_ To be published by Cambridge University Press.