# What is Deep Learning?
- Cut through the noise to differentiate between the press perception and world-changing developments.

- Questions to be tackled:
  - What has Deep Learning achieved so far?
  - How significant is it?
  - Where are we headed next?
  - Should we believe the hype?

## Artificial Intelligence, Machine Learning and Deep Learning

|<img alt="AI, ML and DL Venn Diagram" src="./images/AI_ML_DL_venn.png" width=400/>
|:----:|
|*Artificial Intelligence, Machine Learning and Deep Learning*|

<br><br>

|Artificial Intelligence|Machine Learning|Deep Learning|
|:----------------------|:---------------|:------------|
|<ul><li>Started in 1950s</li><li>A concise definition: *The effort to automate intellectual tasks performed by human beings.*</li><li>Superset of Machine Learning and Deep Learning, which also includes techniques that doesn't involve learning.</li><li>Symbolic AI:<ul><li>Handcrafting a large number of rules to manipulate knowledge</li><li>1950s to 1980s</li><li>Peaked in popularity in 1980s, during *expert systems* boom</li><li>Difficult to use for fuzzy problems, like Image Classification and Speech Recognition.</li><li>Hence Machine Learning arose.</li></ul></li></ul>|<ul><li>Ada Lovelace remark on *Analytical Engine*: "The Analytical Engine has no pretensions whatever to originate anything. It can do whatever we know how to order it to perform... Its province is to assist us in making available what we're already acquainted with."</li><li>Alan Turing when introducing *Truing Test* quoted above remark as "Lady Lovelace's Objection" and came to the conclusion that general purpose computers are capable of learning and originality.</li><li>Machine Learning arises from questions:<ul><li>Could a computer go beyond what we know how to order it to perform and learn on its own how to perform a specified task?</li><li> Could a computer suprise us?</li><li>Can it define rules on its own by looking at data?</li></ul><br><img alt="Classical Programming vs Machine Learning" src="./images/classicalprogramming-vs-ml.png" style="align:center"/></li><li>It is trained rather than explicitly programmed</li><li>It finds statisical structure in the data examples presented to it.</li><li>Tightly linked to mathematical statistics, but differs from statistics in several ways:<ul><li>ML deals with more complex data, in large amounts, for which classical stats like Bayesian Analysis would be impractical.</li></ul></li></ul>|<ul><li>Next level of Machine Learning</li><li>Better way to find hidden representation in the data.</li><li>Learn about data reprsentation in the section below.</li><li>*Deep* in deep learning stands for finding input to output map using successive layers of increasingly meaningful representations</li><li>No of layers in the model is called *depth* of the model</li><li>Other apt names for the field could have been:<ul><li>Layered Representations Learning</li><li>Hierarchical Representations Learning</li></ul></li><li>Other ML approaches focuses on using either one or two layers of representations, while Deep Learning can have 100s of layers of representations. Hence, ML approaches are sometimes called *shallow learning*</li><li>Most often the layered representations of Deep Learning are achieved using *neural networks*</li><li>Term "Neural Networks" came from neural biology and many of its characteristics are taken from "our understanding" of human brain. But it isn't a model of a human brain!</li><li>There is no evidence that the brain uses learning algorithms, used by moder Deep Networks!</li></ul>|

### Data Representation Learning
- Here we will learn about differences between Deep Learning and other Machine Learning algorithms.
- To do ML we need 3 things:
  - Input data points. Example: Images of Dog and Cat
  - Examples of expected outputs. Example: labels stating whether image is that of dog or cat.
  - Way to measure whether the algorithm is doing a good job. This is important to measure the distance between the expected output and where the algorithm is right now!

- An ML model (algorithm), transform input into meaningful outputs.
- This is done by a process that is "learned" by exposure to known outputs.
- Central problem in ML and DL - *meaningfully transform data*. Change representation of input data to something that is more closer to the output.
- *Representation* - different way to look at data.
- An image can be represented using RGB or HSV

Let's take a concrete example!

![Data Representation Example](./images/data_representation_example.png)


Now we can use $x>0$ to define blue class and $x<=0$ to define red class

- Machine Learning algorithms perform these kinds of transformations automatically!
- *Learning* in the context of Machine Learning stands for automatic search for better representations.
- Transformations performed by Machine Learning algorithms can be:
  - Coordinate Changes
  - Linear Projections (which may destroy some info.)
  - Translations
  - Nonlinear operations (example, select all points where x>0)

- ML algorithms are not very "creative" at finding best transformations. They search for the best combinations of transformations in a pre-defined set of operations, called a *hypothesis space*.

- So what makes Deep Learning Special?
  - Deep Learning tries to find layered hidden representation, whereas ML only finds shallow representations.
  - ML algorithms are not very good at finding complex representation as in images and speech, whereas we can reach human level performance using Deep Networks. This is due their layered representations!
  - Think of Neural Networks as a multi-stage information distillation operation, where information goes through successive filters and comes out increasingly *purified*!

![deep representations learnt by model](./images/deep_representations_learnt_by_model.png)

### Understanding How Deep Learning Works!
- Here we will see how Deep Networks map input to desired output using simple data layered transformations and their exposure to examples of input-target pairs.
- Transformations performed by a layer are *parameterized* by the *weights* it holds. *Weights* are a bunch of numbers, a.k.a. *parameters*.
- *Learning* means finding particular sets of *weights* such that network maps inputs correctly (read optimally) to associated targets.

![](./images/nn_parameterized_by_weights.png)

- A Deep Network can contains 10s of millions of parameters, changing one can affect behavior of others, finding the set of weights can be a daunting task.

- To control the output of a neural network, we must have a sense of how far the current output is from desired output. This distance measure is called *loss function* or *objective function*.

![](./images/loss_measures_quality_of_output.png)

- The trick is to use the loss function's distance score as a feedback signal to adjust the weights a little, in a direction that will lower the loss score for the current example. This adjustment is the job of an *optimizer*, which implements *Backpropagation Algorithm* - the central algorithm in Deep Learning.

![](./images/loss_used_as_feedback_signal.png)

- Initially the weights are assigned randomly, so naturally the output will be much far from required.
- With each example shown the network improves by adjusting its weights to perform better on the given example. This is called *training loop*. Final result, is a *trained network*, whose weights produce outputs as close to targets as close they can be.

### What Deep Learning has achieved si fat
- Near human level image classification
- Near human level speech recognition
- Near human level handwritting transcription
- Improved Machine Translation
- Improved text-to-speech conversion
- Digital assistants such as Google Now and Amazon Alexa
- Near human level autonomous driving
- Improved ad targeting, as used by Google, Baidu and Bing
- Improved search results on the web
- Ability to answer natural language questions
- Superhuman Go Playing