# **INTRODUCTION TO DEEP LEARNING** <br/>

Credits:
(Snippets taken from fastai deep learning course, DeepLearning.AI, Elements of statistical learning)

## **COMPUTER PROGRAMMING**
Given a set of numbers write a program to return maximum or minimum. <br>

Given the principal amount, rate of interest, loan duration etc. generate a loan payment schedule <br>

From a database table return records matching a search criteria etc.. etc.. <br>

Computers mostly solve problems using algorithmic approaches that have been refined over the years. For complex problems you have more sophisticated algorithms like finding the shortest distance between a pair of points, or given your geospatial coordinates search for places of interest in proximity

No learning involved here. The programmer knows the steps to solve the problem and writes them in a language the computer understands
<br>
<br>
<br>

## **WHAT IS LEARNING**

### *How does a child learn to recognize objects around him/her?* <br>
### *How does a child learn a language?*
### *How do you learn to drive a vehicle?*

And many similar tasks that demand a degree of congnition

Try solving one of such tasks using computer programs

**Write a program to recognize which digits are present in this image?** <br>
<img src="images/img1.jpg" style="width:400px;height:100px;">

We can split the problem of recognizing handwritten digits into two sub-problems. First, we would like a way of breaking an image containing many digits into a sequence of separate images, each containing a single digit. <br>

We humans solve this segmentation problem with ease, but it is challenging for a computer program to correctly break up the
image. Once the image has been segmented, the program then needs to classify each individual digit.

**Whether a tweet is about a disaster event?** <br>

<br>
<br>

<img src="images/img2.jpeg" style="width:600px;height:300px;">

**Out of multiple objects in an image which object is what?**

<br>
<br>
<br>

## BIOLOGICAL LEARNING SIMULATION VIA NEURAL NETWORKS

A biological neural network is composed of a groups of chemically connected or functionally associated neurons. A single neuron may be connected to many other neurons and the total number of neurons and connections in a network may be extensive. Connections, called synapses, are usually formed from axons to dendrites, though dendrodendritic synapses and other connections are possible. 

There are approximately 100 billion neurons in adult human brain and these have 10e15 connections.
<br/>
<br/>
<br/>
<br/>
<img src="images/img4.png" style="width:400px;height:300px;">
<img src="images/img5.jpg" style="width:300px;height:300px;">

<br/>
<br/>
<br/>
<br/>

We would like the raw input image to be transformed into gradually higher levels of representation,
representing more and more abstract functions of the raw input, e.g., edges, local shapes, object parts

<img src="images/img6.jpg" style="width:300px;height:400px;">
<img src="images/img7.jpg" style="width:500px;height:400px;">

## AI VS MACHINE LEARNING VS DEEP LEARNING

<img src="images/img3.jpg" style="width:500px;height:500px;">
<br/>
<br/>

## **Artifical Intelligence**

Artificial intelligence is the capability of a computer system to mimic human cognitive functions such as learning and problem-solving. Through AI, a computer system uses math and logic to simulate the reasoning that people use to learn from new information and make decisions.

<b> *Kinds of Artifical Intelligence* </b>:

**1) Rule based systems**

- Constrained set of inputs and outputs which are well defined and known in advance. <br>
- Functioning simple enough to be modeled solely based on a set of predetermined rules. <br>
- Rules defined by domain experts. <br>
- Knowledge based system that doesn't learn, like a human born with limited knowledge. <br>
- Rules engines like drools in java, email monitoring system etc. <br>

**2) Rational / Problem solving / goal based agents**

- Use search strategies to solve a specific problem
    - Initial state 
    - Goal or end state
    - Search space (Environment in which search takes place, set of states and operations to change those states)
    - Goal test (function to observe current state and whether goal state is achieved)
    - Search tree
    - Operation or action ( represented by an edge that causes system to change its state) 
    - Solution (operation sequence that leads from initial to goal state)
    - Path cost (each operation has a cost associated)
    - Optimal solution (sequence of operations with lowest cost)
<br>
<br>
- Uninformed search 
    - Depth first search
    - Breadth first search
    - Greedy search (djikstra)
    - Bidirectional search etc.
<br>
<br>
- Informed search ( A* search etc.)    
<br>
- Adversarial search (games like chess, tic-tac-toe etc.)
    - Minimax
    - Alpha beta pruning 
<br/>
<br/>

**3) Optimization techniques like hill climbing, simulated annealing etc.**
<br/>

**4) Genetic algorithms**
<br/>
<br/>
*Application: Robotic vacumn cleaners, game playing like chess, sudoku solvers, NQueens problem, vehicle routing, all constraints satisfaction problems*

Still relevant depending on use cases but RL is quickly dominating this space
<br/>
<br/>
<br/>

## FROM PROGRAMMING COMPUTERS TO MACHINE LEARNING

<img src="images/img8.jpg">

Instead of telling the computer the exact steps required to solve a problem, show it examples of the problem to solve, and let it figure out how to solve it itself.
<br/>

<img src="images/img9.jpg">
<br/>
<br/>
<br/>

**Inputs** => values that it processes in order to produce its results—for instance, taking image pixels as inputs, and returning the classification "dog" as a result 
<br/>

**Weights** => Weights are just variables that act on inputs, and a weight assignment is a particular choice of values for those variables
<br/>

**Results** => Predictions from the model based on input data and chosen weights

**Every weight assignment has some "actual performance"**
<br/>

**There be an "automatic means" of testing that performance (Loss or cost function)**
<br/>

**The need for a "mechanism" (i.e., another automatic process) for improving the performance by changing the weight assignments**

### Model <br/>
Model is a special kind of program: it's one that can do many different things, depending on the weights. 
<br/>
<br/>

<img src="images/img10.jpg">
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>

## WHAT IS A NEURAL NETWORK

A **machine learning model** that is so flexible that it could be used to **solve any given problem**, just by varying its weights. Amazingly enough, this function actually exists! It's the neural network, which we already discussed. That is, if you regard a neural network as a mathematical function, it turns out to be a function which is extremely flexible depending on its weights. A mathematical proof called the **universal approximation theorem** shows that this function can solve any problem to any level of accuracy, in theory. The fact that neural networks are so flexible means that, in practice, they are often a suitable kind of model, and you can focus your effort on the process of training them—that is, of **finding good weight assignments**.


Breaking down a neural network, the key parts: <br/>

1) The input features X and the target variable (feature) y
2) One or more hidden layers with one or more hidden units
3) Weights and biases corresponding to each unit in each layer
4) The activation function (sigmoid, tanh, relu etc.). Adds non-linearity and is critical to the process of learning and discovering features. Without activation functions, the neural network is just a linear function of inputs like in linear regression.
5) Forward propogation or feed forward or forward pass
6) The loss function <br/>
    - Root mean squared error (RMSE) for regression problems
    - Cross entropy loss for classification problems
7) Backward propogation or backward pass
8) Weight updates with a learning rate
9) Steps 5 to 8 governed by an underlying optimization algorithm like stochastic gradient descent 


Walkthrough of implementation of a simple neural network

<img src="images/AutoGrad.jpeg" style="width:800px;height:600px;">

## **Machine Learning (the biggest subset of AI):**
Machine learning (ML) is the study of computer algorithms that can improve automatically through experience and by the use of data.It is seen as a part of artificial intelligence. Machine learning algorithms build a model based on sample data, known as training data, in order to make predictions or decisions without being explicitly programmed to do so.

A core objective of a learner is to **generalize from its experience**. Generalization in this context is the ability of a learning machine to perform accurately on new, unseen examples/tasks after having experienced a learning data set. The training examples come from some generally unknown probability distribution (considered representative of the space of occurrences) and the learner has to build a general model about this space that enables it to **produce sufficiently accurate predictions in new cases**.

Two major paradigms

**1) Classical machine learning**
<br/>
- Closely related to statistics but is different
- Statistical learning from data
- Pattern recognition
- Generalize from experience 
- Interpretable by humans
- Optimization at its heart
- Statistical / probabilistic techniques
<br/>
<br/>

**2) Deep learning via artifical neural networks**

<br/>
<br/>

## Major types of machine learning problems
###  Regression vs Classification 

Regression -> Target variable is continuous <br/>
Classfication -> Target variable is categorical

<br/>
<br/>


## Supervised vs Unsupervised learning

**1) Supervised learning-- Labeled data** <br/>

 Algorithms / approaches    

    - Linear Regression
    - Logistic Regression
    - Decision Trees
    - Boosted trees
    - Random forests
    - Gradient boosting machines (XGB, Light GBM, CatBoost)
    - Support vector machines
    - Probailistic techniques like Naive Bayes
    - PCA / SVD
    
Mathematical concepts like statistics, probability, linear algebra, calculus    

**Unsupervised - No labeled data**<br/>
Clustering like google news, communities in social graphs

Majority use cases are supervised
<br/>
<br/>

## **Machine learning vs Deep learning**

Classical ML algorithms don't improve beyond a point with increasing data.<br/>
Deep learning has no such limits <br/>
<br/>
Classical machine learning => Tabular data (structured) <br/>
Deep learning techniques => Unstructured data ( image, video, text, audio )
<br/>
<br/>
<br/>
<img src="images/img11.jpg" style="width:800px;height:500px;">
<br/>

 So what made deep neural networks work in the past decade <br/>

 *Exponential increase in both compute (GPU computing) and data*


ImageNet large scale visual recognition challenge (14 million images of 1000 categories)
Computer vision one of the hardest problems in ML. Perfomance in imagenet challenge an indicator of ML evolution.

IN 2012 AlexNet (designed by Alex Krizhevsky in collaboration with Ilya Sutskever and Geoffrey Hinton), a deep learning model trained using GPUs won the Imagenet challenge by more than 10.8 percentage points. First time ever such a performance was achieved in a computer vision task by a ML model. That was the begining of the DL renaissance.

## **Fields in deep learning**

**ComputerVision**

Convolutional neural networks <br/>
ImageClassification <br/>
ObjectDetection <br/>
FaceRecognition <br/>

*Use cases*

Manufacturing industries (visual inspection for defects etc.) <br/>
Self driving cars <br/>
Surveillance, security operations <br/>
Classified defence use cases <br/>
<br/>

**Natural language processing**

SequenceModels, Recurrent neural networks, LSTMs, GRUs, Transformers <br/>
Language models like GPT3, Turing NLG

*Use cases* 

Chatbots, text generation, question answering, text summarization <br/>
Machine translation --> Google translate <br/>
SpeechRecognition -> Alexa, google assistant <br/>

## Programming languages:
Python / R / Julia

## Machine learning frameworks
Numpy, pandas, matplotlib, seaborn, scikit-learn

## Deep learning frameworks

Provide automatic differentiation using computation graphs

1) Pytorch ecosystem (backed by facebook)
    - Pytorch 
    - Pytorch lightning
    - fastai etc
<br/>
<br/>
2) Tensorflow ecosystem (backed by Google)
    - Tensorflow
    - Keras
<br/>
<br/>

### Transfer learning

### Inference on edge devices

### Google coral / Pixel TPU

**Bias in ML algorithms**

**DeepFakes**

### Skills required:

Attitudes <br/>
    - Curiosity / exploratory nature <br/>
    - Love for data <br/>

Skills that can be learned <br>
    - Mathematics/Statistics (20%)<br/>
    - Coding (80%)<br/>

With experience <br/>
    - Domain expertise <br/>

Multi year journey

Rapidly evolving field

Need to be on the cutting edge, research papers.