# Introduction to deep learning - Class notes 
(lec 2 deep learning playlist campusx)
## What is Deep Learning?

### Simple Definition
*   **Deep Learning** is a sub-field of **Artificial Intelligence (AI)** and **Machine Learning (ML)**.
*   It is **inspired by the structure of the human brain**.
*   Deep Learning algorithms attempt to **draw similar conclusions as humans** by continuously analysing data through a logical structure called a **Neural Network**.

### Technical Definition
*   Deep Learning is part of a broader family of Machine Learning methods that are based on **Artificial Neural Networks with representation learning**.
*   Deep Learning algorithms use **multiple layers to progressively extract higher-level features** from raw input. For example, in image processing, lower layers identify edges, while higher layers identify complex concepts like eyes, noses, or faces.

### Relationship with AI and Machine Learning
*   **Artificial Intelligence (AI)** is the broadest umbrella term, representing the human quest to create intelligent machines.
*   **Machine Learning (ML)** is a sub-field of AI where systems **learn from data** to figure out the relationship between input and output. ML algorithms primarily rely on **statistical techniques**.
*   **Deep Learning (DL)** is a sub-field of Machine Learning. It performs the same task as ML (finding input-output relationships) but uses a **different approach** based on a logical structure called a Neural Network.

### Neural Networks
*   Deep Learning is fundamentally based on a **logical structure called a Neural Network**, inspired by the human brain. Computer scientists aimed to mimic human intelligence.
*   A **simple Artificial Neural Network (ANN)** consists of:
    *   **Perceptrons**: Fundamental units represented as circles.
    *   **Weights**: Arrows connecting perceptrons, indicating connections.
    *   **Layers**: Perceptrons are arranged in layers, including an **input layer**, an **output layer**, and one or more **hidden layers** in between.
*   The term "**deep**" in Deep Learning refers to the presence of **many hidden layers**.
*   **Types of Neural Networks** (besides ANN):
    *   **Convolutional Neural Networks (CNNs)**: Excellent for image data.
    *   **Recurrent Neural Networks (RNNs)**: Good for speech and textual data.
    *   **Generative Adversarial Networks (GANs)**: Can generate content like text or images.

## Why is Deep Learning So Famous? (Success Reasons)
Deep Learning's fame and recent surge in popularity are attributed to two main reasons:

1.  **Applicability**: Deep Learning algorithms are applicable across a **vast domain of problems**. This includes fields such as:
    *   Computer Vision
    *   Speech Recognition
    *   Natural Language Processing (NLP)
    *   Machine Translation
    *   Bioinformatics
    *   Drug Design
    *   Medical Imaging
    *   Climate Science
    *   Material Inspection
    *   Board Games (e.g., AlphaGo)

2.  **Performance**: In these diverse fields, Deep Learning has achieved **state-of-the-art results**, often **surpassing human experts' performance**. A notable example is **AlphaGo** beating the world champion in the game of Go (winning 4 out of 5 games). It has also had a significant impact on self-driving cars, medical science, and chemistry.

## Deep Learning vs. Machine Learning: Key Differences

| Feature             | Deep Learning                                                                                                                                                                                                                                                                                                                                                                                                     | Machine Learning                                                                                                                                                                                                                                                                                                                                                                                                      |
| :------------------ | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Data Dependency** | **Data-hungry**: Requires a **very large amount of data** (millions of rows) to achieve reliable performance. Performance scales almost **linearly with increasing data**.                                                                                                                                                                                                                               | Can perform well with **less data** (hundreds or thousands of rows). Performance tends to **stagnate** after a certain amount of data, even with more data added.                                                                                                                                                                                                                                              |
| **Hardware**        | Requires **powerful hardware, specifically GPUs** (Graphical Processing Units), to handle complex matrix multiplications and parallel processing efficiently. Training on CPUs would be very slow.                                                                                                                                                                                                        | Can be trained on **simpler machines using CPUs**.                                                                                                                                                                                                                                                                                                                                                         |
| **Training Time**   | **Very high**: Training models can take **weeks or even months** on large datasets.                                                                                                                                                                                                                                                                                                                            | **Low**: Training models typically takes **minutes to hours**, sometimes up to a day or week.                                                                                                                                                                                                                                                                                                                        |
| **Prediction Time** | **Very fast** once trained.                                                                                                                                                                                                                                                                                                                                                                                   | Can vary; some algorithms (e.g., K-Nearest Neighbours) can have **slow prediction times**.                                                                                                                                                                                                                                                                                                                      |
| **Feature Selection** | Employs **Representation Learning** (also called Feature Learning/Extraction). The algorithm **automatically discovers and extracts relevant features** from the raw data. Lower layers extract primitive features (e.g., edges), while deeper layers extract more complex features (e.g., shapes, objects like eyes or faces). | Requires **manual Feature Engineering or Feature Selection/Extraction** by a **domain expert**. For example, in a dog/cat classifier, features like size and colour would need to be manually defined.                                                                                                                                                                                                                   |
| **Interpretability** | **Low (Black Box)**: It's difficult to understand *why* a Deep Learning model made a particular prediction. The automatic feature extraction means the exact features used are not explicitly known to humans. This is a disadvantage in situations requiring explainability (e.g., banning a user based on comments). | **High**: Many Machine Learning models (e.g., Linear Regression, Logistic Regression, Decision Trees) offer **high interpretability**. They can often explain the reasoning behind their predictions, allowing users to understand which features contributed most to the outcome.                                                                                                                                          |

*   **Conclusion**: Deep Learning does not replace Machine Learning. Both have their place, likened to a needle (ML) and a sword (DL) – you choose based on the task.

## Why Deep Learning Became So Famous After 2010?
Despite its origins in the 1960s, Deep Learning only gained significant traction around 2010. Several factors contributed to this:

1.  **Datasets**:
    *   Deep Learning is inherently **data-hungry**, requiring millions of data points.
    *   The **smartphone revolution** and **reduced internet pricing** (e.g., Reliance Jio in India around 2015-16) led to an **exponential increase in data generation**. Data generation increased from 'X' GB from the beginning of human history to 2015, to 'X' GB in just 2015-2016, and '4X' GB in 2017-2018.
    *   While raw data was abundant, **labelled data** was crucial for training.
    *   Major companies (Microsoft, Google, Facebook) invested heavily in converting unlabelled data into labelled data and then **open-sourcing these public datasets**.
    *   These publicly available, high-quality datasets significantly **accelerated Deep Learning research and development**.
    *   **Examples of Public Datasets**:
        *   **Image Data**: Microsoft COCO (for object detection with bounding boxes and labels).
        *   **Video Data**: YouTube 8M (6.1 million YouTube videos).
        *   **Text Data**: SQuAD (150,000 question-answer pairs from Wikipedia).
        *   **Audio Data**: Google Audio Set (2 million sound clips across 600+ categories from YouTube).

2.  **Hardware (Computing Power)**:
    *   **Moore's Law** states that the number of transistors on a microchip doubles approximately every two years, while the cost halves. This continually improves hardware performance and reduces costs.
    *   Deep Learning requires processing **massive amounts of data** and performing **complex matrix operations**. CPUs were too slow for this.
    *   Around 2010, the realisation that **parallel processing** could greatly speed up matrix multiplications, similar to image rendering, led to the adoption of **GPUs**.
    *   **NVIDIA** launched **CUDA**, a programming language for its GPUs, which became a game-changer for Deep Learning. GPUs can **reduce training time by 10-20 times** compared to CPUs.
    *   Beyond GPUs, other specialised hardware emerged:
        *   **FPGAs (Field-Programmable Gate Arrays)**: Fast, low power, re-programmable, customisable, but expensive. Microsoft uses FPGAs extensively in its Bing search engine.
        *   **ASICs (Application-Specific Integrated Circuits)**: Custom-made chips, expensive to design initially but cheaper for mass production. Examples include:
            *   **TPUs (Tensor Processing Units)**: Developed by Google, specifically designed for Deep Learning model training. Available in Google Colab.
            *   **Edge TPUs**: For edge devices like drones, smartwatches, or smart glasses.
            *   **NPUs (Neural Processing Units)**: For accelerating Deep Learning operations on mobile devices (e.g., in smartphones).
    *   This shift to custom hardware for Deep Learning has drastically **accelerated research** by reducing training times.

3.  **Frameworks and Libraries**:
    *   Training Deep Learning models from scratch is difficult and time-consuming, necessitating libraries.
    *   **TensorFlow (by Google)**: Released in 2015, it was very powerful but initially **difficult to use**.
    *   **Keras**: An intermediate library built on top of TensorFlow, making it much more user-friendly. Keras eventually became a built-in part of TensorFlow 2.0 (2019). This combination is widely used in **industry-driven applications**.
    *   **PyTorch (by Facebook)**: Started in 2016, it aimed to address TensorFlow's complexities and became popular among **researchers**. PyTorch and Caffe2 were later merged to form a complete framework for research and deployment.
    *   **AutoML tools**: Products like Google AutoML, Microsoft Custom Vision, and Apple Create ML emerged, offering GUI-based solutions to build and convert Deep Learning models without extensive coding, further boosting adoption.

4.  **Deep Learning Architectures**:
    *   Deep Learning architectures refer to **different ways of connecting nodes and weights** within a neural network. Experimenting to find the best architecture is time-consuming and resource-intensive.
    *   **Transfer Learning**: Researchers have developed and **pre-trained state-of-the-art architectures** on large datasets, making them **ready-to-use**. This allows practitioners to download and directly apply these high-performing architectures to their problems, significantly speeding up development.
    *   **Examples of prominent architectures**:
        *   **Image Classification**: ResNet.
        *   **Text Classification/NLP**: Transformers.
        *   **Image Segmentation**: U-Net.
        *   **Image Translation**: Pix2Pix.
        *   **Object Detection**: YOLO (You Only Look Once).
        *   **Speech Generation**: WaveNet.
    *   The availability of these pre-trained, high-performance architectures has made Deep Learning much more accessible and efficient.

5.  **Community (People)**:
    *   The collective efforts of passionate **researchers, engineers, teachers, and students** have been crucial to Deep Learning's success.
    *   People recognised the potential of Deep Learning to **transform the world** and invested their time and effort into its development.
    *   This strong community is considered a fundamental reason for Deep Learning's current status.
```

# Types of neural netowrk - Class notes 
(lec 3 deep learning playlist campusx)


# Notes: Types of Neural Networks | History of Deep Learning | Applications of Deep Learning

This video is the second in the Deep Learning course. It covers additional theoretical aspects which are important for gaining a broader perspective before diving into practical topics like Perceptrons in the next video.

The three main topics covered in this video are:
1.  **Types of Neural Networks**
2.  **History of Deep Learning**
3.  **Applications of Deep Learning**

---

## 1. Types of Neural Networks

Neural Networks are logical structures inspired by the human brain that form the basis of Deep Learning. While an Artificial Neural Network (ANN) is the simplest, various types of neural networks are designed for specific tasks. This video discusses five prominent types:

1.  **Artificial Neural Networks (ANNs) / Multi-Layer Perceptrons (MLPs)**
    *   These are the **simplest and most fundamental** type of neural network, considered the "grandparent" of other types.
    *   They consist of **multiple perceptron layers** (input, hidden, output) connected by weights.
    *   **Applications**: They are effective for almost any **supervised machine learning problem**, including regression and classification. ANNs excel at capturing **non-linear relationships** in data, and adding more hidden layers increases this capability.

2.  **Convolutional Neural Networks (CNNs)**
    *   **Structure**: CNNs feature a special layer called a "**convolutional layer**".
    *   **Applications**: They are highly effective for **image and video processing applications**.
    *   **Examples**: Used in self-driving cars, and in healthcare for medical image analysis (e.g., detecting cancer from chest scans).
    *   **Key Figure**: **Yann LeCun** is considered the "father" of CNNs.

3.  **Recurrent Neural Networks (RNNs)**
    *   **Structure**: Unlike MLPs and CNNs where information typically flows forward, RNNs allow **information to flow back within the network** (from later layers to earlier layers or within the same layer), creating a "memory" effect.
    *   **Applications**: Ideal for processing **sequential data** such as speech, text, and time-series data.
    *   **Advanced Version**: **Long Short-Term Memory (LSTM)** networks are a very good and widely used type of RNN.
    *   **Examples**: Used in applications like Google's voice assistant for speech recognition.

4.  **Autoencoders**
    *   **Purpose**: Primarily used for **data compression and dimensionality reduction** without significant loss of quality.
    *   **Structure**: They typically have an input layer, one or more hidden layers (an "**encoder**" part that compresses the input), and an output layer that tries to reconstruct the original input from the compressed representation (a "**decoder**" part). The hidden layer is usually smaller than the input/output layers.
    *   **Applications**: Used to reduce the size of input data or to learn efficient data representations.

5.  **Generative Adversarial Networks (GANs)**
    *   **Structure**: GANs are unique because they involve **two neural networks that compete against each other**.
        *   A **Generator** network creates new data (e.g., images, text).
        *   A **Discriminator** network tries to distinguish between real data and data generated by the Generator.
        *   The Generator's goal is to produce data so realistic that the Discriminator cannot tell it apart from real data.
    *   **Applications**: Excellent for **generating new content** (creation of "goods" from "real content").
    *   **Examples**: Generating realistic human faces (of people who don't exist), image-to-image translation, and creating new artistic works.
    *   **Key Figure**: **Ian Goodfellow** is known as the "father" of GANs.

---

## 2. History of Deep Learning

Deep Learning has a rich and often tumultuous history, marked by periods of intense research followed by "AI winters" where funding and interest waned. Its recent success began around 2012, but its roots go back to the 1950s and 60s.

*   **1950s-1960s: The Dawn of Perceptrons**
    *   During the Cold War, nations like the US and USSR invested heavily in computer science to gain technological supremacy.
    *   **Frank Rosenblatt** (1957) created the **Perceptron**, a fundamental unit that could learn.
    *   Rosenblatt promoted Perceptrons enthusiastically, claiming they would solve all AI problems.

*   **1969: The First AI Winter - The XOR Problem**
    *   **Marvin Minsky and Seymour Papert** (1969) published research highlighting the **limitations of Perceptrons**.
    *   They showed that a single-layer Perceptron **could not solve non-linear problems** like the XOR (exclusive OR) function.
    *   This discovery caused a significant drop in funding and interest in neural networks, leading to the **first "AI winter"**.

*   **1970s-1980s: Backpropagation Emerges (and is Ignored)**
    *   **Paul Werbos** (1970s), in his PhD thesis, introduced the **Backpropagation algorithm**, a method to effectively train multi-layer neural networks. This technique could solve the XOR problem and other non-linear functions. However, due to the prevailing AI winter, his work largely went unnoticed for over a decade.

*   **1986: Backpropagation's Resurgence & The Universal Approximator Theorem**
    *   **Geoffrey Hinton** and his colleagues re-introduced Backpropagation in a highly influential 1986 paper.
    *   They demonstrated that **multi-layer perceptrons trained with backpropagation could learn any non-linear function**, establishing them as "**universal approximators**".
    *   This re-sparked interest in neural networks, but practical limitations remained: scarcity of labelled data and insufficient computing power.

*   **1990s-Early 2000s: The Second AI Winter**
    *   Neural networks once again faced challenges, partially due to the rise of other powerful machine learning algorithms like **Support Vector Machines (SVMs)** and **Random Forests**, which were less data-hungry and computationally intensive. This led to another period of reduced interest and funding.

*   **2006: Deep Learning's Comeback - Unsupervised Pre-training**
    *   **Geoffrey Hinton** published another pivotal paper in 2006, introducing the concept of **unsupervised pre-training** and **Deep Belief Networks**.
    *   This work showed that by using GPUs, deep neural networks could be effectively trained and achieve excellent results on datasets like MNIST. This marked the beginning of Deep Learning's modern era.

*   **2012: ImageNet Moment - The GPU Revolution**
    *   The **ImageNet Large Scale Visual Recognition Challenge (ILSVRC)** in 2012 was a watershed moment.
    *   **Hinton's team** (Alex Krizhevsky and Ilya Sutskever) used a deep convolutional neural network (AlexNet) trained on **GPUs** to achieve an unprecedented error rate of **15%**, significantly outperforming traditional methods (which achieved around 25%).
    *   This dramatic improvement demonstrated the power of deep neural networks and GPUs, leading to massive investment from tech giants like Google and Facebook.

*   **2014-Present: Rapid Advancements and Widespread Adoption**
    *   **Ian Goodfellow** (2014) introduced **Generative Adversarial Networks (GANs)**, opening new possibilities for generating realistic content.
    *   **AlphaGo** (2016): Google DeepMind's AlphaGo defeated world champion Lee Sedol in the game of Go (4-1), a feat previously thought to be decades away. This captured global attention and showcased AI's advanced capabilities.
    *   Deep Learning has become mainstream, integrated into countless products and services. This rapid progress is attributed to several factors:
        *   **Abundance of Data**: The smartphone revolution and cheap internet led to an exponential increase in data generation and the creation of large, open-source labelled datasets.
        *   **Powerful Hardware**: Advances in GPUs (NVIDIA CUDA), and the development of specialised hardware like TPUs, FPGAs, and NPUs, drastically accelerated training times.
        *   **Advanced Frameworks**: The development of user-friendly Deep Learning libraries like TensorFlow (Google) and PyTorch (Facebook) democratised access and accelerated research.
        *   **Pre-trained Architectures**: The availability of state-of-the-art, pre-trained models (e.g., ResNet, Transformers, YOLO) through **transfer learning** allowed practitioners to achieve high performance without building models from scratch.
        *   **Strong Community**: The collective efforts of researchers, engineers, educators, and students have been fundamental to Deep Learning's success and continued advancement.

---

## 3. Applications of Deep Learning

Deep Learning has permeated and transformed various fields, achieving state-of-the-art results and often surpassing human performance. Some notable applications include:

*   **Self-Driving Cars**
    *   Companies like Tesla and Google (Waymo) use Deep Learning to enable vehicles to understand their surroundings, navigate, and make decisions, learning from vast amounts of road and traffic data. This is leading to fully autonomous vehicles.

*   **Gaming**
    *   Deep Learning is used to create AI agents that can learn to play and master complex games. **AlphaGo** beating the world Go champion in 2016 is a prime example. Google's DeepMind also developed AI that can play Atari games.

*   **Chatbots and Virtual Assistants**
    *   Tools like Google Now, Siri, Cortana, and Alexa use Deep Learning (especially RNNs and Transformers) for real-time conversation, understanding context, and providing intelligent responses. Their performance has significantly improved since 2015.

*   **Image and Video Processing**
    *   **Image Classification/Object Detection**: Identifying and categorising objects within images (e.g., Google Photos automatically categorises pictures of friends, pets, or cars).
    *   **Image Colourisation**: Converting black-and-white photos into colour.
    *   **Image Caption Generation**: Automatically generating descriptive text for images.
    *   **Image Restoration**: Improving the quality of old or low-resolution photos.
    *   **Video Generation**: Generating audio for muted videos.

*   **Natural Language Processing (NLP) & Translation**
    *   **Real-time Language Translation**: Google Translate and similar tools use Deep Learning to translate text and speech in real-time, aiding communication across language barriers.
    *   **Text Classification**: Classifying text into categories (e.g., sentiment analysis, spam detection).
    *   **Sign Language Translation**: Converting sign language into text or speech for improved accessibility.

*   **Content Generation**
    *   **Generating Realistic Faces**: GANs can create images of human faces that are indistinguishable from real ones, even though the people do not exist.
    *   **Music Generation**: Deep Learning models can compose original musical pieces.
    *   **Handwriting Generation**: Creating realistic handwritten text.
    *   **Story Generation**: AI can write creative stories.
    *   **Code Generation**: Deep Learning can generate code based on descriptions.

*   **Medical and Scientific Research**
    *   **Drug Research**: Accelerating drug discovery and identifying issues in drug development.
    *   **Medical Imaging**: Assisting in the analysis of medical scans for diagnosis (e.g., detecting cancer).
    *   **Climate Science**: Analysing complex climate data.

*   **Finance & Other Industries**
    *   **Fraud Detection**: Identifying fraudulent transactions and activities.
    *   **Customer Service**: Enhancing chatbots and virtual agents for improved customer support.
    *   **Education**: Personalising learning experiences.
    *   **Material Inspection**: Quality control in manufacturing.

Deep Learning's ability to be applied across such a wide array of problems with state-of-the-art performance has cemented its place as a transformative technology. The theoretical aspects covered in these two videos are important before moving to practical topics, with the next video starting with **Perceptrons**, the building block of MLPs.