# Introduction to Deep Learning Systems

**Primary reference:** 
[CMU 10-414/714: Deep Learning Systems](https://dlsyscourse.org/lectures/)
- Fall 2024
- Instructors: J. Zico Kolter and Tianqi Chen

This course will provide you will an introduction to the functioning of modern deep
learning systems. You will learn about the underlying concepts of modern deep learning systems like
automatic differentiation, neural network architectures, optimization, and efficient
operations on systems like GPUs. 

Finally, to solidify understanding, the homeworks build
from scratch **needle** &mdash; a deep learning library loosely similar to PyTorch, and
implement many common architectures in the library.


## Why study deep learning?

**Deep Learning Systems** (DLS) solved problems considered hard prior to 2010, e.g. obtaining SOTA scores on tasks and challenges such as [ImageNet](https://www.image-net.org/challenges/LSVRC/), [CASP](https://en.wikipedia.org/wiki/CASP), the game of Go$^1$, and text & image generation:

[1] Game tree complexity of $10^{360}$ at 250 moves over 150 move games.

<img src="img/01-0.png">

<img src="img/01-1.png">

## Why study deep learning systems?

### Reason #1. To build deep learning systems

Despite the dominance of deep learning libraries and TensorFlow and PyTorch, the
playing field in this space is remarkably fluid (see e.g., recent emergence of JAX). 
You may want to work on developing existing frameworks (virtually all of which are
open source), or developing your own new frameworks for specific tasks.

DLS is not just for the "big players":

<img src="img/01-2.png">

**Controversial claim.** The
single largest driver of
widespread adoption of deep
learning has been the creation of
easy-to-use automatic
differentiation libraries:

<img src="img/01-3.png">


### Reason #2. To use existing systems more effectively 

Understanding how the internals of existing deep learning systems work let you
use them much more efficiently. For example, you can make your custom 
non-standard layer run (much) faster in
TensorFlow / PyTorch by understanding how these
operations are executed. Understanding deep learning systems is a "superpower" that will let you
accomplish your research aims much more efficiently.


<img src="img/01-6.png">
<img src="img/01-5.png">
<img src="img/01-4.png">

### Reason #3: Deep learning systems are fun!

Despite their seeming complexity, the core underlying algorithms behind deep
learning systems (**automatic differentiation** + **gradient-based optimization**) are
extremely simple. Unlike (say) operating systems, you could probably write a “reasonable” deep
learning library in <2000 lines of (dense) code.

The first time you build your automatic differentiation library, and realize you can
take the gradient of a gradient without actually knowing how you would even go
about deriving that mathematically... (e.g. the gradient of the gradient of a for-loop 😵‍💫.)

## Elements of deep learning systems

We will touch on the ff. elements throughout the course:

| | |
|:--|:--|
| **Compose** | multiple tensor operations to build modern machine learning models | 
| **Transform** | a sequence of operations (forward & backward computation, e.g. AD) |
| **Accelerate** | computation via specialized hardware |
| **Extend** |  more hardware backends, more operators |

## Prerequisites

- Systems programming 
- Linear algebra
- Other mathematical background: e.g., calculus, probability, basic proofs
- Python and C++ development$^1$
- Basic prior experience with ML

[1] [C++ crash course](https://www.youtube.com/watch?v=9Myk2vcK8s8)

## Learning objectives

- Understand the basic functioning of modern deep learning libraries
- Including
concepts like automatic differentiation, gradient-based optimization
- Be able to implement several standard deep learning architectures
- MLPs, ConvNets, RNNs, Seq2Seq, Transformers, truly from scratch
- Understand how hardware acceleration (e.g., on GPUs) works under the hood
- Be able to develop your own highly efficient code for modern DL

## Course instructors

<img src="img/01-7.png">
<img src="img/01-8.png">