# DL mechanisms

This page considers concepts that are traditionally considered as related to deep learning.

**Deep learning** is a subfield of machine learning that encompasses models known as neural networks, inspired by the human brain.

## Recurrent

Recurrent is an approach to processing typically sequential units of data. The main idea is to use information about how previous elements of the sequence were processed to process the following ones.

Mathematically it can be written:  

$$h_t = f(x_t W^T_1 + b_1 + h_{t-1} W^T_2 + b_2)$$

Where:  
- $x_t$: input at the $t$-th step.  
- $h_t$: vector that describes hidden state at the $t$-th step.  
- $W_1$: weights associated with the input.  
- $W_2$: weights associated with the state.  
- $b_1$: bias associated with the input.  
- $b_2$: bias associated with the state.  
- $f$: activation function, typically a hyperbolic tangent.

Each $h_t$ depends on $x_t$ and $h_{t-1}$. But $h_{t-1}$ depends on $x_{t-1}$ and $h_{t-2}$, and so on recursively

All these computations, resulting in $h_t$ for $t = \overline{1,n}$, can be used in the subsequent steps to describe the process we are interested in.

For more detailed explanation, checkout on the [Recurrent](dl_mechanisms/recurrent.ipynb) page.

## Transformer

The transformer is a deep learning architecture that was represented in the article [Attention Is All You Need](https://papers.nips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf).

The schema is represented in the following piture:

![](dl_mechanisms/transformer_files/schema.svg)

Check for details in the [transformers](dl_mechanisms/transformer.ipynb) page.