# <center>Foundations of Deep Learning for the Social Sciences:</br>Day 1 Python Tutorial</center>

This notebook will explore the concepts and models we discussed today using the [Keras](https://keras.io/) and [PyTorch](https://pytorch.org/) Python packages.

Familiarity with Python may be helpful but is not strictly necessary. For those who are familiar with R &mdash; Python is syntactically similar to R, but differs from R in that it is more of an [object-oriented language](https://en.wikipedia.org/wiki/Object-oriented_programming), whereas R is more of a [functional language](https://en.wikipedia.org/wiki/Functional_programming). What this means in practice is that in Python, you'll see a lot of syntax that looks like `object.attribute`. For example, you might define a statistical model (the object), then access its coefficients (the attribute) by typing `model.coef`. In R, on the other hand, you typically use a function to get the parameters: `coef(model)`. In my experience, this isn't too difficult to adjust to.

Python does have the upper hand, however, in that its deep learning packages are further developed and better supported than R's. Hence, most of our tutorials will be in Python. There are now R wrappers for Tensorflow and PyTorch, however, so I will also provide some R code for comparison.

Just like in R, in Python you need to import the packages you intend to use. In Python, packages are imported as objects called *modules* that you are welcome to rename for convenience. The functions included in each package can be used by typing `module.function()` (with appropriate arguments); this may feel similar to typing `package::function()` in R.

Let's begin by importing the deep learning packages we'll be using in this tutorial.

In [None]:
import tensorflow as tf
from tensorflow import keras
import torch

## Tensors: The Fundamental Deep Learning Data Structure

Tensors are the basic objects used for computing in deep learning frameworks. In the deep learning context, a tensor is just a multidimensional array. In statistics, we usually only work with 0-dimensional arrays (scalars), 1-dimensional arrays (vectors), and 2-dimensional arrays (matrices):

\begin{equation}
    \underbrace{a}_{\text{a scalar}}, \qquad
    \mathbf{a} = \underbrace{\begin{bmatrix}
        a_1 \\
        a_2 \\
        a_3 \\
    \end{bmatrix}}_{\text{a vector}}, \qquad
    \mathbf{A} = \underbrace{\begin{bmatrix}
        a_{1, 1} & a_{1, 2} & a_{1, 3} \\
        a_{2, 1} & a_{2, 2} & a_{2, 3} \\
        a_{3, 1} & a_{3, 2} & a_{3, 3} \\
    \end{bmatrix}}_{\text{a matrix}}.
\end{equation}

In deep learning frameworks, however, arrays can have more dimensions than just two. For example, a 3-dimensional tensor is a stack of matrices:

![3d-tensor.jpg](attachment:3d-tensor.jpg)

As an example, we might use a 3D tensor to hold a longitudinal data set where we have 30 people answer 10 questions at 5 time points. In this case, we'd typically format our data as a $5 \times 30 \times 10$ tensor. This is rather elegant, because now instead of having a list of $5$ $30 \times 10$ matrices that must be operated on separately, all our data is contained in a single array structure that we can perform operations on directly.

Creating tensors in Keras/Tensorflow and PyTorch is straightforward.

## Univariate Logistic Regression

Similar to the first lecture, we will begin by demonstrating some basic deep learning concepts using logistic regression. This will also help introduce the syntax used by Keras and PyTorch before getting into more complicated deep learning examples.

Let's start with a simple univariate logistic regression model:
\begin{equation}
    \hat{y} = \sigma(wx + b),
\end{equation}
where $w$ is a slope, $x$ is a predictor, $b$ is an intercept, $\sigma(z) = 1 / (1 + \exp[-z])$ is the inverse logistic link or *sigmoid* function, and $\hat{y}$ is the predicted probability that the outcome is equal to $1$. This is the model we used in lecture when we introduced backpropagation and stochastic gradient methods.

We'll begin by defining this model using Keras. Keras is a high-level interface to [Tensorflow](https://www.tensorflow.org/), one of Python's major deep learning frameworks. In Keras and Tensorflow

In [None]:
import numpy as np

In [None]:
from keras import (models,
                   layers,
                   activations)

model = models.Sequential()
model.add(layers.Dense(units = 1,
                       activation = activations.sigmoid))

### Backpropagation in Action

In [None]:
x = torch.tensor([6.])
w = torch.tensor([-0.5], requires_grad = True)
b = torch.tensor([3.], requires_grad = True)
z = w * x + b; z.retain_grad()
y_hat = z.sigmoid(); y_hat.retain_grad()
y = torch.tensor([1.])
L = -(y * y_hat.log() + (1 - y) * (1 - y_hat).log())

## Multivariate Logistic Regression

## Multilayer Perceptrons

## Recurrent Neural Networks

### The Vanishing Gradient Problem

### Gated Architectures: The Long Short-Term Memory Network

## Convolutional Neural Networks

### Convolution to Find Image Edges

### Classifying Handwritten Digits