# H1 Hello Class!
Machine Learning for the Arts | Prof. Twomey | [ml.roberttwomey.com](https://ml.roberttwomey.com)

## Overview

This notebook has two sections.

[Section 1: Working with Markdown](https://colab.research.google.com/github/roberttwomey/ml-art-code/blob/master/exercises/h1-watch-me-learn.ipynb#scrollTo=iB-V7YYh0ZDa&line=7&uniqifier=1) asks you to practice with a little bit of markdown styling.

[Section 2: Watch Me Learn](https://colab.research.google.com/github/roberttwomey/ml-art-code/blob/master/exercises/h1-hello-class.ipynb#scrollTo=fwBSSIin23n7&line=7&uniqifier=1) has you pick a text file to use as training data, and then train a minimal little language model from scratch. This part is based on a character-level model written by Andrej Karpathy (@karpathy). We will learn more about what an [RNN](https://en.wikipedia.org/wiki/Recurrent_neural_network) is later.

When you are done with Section 1 and Section 2, export your file as a jupyter notebook (.ipynb), and also save a copy as a PDF. Submit this as Homework 1 on Canvas.

# Section 1: Working with Markdown

## 1. First off, let's make this notebook your own!

Go up to the top cell of this notebook and change the title to something that you like better

Under the title and above the overview section, add in a block of text that says your name, email, and some other info of your choice. Try playing around with the markdown styling, html.

## 2. Add a text cell below this one
Insert an image in that cell. Include some text of your choice.

## 3. Add a python (code) cell below that.
Write some python code that does something. A good place to start might be just a print statement:

```python
print("Hello Class!")
```

## 4. Add a variable to your code cell.

Try some code like the following (you will need to copy it into a new code cell).

```python
x = "roberto"
```
or

```python
the_answer = 10
```

## 5. Add one more code cell where you print the value of that variable:

```python
print(x)
print(the_answer)
```

## End of Section 1
Congrats, that's the end of section 1.

# Section 2: Watch Me Learn
Here we are going to do a little machine learning. Just two parts:

1. Download a text file that you want the language model to learn from.
2. Train the language model and watch it learn from your text file.

This corresponds to the two parts of a typical ML project: **finding training data** and **training a model**.

NOTE: The language model below is a character-level Recurrent Neural Network (RNN) written by Andrej Karpathy. It is only 100 lines of code! Source: [https://gist.github.com/karpathy/d4dee566867f8291f086](https://gist.github.com/karpathy/d4dee566867f8291f086)

## Step 1: Gather Training Data

You need to download a text file to work with.

First, find a text file online that you want to use for this project. (Or grab a text file that you already own). This should be a plain text file (ending in .txt), i.e. not a Microsoft Word file, pdf, or anything like that.

In the cell below, replace the URL (**https://raw.github...**) with the address to a text file you want the model to learn from.

Then run the cell. This will download the file to the current directory with your notebook.

**NOTES**:
- You can click on the files icon at left to see the files in your current directory.
- From our intro notebooks, you know that the `!` runs a command on the computer that is running this notebook.
- **wget** is a program on most linux-based computers that downloads a filew from the internet.
- So **!wget https://raw.github...** runs a command to download a file from the internet to our current directory.

In [None]:
!wget https://raw.githubusercontent.com/roberttwomey/ml-art-code/master/intro/script.txt

^^^ Check out the output above ^^^. Can you make sense of it? What do you think happened...

## Step 2: Source Code for the RNN
This was written by Andrej Karpathy. Do you know who he is?

Run the following cell to run the basic imports and declare the helper functions for our Recurrent Neural Network.

In [None]:
import numpy as np

# data I/O
data = open('script.txt', 'r', encoding ='ISO-8859-1').read() # should be simple plain text file
chars = list(set(data))
data_size, vocab_size = len(data), len(chars)
print('data has %d characters, %d unique.' % (data_size, vocab_size))
char_to_ix = { ch:i for i,ch in enumerate(chars) }
ix_to_char = { i:ch for i,ch in enumerate(chars) }

# hyperparameters
hidden_size = 100 # size of hidden layer of neurons
seq_length = 25 # number of steps to unroll the RNN for
learning_rate = 1e-1

# model parameters
Wxh = np.random.randn(hidden_size, vocab_size)*0.01 # input to hidden
Whh = np.random.randn(hidden_size, hidden_size)*0.01 # hidden to hidden
Why = np.random.randn(vocab_size, hidden_size)*0.01 # hidden to output
bh = np.zeros((hidden_size, 1)) # hidden bias
by = np.zeros((vocab_size, 1)) # output bias

def lossFun(inputs, targets, hprev):
  """
  inputs,targets are both list of integers.
  hprev is Hx1 array of initial hidden state
  returns the loss, gradients on model parameters, and last hidden state
  """
  xs, hs, ys, ps = {}, {}, {}, {}
  hs[-1] = np.copy(hprev)
  loss = 0
  # forward pass
  for t in range(len(inputs)):
    xs[t] = np.zeros((vocab_size,1)) # encode in 1-of-k representation
    xs[t][inputs[t]] = 1
    hs[t] = np.tanh(np.dot(Wxh, xs[t]) + np.dot(Whh, hs[t-1]) + bh) # hidden state
    ys[t] = np.dot(Why, hs[t]) + by # unnormalized log probabilities for next chars
    ps[t] = np.exp(ys[t]) / np.sum(np.exp(ys[t])) # probabilities for next chars
    loss += -np.log(ps[t][targets[t],0]) # softmax (cross-entropy loss)
  # backward pass: compute gradients going backwards
  dWxh, dWhh, dWhy = np.zeros_like(Wxh), np.zeros_like(Whh), np.zeros_like(Why)
  dbh, dby = np.zeros_like(bh), np.zeros_like(by)
  dhnext = np.zeros_like(hs[0])
  for t in reversed(range(len(inputs))):
    dy = np.copy(ps[t])
    dy[targets[t]] -= 1 # backprop into y. see

    #if confused here
    dWhy += np.dot(dy, hs[t].T)
    dby += dy
    dh = np.dot(Why.T, dy) + dhnext # backprop into h
    dhraw = (1 - hs[t] * hs[t]) * dh # backprop through tanh nonlinearity
    dbh += dhraw
    dWxh += np.dot(dhraw, xs[t].T)
    dWhh += np.dot(dhraw, hs[t-1].T)
    dhnext = np.dot(Whh.T, dhraw)
  for dparam in [dWxh, dWhh, dWhy, dbh, dby]:
    np.clip(dparam, -5, 5, out=dparam) # clip to mitigate exploding gradients
  return loss, dWxh, dWhh, dWhy, dbh, dby, hs[len(inputs)-1]

def sample(h, seed_ix, n):
  """
  sample a sequence of integers from the model
  h is memory state, seed_ix is seed letter for first time step
  """
  x = np.zeros((vocab_size, 1))
  x[seed_ix] = 1
  ixes = []
  for t in range(n):
    h = np.tanh(np.dot(Wxh, x) + np.dot(Whh, h) + bh)
    y = np.dot(Why, h) + by
    p = np.exp(y) / np.sum(np.exp(y))
    ix = np.random.choice(range(vocab_size), p=p.ravel())
    x = np.zeros((vocab_size, 1))
    x[ix] = 1
    ixes.append(ix)
  return ixes

You should see a message about your data if the cell ran properly. (f.ex: `data has 82463 characters, 95 unique.`)

The next cell declares some variables and will actually run the training loop.

**Run the following cell**.

Watch it run for a while and see how the outputs change. This will run **forever**. At some point you should click stop and then export your .ipynb and your .pdf to submit for the homework.

In [None]:

n, p = 0, 0
mWxh, mWhh, mWhy = np.zeros_like(Wxh), np.zeros_like(Whh), np.zeros_like(Why)
mbh, mby = np.zeros_like(bh), np.zeros_like(by) # memory variables for Adagrad
smooth_loss = -np.log(1.0/vocab_size)*seq_length # loss at iteration 0
while True:
  # prepare inputs (we're sweeping from left to right in steps seq_length long)
  if p+seq_length+1 >= len(data) or n == 0:
    hprev = np.zeros((hidden_size,1)) # reset RNN memory
    p = 0 # go from start of data
  inputs = [char_to_ix[ch] for ch in data[p:p+seq_length]]
  targets = [char_to_ix[ch] for ch in data[p+1:p+seq_length+1]]

  # sample from the model now and then
  if n % 1000 == 0:
    sample_ix = sample(hprev, inputs[0], 100)
    txt = ''.join(ix_to_char[ix] for ix in sample_ix)
    print('----\n %s \n----' % (txt, ))

  # forward seq_length characters through the net and fetch gradient
  loss, dWxh, dWhh, dWhy, dbh, dby, hprev = lossFun(inputs, targets, hprev)
  smooth_loss = smooth_loss * 0.999 + loss * 0.001
  if n % 1000 == 0: print('iter %d, loss: %f' % (n, smooth_loss)) # print progress

  # perform parameter update with Adagrad
  for param, dparam, mem in zip([Wxh, Whh, Why, bh, by],
                                [dWxh, dWhh, dWhy, dbh, dby],
                                [mWxh, mWhh, mWhy, mbh, mby]):
    mem += dparam * dparam
    param += -learning_rate * dparam / np.sqrt(mem + 1e-8) # adagrad update

  p += seq_length # move data pointer
  n += 1 # iteration counter

Watch it run for a while and see how the outputs change. **Answer the following questions in the empty text box below**:
1. How many iterations did you getup to? This is a count of how many training examples you showed to the network.
2. What is the final loss for your network? (**loss**). Where did the loss start? The loss function, or loss score, is the main measure of learning in a neural network.
3. How do the outputs look at the end? Is the text intelligible? Does it make sense? Does the output sample resemble the training data that you showed to the network?

## End of Section 2

Don't forget to submit your homework on Canvas.

Export your file as a jupyter notebook (.ipynb), and also save a copy as a PDF. Give them descriptive filenames (*Twomey-H1.ipynb*, and *Twomey-H1.pdf* for instance but with your last name).

Submit this as Homework 1 on Canvas.