# RNN from scratch in Pytorch
> In this post, we will implement a RNN from scratch in Pytorch and use it to build a character level language model.

- toc: true 
- badges: true
- comments: true
- categories: [RNN, Language Modeling, Pytorch]

Let's first import the standard libraries

In [9]:
!pip install wget

Collecting wget
  Downloading wget-3.2.zip (10 kB)
Building wheels for collected packages: wget
  Building wheel for wget (setup.py) ... [?25ldone
[?25h  Created wheel for wget: filename=wget-3.2-py3-none-any.whl size=9681 sha256=8d573fe66f1632d41cc0fd01d9fdb4cc1a86174c2eb2ef6faed4a01a792e69d7
  Stored in directory: /root/.cache/pip/wheels/bd/a8/c3/3cf2c14a1837a4e04bd98631724e81f33f462d86a1d895fae0
Successfully built wget
Installing collected packages: wget
Successfully installed wget-3.2


In [2]:
!pip install d2l==0.16.1

Collecting d2l==0.16.1
  Downloading d2l-0.16.1-py3-none-any.whl (76 kB)
[K     |████████████████████████████████| 76 kB 6.5 MB/s  eta 0:00:01
Collecting jupyter
  Downloading jupyter-1.0.0-py2.py3-none-any.whl (2.7 kB)
Collecting pillow>=6.2.0
  Downloading Pillow-8.1.0-cp38-cp38-manylinux1_x86_64.whl (2.2 MB)
[K     |████████████████████████████████| 2.2 MB 13.2 MB/s eta 0:00:01
Collecting qtconsole
  Downloading qtconsole-5.0.2-py3-none-any.whl (119 kB)
[K     |████████████████████████████████| 119 kB 40.3 MB/s eta 0:00:01
Collecting ipywidgets
  Downloading ipywidgets-7.6.3-py2.py3-none-any.whl (121 kB)
[K     |████████████████████████████████| 121 kB 42.2 MB/s eta 0:00:01
Collecting jupyter-console
  Downloading jupyter_console-6.2.0-py3-none-any.whl (22 kB)
Collecting qtpy
  Downloading QtPy-1.9.0-py2.py3-none-any.whl (54 kB)
[K     |████████████████████████████████| 54 kB 7.3 MB/s  eta 0:00:01
Collecting jupyterlab-widgets>=1.0.0; python_version >= "3.6"
  Downloading jupyt

Installing collected packages: qtpy, qtconsole, jupyterlab-widgets, widgetsnbextension, ipywidgets, jupyter-console, jupyter, d2l, pillow
Successfully installed d2l-0.16.1 ipywidgets-7.6.3 jupyter-1.0.0 jupyter-console-6.2.0 jupyterlab-widgets-1.0.0 pillow-8.1.0 qtconsole-5.0.2 qtpy-1.9.0 widgetsnbextension-3.5.1


In [25]:
import torch
from torch import nn, optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader
import numpy as np
from d2l import torch as d2l
import os
import wget
import re

## What is RNN?

RNN or Recurrent Neural Network is just fancy name for a Looped Neural Network that can handle sequential data really well. 

## Reading the Dataset

To get started we load text from H. G. Wells’ The Time Machine. This is a fairly small corpus of just over 30000 words, but for the purpose of what we want to illustrate this is just fine. 

In [26]:
def download_data():
    """Download the time machine dataset"""
    DATA_URL="http://d2l-data.s3-accelerate.amazonaws.com/timemachine.txt"
    wget.download(DATA_URL)

In [27]:
if not os.path.exists("timemachine.txt"):
    download_data()

In [28]:
def read_data():
    """Load the time machine dataset into a list of text lines."""
    with open('timemachine.txt', 'r') as f:
        lines = f.readlines()
    return [re.sub('[^A-Za-z]+', ' ', line).strip().lower() for line in lines]

In [29]:
lines = read_data()
print(f'# text lines: {len(lines)}')
print(lines[0])
print(lines[10])

# text lines: 3221
the time machine by h g wells
twinkled and his usually pale face was flushed and animated the


![](https://pythonmachinelearning.pro/wp-content/uploads/2017/10/Unrolled-RNN.png.webp "Unrolled Recurrent Neural Network. Less scary now, isn't it? Source: [Mohit Deshpande](https://pythonmachinelearning.pro/recurrent-neural-networks-for-language-modeling/)")