Skip to content
Files and code for the ODSC tutorial "Sequence Modelling with Deep Learning"
Jupyter Notebook
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
Book_1_A_Game_of_Thrones.txt initial commit Nov 3, 2019
Book_2_A_Clash_of_Kings.txt
Book_3_A_Storm_of_Swords.txt
Book_4_A_Feast_for_Crows.txt
Book_5_A_Dance_with_Dragons.txt
GoT_Language_Model.ipynb
Prerequisites_Sequence_Modelling.md
README.md
Season_1_Subtitles.json
Season_2_Subtitles.json
Season_3_Subtitles.json
Season_4_Subtitles.json
Season_5_Subtitles.json
Season_6_Subtitles.json
Season_7_Subtitles.json
big_network.png
bigger_network.png
books.jpg
final_trained_GoT_language_model.h5
lm_data.png
requirements.txt
small_network.png

README.md

Sequence Modelling with Deep Learning

This is the code and files for the practical tutorial for this ODSC tutorial.

Setting up

This tutorial uses some scientific Python libraries (numpy, pandas) and the Keras API on top of a Tensorflow back-end. The exact version requirements are contained in the requirements.txt file.

You can install the right environment in a virtualenv using these commands:

virtualenv tutorial_env
source tutorial_env/bin/activate  # activate environment
pip3 install -r requirements.txt  # install requirements
jupyter notebook                  # launch notebook

A more fiddly but general solution in case you're having some issues (e.g. not the right version of Python, or notebook is not using new kernel):

virtualenv --python=/usr/local/bin/python3 tutorial_env
source tutorial_env/bin/activate
pip3 install ipykernel
python3 -m ipykernel install --user --name tutorial_env --display-name "tutorial_env_kernel"

Tutorial Abstract

Much of data is sequential – think speech, text, DNA, stock prices, financial transactions and customer action histories. Modern methods for modelling sequence data are often deep learning-based, composed of either recurrent neural networks (RNNs) or attention-based Transformers. A tremendous amount of research progress has recently been made in sequence modelling, particularly in the application to NLP problems. However, the inner workings of these sequence models can be difficult to dissect and intuitively understand.

This presentation/tutorial will start from the basics and gradually build upon concepts in order to impart an understanding of the inner mechanics of sequence models – why do we need specific architectures for sequences at all, when you could use standard feed-forward networks? How do RNNs actually handle sequential information, and why do LSTM units help longer-term remembering of information? How can Transformers do such a good job at modelling sequences without any recurrence or convolutions?

In the practical portion of this tutorial, attendees will learn how to build their own LSTM-based language model in Keras. A few other use cases of deep learning-based sequence modelling will be discussed – including sentiment analysis (prediction of the emotional valence of a piece of text) and machine translation (automatic translation between different languages).

The goals of this presentation are to provide an overview of popular sequence-based problems, impart an intuition for how the most commonly-used sequence models work under the hood, and show that quite similar architectures are used to solve sequence-based problems across many domains.

You can’t perform that action at this time.