Skip to content

Sanika-k-1317/Write-mate_eklavya23

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Write-mate

Aim

The project aims to create a system consisting of trained neural network based on LSTM and Attention that generates realistic handwritten text from input text.

Description

The project is an implementation of Alex Graves 2014 paper Generating Sequences With Recurrent Neural Networks. It processes input text and outputs realistic-looking handwriting using stacked LSTM layers along with attention mechanism. The output is x, y co-ordinates of the handwriting sequences which can be plotted to get the required longhand sentences.

Theory

The model uses Long-short term memory cell layers and attention mechanism.

LSTM:

LSTM can use its memory to generate complex, realistic sequences containing long-range structure. It uses purpose-built memory cells to store information, is better at finding and exploiting long range dependencies in the data.

Attention-mechanism

The number of co-ordinates used to write each character varies greatly compared to text according to style, size, pen speed etc. The attention-mechanism assists to create a soft-window so that it dynamically determines an alignment between the text and the pen locations. That is, it helps model to learn to decide which character to write next.

Mixture Density Network (MDN)

Mixture Density Networks are neural networks which can measure their own uncertainty and help to capture randomness of handwriting. They output parameters μ(mean), σ(standard deviation), and ρ(correlation) for several multivariate Gaussian components. They also estimate a parameter π(weights) for each of these distributions which helps to gauge contribution of each distriution in the output.

Model Architecture

Handwriting prediction

It consists of 3 LSTM stacked layers. The output is feeded to a mixture density layer which provides mean, variance, correlation and weights of 20 mixture components.

Handwriting synthesis

The model consists of 3 LSTM layers stacked upon each other along with the added input from the character sequence mediated by the window layer.

The input (x, y, eos) are taken from IAM-OnDB online database where x and y are the pen co-ordinates and eos is the points in the sequence when the pen is lifted off the whiteboard.

Onehots is the onehot representation of text inputted

The output is (x, y, eos) of the hand-writing form of the input text which can be plotted to get the final output.

Future works

Some of the ways in which this model can advance are listed below

  • Controlling the width of the output handwriting. Introducting variance to it to give more realistic look.
  • Generate handwriting in the style of a particular writer. This can be achieved with primed sampling

Contributors

Acknowledgements

Resources

Deep Learning courses

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published