Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What you get is what you see: A visual markup decompiler #19

Open
flrngel opened this issue Jul 3, 2018 · 0 comments
Open

What you get is what you see: A visual markup decompiler #19

flrngel opened this issue Jul 3, 2018 · 0 comments

Comments

@flrngel
Copy link
Owner

flrngel commented Jul 3, 2018

https://arxiv.org/abs/1609.04938

1. Abstract

  • this model is end-to-end
  • model uses convolutional network and recurrent network
  • current models achieve 25% accuracy, but paper model achieves 75% accuracy

2. Introduction

  • OCR requires joint processing of image and text data
  • WYGIWYS is simple extension of the attention-based encoder-decoder model
  • Paper introduces IM2LATEX-100k Dataset

3. Problem: image-to-markup generation

  • author defined the image-to-markup problem as converting a rendered source image t o target presentational markup

4. Model

image

Convolutional Network

  • Convolutional network does not uses fully connected layer
    • this preserve locality of CNN features in order to use visual attention

Row Encoder

  • Show, Attend and Tell shows image feature grid can be directly fed into decoder
    • decoder contains significant relative sequential order information
    • so using rnn can be help in
      • left-to-right order can be easily learned by encoder
      • RNN can utilize the surrounding horizontal context to refine the hidden representation

Decoder

  • uses attention model (Bahdanau attention)
  • uses beam search on test time

5. Dataset

Tokenization

  • character based models were not that good

Optional: Normalization

  • modified KaTeX due to produce normalized input data

My Notes

  • each github project has different loss functions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant