Sequential output #5

gchrupala · 2015-02-13T09:22:38Z

It would be helpful to have an example of a network configuration where a label is predicted for each element in the sequence. This is a common scenario in NLP (e.g. named entity recognition).

Newmu · 2015-02-13T21:04:46Z

Agreed, an example would be nice - setting the seq_output argument to any of the recurrent layers to true should work but I've only tested this for language modeling with softmax output.

Do you have a suggestion for a good sequence prediction dataset for NER or POS or something else that the example can be trained with?

gchrupala · 2015-02-14T15:43:28Z

Actually, language modeling would be one good example, as training data is practically unlimited.

For NER, the most commonly used English is the CoNLL 2003 data (http://www.cnts.ua.ac.be/conll2003/ner/). The annotations are publicly available; the corresponding text is available free of charge from NIST.

For Spanish and Dutch, there are publicly available NER data from CoNLL 2002: http://www.cnts.ua.ac.be/conll2002/ner/

gchrupala · 2015-02-19T16:35:52Z

setting the seq_output argument to any of the recurrent layers to true should work but I've only tested this for language modeling with softmax output.

I've tried to make this work using seq_output=True but I must be missing something:

tokenizer = Tokenizer(min_df=1, character=True)
data = tokenizer.fit_transform(["Lorem ipsum."])
X = [ data[0][:-1] ]
Y = [ data[0][1:]  ]
layers = [ OneHot(n_features=tokenizer.n_features), 
           SimpleRecurrent(seq_output=True),
           Dense(size=tokenizer.n_features, activation='softmax')
          ]
model = RNN(layers=layers, cost='BinaryCrossEntropy')
model.fit(X, Y)

I get a dimension mismatch when compiling this ValueError: Input dimension mis-match. (input[0].shape[2] = 11, input[4].shape[2] = 14)
I've been reading the code in models.py and layers.py and can't quite see what going wrong here.

simonhughes22 · 2015-03-24T04:01:53Z

It looks like they have the sequence labelling there as an option with seq_output=True. Can someone provide a working examples using some dummy data or provided data as to how to make that work?

Newmu · 2015-04-03T21:44:35Z

Update on this, clean output sequence support starts to get into a rabbit hole of re-factoring and/or interface ugliness that's still being figured out. Alpha support is working on the sequence_output branch but isn't clean yet.

Still chewing on this one to figure out the best way forward without compromising ease of use or overly complicating codebase/interface. Have a feeling we're going to start making specific classes like LangugeModel to take care of some of the details.

Here's an example for langauge modeling using a softmax output and training on fixed length context sequences from a collection of documents:

from passage.preprocessing import Tokenizer
from passage.layers import  Embedding, GatedRecurrent, Dense
from passage.models import RNN
from passage.theano_utils import intX
from passage.iterators import SortedPadded

trX = load_list_of_text_documents()

tokenizer = Tokenizer(min_df=10, character=False, max_features=10000)
trX = tokenizer.fit_transform(trX)

trY = [x[1:][:100] for x in trX]
trX = [x[:-1][:100] for x in trX]

layers = [ 
    Embedding(size=512, n_features=tokenizer.n_features), 
    GatedRecurrent(size=512, seq_output=True),
    Dense(size=tokenizer.n_features, activation='softmax')
]

iterator = SortedPadded(y_pad=True, y_dtype=intX)

model = RNN(layers=layers, cost='seq_cce', iterator=iterator, Y=T.imatrix())
model.fit(trX, trY, n_epochs=1)

Let me know if you have any suggestions on api/changes.

gchrupala · 2015-04-07T16:56:34Z

Thanks! I like the idea of having separate classes, e.g. LangugeModel. Keeping all these independent optional argument like seq_output=True, y_pad=True, cost='seq_cce' and Y=T.imatrix() coordinated is going be a headache.

zxcvbn97 · 2015-05-27T22:05:42Z

Hi there,

I'm building an RNN to assign a label for each element in the sequence (according to this blog post!) for activity recognition based on location.

Assume the shape of each input location is 4x1, and the sequence of length n has a shape of 10xn.
The shape of each output activity is 3x1, and each location has one output activity.

How would I setup the layers in the RNN? Is my input the Embedding layer and the output the Dense layer?

Thanks!

youralien mentioned this issue Aug 7, 2015

Use as a Tagging Model? #13

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sequential output #5

Sequential output #5

gchrupala commented Feb 13, 2015

Newmu commented Feb 13, 2015

gchrupala commented Feb 14, 2015

gchrupala commented Feb 19, 2015

simonhughes22 commented Mar 24, 2015

Newmu commented Apr 3, 2015

gchrupala commented Apr 7, 2015

zxcvbn97 commented May 27, 2015

Sequential output #5

Sequential output #5

Comments

gchrupala commented Feb 13, 2015

Newmu commented Feb 13, 2015

gchrupala commented Feb 14, 2015

gchrupala commented Feb 19, 2015

simonhughes22 commented Mar 24, 2015

Newmu commented Apr 3, 2015

gchrupala commented Apr 7, 2015

zxcvbn97 commented May 27, 2015