Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sequential output #5

Open
gchrupala opened this issue Feb 13, 2015 · 7 comments
Open

Sequential output #5

gchrupala opened this issue Feb 13, 2015 · 7 comments

Comments

@gchrupala
Copy link
Contributor

It would be helpful to have an example of a network configuration where a label is predicted for each element in the sequence. This is a common scenario in NLP (e.g. named entity recognition).

@Newmu
Copy link
Contributor

Newmu commented Feb 13, 2015

Agreed, an example would be nice - setting the seq_output argument to any of the recurrent layers to true should work but I've only tested this for language modeling with softmax output.

Do you have a suggestion for a good sequence prediction dataset for NER or POS or something else that the example can be trained with?

@gchrupala
Copy link
Contributor Author

Actually, language modeling would be one good example, as training data is practically unlimited.

For NER, the most commonly used English is the CoNLL 2003 data (http://www.cnts.ua.ac.be/conll2003/ner/). The annotations are publicly available; the corresponding text is available free of charge from NIST.

For Spanish and Dutch, there are publicly available NER data from CoNLL 2002: http://www.cnts.ua.ac.be/conll2002/ner/

@gchrupala
Copy link
Contributor Author

setting the seq_output argument to any of the recurrent layers to true should work but I've only tested this for language modeling with softmax output.

I've tried to make this work using seq_output=True but I must be missing something:

tokenizer = Tokenizer(min_df=1, character=True)
data = tokenizer.fit_transform(["Lorem ipsum."])
X = [ data[0][:-1] ]
Y = [ data[0][1:]  ]
layers = [ OneHot(n_features=tokenizer.n_features), 
           SimpleRecurrent(seq_output=True),
           Dense(size=tokenizer.n_features, activation='softmax')
          ]
model = RNN(layers=layers, cost='BinaryCrossEntropy')
model.fit(X, Y)

I get a dimension mismatch when compiling this ValueError: Input dimension mis-match. (input[0].shape[2] = 11, input[4].shape[2] = 14)
I've been reading the code in models.py and layers.py and can't quite see what going wrong here.

@simonhughes22
Copy link

It looks like they have the sequence labelling there as an option with seq_output=True. Can someone provide a working examples using some dummy data or provided data as to how to make that work?

@Newmu
Copy link
Contributor

Newmu commented Apr 3, 2015

Update on this, clean output sequence support starts to get into a rabbit hole of re-factoring and/or interface ugliness that's still being figured out. Alpha support is working on the sequence_output branch but isn't clean yet.

Still chewing on this one to figure out the best way forward without compromising ease of use or overly complicating codebase/interface. Have a feeling we're going to start making specific classes like LangugeModel to take care of some of the details.

Here's an example for langauge modeling using a softmax output and training on fixed length context sequences from a collection of documents:

from passage.preprocessing import Tokenizer
from passage.layers import  Embedding, GatedRecurrent, Dense
from passage.models import RNN
from passage.theano_utils import intX
from passage.iterators import SortedPadded

trX = load_list_of_text_documents()

tokenizer = Tokenizer(min_df=10, character=False, max_features=10000)
trX = tokenizer.fit_transform(trX)

trY = [x[1:][:100] for x in trX]
trX = [x[:-1][:100] for x in trX]

layers = [ 
    Embedding(size=512, n_features=tokenizer.n_features), 
    GatedRecurrent(size=512, seq_output=True),
    Dense(size=tokenizer.n_features, activation='softmax')
]

iterator = SortedPadded(y_pad=True, y_dtype=intX)

model = RNN(layers=layers, cost='seq_cce', iterator=iterator, Y=T.imatrix())
model.fit(trX, trY, n_epochs=1)

Let me know if you have any suggestions on api/changes.

@gchrupala
Copy link
Contributor Author

Thanks! I like the idea of having separate classes, e.g. LangugeModel. Keeping all these independent optional argument like seq_output=True, y_pad=True, cost='seq_cce' and Y=T.imatrix() coordinated is going be a headache.

@zxcvbn97
Copy link

Hi there,

I'm building an RNN to assign a label for each element in the sequence (according to this blog post!) for activity recognition based on location.

image

Assume the shape of each input location is 4x1, and the sequence of length n has a shape of 10xn.
The shape of each output activity is 3x1, and each location has one output activity.

How would I setup the layers in the RNN? Is my input the Embedding layer and the output the Dense layer?

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants