You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would be nice to have two new wrappers: one that sorts examples in a batch according to a given key function and one that unpacks batches to compose a stream of examples (like itertools.chain does). These two wrappers could be used one after another to make the data stream more uniform, which can yield very significant speed up in some cases, e.g. when the examples are sequences and it is desirable to have sentences of similar lengths in a batch.
Altogether it should look like that:
data_stream=DataStream(dataset, iteration_scheme=ShuffledScheme(2000))
data_stream=Sort(data_stream, key=_get_input_length)
# this one has long segments of sorted examples: uniform batches can be formeddata_stream=Unpack(data_stream)
# a data stream with uniform batches!data_stream=BatchDataStream(data_stream, iteration_scheme=Constant(10))
Maybe this issue should be moved to Fuel.
The text was updated successfully, but these errors were encountered:
It would be nice to have two new wrappers: one that sorts examples in a batch according to a given key function and one that unpacks batches to compose a stream of examples (like
itertools.chain
does). These two wrappers could be used one after another to make the data stream more uniform, which can yield very significant speed up in some cases, e.g. when the examples are sequences and it is desirable to have sentences of similar lengths in a batch.Altogether it should look like that:
Maybe this issue should be moved to Fuel.
The text was updated successfully, but these errors were encountered: