Conversation
@nelson-liu also interested in your feedback |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic!
thanks, this looks great! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
tutorials/misc/migrating-to-0.4.0.md
Outdated
In addition, the base `DatasetReader` constructor now takes a `lazy: bool` parameter, | ||
which means that your subclass constructor should also take that parameter | ||
(unless you don't want to allow laziness, but why wouldn't you?) | ||
and explicitly pass it the superclass constructor: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"pass it to"?
tutorials/misc/migrating-to-0.4.0.md
Outdated
# whatever other initialization you need | ||
``` | ||
|
||
(For the reasoning behind this change, see the [Laziness tutorial](https://github.com/allenai/allennlp/blob/master/tutorials/getting_started/laziness.md).) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd remove the parentheses here.
tutorials/misc/migrating-to-0.4.0.md
Outdated
|
||
# CHANGES YOU ARE MUCH LESS LIKELY TO RUN INTO | ||
|
||
If you only ever use our the command line tools (`python -m allennlp.run ...`) to train / evaluate models, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could say something in here about writing model / dataset reader code ("use our command line tools" doesn't make it obvious that you also shouldn't have to worry about anything else if you only wrote Models
or DatasetReaders
). Also, you have "use our the command line tools".
tutorials/misc/migrating-to-0.4.0.md
Outdated
|
||
In 0.4.0, `DatasetReader.read()` returns an `Iterable[Instance]`, | ||
which could be a list of instances or could produce them lazily. | ||
This means that the indexing and tensorization needs to happen elsewhere. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A sentence saying where the logic moved would be nice.
tutorials/misc/migrating-to-0.4.0.md
Outdated
|
||
As `Dataset` no longer exists, we replaced `Vocabulary.from_dataset()` | ||
with `Vocabulary.from_instances()`, which accepts an `Iterable[Instance]`. | ||
In particulary, you'd most likely call this with the results of one or more calls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"particulary"
|
||
To handle tensorization, | ||
0.4.0 introduces the notion of a `Batch`, | ||
which is basically just a list of `Instance`s. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Above you have Tensor
s (with a space) and here you have it with no space. Not sure which is right.
|
||
Furthermore, each `Instance` now knows whether it's been indexed, | ||
so in the eager case (when all instances stay in memory), | ||
the indexing only happens in the first iteration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like "This means that your first epoch will bit a little bit slower than all other epochs" would be good, I think.
* migration guide * tweaks * address PR feedback
my plan is to link to this from the release notes.
let me know if I missed anything or explained anything poorly.