Skip to content
This repository has been archived by the owner on Jan 19, 2019. It is now read-only.

Tests and fixes to new data api #407

Open
wants to merge 20 commits into
base: new_data_api
Choose a base branch
from

Conversation

DeNeutoy
Copy link
Contributor

No description provided.

@staticmethod
def from_params(params: Params):
choice = params.pop_choice('type', list(token_indexers.keys()), default_to_first_choice=True)
return token_indexers[choice].from_params(params)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are you putting this code? This was intentional, actually, and makes constructing objects from params a lot easier. If it's not here, it has to be duplicated every time you want to construct an object.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't have this here because it creates circular imports - token_indexers are imported from __init__ but haven't been defined yet. We also can't import the classes directly because they inherit from this class.

Copy link
Contributor Author

@DeNeutoy DeNeutoy Jun 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One idea to fix this would be to have parameter checking only at the Model level. This would remove the need for the from_params on everything and we could just log parameters inside classes. This would have the benefit that the log would show where the parameter was defined, which it currently doesn't because all the logging happens in params. The replicability is nice, but we now achieve that with Dockerised experiments, so maybe it is less useful. I don't know about you but I have never really used the logged parameters past a "those look right" sort of check anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just move the import inside the method to remove the circular import. We really do want everything to be constructed from_params, so that we can have a single configuration file that runs a whole experiment (i.e., it's not just parameter checking that this gets us). Having this here makes it so that things like this work. Otherwise it's more complicated to do the construction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, will do

from ..vocabulary import Vocabulary
from .token_indexer import TokenIndexer, TokenType
from deep_qa.common.util import pad_sequence_to_length
from deep_qa.common import Params
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be relative, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

@DeNeutoy
Copy link
Contributor Author

DeNeutoy commented Jun 22, 2017

In summary this PR does the following few things:

  • Moves sort_by_padding from the Dataset to the DataGenerator
  • Adds tests for : Vocabulary, Dataset, all concrete Fields, Embeddings, TokenIndexers, DatasetReaders(The reader ones could be more comprehensive, probably).
  • Roughly fixes the DataGenerator to work with the new Dataset class; I haven't fixed the tests for this, because MattG wrote them and they rely on some FakeInstances, which I'm not exactly sure how to translate to the new api.
  • Adds getter methods to the Fields, Instance and Dataset classes, called things like fields() for the Instance class, and tokens() for the TextField etc. I think it's important to be able to interact with the different classes so that writing functions for mapping predictions back to text etc is easy.

@matt-peters
Copy link

This looks great to me, from my somewhat quick look 👍 . I didn't see any big API changes in this, just testing and cleanup, right? It's nice to have usage examples in the test suite, thank you for adding them!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants