Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

meta_data_vocab comprises of sentences, not tokens #16

Open
Aatlantise opened this issue Jan 21, 2022 · 2 comments
Open

meta_data_vocab comprises of sentences, not tokens #16

Aatlantise opened this issue Jan 21, 2022 · 2 comments

Comments

@Aatlantise
Copy link

Hello,

It looks like meta_data_vocab used as an argument for model declaration is ... not in a format familiar to me? The vocabularies seem to be comprised of sentences, rather than tokens.

I attempted not providing meta_data_vocab as an input, since it seems to be an optional argument, but that also fails due to a snipper of code that invokes meta_data_vocab.itos.

>>> META_DATA.vocab.itos
['<unk>', 'A trial run run on this initialization sentence initializes the OpenIE6 open information extractor .']
>>> meta_data_vocab.itos
['<unk>', 'A trial run run on this initialization sentence initializes the OpenIE6 open information extractor .']

Is meta_data_vocab meant to look like this? I was trying to declare a model that could be used for predicting any given input text, but meta_data_vocab seems to prevent this, assigning each model to one specific predict_fp.

Much thanks!

@SaiKeshav
Copy link
Collaborator

Hi, thank you for your interest in our work. Yes, meta_data_vocab you have is correct. It contains the actual sentences themselves. So that when we print the final predictions of the system, we print the corresponding sentence along with it.

To achieve what you want, one simple solution is to pass the meta_data_vocab to the forward function instead at the time of initialization. Does that make sense?

@Aatlantise
Copy link
Author

Much thanks for your input! I was able to do what I wanted by declaring the model without meta_data_vocab then setting model._meta_data_vocab at the later time, before each inference. A natural follow up question is--would I be able to infer each input text without having to declare a new trainer object every time? The software allows me to do so, but seems to use an indexing of some sort from previous dataloader objects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants