Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

WTQ dataset reader and models moved from iterative-search-semparse #2764

Merged
merged 24 commits into from
Jun 20, 2019

Conversation

pdasigi
Copy link
Member

@pdasigi pdasigi commented Apr 26, 2019

Fixes #2691

Most of the tests are passing locally. I'll merge this after they all pass and after I evaluate the trained models with the new code and get the same numbers. Also, I suppose there are a bunch of things that can now be deleted (type declarations, SEMPRE related stuff...). I'll add those deletions to this PR as well.

@pdasigi pdasigi changed the title Dataset reader and models moved from iterative-search-semparse WTQ dataset reader and models moved from iterative-search-semparse Apr 26, 2019
@pdasigi
Copy link
Member Author

pdasigi commented Apr 30, 2019

@matt-gardner This PR is finally ready to be reviewed!

Copy link
Contributor

@matt-gardner matt-gardner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome @pdasigi! It looks like it was a whole lot of work, thanks for doing this. I had a few questions, but it looks very good overall.

@@ -32,28 +32,19 @@
class WikiTablesDatasetReader(DatasetReader):
"""
This ``DatasetReader`` takes WikiTableQuestions ``*.examples`` files and converts them into
``Instances`` suitable for use with the ``WikiTablesSemanticParser``. This reader also accepts
pre-processed JSONL files produced by ``scripts/preprocess_wikitables.py``. Processing the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We put this caching into the base DatasetReader recently, which is (I assume) why you just removed this code; maybe also because the corenlp preprocessing is already done, and you're reading those by default? Have you tried using the caching functionality with this at all, or is it already fast enough that it doesn't matter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not aware of the caching functionality. That sounds great! I removed this because I thought reading in the preprocessed files was not too slow.

target_values_field = MetadataField(target_values)
world = WikiTablesLanguage(table_context)
world_field = MetadataField(world)
# Note: Not passing any featre extractors when instantiating the field below. This will make
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't want this to be configurable anymore?

# pylint: enable=protected-access
instance = self._dataset_reader.text_to_instance(question_text, # type: ignore
table_rows,
tokenized_question=tokenized_question)
table_rows)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need additional processing? Does this have to be formatted as a corenlp-tagged table?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is just the raw table content. I added a comment here and updated the docstring of text_to_instance to make that clear.

logical_form = world.get_logical_form(action_strings)
lisp_string = state.extras[batch_index]
if self._executor.evaluate_logical_form(logical_form, lisp_string):
logical_form = world.action_sequence_to_logical_form(action_strings)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can execute action sequences directly, which is definitely recommended here, so you don't have to go back and forth between strings. It'll make things at least a little faster.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks! Added a methodWikiTablesLanguage.evaluate_action_sequence

action_strings = [action_mapping[(i, action_index)] for action_index in action_indices]
has_logical_form = False
try:
logical_form = world[i].action_sequence_to_logical_form(action_strings)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, too, just execute the action sequence directly, instead of converting to a logical form first. If we need to add a world.evaluate_action_sequence(actions, targets), that's fine.

It'll simplify the logic below, also, as you don't need to worry about a ParsingError.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we're also measuring whether the parser is producing valid logical forms (has_logical_form), for which we need to make the logical forms. The value of that metric is usually 1.0, but its nice to have that additional check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should already be guaranteed by construction - we only allow actions that produce valid logical forms. I don't think it's worth slowing down training to test something that should have been tested in a unit test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll also get ExecutionErrors if something was up with the action sequence, which is another place you could catch this. This would catch two kinds of issues, which are the only ones that we should run into, anyway: (1) cases where we were loose in our type system, but have stricter execution requirements, like around lists vs. single items; and (2) cases where the action sequence didn't terminate, but we're trying to evaluate it anyway (keeping unfinished sequences at the end of beam search).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was referring to action sequences that got terminated early hence producing incomplete logical forms. Do you have an estimate of how much slow down this could cause?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but anything you do on the CPU is time that's not utilizing the GPU for training. And those action sequences will be caught with execution errors anyway, so there's no reason to go to a logical form.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or you could just instruct the beam search to not include unfinished sequences, which is probably the right thing to do, anyway...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, removing the has_logical_form metric.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. one more thing. We do need to show the logical form in the demo, and to be able to convert the action sequence into a logical form, we need the corresponding world objects. I think this is the best place to make a logical form and store it in the output dict. Also, this method is only called during validation, so should not significantly slow down training.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah, fair enough, that's a good point. You could pass the world object in the output dictionary, and create the logical form in decode, but it's probably fine to just put it here during validation.

@@ -163,17 +163,14 @@ def get_table_knowledge_graph(self) -> KnowledgeGraph:
self._table_knowledge_graph = KnowledgeGraph(entities, dict(neighbors), entity_text)
return self._table_knowledge_graph

# TODO (pradeep): Make a ``read_from_json`` method similar to what we had in ``TableQuestionKnowledgeGraph``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need that for a demo?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. Removing that TODO.

lines: List[List[str]]) -> Tuple[List[Dict[str, Dict[str, str]]],
Dict[str, Set[str]]]:
"""
This method will be called only when we do not have tagged information from CoreNLP. That is, when we are
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to just call corenlp's python API here instead of this? Or would that require a call to java, which we're trying to avoid? It probably would...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we're missing out on a lot by not using CoreNLP here.

@@ -320,6 +320,21 @@ def get_agenda(self,
agenda.append(f"Number -> {number}")
return agenda

@staticmethod
def is_instance_specific_entity(entity_name: str) -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, seems like we should have a function for this on the base class, but maybe it doesn't fit... I'll think about this some more.

@schmmd
Copy link
Member

schmmd commented May 30, 2019

@pdasigi would like to train a model with this code before merging this PR.

@pdasigi pdasigi force-pushed the wtq_variable_free_model branch 2 times, most recently from 146983f to 457d8da Compare June 17, 2019 21:50
@pdasigi
Copy link
Member Author

pdasigi commented Jun 17, 2019

I spent a lot of time trying to reproduce results from the best model we got using the code in iterative-search-semparse (reported in this paper). I made some progress in getting there, but it's not fully done. I think the biggest discrepancy still remaining is that eneric types (like ComparableColumn) are handled differently here while inducing a grammar. In the current logic in this repo, we add terminals of multiple types, whereas there I had hierarchical types.

I may find other minor differences after I fix these two issues.

That said, I think reproducing the result is beyond the scope of this PR. Given that this PR is already huge, and it has important changes (like removing jdk dependency), I've decided to merge this now and open another PR with changes that ensure we get the same result. Any objections @matt-gardner?

@matt-gardner
Copy link
Contributor

Is the result close?

@pdasigi
Copy link
Member Author

pdasigi commented Jun 20, 2019

@matt-gardner Update: I trained an MML model, followed by an ERM model initialized from it using this code. The dev accuracies are both only 1 point behind those from comparable models using the old code. I think that's close enough, and I'm going to merge this PR (after resolving conflicts). I'll spend some time running experiments to try out the things we discussed offline yesterday, and if they give better results, I'll open a new PR.

@pdasigi pdasigi merged commit 0fbd1ca into allenai:master Jun 20, 2019
@pdasigi pdasigi deleted the wtq_variable_free_model branch June 20, 2019 19:21
reiyw pushed a commit to reiyw/allennlp that referenced this pull request Nov 12, 2019
…llenai#2764)

* added dataset reader

* moved dataset reader

* moved models

* removed unnecessary except

* fixed the name of the model in docs

* change trainer_test

* minor fixes

* removed unnecessary modules

* removed unnecessary tests

* removed table knowledge graph

* removed table knowledge graph

* updated knowledge graph field test with new context

* fixed docs after removing table knowledge graph

* got the predictor tests passing

* fixed a bug in checking if data is tagged

* minor fixes

* addressed PR comments and added a test for reading untagged tables

* fixed a silly doc issue

* removed jdk dependency

* more language changes

* undid changes made for backwards compatibility

* pylint and mypy fixes

* better error raising
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Install jre instead of jdk
3 participants