Pruning method in WTQ #88

sophgit · 2020-11-18T08:16:03Z

Hello,

I am new to this topic and I'm currently trying to use the pruning/filtering method for long tables in the WTQ notebook.
I tried using the flag --prune_columns in the prediction function, but it still gives me "Can't convert interaction: error: Sequence too long".
What are the necessary steps to filter/prune long tables during prediction?

Thank you in advance.

ghost · 2020-11-18T08:44:42Z

Thanks for your interest in TAPAS!

Can you provide some more details?
In particular, the exact example your trying to process (question + table)?

sophgit · 2020-11-18T09:34:26Z

Thank you for your quick response. The questions asked were:

result2=predict(holiday_list_of_list, ["Which people are there?","What is the start date of Brittas Südfrankreich Urlaub?","End date of Brittas Südfrankreich Urlaub?","What is the total Duration of Britta Glatts Holidaystyle Urlaub?"])

This is what the table looks like, it contains 36 rows:

The predictions worked perfectly, when I dropped the last column "TESTCATEGORY". But when I leave it in the dataframe, I get the error mentioned above.

eisenjulian · 2020-11-18T09:42:42Z

Thanks for the quick response @sophgit . In order to facilitate debugging, do you mind sharing the table in a computer friendly format, for example a list of lists? Even better, if you can share a colab that reproduces the error that would be great, which you can do with Google Drive or saving to a github gist from the Save menu.

sophgit · 2020-11-18T10:25:04Z

Can you open this? @eisenjulian
https://colab.research.google.com/drive/1oH8-CuLju5fSwlk24NfvqI1FAWfAIg49?usp=sharing

ghost · 2020-11-18T18:42:21Z

Yes, we can open it.

I think the problem is that the current CLI call:

  ! python -m tapas.run_task_main \
    --task="WTQ" \
    --output_dir="results" \
    --noloop_predict \
    --test_batch_size={len(queries)} \
    --tapas_verbosity="ERROR" \
    --compression_type= \
    --reset_position_index_per_cell \
    --init_checkpoint="tapas_model/model.ckpt" \
    --bert_config_file="tapas_model/bert_config.json" \
    --mode="predict" 2> error \
    --prune_columns

Does only run the predictions but assumes that all TF examples have been created.
The prune_columns flag doesn't affect prediction but only the CREATE_DATA mode.

The actually conversion that should be affected happens in the convert_interactions_to_examples function.

ghost · 2020-11-18T19:02:30Z

To add pruning to the colab you will have to create a token selector:

from tapas.utils import pruning_utils

token_selector = pruning_utils.HeuristicExactMatchTokenSelector(
      vocab_file,
      max_seq_length,
      pruning_utils.SelectionType.COLUMN,
      use_previous_answer=True,
      use_previous_questions=True,
)

and then you can call it just before calling the converter:

    interaction = token_selector.annotated_interaction(interaction)
    number_annotation_utils.add_numeric_values(interaction)
    for i in range(len(interaction.questions)):
      try:
        yield converter.convert(interaction, i)
      except ValueError as e:
        print(f"Can't convert interaction: {interaction.id} error: {e}")

When I tried this I realized there was some problem with beam not being properly installed.
I had to workaround it like this:

import apache_beam as beam

def fake_counter(namespace, message):
  class FakeCounter():
    def inc(increment=None, other=None):
      pass
  return FakeCounter() 

class FakeMetrics:
  def __init__(self):
    self.counter = fake_counter

class FakeMetricsModule:
  def __init__(self):
    self.Metrics = FakeMetrics()

beam.metrics = FakeMetricsModule()

ghost · 2020-11-18T21:25:02Z

Looks like the apache_beam thing can also be fixed by restarting the runtime. See #89 for details.

sophgit · 2020-11-19T08:41:02Z

Thank you so much!!! It seems to work. At least I don't get an error anymore and it does predict. Unfortunately the answers to the questions above are mainly incorrect now, but I'll see if I can work with that. :)

ghost · 2020-11-19T09:32:07Z

Great that it's working for you now.

I am closing this issue, feel free to open a new issue for any model quality problems and we can see if there is something we can do about it.

ghost closed this as completed Nov 19, 2020

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pruning method in WTQ #88

Pruning method in WTQ #88

sophgit commented Nov 18, 2020

ghost commented Nov 18, 2020

sophgit commented Nov 18, 2020

eisenjulian commented Nov 18, 2020

sophgit commented Nov 18, 2020

ghost commented Nov 18, 2020

ghost commented Nov 18, 2020

ghost commented Nov 18, 2020

sophgit commented Nov 19, 2020

ghost commented Nov 19, 2020

Pruning method in WTQ #88

Pruning method in WTQ #88

Comments

sophgit commented Nov 18, 2020

ghost commented Nov 18, 2020

sophgit commented Nov 18, 2020

eisenjulian commented Nov 18, 2020

sophgit commented Nov 18, 2020

ghost commented Nov 18, 2020

ghost commented Nov 18, 2020

ghost commented Nov 18, 2020

sophgit commented Nov 19, 2020

ghost commented Nov 19, 2020