How to execute own queries? #5

kev2513 · 2021-02-04T04:00:41Z

Hello I would like to insert my own Questions and Databases but when I try to change the Spider json files it get the error:

RuntimeError: Error(s) in loading state_dict for EncDecModel:
	size mismatch for decoder.rule_logits.2.weight: copying a param with shape torch.Size([97, 128]) from checkpoint, the shape in current model is torch.Size([76, 128]).
	size mismatch for decoder.rule_logits.2.bias: copying a param with shape torch.Size([97]) from checkpoint, the shape in current model is torch.Size([76]).
	size mismatch for decoder.rule_embedding.weight: copying a param with shape torch.Size([97, 128]) from checkpoint, the shape in current model is torch.Size([76, 128]).
	size mismatch for decoder.node_type_embedding.weight: copying a param with shape torch.Size([55, 64]) from checkpoint, the shape in current model is torch.Size([49, 64]).

Is the an elegant solution to test my own data?
Thanks in advance!

The text was updated successfully, but these errors were encountered:

Impavidity · 2021-02-05T22:05:57Z

Hey,
Thanks for your interests on our work. You can checkout the pull request #6 when it is merged. I think you can run your own database and queries based on the notebook I provided.

Let me know if it works for you and let me know if you have any further questions.

Peng

kev2513 · 2021-02-06T03:15:19Z

Hello Peng,

Thank you very much for your quick response! I tried the notebook and it worked 👍 I will let you know if I have any questions. Have a nice weekend.

Kevin

kev2513 · 2021-02-06T23:07:31Z

Hello Peng,

I made further tests and figured out that the response sometimes contains the word 'terminal' for example:

Query: department with budget greater then 10 billion

Answer: SELECT department.Department_ID FROM department WHERE department.Budget_in_Billions > 'terminal'

I guess 'terminal' should be replaced the words contained in the query. How can the replacement be achieved?

Sincerely
Kevin

Impavidity · 2021-02-06T23:41:06Z

Hey Kevin,

Thanks for your question. So the terminal usually will be a cell value: it could be a float/integer or a string.
It usually involves some value copy mechanism to do it; but currently the model doesn't support it.

However, there is a simple solution for this:
If it is number, you can easily detect the number in the utterance and directly fill it in the generated SQL. For string type, you can match the n-gram against the value in the databases: if it matched, it would be a string value for the corresponding column.

I think above is a simple solution for this. I have a script to achieve this but it takes time to clean it and make it public. You can try this method out because that is pretty simple.

And I will try to make the script public as soon as possible if you did not implement it by yourself.

Peng

kev2513 · 2021-02-07T00:25:43Z

Hey Peng,

Thank you very much for your explanation. I will try my best :)

Sincerely
Kevin

thecodemakr · 2021-02-27T19:46:28Z

Hi @Impavidity @kev2513 , I get the following error on trying the notebook -

WARNING <class 'seq2struct.models.enc_dec.EncDecModel.Preproc'>: superfluous {'name': 'EncDec'}
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-21-d986dbd802ee> in <module>()
----> 1 inferer = Inferer(infer_config)

4 frames
/content/gap-text2sql/rat-sql-gap/seq2struct/commands/infer.py in __init__(self, config)
     34             registry.lookup('model', config['model']).Preproc,
     35             config['model'])
---> 36         self.model_preproc.load()
     37 
     38     def load_model(self, logdir, step):

/content/gap-text2sql/rat-sql-gap/seq2struct/models/enc_dec.py in load(self)
     54 
     55         def load(self):
---> 56             self.enc_preproc.load()
     57             self.dec_preproc.load()
     58 

/content/gap-text2sql/rat-sql-gap/seq2struct/models/spider/spider_enc.py in load(self)
   1272 
   1273     def load(self):
-> 1274         self.tokenizer = BartTokenizer.from_pretrained(self.data_dir)
   1275 
   1276 

/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py in from_pretrained(cls, *inputs, **kwargs)
   1138 
   1139         """
-> 1140         return cls._from_pretrained(*inputs, **kwargs)
   1141 
   1142     @classmethod

/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py in _from_pretrained(cls, pretrained_model_name_or_path, *init_inputs, **kwargs)
   1244                     ", ".join(s3_models),
   1245                     pretrained_model_name_or_path,
-> 1246                     list(cls.vocab_files_names.values()),
   1247                 )
   1248             )

OSError: Model name 'data/spider-bart/nl2code-1115,output_from=true,fs=2,emb=bart,cvlink/enc' was not found in tokenizers model name list (facebook/bart-base, facebook/bart-large, facebook/bart-large-mnli, facebook/bart-large-cnn, facebook/bart-large-xsum, yjernite/bart_eli5). We assumed 'data/spider-bart/nl2code-1115,output_from=true,fs=2,emb=bart,cvlink/enc' was a path, a model identifier, or url to a directory containing vocabulary files named ['vocab.json', 'merges.txt'] but couldn't find such vocabulary files at this path or url.

Can you please help me with what step am I missing?

kev2513 · 2021-02-28T00:28:20Z

Hello @thecodemakr,

I got the same issue executing Inference solved the problem for me:

python run.py preprocess experiments/spider-configs/gap-run.jsonnet

(also execute the Preprocess dataset step in advance)

alan-ai-learner · 2021-07-26T10:56:13Z

Can you pls tell me that, How much time this command will run "python run.py preprocess experiments/spider-configs/gap-run.jsonnet", i'm running it for like an hr

roburst2 · 2022-04-29T14:42:16Z

Hi @Impavidity @kev2513 , I get the following error on trying the notebook -

WARNING <class 'seq2struct.models.enc_dec.EncDecModel.Preproc'>: superfluous {'name': 'EncDec'}
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-21-d986dbd802ee> in <module>()
----> 1 inferer = Inferer(infer_config)

4 frames
/content/gap-text2sql/rat-sql-gap/seq2struct/commands/infer.py in __init__(self, config)
     34             registry.lookup('model', config['model']).Preproc,
     35             config['model'])
---> 36         self.model_preproc.load()
     37 
     38     def load_model(self, logdir, step):

/content/gap-text2sql/rat-sql-gap/seq2struct/models/enc_dec.py in load(self)
     54 
     55         def load(self):
---> 56             self.enc_preproc.load()
     57             self.dec_preproc.load()
     58 

/content/gap-text2sql/rat-sql-gap/seq2struct/models/spider/spider_enc.py in load(self)
   1272 
   1273     def load(self):
-> 1274         self.tokenizer = BartTokenizer.from_pretrained(self.data_dir)
   1275 
   1276 

/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py in from_pretrained(cls, *inputs, **kwargs)
   1138 
   1139         """
-> 1140         return cls._from_pretrained(*inputs, **kwargs)
   1141 
   1142     @classmethod

/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py in _from_pretrained(cls, pretrained_model_name_or_path, *init_inputs, **kwargs)
   1244                     ", ".join(s3_models),
   1245                     pretrained_model_name_or_path,
-> 1246                     list(cls.vocab_files_names.values()),
   1247                 )
   1248             )

OSError: Model name 'data/spider-bart/nl2code-1115,output_from=true,fs=2,emb=bart,cvlink/enc' was not found in tokenizers model name list (facebook/bart-base, facebook/bart-large, facebook/bart-large-mnli, facebook/bart-large-cnn, facebook/bart-large-xsum, yjernite/bart_eli5). We assumed 'data/spider-bart/nl2code-1115,output_from=true,fs=2,emb=bart,cvlink/enc' was a path, a model identifier, or url to a directory containing vocabulary files named ['vocab.json', 'merges.txt'] but couldn't find such vocabulary files at this path or url.

Can you please help me with what step am I missing?

@thecodemakr I am also facing the issue while running the notebook
How did you resolve this

kev2513 changed the title ~~How to train with own queries?~~ How to execute own queries? Feb 4, 2021

Impavidity added a commit to Impavidity/gap-text2sql that referenced this issue Feb 5, 2021

support inference on own databases and queries (awslabs#5)

fb1d953

Impavidity mentioned this issue Feb 5, 2021

support inference on own databases and queries (#5) #6

Merged

pnpnpn pushed a commit that referenced this issue Feb 5, 2021

support inference on own databases and queries (#5)

c90d4e0

rahuls321 mentioned this issue May 15, 2023

Can anyone help out in figuring out what is "terminal" here? #26

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to execute own queries? #5

How to execute own queries? #5

kev2513 commented Feb 4, 2021 •

edited

Impavidity commented Feb 5, 2021

kev2513 commented Feb 6, 2021

kev2513 commented Feb 6, 2021 •

edited

Impavidity commented Feb 6, 2021

kev2513 commented Feb 7, 2021

thecodemakr commented Feb 27, 2021

kev2513 commented Feb 28, 2021

alan-ai-learner commented Jul 26, 2021

roburst2 commented Apr 29, 2022

How to execute own queries? #5

How to execute own queries? #5

Comments

kev2513 commented Feb 4, 2021 • edited

Impavidity commented Feb 5, 2021

kev2513 commented Feb 6, 2021

kev2513 commented Feb 6, 2021 • edited

Impavidity commented Feb 6, 2021

kev2513 commented Feb 7, 2021

thecodemakr commented Feb 27, 2021

kev2513 commented Feb 28, 2021

alan-ai-learner commented Jul 26, 2021

roburst2 commented Apr 29, 2022

kev2513 commented Feb 4, 2021 •

edited

kev2513 commented Feb 6, 2021 •

edited