Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about Database access in Serve Mode #32

Closed
adityay121 opened this issue Dec 2, 2021 · 9 comments
Closed

Questions about Database access in Serve Mode #32

adityay121 opened this issue Dec 2, 2021 · 9 comments
Labels
question Further information is requested

Comments

@adityay121
Copy link

Hey @tscholak , so I have been fiddling with your model for a while now, I love the work you guys have done, I just wanted to ask a few questions about the files that go into your ./database folder when you deploy it on serving mode, as per the ReadMe the format's supposed to be like the one shown below

database/
  my_1st_database/
    my_1st_database.sqlite
  my_2nd_database/
    my_2nd_database.sqlite

I am just wondering about the content in each of these files, are they supposed to have both the schema and the rows of data?

And another thing is for my current use case that I want to try your model on, my data is stored on a Postgres AWS server, I can't really convert these to SQLite or even export this data to a local machine (company policy) so how would you suggest me to get the model working with such a setup, what are some of the changes that I would have to make?

Thank you for taking out your time

@adityay121
Copy link
Author

As a follow up to this, I have also seen that the model returns me the expected output by executing the SQL query generated, in my current use case that I want to test out the model for, it's not really possible for me feed the model with data (rows in SQL tables), I am confused why does the model need data when it only needs the input database schema, could you please help me in understanding that?

@tscholak tscholak added the question Further information is requested label Dec 4, 2021
@tscholak
Copy link
Collaborator

tscholak commented Dec 4, 2021

Hi!

I am just wondering about the content in each of these files, are they supposed to have both the schema and the rows of data?

The files must contain the (portion of the) schema the model is expected to work on. They may otherwise be empty. If they contain rows with real entity names, then the model may perform better: the inference pipeline searches the database for entities that match nouns and phrases from the user's question and, if found, adds the information to the input to the model. The model then can use it to generate a more accurate SQL query.

@tscholak
Copy link
Collaborator

tscholak commented Dec 4, 2021

how would you suggest me to get the model working [if] I can't really convert [the Postgres databases] to SQLite?

You can reproduce the schema in sqlite. That gets you something that can work ok in many cases.

@tscholak
Copy link
Collaborator

tscholak commented Dec 4, 2021

why does the model need data when it only needs the input database schema?

The model can work with a database that is empty except for the schema. In that case, the queries will return empty results.

@tscholak
Copy link
Collaborator

tscholak commented Dec 6, 2021

@adityay121 I take your thumbs-up reaction as a sign that the issue was sufficiently addressed, was it not?

@adityay121
Copy link
Author

adityay121 commented Dec 7, 2021

Hey @tscholak Sorry I forgot to reply back to you, yes most of the queries have been addressed above, thank you. out of curiosity, I wanted to know how can I increase the sequence length of the output predictions, in case I want a longer output. also I wanted to know your opinions on the subject

  • Can using the engineering tricks from the Terraformer paper, help reduce the inference time for this model?

@adityay121
Copy link
Author

adityay121 commented Dec 7, 2021

Hi @tschola, sorry to bother, I was hoping you could help me understand the following issues that I am facing.

  • Token indices sequences length is longer the the specified maximum sequence length for this model, (some number > 512), running this sequence will through the model will result in indexing errors

    • After some time I get the following line make: *** [Makefile:173: serve] Error 137 and the uvicorn server terminates, I get a Connection aborted message on my client side
    • The database that I am using for this case, has many columns, which I believe is the culprit, I get this when I am making use the default parameters on the serve.json, from what I know this occurs as the model is using a smaller variant of the pretrained model on your Huggingface page. Would changing the model id on serve.json help in using a larger model? or does it have to do something with the RAM / vRAM on the server's end
  • I have tried to deploy the model on my personal machine with an Nvidia GPU, I have installed the Nvidia Container Runtime as well, but the I still get the error that the docker daemon is unable to detect the CUDA?

    • Do you think that I have to change something on the Makefile?
  • On serve mode I don't really want my model to run the SQL query which has been generated, I would just want the query which has been generated, do I need to change something in the serve.py file?

@adityay121 adityay121 reopened this Dec 7, 2021
@tscholak
Copy link
Collaborator

tscholak commented Dec 8, 2021

Hi @adityay121!

how can I increase the sequence length of the output predictions

This is a configuration parameter. You can set max_target_length to a higher value in the config file. The default is 256 tokens, I believe.

Can using the engineering tricks from the Terraformer paper help reduce the inference time for this model?

In theory, yes. However, this paper introduces a new transformer architecture that needs to be trained from scratch. As far as I can see, it is not possible to convert an existing pre-trained and/or fine-tuned T5 model to that architecture to make use of the sparse optimizations described in the paper. If Google releases a T5 model checkpoint based on the Terraformer architecture and Huggingface adds an implementation in transformers, then there will be an opportunity to fine-tune on Spider and benefit from the better inference time. Without these things, we are out of luck.

Token indices sequences length is longer ...

This is a warning that is shown once when the number of input tokens exceeds the maximum (512 tokens, I believe). In that case, the input is truncated and inference is run with the remainder. While this is a lossy intervention, it is not a critical error, and your program will continue working afterwards.

After some time I get the following line ...

This could be the issue described in #20. Try increasing the memory allocated to docker. 8GB is the absolute minimum, I am using up to 64 GB.

The database that I am using for this case has many columns ...

The bigger the input to the model, the larger the memory requirements. The smaller the model, the smaller those requirements.

I have tried to deploy the model on my personal machine with an Nvidia GPU ... Do you think that I have to change something on the Makefile?

Yes, you have to make sure that the CUDA device is visible to PyTorch in the docker container. Run the Python interpreter, import torch, and call torch.cuda.is_available(). If that doesn't give True, then try using nvidia-docker instead of docker in the Makefile. If you don't have/use nvidia-docker, try using the method described in https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/user-guide.html. You also have to use a recent GPU driver that is compatible with CUDA 11.2.

I would just want the query which has been generated.

For that, you need to change the serving code. See https://github.com/ElementAI/picard/blob/e37020b6eee18bff865d9d2ba852bd636f3ed777/seq2seq/serve_seq2seq.py#L133.

@adityay121
Copy link
Author

Hello Dr @tscholak, thank your for taking your time out and helping me to get stuff done 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants