Slow performance #8

guptam · 2020-05-13T17:25:59Z

Hi,

Thanks for releasing this in open source. Wonderful concept and very different for other seq2seq or ln2sql like approaches.

I am facing performance issues when trying with sqa prediction notebook (using SQA Large). It takes more than 60 seconds on a dual gpu machine for evaluating the model and giving response to a query. Is this normal? How can we improve the prediction time?

Thanks,
Manish

eisenjulian · 2020-05-13T17:36:40Z

Hello @guptam Thanks for the interest and the question. To better understand your problem, can you confirm if the GPUs are being used? Is that time consistent to what you get on Google Colab? You can change the verbosity flag and look at the error log file to see if anything is going wrong with accelerator usage.

While the notebook is a good example to see some predictions, there's a lot of overhead since every time you run the prediction cell a new python runtime is spun up, and the full model has to be loaded from the checkpoint. That's probably taking the most of those 60 seconds, so if you want to predict for multiple examples you probably want do it at once for all of the examples, which is already supported by the evaluation script.

guptam · 2020-05-13T17:49:31Z

Hi @eisenjulian. Thanks for a quick response.

I am not using Colab. TF is using both the GPUs. I will try separating the model loading and evaluation. I was just trying to do a quick test, and might have missed this part. Will create a new predict function and use a pre-loaded model.

Regards,
Manish

eldhosemjoy · 2020-05-16T05:30:05Z

Hi, @guptam I believe the main overhead for prediction comes from the loading the checkpoints as well as the creation of a file as a tf example and on further writing the predictions to a file. The current code in the google collab is actually running a task to predict by generating 2 files out of which one is an empty file. I guess moving away the creation of files and model loading would help you to get improved performance. Are you looking at a script for just prediction?

guptam · 2020-05-16T17:34:23Z

Hi @eldhosemjoy. Yes, only looking for a script to predict using an already loaded model.

eldhosemjoy · 2020-05-18T09:12:14Z

Hi @guptam, a suggestion to get the prediction script implemented.

You will need to remove the dataset creation of file keeping proto to create features. collection. Convert those proto features using the already existing code on tapas/experiments/prediction_utils.py removing the tf session from lines 57 to 68. Generate features by converting them to (int32 - also change in the proto creation file i.e. tf_example_utils.py). Use the first example and send them to the separated estimator model function (def model_fn) from tapas/models/tapas_classifier_model.py on line 919 returning predictions. Use a session and graph to validate the prediction and implement the
def write_predictions from tapas/experiments/prediction_utils.py to return back a JSON.

I hope this would help you to build a prediction script and would definitely improve the performance.

On a later part, you could use tf.placeholders for your features and invoke the model_fn using a session and graph.

I believe there are even better ways and would love to know.

Thanks

eisenjulian · 2020-05-18T12:58:07Z

To add a clarification, the current script as used in the notebook should work as-is to predict on a large number of examples, with only a minimal change to dump all of your examples into a single tf_example file. Only when running it multiple times with just one example at a time is that there is a lot of overhead, both because the model is loaded again every time run_task_main.py is run, and because size 1 batches are wasting GPU/TPU parallelism.

On the other hand if you want to load the model in the notebook, the easiest way would be to copy the content of the run_task_main.py file into a notebook as a starting point. The estimator object which is defined here contains the model in memory and can be used to train and/or predict.

I hope this helps, otherwise please give us more info to help us understand your use case.

eldhosemjoy · 2020-05-18T15:48:47Z

@eisenjulian Absolutely. I had given the above option for a single example prediction on a realtime chat-based UI approach. To be more specific I guess @guptam was looking at more of leveraging the repo and the model for single predictions more towards a service/API level adaptation of the run_task_main.py if I have caught it right.

eisenjulian · 2020-05-18T16:20:40Z

Thanks for the clarification. If what you are looking for is to have a service to do predictions in real time, there are a few alternatives:

Create the estimator object as I commented before and predict when needed on a webserver. This is not the recommended approach for production usecases.
Export the a SavedModel from the estimator https://www.tensorflow.org/guide/saved_model#using_savedmodel_with_estimators and load it with a TensorFlow Serving server. That service receives serialized tf_examples, so you will need to create the tf_examples before calling it, just as it's done in the notebook, which you will likely need to do in a python server that loads the vocabulary file and the tapas lib.

monuminu · 2020-05-28T02:23:14Z

@eldhosemjoy

Can you please share your approach , as i am also trying for the same . Would save a lot of time .

eldhosemjoy · 2020-08-29T06:34:40Z

@monuminu, @guptam - I had tried to implement the prediction of the same as a service. You can have a look at the same -TAPAS Service Adaptation

Akshaysharma29 · 2020-09-25T06:35:58Z

Hi, @eldhosemjoy thanks for sharing your work but while trying to use your code on colab I am facing below issue

path of model:/content/temp/tapas/model/tapas_sqa_base.pb
---------------------------------------------------------------------------
DecodeError                               Traceback (most recent call last)
<ipython-input-14-c82d684a8d52> in <module>()
----> 1 tapaspredictor  = TapasPredictor()

2 frames
<ipython-input-13-5e395447c7d6> in load_frozen_graph(self, frozen_graph_filename)
     80     with tf.gfile.GFile(frozen_graph_filename, "rb") as f:
     81         graph_def = tf.GraphDef()
---> 82         graph_def.ParseFromString(f.read())
     83 
     84     # Then, we import the graph_def into a new Graph and returns it

DecodeError: Error parsing message

eldhosemjoy · 2020-09-25T06:57:35Z

@Akshaysharma29 you will need to do a git lfs pull or clone. The model is an LFS object.

Akshaysharma29 · 2020-09-25T08:45:54Z

@eldhosemjoy thanks for the quick response
ok I will try it

eldhosemjoy · 2020-09-25T09:55:48Z

git lfs clone https://github.com/eldhosemjoy/tapas.git
This will have the model pulled in to the repository.
@Akshaysharma29 You could take a look at this and you can directly run it from the directory - https://github.com/eldhosemjoy/tapas/blob/master/test/class_test.py

Akshaysharma29 · 2020-09-28T06:12:24Z

Thanks, @eldhosemjoy. it's working. which model version of SQA you have convert?

eldhosemjoy · 2020-09-28T07:12:50Z

@Akshaysharma29 The SQA Base - https://storage.googleapis.com/tapas_models/2020_04_21/tapas_sqa_base.zip

rahulyadav02 · 2020-12-24T05:51:50Z

Hi @eldhosemjoy ,
I'm also facing issue with the slow performance.
In your repo TAPAS Service Adaptation, you have created a new class for prediction, where you are loading the saved model from config.json file.
Can you hep me with where exactly in the repo you are saving the model?

TheurgicDuke771 · 2020-12-24T12:31:32Z

git lfs clone https://github.com/eldhosemjoy/tapas.git
This will have the model pulled in to the repository.
@Akshaysharma29 You could take a look at this and you can directly run it from the directory - https://github.com/eldhosemjoy/tapas/blob/master/test/class_test.py

Hi @eldhosemjoy while running class_test.py I am getting this error question_id = example["question_id"][0, 0].decode("utf-8") TypeError: 'Example' object is not subscriptable

eisenjulian closed this as completed May 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow performance #8

Slow performance #8

guptam commented May 13, 2020

eisenjulian commented May 13, 2020

guptam commented May 13, 2020

eldhosemjoy commented May 16, 2020

guptam commented May 16, 2020

eldhosemjoy commented May 18, 2020

eisenjulian commented May 18, 2020

eldhosemjoy commented May 18, 2020 •

edited

eisenjulian commented May 18, 2020

monuminu commented May 28, 2020

eldhosemjoy commented Aug 29, 2020

Akshaysharma29 commented Sep 25, 2020

eldhosemjoy commented Sep 25, 2020

Akshaysharma29 commented Sep 25, 2020 •

edited

eldhosemjoy commented Sep 25, 2020

Akshaysharma29 commented Sep 28, 2020 •

edited

eldhosemjoy commented Sep 28, 2020

rahulyadav02 commented Dec 24, 2020 •

edited

TheurgicDuke771 commented Dec 24, 2020

Slow performance #8

Slow performance #8

Comments

guptam commented May 13, 2020

eisenjulian commented May 13, 2020

guptam commented May 13, 2020

eldhosemjoy commented May 16, 2020

guptam commented May 16, 2020

eldhosemjoy commented May 18, 2020

eisenjulian commented May 18, 2020

eldhosemjoy commented May 18, 2020 • edited

eisenjulian commented May 18, 2020

monuminu commented May 28, 2020

eldhosemjoy commented Aug 29, 2020

Akshaysharma29 commented Sep 25, 2020

eldhosemjoy commented Sep 25, 2020

Akshaysharma29 commented Sep 25, 2020 • edited

eldhosemjoy commented Sep 25, 2020

Akshaysharma29 commented Sep 28, 2020 • edited

eldhosemjoy commented Sep 28, 2020

rahulyadav02 commented Dec 24, 2020 • edited

TheurgicDuke771 commented Dec 24, 2020

eldhosemjoy commented May 18, 2020 •

edited

Akshaysharma29 commented Sep 25, 2020 •

edited

Akshaysharma29 commented Sep 28, 2020 •

edited

rahulyadav02 commented Dec 24, 2020 •

edited