How to overcome memory issues when predicting large batches of data? #69

SaadAhmed433 · 2022-10-18T12:43:47Z

Hello team,

I have a dataset of about 8000 comments each comment is around 6 to 8 words (some are shorted with 2 words only)

The problem is that I am unable to get the prediction since I run out of GPU memory during the process. To overcome this I am using a custom loop to loop over comments in batches and append the results to a data frame.

comments_list = comments["text"].to_list()
 df = pd.DataFrame()

 for i in range(0, len(comments_list), 32):
     comms = comments_list[i : i + 32]
     results = Detoxify("original", device=device).predict(comms)
     results = pd.DataFrame(results)
     df = df.append(results, ignore_index=True)

Is there a more efficient way of doing this than writing a for loop?

Currently I have a 16GB Testa T4 as GPU.

Thanks!

The text was updated successfully, but these errors were encountered:

laurahanu · 2022-10-18T13:03:16Z

Hello, you should initialise the model only once and assign it to a variable so this doesn't happen at each iteration:

model = Detoxify("original", device=device)
comments_list = comments["text"].to_list()
 df = pd.DataFrame()

 for i in range(0, len(comments_list), 32):
     comms = comments_list[i : i + 32]
     results = model.predict(comms)
     results = pd.DataFrame(results)
     df = df.append(results, ignore_index=True)

Now you should be able to use a bigger batch size as well.
Hope this helps!

SaadAhmed433 · 2022-10-18T13:51:09Z

Thanks for the suggestion and pointing it out. The change worked pretty well infact insanely well.

Previously it was averaging around 7 mins to get all the predictions, now everything is done in about 8 seconds.

laurahanu · 2022-10-18T14:08:17Z

Great, glad it helped!

laurahanu closed this as completed Oct 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to overcome memory issues when predicting large batches of data? #69

How to overcome memory issues when predicting large batches of data? #69

SaadAhmed433 commented Oct 18, 2022

laurahanu commented Oct 18, 2022

SaadAhmed433 commented Oct 18, 2022

laurahanu commented Oct 18, 2022

How to overcome memory issues when predicting large batches of data? #69

How to overcome memory issues when predicting large batches of data? #69

Comments

SaadAhmed433 commented Oct 18, 2022

laurahanu commented Oct 18, 2022

SaadAhmed433 commented Oct 18, 2022

laurahanu commented Oct 18, 2022