-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Model inference locally #359
Comments
@Ahanmr Do you mean model accuracy with model performance? If yes, you need to use
and for Tensorflow model you can do you don't need inference server for that, if that's your question? |
So, the following line of code helped me get the metrics for the test_set, but I wanted to clarify if I'm doing this correctly once, so that I can go ahead and deploy this. eval_metrics = trainer.evaluate(eval_dataset=eval_data_paths, metric_key_prefix='eval')
for key in sorted(eval_metrics.keys()):
print(" %s = %s" % (key, str(eval_metrics[key]))) After this, to get the predictions on a batch of product_id sessions, I do the following: eval_dataload = get_dataloader(eval_data_paths, train_args.per_device_eval_batch_size)
batch = next(iter(eval_dataload)) and batch looks like this: {'product_id-list_seq': tensor([[ 642, 642, 642, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0],
[ 2962, 2962, 2962, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0],
[ 12252, 10221, 18489, 20159, 31245, 10221, 18885, 12252, 18885,
12252, 18885, 12252, 2886, 20159, 2886, 20159, 2886, 31245,
78571, 10221],
[ 5005, 1839, 3112, 15067, 3112, 455, 3503, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0],
...
[ 6589, 1881, 185, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0]], device='cuda:0')} then finally, on running, model(batch), I got the following output dictionary, just wanted to clarify, what are labels here? Is this the target or the predicted output? output = model(batch)
{'loss': tensor(12.9607, device='cuda:0', grad_fn=<NllLossBackward0>),
'labels': tensor([ 642, 642, 2962, 2962, 10221, 18489, 20159, 31245, 10221,
18885, 12252, 18885, 12252, 18885, 12252, 2886, 20159, 2886,
20159, 2886, 31245, 78571, 10221, 1839, 3112, 15067, 3112,
455, 3503, 27648, 33940, 371869, 1693, 3809, 1693, 1152,
39043, 20117, 162305, 42138, 1976, 11139, 1920, 746, 746,
322, 3780, 3121, 185678, 3121, 3780, 3121, 3780, 3121,
3121, 73064, 73064, 73064, 419261, 147073, 95499, 175334, 59296,
1497, 2602, 2602, 462300, 50554, 7634, 50554, 74959, 50554,
225943, 3525, 1264, 501, 1264, 79, 6027, 6027, 31012,
21, 6, 306, 4363, 14283, 30137, 4363, 30137, 224529,
143820, 224522, 925, 576, 1902, 17982, 25200, 162768, 162772,
203166, 162772, 162768, 85980, 62426, 1881, 185],
device='cuda:0'),
'predictions': tensor([[ -40.8026, -41.2528, -13.0302, ..., -41.4139, -41.3386,
-41.0184],
[ -40.8026, -41.2528, -13.0302, ..., -41.4139, -41.3386,
-41.0184],
[ -57.8077, -57.5196, -21.6697, ..., -58.0113, -55.8364,
-57.8259],
...,
[ -16.0957, -16.1184, -10.7048, ..., -16.0890, -16.1076,
-16.0958],
[-103.6853, -102.6834, -13.4621, ..., -100.6037, -104.4472,
-104.3079],
[ -70.6631, -70.4196, -5.5336, ..., -71.1709, -70.4627,
-70.9110]], device='cuda:0', grad_fn=<LogSoftmaxBackward0>),
'pred_metadata': {},
'model_outputs': []} So, I extracted |
I am also confused about this. Is there a way to get sorted predictions for each session with their labels (predicted output)? |
@SaadAhmed433 The model generates predicted logit values and you need to convert them to the recommended item_ids. See this example how we do it with a util function (cell 23). Hope that helps. |
Yes I have been looking at this notebook for sometime now and after a bit of experimentation I was able to make it work for my use case. Hopefully the implementation is correct @rnyak
The batch contains 32 sessions.
Stores the prediction logits in response variable above Now I load the data that I have pre-processed
Made a very slight change to the util function because I am not using the tritonclient inference server
Finally
Output: |
@SaadAhmed433 how did you define the method "get_dataloader"? |
@kontrabas380 I followed the tutorial example given in this notebook |
Hi All, First of all, thank you for sharing this awesome code with us! I'm afraid that there's an issue with the local inference as explained above, i.e.
The masking layer if retained in a trained model and not disabled (I'm not sure there's a easy way to turn it off / remove), then at inference it will be replacing an embedding of the last element of every sequence with a learnable masking embedding. To my understanding this effectively makes the model ignore the last, more recently consumed item. |
@SaadAhmed433: About the local prediction, in your code: "batch = next(iter(trainer.get_eval_dataloader()))" only returns 32 sessions. Do you know how to loop through the whole file to get all sessions? Does trainer.eval_dataloader remember last batch read position and we just keep calling next until batch is null, or "next(iter(" just randomly return 32 records? Thanks test_path = ['./data/sessions_by_day_mind/6/test.parquet'] #batch |
Yes you can keep calling the next and you will get the next batch in sequence, it will remember where it ended and will start from the new position. Here is a small snippet of code
Now you loop over the length of the
|
@SaadAhmed433 thanks for the detailed explanation. On the last section, the |
@rnyak Thanks for the explanation. Are you sure the printed predictions of this example are Item IDs? Or are they indices of Items? I searched in the initial dataset and cannot find some this numbers as an item id. I wish the example would go one extra step to print exactly the recommendation from the initial dataset. |
@hosseinkalbasi These are categorified (encoded) item ids. you need to convert/map them back to original item ids. in order to do that, you need to find the categories folder where you can see the parquet file |
Thank you @rnyak for the detailed response, interesting! I understand now! |
The index column from basically if you do that
you will get your encoded item id column and original ids in the same cudf dataframe. |
Thank you @rnyak! This is so much clearer for me now. I can now resolve recommendations. Detailed comaprisons: Let's see how many Item Ids we have in the first place
How many encoded item ids do we have after NVTabular workflow has been applied?
All good so far! However, if we take a closer look at the prediction (Triton Inference Server response), we have After running this code:
Output:
Is this expected @rnyak ? I expected the length of each row of prediction to be |
@hosseinkalbasi what's your filtered batch? any filtering did you apply? what's the code before the code block you shared above. Btw, assume you are running end-to-end example nb? Note that in the NVT pipeline we apply min session length filter, meaning sessions less than 2 are not considered. |
Hi everyone. May I ask you for help? Specifically, I thought, that if I tried to replicate it. I truncated my
However, I am getting smaller RecallAt compared to running I debugged code and also found out, that the predictions inside the Thanks a lot. |
🚀 Feature request
Wanted to check if it is currently possible to perform model inference locally in place of setting up Triton Server. If yes, where is the documentation available currently for the same?
Motivation
nvcr.io/nvidia/merlin/merlin-inference:21.09
image. But is there a work around to inference locally?The text was updated successfully, but these errors were encountered: