Unexpected behaviour with dataset.mapAsync and fitDataset #1450

tafsiri · 2019-03-27T00:22:40Z

TensorFlow.js version

"@tensorflow/tfjs": "1.0.0",

Browser version

Node 11.11

Describe the problem or feature request

After creating a dataset pipeline with a batch and mapAsync operation. Passing it to model.fitDataset, causes the mapAsync to run over the whole dataset and then once again for each batch.

I've made a repo to reproduce that I'll link below but here is some of the output I get.

The code looks like this https://github.com/tafsiri/use-text-classifier/blob/fitdataset/training/train.js#L63

counter 0
Embedding a batch: 1729.892ms
counter 1
Embedding a batch: 1643.588ms
counter 2
Embedding a batch: 1329.843ms
counter 3
Embedding a batch: 1706.367ms
counter 4
Embedding a batch: 1489.202ms
counter 5
Embedding a batch: 1704.473ms
counter 6
Embedding a batch: 1431.451ms
counter 7
Embedding a batch: 1136.306ms
counter 8
Embedding a batch: 919.442ms
counter 9
Embedding a batch: 1476.211ms
counter 10
Embedding a batch: 1453.095ms
counter 11
// This continues until all the batches are consumed and then i get
counter 32
Embedding a batch: 1770.664ms
onBatchEnd 0 { batch: 0, size: 4, loss: 2.2711334228515625, acc: 0.5 }
counter 33
Embedding a batch: 2392.485ms
onBatchEnd 1 { batch: 1, size: 4, loss: 2.256931781768799, acc: 0.75 }
counter 34
Embedding a batch: 1544.857ms
onBatchEnd 2 { batch: 2, size: 4, loss: 2.3282790184020996, acc: 0 }
counter 35

The initial behaviour of going through all the batches happens at the start of each epoch.

2). When using mapAsync without batch it seems to not always wait for the function to finish? I'll link to sample code below but in this case I get

counter 0
Embedding a sentence: 843.928ms
counter 1
Embedding a sentence: 536.952ms
counter 2
counter 3
counter 4
counter 5
Embedding a sentence: 1144.132ms
counter 6
Embedding a sentence: 42.071ms
counter 7
counter 8
counter 9
counter 10
Embedding a sentence: 992.742ms
counter 11
Embedding a sentence: 242.740ms
counter 12
Embedding a sentence: 318.635ms

The counter seems to increment multiple times before some of the tensor operations complete.

The code looks like this https://github.com/tafsiri/use-text-classifier/blob/fitdataset/training/train.js#L23

Code to reproduce the bug / link to feature request

To reproduce 1). Clone this repo https://github.com/tafsiri/use-text-classifier/tree/fitdataset switch the fitDataset branch, go into the training folder and run node train.js

To reproduce 2). Do the same as above but comment out lines 124-130 in train.js and uncomment lines 133-140.

The text was updated successfully, but these errors were encountered:

shmishra99 · 2023-05-01T12:05:15Z

Hi @tafsiri ,
Apology for the late response.
Kindly Let me know if your issue is resolved in latest version of npm 4.4.0. If it is not resolved yet, kindly share the reproducible code. Your provided code link is broken. Thanks!

google-ml-butler · 2023-05-08T12:53:08Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you.

google-ml-butler · 2023-05-15T13:15:33Z

Closing as stale. Please @mention us if this needs more attention.

google-ml-butler · 2023-05-15T13:15:40Z

Are you satisfied with the resolution of your issue?
Yes
No

tafsiri added type:bug Something isn't working comp:data labels Mar 27, 2019

tafsiri assigned kangyizhang Mar 27, 2019

snyk-bot mentioned this issue Dec 18, 2019

[Snyk] Upgrade @tensorflow/tfjs from 0.11.7 to 0.15.3 shams3049/tf-jam#1

Open

shmishra99 self-assigned this May 1, 2023

shmishra99 added the stat:awaiting response label May 1, 2023

google-ml-butler bot added the stalled label May 8, 2023

google-ml-butler bot closed this as completed May 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unexpected behaviour with dataset.mapAsync and fitDataset #1450

Unexpected behaviour with dataset.mapAsync and fitDataset #1450

tafsiri commented Mar 27, 2019

shmishra99 commented May 1, 2023

google-ml-butler bot commented May 8, 2023

google-ml-butler bot commented May 15, 2023

google-ml-butler bot commented May 15, 2023

Unexpected behaviour with dataset.mapAsync and fitDataset #1450

Unexpected behaviour with dataset.mapAsync and fitDataset #1450

Comments

tafsiri commented Mar 27, 2019

TensorFlow.js version

Browser version

Describe the problem or feature request

Code to reproduce the bug / link to feature request

shmishra99 commented May 1, 2023

google-ml-butler bot commented May 8, 2023

google-ml-butler bot commented May 15, 2023

google-ml-butler bot commented May 15, 2023