Consistent Dataset Handling #28

AmitMY · 2022-02-19T10:57:54Z

Very nice repo and documentation!

I think this repository can benefit from using https://github.com/sign-language-processing/datasets as data loaders.

It is fast, consistent across datasets, and allows loading videos / poses from multiple datasets.
If a dataset you are using is not there, you can ask for it or add it yourself, it is a breeze.

The repo supports many datasets, multiple pose estimation formats, binary pose files, fps and resolution manipulations, and dataset disk mapping.

Finally, this would make this repo less complex. This repo does pre-training and fine-tuning, the other repo does datasets, and they could be used together.

Please consider :)

GokulNC · 2022-02-21T06:59:36Z

Thanks for this suggestion @AmitMY . Interesting! We will check it out in detail and get back to you here.
We are not familiar with using tfds, so we'll have to see if there are any setbacks in using it for our case.

Also, it would be great if you can share with us some resources/pointers on how to get started with creating this custom tfds dataset using .pose files in a way that is expected by your datasets library. (probably as an .md file in your repo itself)

One challenge in our case is that we use PyTorch Lightning in this repo. So we're not sure how those dataloader flows could be used with tfds.

AmitMY · 2022-02-21T10:04:07Z

Thanks for being open to this.

Tensorflow has many tutorials about adding datasets, including - https://www.tensorflow.org/datasets/add_dataset

But perhaps also just looking at the code of one dataset might be useful.

Regarding PyTorch Lightning - that is no problem. I have consistently worked with tfds for pytorch without any issues.

The simplest way would be just make it all numpy - https://www.tensorflow.org/datasets/api_docs/python/tfds/as_numpy

But you can also perform whatever operations you want on tfds (batching, mapping, prefetching, shuffling, etc) and then for each batch do as_numpy in order to be memory efficient.

Please let me know if there's anything concrete that you are not sure about, and I'll see if I can make an example.

Prem-kumar27 · 2022-02-21T12:03:37Z

Thanks @AmitMY

Currently our data pipeline is as like we lazy load the pose data for only the batch of videos and then do augmentations for them and then the data is used by the model. We also use Pytorch-Lightning's LightningDataModule for this. This can be found here.

We are not sure of how to use TFDS's dataset module here. One way would be convert the whole TFDS dataset as a torch Tensor and wrap it with the torch Dataset class. But this would require the whole dataset to be in memory. Is there any other way to do this?

Basically, iterating over batches is being handled by Pytorch-lightning in our case. So, we are not sure of how to make use of TFDS here.

AmitMY · 2022-02-21T17:36:16Z

How about wrapping the tfds with a generic wrapper that makes the data in torch?

from sign_language_datasets.utils.torch_dataset import TFDSTorchDataset

# Fast download and load dataset using TFDS
config = SignDatasetConfig(name="holistic-poses", version="1.0.0", include_video=False, include_pose="holistic")
dicta_sign = tfds.load(name='dicta_sign', builder_kwargs={"config": config})

# Convert to torch dataset
train_dataset = TFDSTorchDataset(dicta_sign["train"])

for datum in itertools.islice(train_dataset, 0, 10):
    print(datum)

Which in this case for example, returns the following dictionary:

{
    "gloss": "ERLAUBNIS2", 
    "hamnosys": "\xee\x83\xa9\xee\x80\x85\xee\x80\x8c\xee\x81\xb2\xee\x80\x90\xee\x80\xa0\xee\x80\xbf\xee\x83\xa2\xee\x81\x82\xee\x81\x99\xee\x83\x91\xee\x83\xa7\xee\x81\x92\xee\x83\xa3\xee\x83\xa2\xee\x82\x90\xee\x82\xaa\xee\x80\xb1\xee\x80\xbc\xee\x83\xa3", 
    "id": "54_DGS", 
    "pose": {
      "data": tensor([[[[ 9.4747e+01,  8.0048e+01, -1.2109e-04],
              [ 9.8266e+01,  7.4415e+01,  2.2603e-03],
              [ 1.0062e+02,  7.4430e+01, -3.8285e-03],
              ...,
              [ 6.1661e+01,  1.7587e+02, -3.8705e-02]]]]), 
      "conf": tensor([[[1.0000, 1.0000, 1.0000,  ..., 1.0000, 1.0000, 1.0000]],
            ...,
            [[1.0000, 1.0000, 1.0000,  ..., 1.0000, 1.0000, 1.0000]]]), 
      "fps": 25
    }, 
    "signed_language": "DGS", 
    "spoken_language": "de", 
    "text": "Erlaubnis", 
    "video": "https://www.sign-lang.uni-hamburg.de/dicta-sign/portal/concepts/dgs/54.webm"
}

Prem-kumar27 · 2022-02-22T04:36:28Z

Thanks. I think this could work.
We will try this and get back to you if we had any questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consistent Dataset Handling #28

Consistent Dataset Handling #28

AmitMY commented Feb 19, 2022 •

edited

GokulNC commented Feb 21, 2022

AmitMY commented Feb 21, 2022

Prem-kumar27 commented Feb 21, 2022 •

edited

AmitMY commented Feb 21, 2022

Prem-kumar27 commented Feb 22, 2022

Consistent Dataset Handling #28

Consistent Dataset Handling #28

Comments

AmitMY commented Feb 19, 2022 • edited

GokulNC commented Feb 21, 2022

AmitMY commented Feb 21, 2022

Prem-kumar27 commented Feb 21, 2022 • edited

AmitMY commented Feb 21, 2022

Prem-kumar27 commented Feb 22, 2022

AmitMY commented Feb 19, 2022 •

edited

Prem-kumar27 commented Feb 21, 2022 •

edited