Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to to_tensorflow, multi worker transforms, dynamic tensor slicing #689

Merged
merged 24 commits into from Mar 21, 2021

Conversation

AbhinavTuli
Copy link
Contributor

@AbhinavTuli AbhinavTuli commented Mar 16, 2021

  • to_tensorflow now supports a new argument that only passes certain tensors to it and speeds up iteration time in case multiple extra tensors are present.
  • caching present within to_tensorflow has been improved to tensors with dynamic shapes (earlier it was saving only the current sample in the cache)
  • adds the option to specify None as compressor while defining the schema
  • adds the ability to slice dynamically shaped tensors and obtain a list instead of iterating over them one by one.
  • transform logic has been modified to work properly with multiple workers
  • Any dataset copy test that got interrupted midway through the test affected all subsequent test runs. This has now been fixed.

Relevant tests have been added in test_dataset and test_converters that can be used to try out the new features.

@github-actions
Copy link

Locust summary

Git references

Initial: 85a8004
Terminal: d88c5fb

hub/api/integrations.py
Changes:
hub/api/dataset.py
Changes:
hub/store/dynamic_tensor.py
Changes:
hub/api/datasetview.py
Changes:
hub/compute/transform.py
Changes:
  • Name: Transform
    Type: class
    Changed lines: 25
    Total lines: 414
    Changes:
hub/store/metastore.py
Changes:
hub/api/tests/test_converters.py
Changes:
hub/api/tests/test_dataset.py
Changes:
hub/utils.py
Changes:
  • Name: batchify
    Type: function
    Changed lines: 4
    Total lines: 11

    @codecov
    Copy link

    codecov bot commented Mar 17, 2021

    Codecov Report

    Merging #689 (4507364) into master (58c65c4) will decrease coverage by 0.04%.
    The diff coverage is 97.46%.

    Impacted file tree graph

    @@            Coverage Diff             @@
    ##           master     #689      +/-   ##
    ==========================================
    - Coverage   89.14%   89.10%   -0.05%     
    ==========================================
      Files          58       58              
      Lines        4266     4296      +30     
    ==========================================
    + Hits         3803     3828      +25     
    - Misses        463      468       +5     
    Impacted Files Coverage Δ
    hub/api/tensorview.py 90.40% <ø> (-0.10%) ⬇️
    hub/api/integrations.py 87.53% <90.90%> (-0.11%) ⬇️
    hub/api/dataset.py 89.57% <100.00%> (-0.85%) ⬇️
    hub/api/dataset_utils.py 95.31% <100.00%> (+1.02%) ⬆️
    hub/api/datasetview.py 91.50% <100.00%> (ø)
    hub/compute/transform.py 93.57% <100.00%> (+0.15%) ⬆️
    hub/store/dynamic_tensor.py 88.77% <100.00%> (+0.23%) ⬆️
    hub/store/metastore.py 90.65% <100.00%> (ø)
    hub/utils.py 81.67% <100.00%> (+0.28%) ⬆️
    ... and 1 more

    Continue to review full report at Codecov.

    Legend - Click here to learn more
    Δ = absolute <relative> (impact), ø = not affected, ? = missing data
    Powered by Codecov. Last update 58c65c4...4507364. Read the comment docs.

    Copy link
    Contributor

    @imshashank imshashank left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Minor comment. Please update that and ready for merge.

    hub/api/tests/test_dataset.py Outdated Show resolved Hide resolved
    @AbhinavTuli AbhinavTuli merged commit 272107c into master Mar 21, 2021
    @kristinagrig06 kristinagrig06 deleted the fixes/improvements branch May 31, 2021 13:00
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    None yet
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    None yet

    3 participants