Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AL-2353] Add delay for transform progress bar updates #2515

Merged
merged 2 commits into from Aug 4, 2023

Conversation

FayazRahman
Copy link
Contributor

🚀 🚀 Pull Request

Impact

  • Bug fix (non-breaking change which fixes expected existing functionality)
  • Enhancement/New feature (adds functionality without impacting existing logic)
  • Breaking change (fix or feature that would cause existing functionality to change)

Description

  • When compute functions are really fast, progress bar updates become the slowest part of the chain.
  • Update it in intervals of 5 seconds.

Things to be aware of

Things to worry about

Additional Context

@codecov
Copy link

codecov bot commented Aug 3, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: -0.01% ⚠️

Comparison is base (1c6db8a) 84.90% compared to head (5e38cbb) 84.89%.
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2515      +/-   ##
==========================================
- Coverage   84.90%   84.89%   -0.01%     
==========================================
  Files         328      328              
  Lines       38837    38846       +9     
==========================================
+ Hits        32973    32979       +6     
- Misses       5864     5867       +3     
Flag Coverage Δ
unittests 84.89% <100.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
deeplake/core/vectorstore/deeplake_vectorstore.py 89.15% <ø> (ø)
...lake/core/vectorstore/test_deeplake_vectorstore.py 60.42% <100.00%> (+0.25%) ⬆️
deeplake/util/transform.py 95.40% <100.00%> (+0.07%) ⬆️

... and 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@tatevikh tatevikh requested a review from levongh August 3, 2023 15:31
@nvoxland
Copy link
Contributor

nvoxland commented Aug 4, 2023

Does the performance improvements you saw still remain if you push the logic up to deeplake.core.compute.provider? @FayazRahman

        def sub_func(*args, **kwargs):
            callback_data = {
                "progress": 0,
                "last_progress_update_time": 0,
            }

            def pg_callback(value: int):
                callback_data["progress"] = callback_data["progress"] + value
                if (
                        time.time() - callback_data["last_progress_update_time"] > TRANSFORM_PROGRESSBAR_UPDATE_INTERVAL
                        or callback_data["progress"] == total_length - 1
                ):
                    progress_queue.put(callback_data["progress"])
                    callback_data["last_progress_update_time"] = time.time()
                    callback_data["progress"] = 0

            return func(pg_callback, *args, **kwargs)

@FayazRahman
Copy link
Contributor Author

@nvoxland It should remain, but I would advice against that inorder to keep the pg_callback within the base class general.

@FayazRahman FayazRahman merged commit ac76dea into main Aug 4, 2023
8 of 11 checks passed
@FayazRahman FayazRahman deleted the fy_fix_transform branch August 4, 2023 19:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants