Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introducing Task Run Locking for Enhanced Concurrency Control in Gokart #353

Merged

Conversation

mski-iksm
Copy link
Contributor

@mski-iksm mski-iksm commented Feb 26, 2024

Introducing Task Run Locking for Enhanced Concurrency Control in Gokart

tl;dr

  • Introduces task run locking in Gokart for better concurrency control.
  • Prevents redundant task executions in distributed setups.
  • Updates and adds documentation for efficient multi-worker execution.
  • Implements backoff strategies for handling task lock exceptions.
  • Enhances efficiency and reliability of task execution in Gokart.

Summary

This pull request introduces significant updates aimed at enhancing the efficiency and reliability of running tasks on multiple workers in a Gokart/Luigi pipeline. Specifically, it adds new documentation on efficient multi-worker execution, updates task conflict prevention mechanisms, and integrates backoff strategies for handling task lock exceptions. These changes are designed to prevent redundant task executions and ensure more robust task locking in distributed environments.

Changes

  • Documentation Addition: Added a new documentation file efficient_run_on_multi_workers.rst that guides users on how to improve efficiency when running similar Gokart pipelines on multiple workers. This includes strategies to skip completed tasks and suppress the execution of tasks already being run by another worker.

  • Documentation Update: Updated the index.rst to include the new documentation in the User Guide section.

  • Task Conflict Prevention Lock: Renamed using_task_cache_collision_lock.rst to using_task_task_conflict_prevention_lock.rst to better reflect the mechanism's purpose. The documentation within has also been updated to align with the new naming convention and clarify the prevention of task cache conflicts.

  • Code Enhancements:

    • Modified gokart/build.py to include backoff strategies when encountering TaskLockException, allowing for automatic retrying with exponential backoff until a maximum number of tries or wait time is reached.
    • Updated task_lock.py and task_lock_wrappers.py to support the new locking mechanism during task execution (run method), ensuring that tasks are not executed redundantly across workers.
    • Added a new module wrap_run_with_lock.py to facilitate wrapping the task's run method with a lock, preventing simultaneous execution of the same task by multiple workers.
    • Adjusted gokart/task.py to automatically apply run locking based on task configuration, enhancing task execution efficiency in distributed environments.
  • Dependency Addition: Added backoff library to pyproject.toml and updated poetry.lock accordingly. This library is utilized to implement exponential backoff strategy when handling task lock exceptions.

Impact

  • Efficiency: These changes significantly reduce redundant task executions in distributed environments, lowering compute resource wastage.
  • Reliability: Enhances the reliability of task execution in concurrent scenarios by preventing task cache conflicts and ensuring that tasks are not executed more than necessary.
  • Usability: The addition of documentation provides clear guidance to users on how to leverage these new features, improving the overall usability of Gokart for distributed task execution.

Testing

  • Updated existing tests to reflect changes in task locking mechanism.
  • Added new tests to cover the functionality of retrying task execution with exponential backoff upon encountering lock exceptions.

Documentation

  • Added comprehensive documentation on efficient execution strategies on multiple workers.
  • Updated existing documentation to reflect the renaming and functionality changes in task conflict prevention.

@mski-iksm mski-iksm changed the title add task run lock Introducing Task Run Locking for Enhanced Concurrency Control in Gokart Feb 26, 2024
@mski-iksm mski-iksm marked this pull request as ready for review March 3, 2024 14:50
test/test_build.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@yokomotod yokomotod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


Skip completed tasks with `complete_check_at_run`
---------------------------
By setting `gokart.TaskOnKart.complete_check_at_run` to True, the existence of the cache can be rechecked at run() time.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When should we set this to be false? I feel this can be always True :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with this soution #358 :)

mski-iksm and others added 3 commits March 17, 2024 14:48
Co-authored-by: Keisuke OGAKI <hikingko1@gmail.com>
…m:mski-iksm/gokart into feature/add_run_lock_to_gokart_taskonkart
@Hi-king Hi-king merged commit 77165c3 into m3dev:master Mar 17, 2024
5 checks passed
@Hi-king
Copy link
Member

Hi-king commented Mar 17, 2024

@mski-iksm THX for adding brand new feature 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants