Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Jobs] Merge Jobs Support into Master #2925

Merged
merged 21 commits into from Jan 13, 2020
Merged

[Jobs] Merge Jobs Support into Master #2925

merged 21 commits into from Jan 13, 2020

Commits on Jan 7, 2020

  1. Add Job protos

    Adds protocol buffers for implementing Jobs in swarmkit.
    
    Signed-off-by: Drew Erny <drew.erny@docker.com>
    dperny committed Jan 7, 2020
    Configuration menu
    Copy the full SHA
    788e8a6 View commit details
    Browse the repository at this point in the history
  2. Add minimal Replicated Job Orchestrator and Tests

    Adds orchestrators for replicated and global jobs, and the basic tests.
    This commit exists mostly to keep the Ginkgo in mostly its own commit.
    
    Signed-off-by: Drew Erny <drew.erny@docker.com>
    dperny committed Jan 7, 2020
    Configuration menu
    Copy the full SHA
    67ceea9 View commit details
    Browse the repository at this point in the history
  3. Add service reconcilation for replicated jobs

    Adds service reconciliation logic for the replicated jobs orchestrator.
    This code does not function in production, and is not actually called
    except from the tests.
    
    Signed-off-by: Drew Erny <drew.erny@docker.com>
    dperny committed Jan 7, 2020
    Configuration menu
    Copy the full SHA
    a4da9ec View commit details
    Browse the repository at this point in the history
  4. Add global job orchestrator skeleton

    Expands the skeleton structure of the global jobs orchestrator.
    
    Signed-off-by: Drew Erny <drew.erny@docker.com>
    dperny committed Jan 7, 2020
    Configuration menu
    Copy the full SHA
    3b160c8 View commit details
    Browse the repository at this point in the history
  5. Refactor replicated job orchestrator and add initialization

    Refactors the replicated job orchestrator to make testing simpler, and
    then adds initialization logic to it.
    
    Signed-off-by: Drew Erny <drew.erny@docker.com>
    dperny committed Jan 7, 2020
    Configuration menu
    Copy the full SHA
    06fabe5 View commit details
    Browse the repository at this point in the history
  6. Add store event logic to replicated jobs orchestrator

    Signed-off-by: Drew Erny <drew.erny@docker.com>
    dperny committed Jan 7, 2020
    Configuration menu
    Copy the full SHA
    8140d06 View commit details
    Browse the repository at this point in the history
  7. Refactor global job orchestrator

    Refactors the global job orchestrator along the same lines as the
    replicated job orchestrator, in order to better decouple the
    event-driven orchestrator logic from the reconciliation logic.
    
    Signed-off-by: Drew Erny <drew.erny@docker.com>
    dperny committed Jan 7, 2020
    Configuration menu
    Copy the full SHA
    e5962e7 View commit details
    Browse the repository at this point in the history
  8. Refactor Jobs Orchestrators

    It became evident in the process of writing the Global Jobs orchestrator
    that the Orchestrators required by both Replicated and Global jobs are
    essentially identical. This commit merges them into one combined
    orchestrator, which dispatches to the appropriate Reconcilers to do the
    actual work.
    
    Unlike existing services, these orchestrators can be combined because
    the requirements of jobs are much simpler than that of services.
    
    Signed-off-by: Drew Erny <drew.erny@docker.com>
    dperny committed Jan 7, 2020
    Configuration menu
    Copy the full SHA
    0d91115 View commit details
    Browse the repository at this point in the history
  9. Add controlapi support for job services

    Adds support to the controlapi for creating and updating job-mode
    services. This still does not include correct plumbing to execute
    job-type services.
    
    Signed-off-by: Drew Erny <drew.erny@docker.com>
    dperny committed Jan 7, 2020
    Configuration menu
    Copy the full SHA
    fb8a719 View commit details
    Browse the repository at this point in the history
  10. Wire up jobs orchestrator to manager

    Adds the jobs orchestrator to the swarmkit manager. Jobs orchestrator
    will now start and run with the manager.
    
    Signed-off-by: Drew Erny <drew.erny@docker.com>
    dperny committed Jan 7, 2020
    Configuration menu
    Copy the full SHA
    b9321f7 View commit details
    Browse the repository at this point in the history
  11. Add jobs orchestrator to controlapi integration tests

    Adds the beginnings of the integration tests between the controlapi and
    the jobs orchestrator. These tests don't actually check much more than
    creation right now, as update handling for the jobs orchestrator is
    pending, but further tests will be able to leverage the groundwork here
    to a high degree.
    
    Signed-off-by: Drew Erny <drew.erny@docker.com>
    dperny committed Jan 7, 2020
    Configuration menu
    Copy the full SHA
    5fb683f View commit details
    Browse the repository at this point in the history
  12. Support restarting failing jobs tasks

    In order to make the jobs reconcilers work correctly with the restart
    supervisor, they have been altered to never replace failed tasks
    directly Replacing failed tasks is the purview of the restart
    supervisor. The jobs reconcilers will only create new tasks when needed.
    
    Additionally, this alters the behavior of the replicated job reconciler
    with regards to slots -- each new task will get a new slot, and when the
    job is completed, there will be a Completed task in each slot from 0 to
    TotalCompletions-1.
    
    Then, makes the tweaks necessary for the Restart Supervisor to support
    Jobs, which are different from other services in that they deliberately
    have a desired state of Completed.
    
    Finally, wires up the replicated and global orchestrators to call the
    restart supervisor to restart tasks that have failed.
    
    Signed-off-by: Drew Erny <drew.erny@docker.com>
    dperny committed Jan 7, 2020
    Configuration menu
    Copy the full SHA
    9fec897 View commit details
    Browse the repository at this point in the history
  13. Update components to support jobs

    Updates manager components to support jobs, which have a desired state
    of Completed.
    
    Signed-off-by: Drew Erny <drew.erny@docker.com>
    dperny committed Jan 7, 2020
    Configuration menu
    Copy the full SHA
    42e7797 View commit details
    Browse the repository at this point in the history
  14. Update ListServiceStatuses for Jobs

    Updates the ListServiceStatuses RPC to work with jobs. Includes adding a
    new field to the responses showing the number of completed Tasks in a
    job.
    
    Signed-off-by: Drew Erny <drew.erny@docker.com>
    dperny committed Jan 7, 2020
    Configuration menu
    Copy the full SHA
    6ad2213 View commit details
    Browse the repository at this point in the history
  15. Add support for updating global jobs

    Adds support for updating global jobs by adding code to shut down tasks
    belonging to previous job iterations.
    
    Signed-off-by: Drew Erny <drew.erny@docker.com>
    dperny committed Jan 7, 2020
    Configuration menu
    Copy the full SHA
    39596e5 View commit details
    Browse the repository at this point in the history
  16. Add support for updating replicated jobs

    Adds support for updating replicated jobs by adding code to shut down
    tasks belonging to previous job iterations.
    
    Signed-off-by: Drew Erny <drew.erny@docker.com>
    dperny committed Jan 7, 2020
    Configuration menu
    Copy the full SHA
    9959bf7 View commit details
    Browse the repository at this point in the history
  17. Add controlapi integration test for updating replicated jobs

    Signed-off-by: Drew Erny <drew.erny@docker.com>
    dperny committed Jan 7, 2020
    Configuration menu
    Copy the full SHA
    17cd5b1 View commit details
    Browse the repository at this point in the history
  18. Revert #2899

    This reverts the commits from #2899, which shut down tasks of old
    iterations. This is the wrong approach and the right approach will be
    fixed in a later commit.
    
    Signed-off-by: Drew Erny <drew.erny@docker.com>
    dperny committed Jan 7, 2020
    Configuration menu
    Copy the full SHA
    f61b9f7 View commit details
    Browse the repository at this point in the history
  19. Fix some jobs TODOs

    Addresses a couple of TODOs in the jobs orchestrator.
    
    Most notably, updates the reconcilers to set the DesiredState of Tasks
    belonging to older job iterations to Remove, which will cause them to be
    cleaned up and deleted.
    
    Signed-off-by: Drew Erny <drew.erny@docker.com>
    dperny committed Jan 7, 2020
    Configuration menu
    Copy the full SHA
    5367704 View commit details
    Browse the repository at this point in the history
  20. Add tests for jobs integration with RestartSupervisor

    Signed-off-by: Drew Erny <drew.erny@docker.com>
    dperny committed Jan 7, 2020
    Configuration menu
    Copy the full SHA
    2345d92 View commit details
    Browse the repository at this point in the history

Commits on Jan 10, 2020

  1. Remove global job node creation time check

    Removes code from the global job reconciler that prevented global jobs
    from executing on newly created nodes. This behavior probably would not
    have worked well in real-world use, and closes off a few use cases (like
    using a global job to perform some initialization on newly-created
    nodes).
    
    Signed-off-by: Drew Erny <derny@mirantis.com>
    dperny committed Jan 10, 2020
    Configuration menu
    Copy the full SHA
    eae24f8 View commit details
    Browse the repository at this point in the history