Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Automatically restart Buildbot master on configuration change #505

Commits on Dec 27, 2016

  1. Disable Python bytecode caching for Buildbot master

    For some reason (likely due to our "graceful" restarts of Buildbot),
    when Buildbot is restarted after its configuration is updated,
    it does not read the updated `.py` files but instead uses the existing
    compiled bytecode (`.pyc`) files, which reflect an older version and are
    not up to date. These bytecode files are also not regenerated,
    meaning the new configuration does not (fully) take effect.
    
    To work around this, disable bytecode caching for the Buildbot master
    entirely to avoid using out-of-date bytecode.
    
    saltfs-migration: Delete all `.pyc` files (recursively)
    in the Buildbot master directory (/home/servo/buildbot/master).
    aneeshusa committed Dec 27, 2016
  2. Upgrade Buildbot master DB on upgrade

    When upgrading the Buildbot master version,
    or starting from a fresh deploy (no existing database),
    the Buildbot database must be upgraded in order
    for Buildbot to start normally.
    (The DB is usually created during `buildbot create-master`,
    but we want to avoid checking in the database.)
    Add a `cmd.run` state that only runs if the Buildbot version is changed,
    using an `onchanges` requisite.
    
    The upgrade script requires that Buildbot is not running,
    presuambly to avoid conflicting updates,
    so Buildbot must be stopped before running the upgrade.
    We want to perform a clean stop, but the built-in `buildbot stop`
    command is nonblocking and will return immediately, without waiting for
    the existing Buildbot instance to finish.
    Buildbot has blocking/waiting for stop functionality built-in but not
    exposed, so add a small helper script to stop Buildbot and block until
    it is finished shutting down, and invoke it before the upgrade.
    
    An alternative would be trying to use Upstart or Salt directly
    to stop Buildbot, as a clean shutdown boils down to sending a SIGUSR1
    to the Buildbot process (only if one is running), in the Unix tradition.
    However, this would be hard to integrate with; in particular, we need to
    wait for the existing Buildbot process to finish running; the easiest
    way to integrate this into a Salt state (without writing a custom Salt
    state) is to start a process to do the waiting, hence the stop script.
    
    Note that the upgrade-master command also adds various other cruft
    to the master directory; the Buildbot internal upgradeDatabase API
    is not called because it is layered in Twisted Reactor/inline callback
    goop, and it is simpler to just call the CLI command.
    
    Also update the Buildbot master states to be more strict
    about using requisites for better ordering control,
    and re-order/space out states for a better reading flow.
    aneeshusa committed Dec 27, 2016
  3. Automatically queue Buildbot master restarts

    Buildbot comes with built-in functionality for clean restarts,
    which entail starting a new Buildbot process that does the following:
    - Cleanly shut down the existing buildmaster (existing instance)
      by waiting for pending builds to be finshed,
      ending the existing process.
    - Start a new buildmaster in the new process, taking over.
    
    Note that the new process becomes the new daemon,
    and thus needs to linger/be kept alive and managed.
    
    Upstart's built-in restart functionality hard-kills the existing
    process, which is undesirable; it's also hard to make Upstart wait
    for pending builds before stopping the existing process,
    as only Buildbot knows about any pending builds.
    Additionally, Buildbot has a limited reload functionality,
    but there are many pitfalls, gotchas, and inconsistencies,
    and it is not recommended for customized installations like ours.
    (e.g. re-loading imports doesn't work.)
    
    Note that when a Buildbot clean restart is requested, there are multiple
    processes running simultaneously.
    Model this in Upstart by using an "instance" job,
    which makes it easy to queue restarts and monitor them,
    without having to leave processes running in e.g. tmux or screen.
    
    Add an additional Upstart task to spawn a single instance of the
    Buildbot master service at boot-up,
    or during a Salt deploy if no instances are alive.
    More instances of the buildbot-master job can be started manually,
    or automatically by Salt, to queue up restarts.
    (The original process will exit gracefully after some time.)
    All instances will stop gracefully on shutdown.
    
    Note that the switch to an instance job means each separate instance
    must be differentiated by a `reason` variable, which is not used for
    any other purpose; this is automatically set to the current date/time,
    suffixed with some metadata about the reason for the instance.
    This has a side effect of creating separate Upstart logs
    (in `/var/log/upstart` for each instance), but the choice of reason
    keeps the logs sorted by date.
    
    Finally, automate this by having Salt queue a new restart job instance
    if anything about the Buildbot master configuration (or package)
    changes.
    aneeshusa committed Dec 27, 2016
You can’t perform that action at this time.