[RFC] Maintenance III: background maintenance #680

derrickstolee · 2020-07-23T16:24:01Z

This is based on v3 of Part II (ds/maintenance-part-2) [1].

[1] https://lore.kernel.org/git/pull.696.v3.git.1598380599.gitgitgadget@gmail.com/

This RFC is intended to show how I hope to integrate true background maintenance into Git. As opposed to my original RFC [2], this entirely integrates with cron (through crontab [-e|-l]) to launch maintenance commands in the background.

[2] https://lore.kernel.org/git/pull.597.git.1585946894.gitgitgadget@gmail.com/

Some preliminary work is done to allow a new --schedule option that tells the command which tasks to run based on a maintenance.<task>.schedule config option. The timing is not enforced by Git, but instead is expected to be provided as a hint from a cron schedule.

A new for-each-repo builtin runs Git commands on every repo in a given list. Currently, the list is stored as a config setting, allowing a new maintenance.repos config list to store the repositories registered for background maintenance. Others may want to add a --file=<file> option for their own workflows, but I focused on making this as simple as possible for now.

The updates to the git maintenance builtin include new register/unregister subcommands and start/stop subcommands. The register subcommand initializes the config while the start subcommand does everything register does plus update the cron table. The unregister and stop commands reverse this process.

The very last patch is entirely optional. It sets a recommended schedule based on my own experience with very large repositories. I'm open to other suggestions, but these are ones that I think work well and don't cause a "rewrite the world" scenario like running nightly 'gc' would do.

I've been testing this scenario on my macOS laptop for a while and my Linux machine. I have modified my cron task to provide logging via trace2 so I can see what's happening. A future direction here would be to add some maintenance logs to the repository so we can track what is happening and diagnose whether the maintenance strategy is working on real repos.

Note: git maintenance (start|stop) only works on machines with cron by design. The proper thing to do on Windows will come later. Perhaps this command should be marked as unavailable on Windows somehow, or at least a better error than "cron may not be available on your system". I did find that that message is helpful sometimes: macOS worker agents for CI builds typically do not have cron available.

Updates since RFC v2

Update the cron schedule with three lines saying "run hourly except at midnight", "run daily except on first day of week", and "run weekly". This avoids parallel processes competing for the object database lock.
Update the --schedule= and 'maintenance..schedule' config options. This is reflected in the recommended schedule at the end.
Drop the *.lastRun config option. It was going to trash config files but it is also not needed by the new cron schedule.

I expect this to be my final RFC version before restarting the thread with a v1 next week. Please throw any and all critique at the plan here!

Updates since RFC v1

Some fallout from rewriting the option parsing in "Maintenance I"
This applies cleanly on v3 of "Maintenance II"
Several helpful feedback items from Đoàn Trần Công Danh are applied.
There is an unresolved comment around the use of approxidate("now"). These calls are untouched from v1.

Thanks,
-Stolee

Cc: sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com

cc: Derrick Stolee stolee@gmail.com

gitgitgadget · 2020-08-20T14:38:03Z

Documentation/config/maintenance.txt

@@ -1,10 +1,20 @@
+maintenance.auto::


On the Git mailing list, Đoàn Trần Công Danh wrote (reply to this):

On 2020-08-19 17:16:43+0000, Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com> wrote: > From: Derrick Stolee <dstolee@microsoft.com> > > Users may want to run certain maintenance tasks only so often. Update > the local config with a new 'maintenance.<task>.lastRun' config option > that stores the timestamp just before running the maintenance task. > > I selected the timestamp before the task, as opposed to after the task, > for a couple reasons: > > 1. The time the task takes to execute should not contribute to the > interval between running the tasks. If a daily task takes 10 minutes > to run, then every day the execution will drift by at least 10 > minutes. > > 2. If the task fails for some unforseen reason, it would be good to > indicate that we _attempted_ the task at a certain timestamp. This > will avoid spamming a repository that is in a bad state. > > Signed-off-by: Derrick Stolee <dstolee@microsoft.com> > --- > Documentation/config/maintenance.txt | 5 +++++ > builtin/gc.c | 16 ++++++++++++++++ > git-gvfs-helper | Bin 0 -> 11171736 bytes > t/helper/test-gvfs-protocol | Bin 0 -> 10946928 bytes Look like those 2 files should be added into .gitignore, no? -- Danh

gitgitgadget · 2020-08-20T14:53:38Z

Documentation/config/maintenance.txt

@@ -1,10 +1,29 @@
+maintenance.auto::


On the Git mailing list, Đoàn Trần Công Danh wrote (reply to this):

On 2020-08-19 17:16:44+0000, Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com> wrote: > +static void fill_schedule_info(struct maintenance_task *task, > + const char *config_name, > + timestamp_t schedule_delay) > +{ > + timestamp_t now = approxidate("now"); I see this pattern in both previous patch and this patch, should we create a helper (if not exist) to get current timestamp instead, parsing "now" every now and then is not a good idea, in my very opinionated opinion. > + char *value = NULL; > + struct strbuf last_run = STRBUF_INIT; > + int64_t previous_run; > + > + strbuf_addf(&last_run, "maintenance.%s.lastrun", task->name); > + > + if (git_config_get_string(last_run.buf, &value)) > + task->scheduled = 1; > + else { > + previous_run = git_config_int64(last_run.buf, value); > + if (now >= previous_run + schedule_delay) > + task->scheduled = 1; > + } > + > + free(value); > + strbuf_release(&last_run); > +} > + > static void initialize_task_config(void) > { > int i; > @@ -1359,6 +1387,7 @@ static void initialize_task_config(void) > > for (i = 0; i < TASK__COUNT; i++) { > int config_value; > + char *config_str; > > strbuf_setlen(&config_name, 0); > strbuf_addf(&config_name, "maintenance.%s.enabled", > @@ -1366,6 +1395,20 @@ static void initialize_task_config(void) > > if (!git_config_get_bool(config_name.buf, &config_value)) > tasks[i].enabled = config_value; > + > + strbuf_setlen(&config_name, 0); It looks like we have a simple and better named alias for this: strbuf_reset(&config_name) _reset has 400+ occurences in this code base, compare to 20 of _setlen -- Danh

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 8/20/2020 10:51 AM, Đoàn Trần Công Danh wrote: > On 2020-08-19 17:16:44+0000, Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com> wrote: >> +static void fill_schedule_info(struct maintenance_task *task, >> + const char *config_name, >> + timestamp_t schedule_delay) >> +{ >> + timestamp_t now = approxidate("now"); > > I see this pattern in both previous patch and this patch, > should we create a helper (if not exist) to get current timestamp > instead, parsing "now" every now and then is not a good idea, in my > very opinionated opinion. Parsing "now" is not that much work, and it is done only once per maintenance task. To make a helper that avoids these string comparisons (specifically to avoid iterating through the "special" array in date.c) is unlikely to be worth the effort and code duplication. If you mean it would be good to use a macro here, then that would be easy: #define approxidate_now() approxidate("now") One important thing for using this over time(NULL) is that we really want this to work with GIT_TEST_DATE_NOW. >> + char *value = NULL; >> + struct strbuf last_run = STRBUF_INIT; >> + int64_t previous_run; >> + >> + strbuf_addf(&last_run, "maintenance.%s.lastrun", task->name); >> + >> + if (git_config_get_string(last_run.buf, &value)) >> + task->scheduled = 1; >> + else { >> + previous_run = git_config_int64(last_run.buf, value); >> + if (now >= previous_run + schedule_delay) >> + task->scheduled = 1; >> + } >> + >> + free(value); >> + strbuf_release(&last_run); >> +} >> + >> static void initialize_task_config(void) >> { >> int i; >> @@ -1359,6 +1387,7 @@ static void initialize_task_config(void) >> >> for (i = 0; i < TASK__COUNT; i++) { >> int config_value; >> + char *config_str; >> >> strbuf_setlen(&config_name, 0); >> strbuf_addf(&config_name, "maintenance.%s.enabled", >> @@ -1366,6 +1395,20 @@ static void initialize_task_config(void) >> >> if (!git_config_get_bool(config_name.buf, &config_value)) >> tasks[i].enabled = config_value; >> + >> + strbuf_setlen(&config_name, 0); > > It looks like we have a simple and better named alias for this: > > strbuf_reset(&config_name) > > _reset has 400+ occurences in this code base, compare to 20 of _setlen Makes sense. Thanks. -Stolee

gitgitgadget · 2020-08-20T15:02:07Z

.gitignore

@@ -67,6 +67,7 @@
 /git-filter-branch


On the Git mailing list, Đoàn Trần Công Danh wrote (reply to this):

On 2020-08-19 17:16:45+0000, Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com> wrote: > From: Derrick Stolee <dstolee@microsoft.com> > > It can be helpful to store a list of repositories in global or system > config and then iterate Git commands on that list. Create a new builtin > that makes this process simple for experts. We will use this builtin to > run scheduled maintenance on all configured repositories in a future > change. Nice, I like this new command. However, I'm not sure if we could declare this command as plumbing or porcelain command. I guess this command is meant more for scripting purpose, hence, it should be plumbing, thus we need to define clear protocol for this command, e.g, where it will redirect other command output, error to, where for-each-repo write its own output/error. Also, I think it would be nice to declare this is experimental for now. Like we declared git-switch and git-restore. -- Danh > > The test is very simple, but does highlight that the "--" argument is > optional. > > Signed-off-by: Derrick Stolee <dstolee@microsoft.com> > --- > .gitignore | 1 + > Documentation/git-for-each-repo.txt | 45 ++++++++++++++++++++++ > Makefile | 1 + > builtin.h | 1 + > builtin/for-each-repo.c | 58 +++++++++++++++++++++++++++++ > git.c | 1 + > t/t0068-for-each-repo.sh | 30 +++++++++++++++ > 7 files changed, 137 insertions(+) > create mode 100644 Documentation/git-for-each-repo.txt > create mode 100644 builtin/for-each-repo.c > create mode 100755 t/t0068-for-each-repo.sh > > diff --git a/.gitignore b/.gitignore > index a5808fa30d..5eb2a2be71 100644 > --- a/.gitignore > +++ b/.gitignore > @@ -67,6 +67,7 @@ > /git-filter-branch > /git-fmt-merge-msg > /git-for-each-ref > +/git-for-each-repo > /git-format-patch > /git-fsck > /git-fsck-objects > diff --git a/Documentation/git-for-each-repo.txt b/Documentation/git-for-each-repo.txt > new file mode 100644 > index 0000000000..83b06db410 > --- /dev/null > +++ b/Documentation/git-for-each-repo.txt > @@ -0,0 +1,45 @@ > +git-for-each-repo(1) > +==================== > + > +NAME > +---- > +git-for-each-repo - Run a Git command on a list of repositories > + > + > +SYNOPSIS > +-------- > +[verse] > +'git for-each-repo' --config=<config> [--] <arguments> > + > + > +DESCRIPTION > +----------- > +Run a Git commands on a list of repositories. The arguments after the > +known options or `--` indicator are used as the arguments for the Git > +command. > + > +For example, we could run maintenance on each of a list of repositories > +stored in a `maintenance.repo` config variable using > + > +------------- > +git for-each-repo --config=maintenance.repo maintenance run > +------------- > + > +This will run `git -C <repo> maintenance run` for each value `<repo>` > +in the multi-valued config variable `maintenance.repo`. > + > + > +OPTIONS > +------- > +--config=<config>:: > + Use the given config variable as a multi-valued list storing > + absolute path names. Iterate on that list of paths to run > + the given arguments. > ++ > +These config values are loaded from system, global, and local Git config, > +as available. If `git for-each-repo` is run in a directory that is not a > +Git repository, then only the system and global config is used. > + > +GIT > +--- > +Part of the linkgit:git[1] suite > diff --git a/Makefile b/Makefile > index 65f8cfb236..7c588ff036 100644 > --- a/Makefile > +++ b/Makefile > @@ -1071,6 +1071,7 @@ BUILTIN_OBJS += builtin/fetch-pack.o > BUILTIN_OBJS += builtin/fetch.o > BUILTIN_OBJS += builtin/fmt-merge-msg.o > BUILTIN_OBJS += builtin/for-each-ref.o > +BUILTIN_OBJS += builtin/for-each-repo.o > BUILTIN_OBJS += builtin/fsck.o > BUILTIN_OBJS += builtin/gc.o > BUILTIN_OBJS += builtin/get-tar-commit-id.o > diff --git a/builtin.h b/builtin.h > index 17c1c0ce49..ff7c6e5aa9 100644 > --- a/builtin.h > +++ b/builtin.h > @@ -150,6 +150,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix); > int cmd_fetch_pack(int argc, const char **argv, const char *prefix); > int cmd_fmt_merge_msg(int argc, const char **argv, const char *prefix); > int cmd_for_each_ref(int argc, const char **argv, const char *prefix); > +int cmd_for_each_repo(int argc, const char **argv, const char *prefix); > int cmd_format_patch(int argc, const char **argv, const char *prefix); > int cmd_fsck(int argc, const char **argv, const char *prefix); > int cmd_gc(int argc, const char **argv, const char *prefix); > diff --git a/builtin/for-each-repo.c b/builtin/for-each-repo.c > new file mode 100644 > index 0000000000..5bba623ff1 > --- /dev/null > +++ b/builtin/for-each-repo.c > @@ -0,0 +1,58 @@ > +#include "cache.h" > +#include "config.h" > +#include "builtin.h" > +#include "parse-options.h" > +#include "run-command.h" > +#include "string-list.h" > + > +static const char * const for_each_repo_usage[] = { > + N_("git for-each-repo --config=<config> <command-args>"), > + NULL > +}; > + > +static int run_command_on_repo(const char *path, > + void *cbdata) > +{ > + int i; > + struct child_process child = CHILD_PROCESS_INIT; > + struct strvec *args = (struct strvec *)cbdata; > + > + child.git_cmd = 1; > + strvec_pushl(&child.args, "-C", path, NULL); > + > + for (i = 0; i < args->nr; i++) > + strvec_push(&child.args, args->v[i]); > + > + return run_command(&child); > +} > + > +int cmd_for_each_repo(int argc, const char **argv, const char *prefix) > +{ > + static const char *config_key = NULL; > + int i, result = 0; > + const struct string_list *values; > + struct strvec args = STRVEC_INIT; > + > + const struct option options[] = { > + OPT_STRING(0, "config", &config_key, N_("config"), > + N_("config key storing a list of repository paths")), > + OPT_END() > + }; > + > + argc = parse_options(argc, argv, prefix, options, for_each_repo_usage, > + PARSE_OPT_STOP_AT_NON_OPTION); > + > + if (!config_key) > + die(_("missing --config=<config>")); > + > + for (i = 0; i < argc; i++) > + strvec_push(&args, argv[i]); > + > + values = repo_config_get_value_multi(the_repository, > + config_key); > + > + for (i = 0; !result && i < values->nr; i++) > + result = run_command_on_repo(values->items[i].string, &args); > + > + return result; > +} > diff --git a/git.c b/git.c > index 24f250d29a..1cab64b5d1 100644 > --- a/git.c > +++ b/git.c > @@ -511,6 +511,7 @@ static struct cmd_struct commands[] = { > { "fetch-pack", cmd_fetch_pack, RUN_SETUP | NO_PARSEOPT }, > { "fmt-merge-msg", cmd_fmt_merge_msg, RUN_SETUP }, > { "for-each-ref", cmd_for_each_ref, RUN_SETUP }, > + { "for-each-repo", cmd_for_each_repo, RUN_SETUP_GENTLY }, > { "format-patch", cmd_format_patch, RUN_SETUP }, > { "fsck", cmd_fsck, RUN_SETUP }, > { "fsck-objects", cmd_fsck, RUN_SETUP }, > diff --git a/t/t0068-for-each-repo.sh b/t/t0068-for-each-repo.sh > new file mode 100755 > index 0000000000..136b4ec839 > --- /dev/null > +++ b/t/t0068-for-each-repo.sh > @@ -0,0 +1,30 @@ > +#!/bin/sh > + > +test_description='git for-each-repo builtin' > + > +. ./test-lib.sh > + > +test_expect_success 'run based on configured value' ' > + git init one && > + git init two && > + git init three && > + git -C two commit --allow-empty -m "DID NOT RUN" && > + git config run.key "$TRASH_DIRECTORY/one" && > + git config --add run.key "$TRASH_DIRECTORY/three" && > + git for-each-repo --config=run.key commit --allow-empty -m "ran" && > + git -C one log -1 --pretty=format:%s >message && > + grep ran message && > + git -C two log -1 --pretty=format:%s >message && > + ! grep ran message && > + git -C three log -1 --pretty=format:%s >message && > + grep ran message && > + git for-each-repo --config=run.key -- commit --allow-empty -m "ran again" && > + git -C one log -1 --pretty=format:%s >message && > + grep again message && > + git -C two log -1 --pretty=format:%s >message && > + ! grep again message && > + git -C three log -1 --pretty=format:%s >message && > + grep again message > +' > + > +test_done > -- > gitgitgadget > -- Danh

Some commands run 'git maintenance run --auto --[no-]quiet' after doing their normal work, as a way to keep repositories clean as they are used. Currently, users who do not want this maintenance to occur would set the 'gc.auto' config option to 0 to avoid the 'gc' task from running. However, this does not stop the extra process invocation. On Windows, this extra process invocation can be more expensive than necessary. Allow users to drop this extra process by setting 'maintenance.auto' to 'false'. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>

derrickstolee · 2020-08-25T18:38:03Z

/submit

gitgitgadget · 2020-08-25T18:40:41Z

Submitted as pull.680.v2.git.1598380805.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git pr-680/derrickstolee/maintenance/scheduled-v2

To fetch this version to local tag pr-680/derrickstolee/maintenance/scheduled-v2:

git fetch --no-tags https://github.com/gitgitgadget/git tag pr-680/derrickstolee/maintenance/scheduled-v2

gitgitgadget · 2020-08-25T21:47:52Z

Documentation/config/maintenance.txt

@@ -1,3 +1,8 @@
+maintenance.auto::


On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Derrick Stolee <dstolee@microsoft.com> > > Some commands run 'git maintenance run --auto --[no-]quiet' after doing > their normal work, as a way to keep repositories clean as they are used. > Currently, users who do not want this maintenance to occur would set the > 'gc.auto' config option to 0 to avoid the 'gc' task from running. > However, this does not stop the extra process invocation. OK, that is because the configuration is checked on the other side, and the new check is implemented on this side before we decide to spawn the maintenance task. It sounds like a change worth having without even waiting for the "git maintenance" to materialize ;-). > @@ -1868,8 +1869,13 @@ int run_processes_parallel_tr2(int n, get_next_task_fn get_next_task, > > int run_auto_maintenance(int quiet) > { > + int enabled; > struct child_process maint = CHILD_PROCESS_INIT; > > + if (!git_config_get_bool("maintenance.auto", &enabled) && > + !enabled) > + return 0; So in a repository without this configuration, get_bool would fail and we do not short-circuit. Only if the value get_bool sees is false, we return without running the command. Makes sense.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 8/25/2020 5:44 PM, Junio C Hamano wrote: > "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > >> From: Derrick Stolee <dstolee@microsoft.com> >> >> Some commands run 'git maintenance run --auto --[no-]quiet' after doing >> their normal work, as a way to keep repositories clean as they are used. >> Currently, users who do not want this maintenance to occur would set the >> 'gc.auto' config option to 0 to avoid the 'gc' task from running. >> However, this does not stop the extra process invocation. > > OK, that is because the configuration is checked on the other side, > and the new check is implemented on this side before we decide to > spawn the maintenance task. > > It sounds like a change worth having without even waiting for the > "git maintenance" to materialize ;-). True. This one could be pulled out of Part III and placed any time after Part I (outside of test script context conflicts). The only reason to wait until after Part I is that the name "maintenance.auto" implies a connection to the maintenance builtin instead of gc. Before the maintenance builtin, it would be natural to see "gc.auto=0" and do this same behavior, but that will not be general enough in the future. If you prefer, I can pull this out into a series on its own to be tracked separately. Thanks, -Stolee

On the Git mailing list, Junio C Hamano wrote (reply to this):

Derrick Stolee <stolee@gmail.com> writes: > Before the maintenance builtin, it would be natural to see "gc.auto=0" > and do this same behavior, but that will not be general enough in the > future. I do think it is an excellent change to move the check done in the need_to_gc() to check the value of gc.auto to run_auto_gc() for exactly the reason the log message of this patch gives. And after the function is renamed to run_auto_maintenance(), at some point the variable that gets checked would also be updated, and we'd eventually reach the same state, I would think. But it is so small a change that it probably is not worth the book-keeping burden of remembering that the maintenance topic needs to build on the patch to update auto-gc. > If you prefer, I can pull this out into a series on its own to be > tracked separately. So let's leave it as-is. Thanks.

gitgitgadget · 2020-08-25T21:54:06Z

Documentation/config/maintenance.txt

@@ -1,10 +1,20 @@
+maintenance.auto::


On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > I selected the timestamp before the task, as opposed to after the task, > for a couple reasons: > > 1. The time the task takes to execute should not contribute to the > interval between running the tasks. ... as long as the run time is sufficiently shorter than the interval, that is. If a task takes 10-30 minutes depending on how dirty the repository is, it does not make sense to even try to run it every 15 minutes. > If a daily task takes 10 minutes > to run, then every day the execution will drift by at least 10 > minutes. That is not incorrect per-se, but it does not tell us why drifting by 10 minutes is a bad thing. > 2. If the task fails for some unforseen reason, it would be good to > indicate that we _attempted_ the task at a certain timestamp. This > will avoid spamming a repository that is in a bad state. Absolutely. > +static void update_last_run(struct maintenance_task *task) > +{ > + timestamp_t now = approxidate("now"); > + struct strbuf config = STRBUF_INIT; > + struct strbuf value = STRBUF_INIT; > + strbuf_addf(&config, "maintenance.%s.lastrun", task->name); > + strbuf_addf(&value, "%"PRItime"", now); So is this essentially meant as a human-unreadable opaque value, like we have in the commit object header lines? I do not have a strong opinion, but it would be nice to allow curious to casually read it. Perhaps "git config --type=timestamp maintenance.lastrun" can be taught to pretty print its value?

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 8/25/2020 5:52 PM, Junio C Hamano wrote: > "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > >> I selected the timestamp before the task, as opposed to after the task, >> for a couple reasons: >> >> 1. The time the task takes to execute should not contribute to the >> interval between running the tasks. > > ... as long as the run time is sufficiently shorter than the > interval, that is. If a task takes 10-30 minutes depending on how > dirty the repository is, it does not make sense to even try to run > it every 15 minutes. Definitely. The lock on the object database from earlier prevents these longer-than-anticipated tasks from stacking. >> If a daily task takes 10 minutes >> to run, then every day the execution will drift by at least 10 >> minutes. > > That is not incorrect per-se, but it does not tell us why drifting > by 10 minutes is a bad thing. True. >> 2. If the task fails for some unforseen reason, it would be good to >> indicate that we _attempted_ the task at a certain timestamp. This >> will avoid spamming a repository that is in a bad state. > > Absolutely. > >> +static void update_last_run(struct maintenance_task *task) >> +{ >> + timestamp_t now = approxidate("now"); >> + struct strbuf config = STRBUF_INIT; >> + struct strbuf value = STRBUF_INIT; >> + strbuf_addf(&config, "maintenance.%s.lastrun", task->name); >> + strbuf_addf(&value, "%"PRItime"", now); > > So is this essentially meant as a human-unreadable opaque value, > like we have in the commit object header lines? I do not have a > strong opinion, but it would be nice to allow curious to casually > read it. Perhaps "git config --type=timestamp maintenance.lastrun" > can be taught to pretty print its value? Good idea. I will think on this. Of course, we already have --type=expiry-date, which does the opposite. Perhaps this config value should be a human-readable date and then be parsed into a timestamp in-process using git_config_expiry_date()? I have mixed feelings on using that format, because it can store both a fixed or relative datetime. The *.lastRun config really wants a _fixed_ datetime, but the *.schedule config (in the next patch) would want a _relative_ datetime. This also allows things like "now" or "never", so it presents a lot of flexibility for users. A nightmare to test, but perhaps that flexibility is useful. (Of course, in another thread you mentioned multiple `crontab` lines, which might make this entire discussion irrelevant. I'll follow up there.) Thanks, -Stolee

On the Git mailing list, Junio C Hamano wrote (reply to this):

Derrick Stolee <stolee@gmail.com> writes: >>> 1. The time the task takes to execute should not contribute to the >>> interval between running the tasks. >> >> ... as long as the run time is sufficiently shorter than the >> interval, that is. If a task takes 10-30 minutes depending on how >> dirty the repository is, it does not make sense to even try to run >> it every 15 minutes. > > Definitely. The lock on the object database from earlier prevents these > longer-than-anticipated tasks from stacking. Hmph, I actually was (anticipating|hoping) that you would give a good argument for having maintenance subsystem in change of scheduling rather than cron, as it can monitor how the already running job is goind and skip one cycle if needed. The above is instead a good argument that independent cron jobs can still coordinate and there is no central and custom scheduler in the form of 'maintenance run'. >>> 2. If the task fails for some unforseen reason, it would be good to >>> indicate that we _attempted_ the task at a certain timestamp. This >>> will avoid spamming a repository that is in a bad state. >> >> Absolutely. Somebody already mentioned that using the configuration file for recordkeeping may not be a good idea, and I tend to agree, by the way. I may want to periodically take a snapshot of my configuration to notice and remember changes I made myself intentionally (e.g. switched access method of a hosting site from ssh:// to https://, added a new branch that builds on something else, etc.) by comparing the snapshot with previous ones (and might even put it under version-control) and mechanical noise would interfere with it.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 8/26/2020 1:03 PM, Junio C Hamano wrote: > Derrick Stolee <stolee@gmail.com> writes: > >>>> 1. The time the task takes to execute should not contribute to the >>>> interval between running the tasks. >>> >>> ... as long as the run time is sufficiently shorter than the >>> interval, that is. If a task takes 10-30 minutes depending on how >>> dirty the repository is, it does not make sense to even try to run >>> it every 15 minutes. >> >> Definitely. The lock on the object database from earlier prevents these >> longer-than-anticipated tasks from stacking. > > Hmph, I actually was (anticipating|hoping) that you would give a > good argument for having maintenance subsystem in change of > scheduling rather than cron, as it can monitor how the already > running job is goind and skip one cycle if needed. The above is > instead a good argument that independent cron jobs can still > coordinate and there is no central and custom scheduler in the form > of 'maintenance run'. While the lock does prevent concurrent 'maintenance run' commands from colliding and causing unpredictable behavior as they both modify the object database, this does not help ensure that maintenance tasks actually happen if certain tasks are fired independently by cron and consistently collide. This is the main motivation for me using a single crontab entry. More discussion of all of the tradeoffs is in [1]. [1] https://lore.kernel.org/git/bd4e18b7-6265-73e7-bc1a-a7d647eafd0a@gmail.com/ >>>> 2. If the task fails for some unforseen reason, it would be good to >>>> indicate that we _attempted_ the task at a certain timestamp. This >>>> will avoid spamming a repository that is in a bad state. >>> >>> Absolutely. > > Somebody already mentioned that using the configuration file for > recordkeeping may not be a good idea, and I tend to agree, by the > way. I may want to periodically take a snapshot of my configuration > to notice and remember changes I made myself intentionally > (e.g. switched access method of a hosting site from ssh:// to > https://, added a new branch that builds on something else, etc.) by > comparing the snapshot with previous ones (and might even put it > under version-control) and mechanical noise would interfere with it. I will think of another way to handle this, then. If we cannot infer that "this task was launched, therefore it is due to run" from an optimal cron schedule, then I'll probably create a new file in the .git repository that stores these values. That file would be in the config format to make parsing easy. Thanks, -Stolee

gitgitgadget · 2020-08-25T22:03:10Z

Documentation/config/maintenance.txt

@@ -1,10 +1,29 @@
+maintenance.auto::


On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Derrick Stolee <dstolee@microsoft.com> > > A user may want to run certain maintenance tasks based on frequency, not > conditions given in the repository. For example, the user may want to > perform a 'prefetch' task every hour, or 'gc' task every day. To assist, > update the 'git maintenance run --scheduled' command to check the config > for the last run of that task and add a number of seconds. The task > would then run only if the current time is beyond that minimum > timestamp. > > Add a '--scheduled' option to 'git maintenance run' to only run tasks > that have had enough time pass since their last run. This is done for > each enabled task by checking if the current timestamp is at least as > large as the sum of 'maintenance.<task>.lastRun' and > 'maintenance.<task>.schedule' in the Git config. This second value is > new to this commit, storing a number of seconds intended between runs. > > A user could then set up an hourly maintenance run with the following > cron table: > > 0 * * * * git -C <repo> maintenance run --scheduled The scheme has one obvious drawback. An hourly crontab entry means your maintenance.*.schedule that is finer grained than an hour increment will not run as expected. You'd need to take all the schedule intervals and take their GCD to come up with the frequency of the single crontab entry. Wouldn't it make more sense to have N crontab entries for N tasks you want to run periodically, each with their own frequency controlled by crontab? That way, you do not need to maintain maintenance.*.schedule configuration variables and the --scheduled option. It might make maintenance.*.lastrun timestamps unneeded, which would be an added plus to simplify the system quite drastically. Most importantly, that would be the way crontab users are most used to in order to schedule their periodical jobs, so it is one less thing to learn.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 8/25/2020 6:01 PM, Junio C Hamano wrote: > "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > >> From: Derrick Stolee <dstolee@microsoft.com> >> >> A user may want to run certain maintenance tasks based on frequency, not >> conditions given in the repository. For example, the user may want to >> perform a 'prefetch' task every hour, or 'gc' task every day. To assist, >> update the 'git maintenance run --scheduled' command to check the config >> for the last run of that task and add a number of seconds. The task >> would then run only if the current time is beyond that minimum >> timestamp. >> >> Add a '--scheduled' option to 'git maintenance run' to only run tasks >> that have had enough time pass since their last run. This is done for >> each enabled task by checking if the current timestamp is at least as >> large as the sum of 'maintenance.<task>.lastRun' and >> 'maintenance.<task>.schedule' in the Git config. This second value is >> new to this commit, storing a number of seconds intended between runs. >> >> A user could then set up an hourly maintenance run with the following >> cron table: >> >> 0 * * * * git -C <repo> maintenance run --scheduled > > The scheme has one obvious drawback. An hourly crontab entry means > your maintenance.*.schedule that is finer grained than an hour > increment will not run as expected. You'd need to take all the > schedule intervals and take their GCD to come up with the frequency > of the single crontab entry. My intention for the *.schedule is that it is not an _exact_ frequency, but instead a lower bound on the frequency. That can be shelved for now as we discuss this setup: > Wouldn't it make more sense to have N crontab entries for N tasks > you want to run periodically, each with their own frequency > controlled by crontab? That way, you do not need to maintain > maintenance.*.schedule configuration variables and the --scheduled > option. It might make maintenance.*.lastrun timestamps unneeded, > which would be an added plus to simplify the system quite > drastically. Most importantly, that would be the way crontab users > are most used to in order to schedule their periodical jobs, so it > is one less thing to learn. I had briefly considered setting up crontab entries for each task (and possibly each repo) but ended up with these complications: 1. Maintenance frequency differs by task, so we need to split the crontab by task. But we can't just split everything because we do not want multiple tasks running at the same time on one repository. We would need to group the tasks and have one entry saying "git maintenance run --task=<task1> --task=<task2> ..." for all tasks in the group. 2. Different repositories might want different tasks at different frequencies, so we might need to split the crontab by repository. Again, we likely want to group repositories by these frequencies because a user could have 100 registered repositories and we don't really want to launch 100 parallel processes. 3. If we want to stop maintenance, then restart it, we need to clear the crontab and repopulate it, which would require iterating through all "registered" repositories to read their config for frequencies. 4. On macOS, editing the crontab doesn't require "sudo" but it _does_ create a pop-up(!) to get permission from the user. It would be good to minimize how often we edit the crontab and instead use config edits to change frequencies. With these things in mind, here is a suggested alternative design: Let users specify a schedule frequency among this list: hourly, daily, weekly, monthly. We then set the following* crontab: 0 * * * * git for-each-repo --config=maintenance.repos maintenance run --scheduled=hourly 0 0 * * * git for-each-repo --config=maintenance.repos maintenance run --scheduled=daily 0 0 * * 0 git for-each-repo --config=maintenance.repos maintenance run --scheduled=weekly 0 0 0 * * git for-each-repo --config=maintenance.repos maintenance run --scheduled=monthly *Of course, there is some care around "$path/git --exec-path=$path" that I drop for ease here. Then, "git maintenance (start|stop)" can be just as simple as we have now: write a fixed schedule every time. The problem here is that cron will launch these processes in parallel, and then our object-database lock will cause some to fail! If anyone knows a simple way to tell cron "run hourly _except_ not at midnight" then we could let the "daily" schedule also run the "hourly" jobs, for instance. Hopefully that pattern could be extended to the weekly and monthly collisions. Alternatively, we could run every hour and then interpret from config if the current "hour" matches one of the schedules ourselves. So, the crontab would be this simple: 0 * * * * git for-each-repo --config=maintenance.repos maintenance run --scheduled and then we would internally decide "is this the midnight hour?" and "is this the first day of the week?" and "is this the first day of the month?" to detect if we should run the daily/weekly/monthly tasks. While it adds more time-awareness into Git, it does avoid the parallel task collisions. There are some concerns here related to long-running tasks delaying sequential runs of "git -C <repo> maintenance run --scheduled" causing the "is this the midnight hour?" queries to fail and having nightly/weekly/monthly maintenance be skipped accidentally. This motivates the *.lastRun config giving us some guarantee of _eventually_ running the tasks, just _not too frequently_. I hope this launches a good discussion to help us find a good cron schedule strategy. After we land on a suitable strategy, I'll summarize all of these subtleties in the commit message for posterity. Hopefully, the current way that I integrate with crontab and test that integration (in PATCH 6/7) could also be reviewed in parallel with this discussion. I'm very curious to see how that could be improved. Thanks, -Stolee

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 8/26/2020 11:30 AM, Derrick Stolee wrote: > Let users specify a schedule frequency among this list: hourly, daily, > weekly, monthly. We then set the following* crontab: > > 0 * * * * git for-each-repo --config=maintenance.repos maintenance run --scheduled=hourly > 0 0 * * * git for-each-repo --config=maintenance.repos maintenance run --scheduled=daily > 0 0 * * 0 git for-each-repo --config=maintenance.repos maintenance run --scheduled=weekly > 0 0 0 * * git for-each-repo --config=maintenance.repos maintenance run --scheduled=monthly > > *Of course, there is some care around "$path/git --exec-path=$path" > that I drop for ease here. Jeff Hostetler pointed out the following details in the crontab documentation [1]: Ranges of numbers are allowed. Ranges are two numbers separated with a hyphen. The specified range is inclusive. For example, 8-11 for an 'hours' entry specifies execution at hours 8, 9, 10, and 11. The first number must be less than or equal to the second one. [1] https://man7.org/linux/man-pages/man5/crontab.5.html This means we could try this schedule: 0 1-23 * * * git for-each-repo --config=maintenance.repos maintenance run --scheduled=hourly 0 0 * * 1-6 git for-each-repo --config=maintenance.repos maintenance run --scheduled=daily 0 0 1-30 * 0 git for-each-repo --config=maintenance.repos maintenance run --scheduled=weekly 0 0 0 * * git for-each-repo --config=maintenance.repos maintenance run --scheduled=monthly And it should behave this way: Run --scheduled=hourly every hour, except at midnight. This runs all "hourly" tasks. Run --scheduled=daily at midnight, except on Sunday. This runs all "hourly" and "daily" tasks. Run --scheduled=weekly at midnight Sunday, except on the first day of the month. This runs all "hourly", "daily", and "weekly" tasks. Run --scheduled=monthly at midnight on the first day of the month. This runs all scheduled tasks. There is some subtlety between whether the "weekly" runs should be a subset of "monthly" and maybe the easiest way to handle that would be to not support "monthly" and have only "hourly", "daily", and "weekly" options for now. This should get around all of the parallel issues and allow us to drop the *.lastRun config option. Thoughts? Thanks, -Stolee

gitgitgadget · 2020-08-25T22:20:58Z

.gitignore

@@ -67,6 +67,7 @@
 /git-filter-branch


On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > +SYNOPSIS > +-------- > +[verse] > +'git for-each-repo' --config=<config> [--] <arguments> > + ... > +--config=<config>:: > + Use the given config variable as a multi-valued list storing > + absolute path names. Would it make sense to allow this config to be read from the current repository, I wonder. It is probably designed to be written to either ~/.gitconfig or /etc/gitconfig because it is probably a need that is not per-repository to list repositories for various purposes specified by the config key, but I suspect there _might_ be a good use case for storing some custom list of repositories in the configuration file local to a repository, but it is not quite obvious what it is. If we have a good example, we may want to spell it out---that would help future readers who wonder about this (just like I am doing now). Also, if we do read from local config, should there be a way to say "ah, you may have read values from /etc/gitconfig and ~/.gitconfig, but please forget them---I have a full list I care when you are running in this repository", i.e. clear the list. It is purely a convention and there is no built-in mechanism for this in the config API, but often it is signalled by giving an empty string as a value. By the way, I do not have a good concrete suggestion, but can we use something better than <config> as the placeholder? I first thought this was naming the name of a file that lists repositories, not the config variable name in our usual config namespace. > +static int run_command_on_repo(const char *path, > + void *cbdata) Is that on repo or in repo? When I saw "-C" on the command line, I immediately thought of "in repo". > +{ > + int i; > + struct child_process child = CHILD_PROCESS_INIT; > + struct strvec *args = (struct strvec *)cbdata; > + > + child.git_cmd = 1; > + strvec_pushl(&child.args, "-C", path, NULL); > + > + for (i = 0; i < args->nr; i++) > + strvec_push(&child.args, args->v[i]); Would strvec_pushv() work, or is args->v[] not NULL terminated? > + return run_command(&child); > +} > + values = repo_config_get_value_multi(the_repository, > + config_key); Not your fault, but it is a bit unsatisfactory that we do not have special "type" meant for paths in the config API, unlike the parse-options API where there is a "filename" type that is a bit richer than a vanilla "string" type by allowing "prefix" handling. For the purposes of this, as the values are limited to absolute/full pathnames, it does not hurt as much, though. Thanks.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 8/25/2020 6:19 PM, Junio C Hamano wrote: > "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > >> +SYNOPSIS >> +-------- >> +[verse] >> +'git for-each-repo' --config=<config> [--] <arguments> >> + ... >> +--config=<config>:: >> + Use the given config variable as a multi-valued list storing >> + absolute path names. > > Would it make sense to allow this config to be read from the current > repository, I wonder. It is probably designed to be written to > either ~/.gitconfig or /etc/gitconfig because it is probably a need > that is not per-repository to list repositories for various purposes > specified by the config key, but I suspect there _might_ be a good > use case for storing some custom list of repositories in the > configuration file local to a repository, but it is not quite > obvious what it is. > > If we have a good example, we may want to spell it out---that would > help future readers who wonder about this (just like I am doing now). > > Also, if we do read from local config, should there be a way to say > "ah, you may have read values from /etc/gitconfig and ~/.gitconfig, > but please forget them---I have a full list I care when you are > running in this repository", i.e. clear the list. It is purely a > convention and there is no built-in mechanism for this in the config > API, but often it is signalled by giving an empty string as a value. I guess I should test this, but if I ask for a multi-valued config, will I not get _all_ of the results from /etc/gitconfig, ~/.gitconfig, AND .git/config? That was my expectation, which is why I don't specify "local" or "global" config anywhere in the discussion. > By the way, I do not have a good concrete suggestion, but can we use > something better than <config> as the placeholder? I first thought > this was naming the name of a file that lists repositories, not the > config variable name in our usual config namespace. Sure. How about "<key>"? >> +static int run_command_on_repo(const char *path, >> + void *cbdata) > > Is that on repo or in repo? When I saw "-C" on the command line, I > immediately thought of "in repo". "in" is better. >> +{ >> + int i; >> + struct child_process child = CHILD_PROCESS_INIT; >> + struct strvec *args = (struct strvec *)cbdata; >> + >> + child.git_cmd = 1; >> + strvec_pushl(&child.args, "-C", path, NULL); >> + >> + for (i = 0; i < args->nr; i++) >> + strvec_push(&child.args, args->v[i]); > > Would strvec_pushv() work, or is args->v[] not NULL terminated? Yeah, pushv should work. >> + return run_command(&child); >> +} > > >> + values = repo_config_get_value_multi(the_repository, >> + config_key); > > Not your fault, but it is a bit unsatisfactory that we do not have > special "type" meant for paths in the config API, unlike the > parse-options API where there is a "filename" type that is a bit > richer than a vanilla "string" type by allowing "prefix" handling. > For the purposes of this, as the values are limited to absolute/full > pathnames, it does not hurt as much, though. Interesting. Noted. Thanks, -Stolee

gitgitgadget · 2020-08-26T12:45:55Z

On the Git mailing list, Michal Suchánek wrote (reply to this):

On Wed, Aug 19, 2020 at 05:16:41PM +0000, Derrick Stolee via GitGitGadget wrote:
> This is based on ds/maintenance-part-2, but with some local updates to
> review feedback. It won't apply cleanly right now. This RFC is for early
> feedback and not intended to make a new tracking branch until v2.
> 
> This RFC is intended to show how I hope to integrate true background
> maintenance into Git. As opposed to my original RFC [1], this entirely
> integrates with cron (through crontab [-e|-l]) to launch maintenance
> commands in the background.
> 
> [1] 
> https://lore.kernel.org/git/pull.597.git.1585946894.gitgitgadget@gmail.com/
> 
> Some preliminary work is done to allow a new --scheduled option that
> triggers enabled tasks only if they have not been run in some amount of
> time. The timestamp of the previous run is stored in the 
> maintenance.<task>.lastRun config value while the interval is stored in the 
> maintenance.<task>.schedule config value.
This changes the config file from read-mostly to continuously updated. Is
that desirable?
In particular it significanly increases the risk of race with the user
editing the file.
I think timestamps are not configuration and should be written to some
other file.
Or is there already a core git feature that continuously updates the
config file?

Thanks

Michal

gitgitgadget · 2020-08-27T14:55:59Z

User Derrick Stolee <stolee@gmail.com> has been added to the cc: list.

A user may want to run certain maintenance tasks based on frequency, not conditions given in the repository. For example, the user may want to perform a 'prefetch' task every hour, or 'gc' task every day. To assist, update the 'git maintenance run' command to include a '--schedule=<frequency>' option. The allowed frequencies are 'hourly', 'daily', and 'weekly'. These values are also allowed in a new config value 'maintenance.<task>.schedule'. The 'git maintenance run --schedule=<frequency>' checks the '*.schedule' config value for each enabled task to see if the configured frequency is at least as frequent as the frequency from the '--schedule' argument. We use the following order, for full clarity: 'hourly' > 'daily' > 'weekly' Use new 'enum schedule_priority' to track these values numerically. The following cron table would run the scheduled tasks with the correct frequencies: 0 1-23 * * * git -C <repo> maintenance run --scheduled=hourly 0 0 * * 1-6 git -C <repo> maintenance run --scheduled=daily 0 0 * * 0 git -C <repo> maintenance run --scheduled=weekly This cron schedule will run --scheduled=hourly every hour except at midnight. This avoids a concurrent run with the --scheduled=daily that runs at midnight every day except the first day of the week. This avoids a concurrent run with the --scheduled=weekly that runs at midnight on the first day of the week. Since --scheduled=daily also runs the 'hourly' tasks and --scheduled=weekly runs the 'hourly' and 'daily' tasks, we will still see all tasks run with the proper frequencies. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>

It can be helpful to store a list of repositories in global or system config and then iterate Git commands on that list. Create a new builtin that makes this process simple for experts. We will use this builtin to run scheduled maintenance on all configured repositories in a future change. The test is very simple, but does highlight that the "--" argument is optional. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>

In preparation for launching background maintenance from the 'git maintenance' builtin, create register/unregister subcommands. These commands update the new 'maintenance.repos' config option in the global config so the background maintenance job knows which repositories to maintain. These commands allow users to add a repository to the background maintenance list without disrupting the actual maintenance mechanism. For example, a user can run 'git maintenance register' when no background maintenance is running and it will not start the background maintenance. A later update to start running background maintenance will then pick up this repository automatically. The opposite example is that a user can run 'git maintenance unregister' to remove the current repository from background maintenance without halting maintenance for other repositories. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>

Add new subcommands to 'git maintenance' that start or stop background maintenance using 'cron', when available. This integration is as simple as I could make it, barring some implementation complications. The schedule is laid out as follows: 0 1-23 * * * $cmd maintenance run --schedule=hourly 0 0 * * 1-6 $cmd maintenance run --schedule=daily 0 0 * * 0 $cmd maintenance run --schedule=weekly where $cmd is a properly-qualified 'git for-each-repo' execution: $cmd=$path/git --exec-path=$path for-each-repo --config=maintenance.repo where $path points to the location of the Git executable running 'git maintenance start'. This is critical for systems with multiple versions of Git. Specifically, macOS has a system version at '/usr/bin/git' while the version that users can install resides at '/usr/local/bin/git' (symlinked to '/usr/local/libexec/git-core/git'). This will also use your locally-built version if you build and run this in your development environment without installing first. This conditional schedule avoids having cron launch multiple 'git for-each-repo' commands in parallel. Such parallel commands would likely lead to the 'hourly' and 'daily' tasks competing over the object database lock. This could lead to to some tasks never being run! Since the --schedule=<frequency> argument will run all tasks with _at least_ the given frequency, the daily runs will also run the hourly tasks. Similarly, the weekly runs will also run the daily and hourly tasks. The GIT_TEST_CRONTAB environment variable is not intended for users to edit, but instead as a way to mock the 'crontab [-l]' command. This variable is set in test-lib.sh to avoid a future test from accidentally running anything with the cron integration from modifying the user's schedule. We use GIT_TEST_CRONTAB='test-tool crontab <file>' in our tests to check how the schedule is modified in 'git maintenance (start|stop)' commands. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>

The 'git maintenance (register|start)' subcommands add the current repository to the global Git config so maintenance will operate on that repository. It does not specify what maintenance should occur or how often. If a user sets any 'maintenance.<task>.schedule' config value, then they have chosen a specific schedule for themselves and Git should respect that. However, in an effort to recommend a good schedule for repositories of all sizes, set new config values for recommended tasks that are safe to run in the background while users run foreground Git commands. These commands are generally everything but the 'gc' task. Author's Note: I feel we should do _something_ to recommend a good schedule to users, but I'm not 100% set on this schedule. This is the schedule we use in Scalar and VFS for Git for very large repositories using the GVFS protocol. While the schedule works in that environment, it is possible that "normal" Git repositories could benefit from something more obvious (such as running 'gc' weekly). However, this patch gives us a place to start a conversation on what we should recommend. For my purposes, Scalar will set these config values so we can always differ from core Git's recommendations. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>

derrickstolee · 2020-08-28T15:44:23Z

/submit

gitgitgadget · 2020-08-28T15:45:46Z

Submitted as pull.680.v3.git.1598629517.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git pr-680/derrickstolee/maintenance/scheduled-v3

To fetch this version to local tag pr-680/derrickstolee/maintenance/scheduled-v3:

git fetch --no-tags https://github.com/gitgitgadget/git tag pr-680/derrickstolee/maintenance/scheduled-v3

derrickstolee · 2020-08-31T19:00:56Z

Here is a conversation in the Git IRC channel about what to do for the next version:

dscho · 2020-09-01T08:34:12Z

Here is a conversation in the Git IRC channel about what to do for the next version:

And here is a link to those logs: https://colabti.org/irclogger/irclogger_log/git-devel?date=2020-08-31#l93

derrickstolee · 2020-09-04T12:48:42Z

Closing in preparation of a non-RFC version based on a different branch.

derrickstolee force-pushed the stolee/maintenance/gc branch from 3233e60 to f24db77 Compare July 23, 2020 16:45

derrickstolee force-pushed the maintenance/scheduled branch from f293b85 to 7f5103c Compare July 23, 2020 16:45

derrickstolee mentioned this pull request Jul 23, 2020

Maintenance builtin, allowing 'gc --auto' customization #671

Closed

This was referenced Jul 26, 2020

Weekly Digest (19 July, 2020 - 26 July, 2020) #682

Closed

Weekly Digest (19 July, 2020 - 26 July, 2020) #683

Closed

derrickstolee force-pushed the maintenance/scheduled branch 3 times, most recently from 172a297 to e885a97 Compare July 30, 2020 18:01

derrickstolee force-pushed the stolee/maintenance/gc branch from f24db77 to aed9183 Compare July 30, 2020 18:57

derrickstolee force-pushed the maintenance/scheduled branch 2 times, most recently from 08d04e9 to 70194b8 Compare July 30, 2020 19:04

derrickstolee force-pushed the stolee/maintenance/gc branch 2 times, most recently from eebd684 to 39eb83a Compare July 30, 2020 20:23

derrickstolee force-pushed the maintenance/scheduled branch from 70194b8 to c6f5465 Compare July 30, 2020 20:23

weekly-digest bot mentioned this pull request Aug 2, 2020

Weekly Digest (26 July, 2020 - 2 August, 2020) #690

Closed

derrickstolee force-pushed the maintenance/scheduled branch from c6f5465 to b9ba549 Compare August 3, 2020 19:27

derrickstolee force-pushed the stolee/maintenance/gc branch from 39eb83a to 5004df1 Compare August 3, 2020 19:28

derrickstolee force-pushed the maintenance/scheduled branch from b9ba549 to 364eda2 Compare August 4, 2020 18:47

derrickstolee force-pushed the stolee/maintenance/gc branch from 5004df1 to 9d068d3 Compare August 4, 2020 18:47

derrickstolee force-pushed the maintenance/scheduled branch from 364eda2 to 76219c1 Compare August 4, 2020 19:54

derrickstolee force-pushed the stolee/maintenance/gc branch from 9d068d3 to 33175e5 Compare August 4, 2020 19:54

derrickstolee force-pushed the maintenance/scheduled branch from 76219c1 to 3c7ba30 Compare August 6, 2020 13:38

derrickstolee changed the title ~~[WIP] Maintenance builtin: background maintenance~~ [WIP] Maintenance III: background maintenance Aug 6, 2020

derrickstolee mentioned this pull request Aug 6, 2020

Maintenance II: prefetch, loose-objects, incremental-repack tasks #696

Closed

derrickstolee force-pushed the maintenance/scheduled branch from 3c7ba30 to 3f33223 Compare August 7, 2020 14:40

weekly-digest bot mentioned this pull request Aug 9, 2020

Weekly Digest (2 August, 2020 - 9 August, 2020) #697

Closed

derrickstolee force-pushed the maintenance/scheduled branch from 3f33223 to 0593806 Compare August 12, 2020 13:32

derrickstolee force-pushed the stolee/maintenance/gc branch from 407c123 to d9d8607 Compare August 14, 2020 01:51

derrickstolee force-pushed the maintenance/scheduled branch 2 times, most recently from be78607 to 7ae4293 Compare August 17, 2020 14:13

gitgitgadget bot reviewed Aug 20, 2020

View reviewed changes

derrickstolee force-pushed the maintenance/scheduled branch 2 times, most recently from 1b8050f to ec1c1e1 Compare August 25, 2020 16:02

derrickstolee force-pushed the stolee/maintenance/gc branch from 0c43c64 to 69dd5b2 Compare August 25, 2020 16:02

derrickstolee force-pushed the stolee/maintenance/gc branch from 69dd5b2 to e9bb32f Compare August 25, 2020 17:23

derrickstolee force-pushed the maintenance/scheduled branch from ec1c1e1 to 9ecabeb Compare August 25, 2020 17:23

gitgitgadget bot reviewed Aug 25, 2020

View reviewed changes

derrickstolee force-pushed the maintenance/scheduled branch from 9ecabeb to 7c6b051 Compare August 28, 2020 14:11

derrickstolee added 5 commits August 28, 2020 10:33

derrickstolee force-pushed the maintenance/scheduled branch from 7c6b051 to 62e8db8 Compare August 28, 2020 14:42

derrickstolee closed this Sep 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Maintenance III: background maintenance #680

[RFC] Maintenance III: background maintenance #680

derrickstolee commented Jul 23, 2020 •

edited

Loading

gitgitgadget bot Aug 20, 2020

gitgitgadget bot Aug 20, 2020

gitgitgadget bot Aug 24, 2020

gitgitgadget bot Aug 20, 2020

derrickstolee commented Aug 25, 2020

gitgitgadget bot commented Aug 25, 2020

gitgitgadget bot Aug 25, 2020

gitgitgadget bot Aug 26, 2020

gitgitgadget bot Aug 26, 2020

gitgitgadget bot Aug 25, 2020

gitgitgadget bot Aug 26, 2020

gitgitgadget bot Aug 26, 2020

gitgitgadget bot Aug 27, 2020

gitgitgadget bot Aug 25, 2020

gitgitgadget bot Aug 26, 2020

gitgitgadget bot Aug 27, 2020

gitgitgadget bot Aug 25, 2020

gitgitgadget bot Aug 26, 2020

gitgitgadget bot commented Aug 26, 2020

gitgitgadget bot commented Aug 27, 2020

derrickstolee commented Aug 28, 2020

gitgitgadget bot commented Aug 28, 2020

derrickstolee commented Aug 31, 2020

dscho commented Sep 1, 2020

derrickstolee commented Sep 4, 2020

[RFC] Maintenance III: background maintenance #680

[RFC] Maintenance III: background maintenance #680

Conversation

derrickstolee commented Jul 23, 2020 • edited Loading

Updates since RFC v2

Updates since RFC v1

gitgitgadget bot Aug 20, 2020

Choose a reason for hiding this comment

gitgitgadget bot Aug 20, 2020

Choose a reason for hiding this comment

gitgitgadget bot Aug 24, 2020

Choose a reason for hiding this comment

gitgitgadget bot Aug 20, 2020

Choose a reason for hiding this comment

derrickstolee commented Aug 25, 2020

gitgitgadget bot commented Aug 25, 2020

gitgitgadget bot Aug 25, 2020

Choose a reason for hiding this comment

gitgitgadget bot Aug 26, 2020

Choose a reason for hiding this comment

gitgitgadget bot Aug 26, 2020

Choose a reason for hiding this comment

gitgitgadget bot Aug 25, 2020

Choose a reason for hiding this comment

gitgitgadget bot Aug 26, 2020

Choose a reason for hiding this comment

gitgitgadget bot Aug 26, 2020

Choose a reason for hiding this comment

gitgitgadget bot Aug 27, 2020

Choose a reason for hiding this comment

gitgitgadget bot Aug 25, 2020

Choose a reason for hiding this comment

gitgitgadget bot Aug 26, 2020

Choose a reason for hiding this comment

gitgitgadget bot Aug 27, 2020

Choose a reason for hiding this comment

gitgitgadget bot Aug 25, 2020

Choose a reason for hiding this comment

gitgitgadget bot Aug 26, 2020

Choose a reason for hiding this comment

gitgitgadget bot commented Aug 26, 2020

gitgitgadget bot commented Aug 27, 2020

derrickstolee commented Aug 28, 2020

gitgitgadget bot commented Aug 28, 2020

derrickstolee commented Aug 31, 2020

dscho commented Sep 1, 2020

derrickstolee commented Sep 4, 2020

derrickstolee commented Jul 23, 2020 •

edited

Loading