New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: rebase-builtin #505

Open
wants to merge 52 commits into
base: master
from

Conversation

Projects
None yet
6 participants
@prertik
Copy link
Contributor

prertik commented Jun 10, 2018

No description provided.

@prertik

This comment has been minimized.

Copy link
Contributor Author

prertik commented Jun 10, 2018

@@ -26,6 +27,31 @@ static int use_builtin_rebase(void)
strbuf_release(&out);
return ret;
}
static int finish_rebase(){

}

This comment has been minimized.

@chriscool

chriscool Jun 10, 2018

Contributor

I think this would need at least a TODO comment and a return 0;

argv_array_pushf(&child_env, "--preserve-merges", NULL);
status = run_command_v_opt_cd_env(child_argv, RUN_USING_SHELL, NULL,
child_env.argv);
if(status == 0) {

This comment has been minimized.

@chriscool

chriscool Jun 10, 2018

Contributor

Nit: space missing betwen if and (.

child_env.argv);
if(status == 0) {
finish_rebase();
}

This comment has been minimized.

@chriscool

chriscool Jun 10, 2018

Contributor

The brackets can be removed when a if clause has only one statement

This comment has been minimized.

@chriscool

chriscool Jun 10, 2018

Contributor

If you keep them, at least have the following else just after the closing bracket.

if(status == 0) {
finish_rebase();
}
else if(status == 2) {

This comment has been minimized.

@chriscool

chriscool Jun 10, 2018

Contributor

Nit: space missing betwen if and (.

else if(status == 2) {
apply_autostash();
remove_dir_recursively("GIT_DIR=%s/rebase-apply" ||
"GIT_DIR=%s/rebase-merge");

This comment has been minimized.

@chriscool

chriscool Jun 10, 2018

Contributor

remove_dir_recursively() takes 2 arguments.

I don't see how this can even compile.

@chriscool

This comment has been minimized.

Copy link
Contributor

chriscool commented Jun 10, 2018

Please at least try to compile all your commits before asking us to review them.

close_all_packs(the_repository->objects);
run_command_v_opt(argv_gc_auto, RUN_GIT_CMD);
remove_dir_recursively(git_path_rebase_apply());
remove_dir_recursively(git_path_rebase_merge());

This comment has been minimized.

@chriscool

chriscool Jun 10, 2018

Contributor

remove_dir_recursively() takes 2 arguments, so I don't see how this can even compile.

@chriscool

This comment has been minimized.

Copy link
Contributor

chriscool commented Jun 10, 2018

Also please squash the commits that should be squashed, so that we don't review commits that are not relevant any more because you have already changed them later. For example I feel like I wasted time reviewing your "WIP" commit because related code has changed a lot in the following "WIP with some major changes" commit.

@prertik prertik force-pushed the prertik:wip-rebase branch from 9db462a to cbfa73a Jun 11, 2018

const struct object_id *head = &head_commit->object.oid;
unlink("$(git rev-parse --git-path REBASE_HEAD)"); //NEEDS HELP.
//apply_autostash();
const char *argv_gc_auto[] = { "gc", "--auto", NULL };

This comment has been minimized.

@chriscool

chriscool Jun 11, 2018

Contributor

This is a declaration after a statement (the unlink() call). We try to avoid those.

This comment has been minimized.

@dscho

dscho Jun 12, 2018

Member

Compile with DEVELOPER=1 to catch such issues.

This comment has been minimized.

@stefanbeller

stefanbeller Jun 21, 2018

Contributor

While it is convenient to use a NULL terminated array and call run_command_v_opt on it directly, there were problems with it in the past: It is hard to modify the arguments if we ever want to. (For example if we wanted to add a new parameter, the array would be made one longer as { "gc", "--auto", NULL, NULL }; and then we could place the parameter to add in the spot of the first NULL; but then we have to carefully count how many NULLs we have to have in this array. In recent times, I think, we'd prefer a higher level data structure, found argv-array.h; an example for running a sub process is found in builtin/pull.c#run_rebase for example:

    struct argv_array args = ARGV_ARRAY_INIT;
    argv_array_push(&args, "rebase");
    ...
    if (opt_autostash == 0)
        argv_array_push(&args, "--no-autostash");
    ...
    ret = run_command_v_opt(args.argv, RUN_GIT_CMD);
    ```
which I think would also be a good idea here, although we do not yet know which parameters we might want to add in the future. 
struct strbuf reflog_message = STRBUF_INIT;
struct commit *head_commit;
const struct object_id *head = &head_commit->object.oid;
unlink("$(git rev-parse --git-path REBASE_HEAD)"); //NEEDS HELP.

This comment has been minimized.

@chriscool

chriscool Jun 11, 2018

Contributor

You may want to use GIT_PATH_FUNC(...) to define git_path_rebase_head() like you did for git_path_rebase_merge() above.

This comment has been minimized.

@phillipwood

phillipwood Jun 11, 2018

Contributor

Or use delete_ref() as the sequencer code does

* user should see them.
*/
close_all_packs(the_repository->objects);
run_command_v_opt(argv_gc_auto, RUN_GIT_CMD);

This comment has been minimized.

@chriscool

chriscool Jun 11, 2018

Contributor

I don't understand why the above lines are indented like this.

remove_dir_recursively(git_path_rebase_apply(), 1); //Is this the correct way?
remove_dir_recursively(git_path_rebase_merge(), 1);
die("Nothing to do");
argv_array_clear(&child_args);

This comment has been minimized.

@chriscool

chriscool Jun 11, 2018

Contributor

Indentation issue.

enum rebase_type mode;

static int use_builtin_rebase(void)
{

This comment has been minimized.

@stefanbeller

stefanbeller Jun 11, 2018

Contributor

This whole function could be replaced by git_config_get_bool("rebase.usebuiltin", &ret) ? (I seem to be missing the need why we need to fork off a new process to load that boolean, as this is not part of a shell -> translation, but part of the setup of a wrapper, i.e. new code?)

This comment has been minimized.

@prertik

prertik Jun 12, 2018

Author Contributor

Yes. You're right. I will replace this function in the coming iterations as this has no more need. It was from the start as a placeholder.

This comment has been minimized.

@dscho

dscho Jun 12, 2018

Member

No, that's not right. The whole point of the spawning exercise is to avoid going through the setup_git_directory() dance too early (and lose the information about the current directory!)

Please study the commit message of the difftool commit that introduced this paradigm. If you deviate from the paradigm in any way, you will most likely introduce a bug.

This comment has been minimized.

@chriscool

chriscool Jun 12, 2018

Contributor

@dscho ok but then it would be nice if that was explained here in the commit message or in a comment (or both).

This comment has been minimized.

@prertik

prertik Jun 12, 2018

Author Contributor

Ok. @dscho This seems to be the commit message.

The current version of the builtin difftool does not, however, make full
use of the internals but instead chooses to spawn a couple of Git
processes, still, to make for an easier conversion. There remains a lot
of room for improvement, left for a later date.

I'm sorry for not reading the message given by the commit properly.
And I totally agree with you.
@chriscool I will add a comment for the WIP commit and for the patch I'll be adding this detail to the commit message.

This comment has been minimized.

@phillipwood

phillipwood Jun 12, 2018

Contributor

But that comment does not explain why you're avoiding calling git_config_get_bool(). It would be better to explain that you're avoiding the setup dance and why. See the initial commit adding builtin/difftool.c (be8a90e) where there is a code comment explaining this.

This comment has been minimized.

@phillipwood

phillipwood Jun 12, 2018

Contributor

@dscho Why is it important to avoid changing directories when rebase doesn't have paths on the command line like difftool does?

This comment has been minimized.

@dscho

dscho Jun 12, 2018

Member

@phillipwood it is important in general, because code changes. So while the current code may not rely on the subdirectory in which it was started, future iterations might want to. To stay consistent is the sanest way to handle this: i.e. the way I suggested, to spawn.

In rebase -i's case, you further have to take into account that there are potentially scripts making use of GIT_PREFIX (IIRC one of my scripts does, or at least did).

This comment has been minimized.

@dscho

dscho Jun 12, 2018

Member

FWIW there is one way you could avoid spawning, that was developed in the meantime: the early config machinery (use read_early_config()).

But the idea is to use this technique only for the time until the project is completed, so I would be totally cool with the simpler-to-reason-about spawning approach.

@prertik prertik changed the title WIP rebase-builtin WIP: rebase-builtin Jun 12, 2018

@dscho

This comment has been minimized.

Copy link
Member

dscho commented Jun 12, 2018

Maybe working in --am after this would be better. What do you think?

I think that --preserve-merges is so uninteresting at this stage that it should be the last mode to be supported. The default mode of rebase, i.e. --am should be the one you optimize for, i.e. the one you start with.

@dscho

This comment has been minimized.

Copy link
Member

dscho commented Jun 13, 2018

@prertik any reaction to my reaction? I really wonder why you want to focus on the --preserve-merges mode when that is soon to be deprecated?

@prertik

This comment has been minimized.

Copy link
Contributor Author

prertik commented Jun 14, 2018

@dscho, yeah there was "double placement reaction" 😄
Ok, so I am now planning on type=am mode but was thinking of implementing the thing you said here before that:

The autostash part can be handled in a later patch, and should definitely leverage the apply_autostash() function that was introduced to support finishing interactive rebases in the sequencer; to make that work with the type=am mode, you will want to use get_dir() and teach it about the rebase-apply directory, and you will want to make apply_autostash() non-static (but keep get_dir() static).

@dscho

This comment has been minimized.

Copy link
Member

dscho commented Jun 15, 2018

I am worried that we get ahead of ourselves... so far, we have no working code, not even for the simplest use case: git rebase <upstream>.

I don't think it is wise to even investigate anything else until that works.

We really need to focus on one task at a time, and to try to get it done in the most efficient manner, so that we have a chance to finish this project in time: it's already mid-June...

@prertik prertik force-pushed the prertik:wip-rebase branch 2 times, most recently from 4c5cfb7 to d39ed0b Jun 19, 2018

@prertik

This comment has been minimized.

Copy link
Contributor Author

prertik commented Jun 19, 2018

@dscho

This comment has been minimized.

Copy link
Member

dscho commented Jun 19, 2018

@prertik could you please adjust the second commit's message? It should probably start by something like builtin/rebase: support running "git rebase <upstream>" and go into details how you did it.

Also, please change the commit subject from "This patch offers to provide a minimal rebase builtin." in the first patch, to reflect in a short, concise semi-sentence what this does ("rebase: refactor common shell functions into their own file" or some such, followed by "In the next comit, we will source git-rebase--am directly from the builtin rebase, therefore the functions from git-rebase.sh that are used by that backend would be missing." or something like that).

Also: have you performed a quick test whether the code works so far? Like, ./git --exec-path="$PWD" -c rebase.usebuiltin=true rebase HEAD^?

I would also like to split out the work we have done yesterday to add and execute that script snippet into its own commit. Ideally, I would like to see the current code split into three commits:

  1. introduce the bare-bones builtin (renaming git-rebase.sh to git-legacy-rebase.sh) that uses that config setting to hand off to the legacy rebase by default
  2. refactor the shell functions in git-rebase.sh that are used in the backends into git-rebase--common.sh.
  3. add the minimal code to make git rebase <upstream> work.

Note: I am not all that certain that the code already works, there are probably quite a few variables missing that we need to set in the env argv array. I will be available again in 1-2h to work on this if you have questions.

@prertik prertik force-pushed the prertik:wip-rebase branch 2 times, most recently from 90d8cee to c67435f Jun 19, 2018

@stefanbeller
Copy link
Contributor

stefanbeller left a comment

Could you mention that commit that introduced the difftool change? ("This commit imitates the strategy that was used to convert the difftool to a builtin, see be8a90e (difftool: add a skeleton for the upcoming builtin, 2017-01-17) for details: This commit renames git-rebase to ...-legacy... ") Maybe highlight the issues again? (For example you could repeat that we need to spawn a process to find out if we run the builtin or legacy rebase as that keeps the directory that we are in correct)

@stefanbeller
Copy link
Contributor

stefanbeller left a comment

In the second paragraph of the commit message there is a "ideally called, ideally named" I think one of them is superfluous?

@prertik prertik force-pushed the prertik:wip-rebase branch from c67435f to 46d16e0 Jun 20, 2018

@prertik

This comment has been minimized.

Copy link
Contributor Author

prertik commented Jun 20, 2018

How 'bout now @stefanbeller?

prertik added some commits Jul 13, 2018

builtin rebase: support --rerere-autoupdate
The `--rerere-autoupdate` option allows rerere to update the index with
resolved conflicts. This commit follows closely the equivalent part of
`git-legacy-rebase.sh`.

Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
builtin rebase: support --committer-date-is-author-date
This option is simply handed down to `git am` by way of setting the
`git_am_opt` variable that is handled by the `git-rebase--am` backend.

Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
builtin rebase: support `ignore-whitespace` option
This commit adds support for the `--ignore-whitespace` option
of the rebase command. This option is simply passed to the
`--am` backend.

Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
Merge branch 'builtin-rebase-minimal'
This patch series provides the bare minimum to run more than the trivial
rebase (i.e. `git rebase <upstream>`).

Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
Merge branch 'builtin-rebase-actions'
With this patch series, the builtin rebase learned about all the rebase
actions (such as --continue, --skip, etc).

Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
builtin rebase: support `ignore-date` option
This commit adds support for `--ignore-date` which is passed to `git am`
to easily change the dates of the rebased commits.

Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
builtin rebase: support `keep-empty` option
The `--keep-empty` option can be used to keep the commits that do not
change anything from its parents in the result.

While the scripted version uses `interactive_rebase=implied` to indicate
that the rebase needs to use the `git-rebase--interactive` backend in
non-interactive mode as fallback when figuring out which backend to use,
the C version needs to use a different route because the backend will
already be chosen during the `parse_options()` call.

Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
builtin rebase: support `--autosquash`
This commit adds support for the `--autosquash` option which is used to
automatically squash the commits marked as `squash` or `fixup` in their
messages. This is converted following `git-legacy-rebase.sh` closely.

This option can also be configured via the Git config setting
rebase.autosquash. To support this, we also add a custom
rebase_config() function in this commit that will be used instead (and
falls back to) git_default_config().

Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
builtin rebase: support `fork-point` option
This commit adds support for `--fork-point` and `--no-fork-point`.
This is converted as-is from `git-legacy-rebase.sh`.

Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
merge-base --fork-point: extract libified function
We need this functionality in the builtin rebase.

Note: to make this function truly reusable, we have to switch the call
get_merges_many_dirty() to get_merges_many() because we want the commit
flags to be reset (otherwise, subsequent get_merge_bases() calls would
obtain incorrect results). This did not matter when the function was
called in `git rev-parse --fork-point` because in that command, the
process definitely did not traverse any commits before exiting.

Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
builtin rebase: support `--gpg-sign` option
This commit introduces support for `--gpg-sign` option which is used
to GPG-sign commits.

Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
builtin rebase: support `-C` and `--whitespace=<type>`
This commit converts more code from the shell script version to the
builtin rebase. In this instance, we just have to be careful to
keep support for passing multiple `--whitespace` options, as the
shell script version does so, too.

Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
builtin rebase: support `--autostash` option
To support `--autostash` we introduce a function `apply_autostash()`
just like in `git-legacy-rebase.sh`.

Rather than refactoring and using the same function that exists in
`sequencer.c`, we go a different route here, to avoid clashes with
the sister GSoC project that turns the interactive rebase into a
builtin.

Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
builtin rebase: support `--exec`
This commit adds support for the `--exec` option which takes a shell
command-line as argument. This argument will be appended as an `exec
<cmd>` command after each line in the todo list that creates a commit in
the final history.  commands.

Note: while the shell script version of `git rebase` assigned the empty
string to `cmd` by default, we *unset* it here because the code looks
nicer and it does not change the behavior.

The `--exec` option requires `--interactive` machinery.

Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
builtin rebase: support `--allow-empty-message` option
This commit introduces the `--allow-empty-message` option to
`builtin/rebase.c`. The motivation behind this option is: if there are
empty messages (which is not allowed in Git by default, but can be
imported from different version control systems), the rebase will fail.

Using `--allow-empty-message` overrides that behaviour which will allow
the commits having empty messages to continue in rebase operation.

Note: a very recent change made this the default in the shell scripted
`git rebase`, therefore the builtin rebase does the same.

Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
builtin rebase: support --rebase-merges[=[no-]rebase-cousins]
The mode to rebase non-linear branches is now supported by the builtin
rebase, too.

Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
builtin rebase: fast-forward to onto if it is a proper descendant
When trying to rebase onto a direct descendant of HEAD, we can
take a shortcut and fast-forward instead. This commit makes it so.

Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
builtin rebase: optionally pass custom reflogs to reset_head()
In the next patch, we will make use of that in the code that
fast-forwards to `onto` whenever possible.

Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
builtin rebase: show progress when connected to a terminal
In this commit, we pass `--progress` to the `format-patch` command if
stderr is connected to an interactive terminal, unless we're in quiet
mode.

This `--progress` option will be used in `format-patch` to show progress
reports on stderr as patches are generated.

Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
builtin rebase: use no-op editor when interactive is "implied"
Some options are only handled by the git-rebase--interactive backend,
even if run non-interactively. For this awkward situation (run
non-interactively, but use the interactive backend), the shell scripted
version of `git rebase` introduced the concept of an "implied
interactive rebase". All it does is to replace the editor by a dummy one
(`:` is the Unix command that takes arbitrary command-line parameters,
ignores them and simply exits with success).

Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
builtin rebase: add support for custom merge strategies
When running a rebase in non-am mode, it uses the recursive merge to
cherry-pick the commits, and the rebase command allows to configure
the merge strategy to be used in this operation.

This commit adds that support to the builtin rebase.

Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
builtin rebase: optionally auto-detect the upstream
The `git rebase` command, when called without the `<upstream>`
command-line argument, automatically looks for the upstream
branch configured for the current branch.

With this commit, the builtin rebase learned that trick, too.

Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
builtin rebase: support --root
This option allows to rebase entire histories up to, and including, the
root commit.

The conversion from the shell script is straight-forward, apart from
the fact that we do not have to write an empty tree in C.

Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
builtin rebase: error out on incompatible option/mode combinations
While working on the GSoC project to convert the rebase command to a
builtin, the rebase command learned to error out on certain command-line
option combinations that cannot work, such as --whitespace=fix with
--interactive.

This commit converts that code.

Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
rebase: default to using the builtin rebase
Now that the builtin rebase is feature-complete, we should use it by
default. Let's keep the legacy scripted version around for the time
being; Once the builtin rebase is well-tested enough, we can remove
`git-legacy-rebase.sh`.

Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
Merge branch 'builtin-rebase-options'
This patch series completes the support for all rebase options in the
builtin rebase.

This converts the remaining command-line options.
Merge branch 'builtin-rebase-rest'
This converts all what is remaining from the shell scripted version of
`git rebase` to the builtin version.

@prertik prertik force-pushed the prertik:wip-rebase branch from d98687f to 0046209 Aug 8, 2018

Merge branch 'default-to-builtin-rebase'
Now that the builtin rebase learned all the tricks of the shell scripted
version, we can default to the builtin one.

Signed-off-by: Pratik Karki <predatoramigo@gmail.com>

@prertik prertik force-pushed the prertik:wip-rebase branch from 0046209 to 3358abd Aug 8, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment