Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Repository::in_progress_operation() #382

Merged
merged 17 commits into from
Apr 18, 2022

Conversation

inferiorhumanorgans
Copy link

This function is designed to mimic libgit2's Repository::state() function.

@inferiorhumanorgans
Copy link
Author

The obvious omission here is test coverage. I'm not sure what you had in mind. For my project I've been setting up repos as minimum viable cases and tarring them up so as to avoid depending on a specific version of git being installed locally. The tests I've used for this PR live here:

https://github.com/inferiorhumanorgans/git-verification

Copy link
Owner

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's great work, I am loving it!

I also took a look at the tests and would love to see them ported like so. The git-testtools crate allows to do what you need. Note that it makes the following assumptions:

  • The git binary is recent enough
  • the git binary is a stable interface to git.

With this in mind, you are able to get a Repository on the result of a shell script, where the following lines are a mandatory preamble.

#!/bin/bash
set -eu -o pipefail

git init -q
git config commit.gpgsign false

From there it should be possible to adapt the shell scripts in the test repository mentioned here and run the tests.

Thanks a lot, can't wait to get this merged.

git-repository/src/repository/state.rs Outdated Show resolved Hide resolved
git-repository/src/repository/state.rs Outdated Show resolved Hide resolved
@Byron
Copy link
Owner

Byron commented Apr 14, 2022

By the way, I love the idea of freezing the script output into tar files and untarring them on the fly. This is something one could use to reduce the time tests take on CI especially on windows which is awkwardly slow in executing processes.

It will probably take some work but I am inspired to find a way to support on-demand freezing of slow-running fixture shell scripts which would be picked up transparently if available, and which are tied to the version of the fixture that created them. I would, however, avoid putting these in to the repository and instead use git-lfs which github has support for.

Alex Zepeda added 7 commits April 14, 2022 00:28
1.) Install gnu-sed on the macOS instances

2.) Remember that which(1) sets a non-zero exit code if it doesn't find
each item you've specified.  In this case we're searching for 'gsed' and
'sed' in the hopes of finding at least one.

The ultimate goal is to prefer gsed if it exists as systems with a BSD
userland tend to install GNU tools with a 'g' prefix such that GNU sed
is accessible as 'gsed'.

GNU sed lets us easily (in a somewhat portable way) edit the second line
of a sequence so that we can change one 'pick' to 'edit'.  All while
pretending to be an interactive text editor to make git happy.
@inferiorhumanorgans
Copy link
Author

Apologies for all the CI churn. I did, however, notice something interesting. The Windows test instance gave a reasonably descriptive error message upon failure but the Ubuntu test instance did not. This is all the Ubuntu test showed:

stderr: ', tests/tools/src/lib.rs:115:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

In this case it should be the same failure on both platforms.

@Byron
Copy link
Owner

Byron commented Apr 14, 2022

These seem to be different errors though, maybe the shell on MacOS has trouble?

MacOS
2022-04-14T09:48:59.9196440Z stderr: /Users/runner/work/gitoxide/gitoxide/git-repository/tests/fixtures/make_rebase_i_repo.sh: line 33: [: 2 35 1: integer expression expected
2022-04-14T09:48:59.9196950Z fatal: --preserve-merges was replaced by --rebase-merges
2022-04-14T09:48:59.9197450Z ', tests/tools/src/lib.rs:115:9
2022-04-14T09:48:59.9197760Z note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
2022-04-14T09:48:59.9197950Z 
2022-04-14T09:48:59.9197960Z 
2022-04-14T09:48:59.9198020Z failures:
Windows
2022-04-14T09:40:43.0699609Z stderr: which: no gsed in (/mingw64/bin:/usr/bin:/c/Users/runneradmin/bin:/d/a/gitoxide/gitoxide/target/debug/build/curl-sys-7e85c957548a17f0/out/build:/d/a/gitoxide/gitoxide/target/debug/build/libgit2-sys-a405fc67d8aee456/out/build:/d/a/gitoxide/gitoxide/target/debug/build/libssh2-sys-0acf82cde4d4f0d1/out/build:/d/a/gitoxide/gitoxide/target/debug/build/libz-sys-a02af73321b6f02b/out/lib:/d/a/gitoxide/gitoxide/target/debug/build/wepoll-ffi-169df33785c3c0f8/out:/d/a/gitoxide/gitoxide/target/debug/deps:/d/a/gitoxide/gitoxide/target/debug:/c/Users/runneradmin/.rustup/toolchains/stable-x86_64-pc-windows-msvc/lib/rustlib/x86_64-pc-windows-msvc/lib:/c/Users/runneradmin/.cargo/bin:/c/Users/runneradmin/.rustup/toolchains/stable-x86_64-pc-windows-msvc/bin:/c/Program Files/MongoDB/Server/5.0/bin:/c/aliyun-cli:/c/vcpkg:/c/cf-cli:/c/Program Files (x86)/NSIS:/c/tools/zstd:/c/Program Files/Mercurial:/c/hostedtoolcache/windows/stack/2.7.5/x64:/c/cabal/bin:/c/ghcup/bin:/c/tools/ghc-9.2.2/bin:/c/Program Files/dotnet:/c/mysql/bin:/c/Program Files/R/R-4.1.3/bin/x64:/c/SeleniumWebDrivers/GeckoDriver:/c/Program Files (x86)/sbt/bin:/c/Program Files (x86)/GitHub CLI:/bin:/c/Program Files (x86)/pipx_bin:/c/hostedtoolcache/windows/go/1.17.8/x64/bin:/c/hostedtoolcache/windows/Python/3.7.9/x64/Scripts:/c/hostedtoolcache/windows/Python/3.7.9/x64:/c/hostedtoolcache/windows/Ruby/2.5.9/x64/bin:/c/tools/kotlinc/bin:/c/hostedtoolcache/windows/Java_Temurin-Hotspot_jdk/8.0.322-6/x64/bin:/c/npm/prefix:/c/Program Files (x86)/Microsoft SDKs/Azure/CLI2/wbin:/c/ProgramData/kind:/c/Program Files/Eclipse Foundation/jdk-8.0.302.8-hotspot/bin:/c/Windows/system32:/c/Windows:/c/Windows/System32/Wbem:/c/Windows/System32/WindowsPowerShell/v1.0:/c/Windows/System32/OpenSSH:/c/ProgramData/Chocolatey/bin:/c/Program Files/Docker:/c/Program Files/PowerShell/7:/c/Program Files/Microsoft/Web Platform Installer:/c/Program Files/dotnet:/c/Program Files/Microsoft SQL Server/130/Tools/Binn:/c/Program Files/Microsoft SQL Server/Client SDK/ODBC/170/Tools/Binn:/c/Program Files (x86)/Windows Kits/10/Windows Performance Toolkit:/c/Program Files (x86)/Microsoft SQL Server/110/DTS/Binn:/c/Program Files (x86)/Microsoft SQL Server/120/DTS/Binn:/c/Program Files (x86)/Microsoft SQL Server/130/DTS/Binn:/c/Program Files (x86)/Microsoft SQL Server/140/DTS/Binn:/c/Program Files (x86)/Microsoft SQL Server/150/DTS/Binn:/c/Program Files/nodejs:/c/Program Files/OpenSSL/bin:/c/Strawberry/c/bin:/c/Strawberry/perl/site/bin:/c/Strawberry/perl/bin:/c/ProgramData/chocolatey/lib/pulumi/tools/Pulumi/bin:/c/Program Files/TortoiseSVN/bin:/c/Program Files/CMake/bin:/c/ProgramData/chocolatey/lib/maven/apache-maven-3.8.5/bin:/c/Program Files/Microsoft Service Fabric/bin/Fabric/Fabric.Code:/c/Program Files/Microsoft SDKs/Service Fabric/Tools/ServiceFabricLocalClusterManager:/cmd:/mingw64/bin:/usr/bin:/c/Program Files/GitHub CLI:/c/tools/php:/c/Program Files (x86)/sbt/bin:/c/SeleniumWebDrivers/ChromeDriver:/c/SeleniumWebDrivers/EdgeDriver:/c/Program Files/Amazon/AWSCLIV2:/c/Program Files/Amazon/SessionManagerPlugin/bin:/c/Program Files/Amazon/AWSSAMCLI/bin:/c/Program Files (x86)/Google/Cloud SDK/google-cloud-sdk/bin:/c/Program Files (x86)/Microsoft BizTalk Server:/c/Program Files/LLVM/bin:/c/Users/runneradmin/.dotnet/tools:/c/Users/runneradmin/.cargo/bin:/c/Users/runneradmin/AppData/Local/Microsoft/WindowsApps)
2022-04-14T09:40:43.0705473Z D:\a\gitoxide\gitoxide\git-repository\tests\fixtures\make_rebase_i_repo.sh: line 33: bc: command not found
2022-04-14T09:40:43.0706030Z fatal: --preserve-merges was replaced by --rebase-merges
2022-04-14T09:40:43.0706440Z ', tests\tools\src\lib.rs:115:9
2022-04-14T09:40:43.0706836Z note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

@inferiorhumanorgans
Copy link
Author

inferiorhumanorgans commented Apr 14, 2022

The root problem was that I didn't realize which(1) would return non-zero in the absence of any one of the programs passed to it. That led me down a rabbit hole of trying to detect the git version when in reality the issue was in finding the path to a suitable sed. In terms of cross-platform scripting with the tests, lessons learned:

  • Windows instances come with a modern bash and git. Quixotically bc is missing.
  • Windows instances come with an oddly verbose version of which(1)
  • Mac instances come with a newer, non-Apple git but leave the ancient bash in place.
  • The fixtures' shebang points at /bin/bash but the test harness runs whichever bash instance is in the path first. In my case I still have the system /bin/bash but the newer homebrew version comes first in the path.

My goal is to take a look at the rest of your feedback in the next few days.

@Byron
Copy link
Owner

Byron commented Apr 14, 2022

Thanks a lot for the summary! When reading this I thought I was happy that the whole fixture setup affair worked so well for me thus far.

Regarding feedback, there isn't more except for the two minor changes that are probably fixed already (I yet have to take a look), and the notes about using the archive-technique in gitoxides test-suite can be ignored as it's already in the works in #384 . Even though I think ultimately I will only turn it on on windows and probably only for slow-running fixture setups, it's great to have previous states archived in case one needs to get old fixture runs back.

@Byron
Copy link
Owner

Byron commented Apr 15, 2022

Thanks for the update. I took another look and noticed that for some reason GitHub is all hung up on something so won't run CI while displaying the commit list like this:

Screen Shot 2022-04-16 at 07 56 39

I assume it's the same for you, but not having CI is a bit of an issue. Maybe it's because of the conflict in ci.yml?

In any case, please let me know once I should take a look again for a merge attempt. Thanks.

@inferiorhumanorgans
Copy link
Author

Correct. The whole point of most of these commits is to ensure test coverage. I've merged the latest changes from main into the topic branch. At a high level the only difference between the CI config for this branch and main should be that GNU sed is installed on macOS.

@inferiorhumanorgans
Copy link
Author

inferiorhumanorgans commented Apr 16, 2022

To the other point, cross platform shell scripting is a bit of a black art and that's a large part of why I split out the generation of the repositories from the tests on my own projects. This PR isn't too bad in that regard, but it would be nice to find a more portable solution to feigning an interactive git operation than relying on GNU sed.

The other thing I've noticed is that if a fixture fails the artifacts are left in place. What are your thoughts on deleting or moving them out of the way? The issue I've had is that if I'm trying to do some quick and dirty A/B testing I will likely be using the same two variants (a.k.a. the same two CRC values) and have to manually remove the generated directory to ensure the fixture is run again.

Alternatively what about using the fixture's mtime as part of the data to hash?

@Byron
Copy link
Owner

Byron commented Apr 16, 2022

The issue I've had is that if I'm trying to do some quick and dirty A/B testing I will likely be using the same two variants (a.k.a. the same two CRC values) and have to manually remove the generated directory to ensure the fixture is run again.

I think the current default behaviour is confusing and undesirable. Instead it would be better to cleanup automatically while allowing to set some environment variable to keep the previous version to help with debugging (my previous default case :D). I understand that if some directory is there the next time it runs it happily takes the state of the previous failed run. You are very welcome to make these adjustments in this PR.

Alternatively what about using the fixture's mtime as part of the data to hash?

Is this question obsolete by the statement above?

@davidkna
Copy link
Contributor

Not sure if it is appropriate for this function, as it is more expensive, but it would be very nice to also get progress information on the states (starship implementation).
I'd also be happy to submit a PR with that in this function or a new function.

@inferiorhumanorgans
Copy link
Author

inferiorhumanorgans commented Apr 16, 2022

@Byron Yeah, if the repos will get cleaned up after each run then there's no need to worry about a different hash. However I think that's more appropriate for a different PR.

@davidkna FWIW there was a bit of related discussion in #255. My use case is for a shell prompt as well and in fact this was the next feature on my list. I don't think getting the current status of the sequencer is all that expensive (read a couple files), but the bigger question is whether it's more appropriate to have the state enum contain that information or not. e.g.

enum State {// Should CherryPick and Revert use the Interactive suffix here?
  CherryPickSequence { current_operation: usize, total_operations: usize },
  RebaseInteractive { current_operation: usize, total_operations: usize },
  RevertSequence { current_operation: usize, total_operations: usize },
}

Or:

struct SequencerState {
    pub completed_operations: usize,
    pub total_operations: usize,

    // Commit currently subjected to an edit request, if any
    pub edit_commit: Option<Id>,

    // Other rebase state? See also:
    // https://github.com/git/git/blob/master/sequencer.c
}

impl Repository {
    pub fn sequencer_progress(&self) -> Option<SequencerState>;
}

I don't have any particularly strong feelings either way.

@inferiorhumanorgans
Copy link
Author

Apologies I just fat fingered that.

@davidkna
Copy link
Contributor

@inferiorhumanorgans Thanks for the update! I don't have strong opinions on the API, either. I feel like the enum approach is more intuitive, but a struct could help reduce code duplication.

@inferiorhumanorgans
Copy link
Author

inferiorhumanorgans commented Apr 16, 2022

I think it depends on the intended use case. For something like a prompt generator the only important metadata is the number of operations (completed, total) so having that as part of the enum is convenient but not so much so that it's worth having multiple paths to get at the same information.

However git stores a bunch of other information that IMO doesn't belong in the enum and may be worth exposing for other use cases.

@davidkna thoughts?

@Byron
Copy link
Owner

Byron commented Apr 17, 2022

Regarding the test failure, you could merge main as I believe to have applied a fix for that. It's that lock-creation failures appear to sometimes trigger a different kind of IO error on windows in highly concurrent situations.

When seeing the conversation about the potential performance impact of the obtaining the status for everything, I realized that it's probably more in line with the existing API to make it more lazy.

This can be achieved using a platform, like so:

fn status(&self) -> Platform<'_> {}

impl<'a> Platform<'a> {
    fn rebase(&self) -> git_rebase::Status {}
}

fn main() {
    repo.status().rebase()
}

An empty git-rebase crate was added to main as well.

I think this means the enum can be replaced with smaller structures, which probably fit well into git_repository::status::… if needed.

I hope this helps for the issues brought up here around efficiency and usability.

Byron added a commit that referenced this pull request Apr 17, 2022
git-rebase can be useful to hold status types for the rebase operation
and anticipates that eventually this will be implemented using gitoxide.

git-lfs anticipates its implementation and reserves the name.
@inferiorhumanorgans
Copy link
Author

I think the current implementation of Repository::in_progress_operation() makes sense as-is since it's lightweight (both the implementation and the concept). From my end it pretty much looks like what I'd like to see merged into main.

Agreed on keeping the API lazy. My preference is to keep more detailed info separate from the in_progress_operation return value. What sort of information were you thinking status() would return? In terms of separate crates, is the intent to have one per git sub-command? I know most of this discussion is around rebase behavior but cherry-pick and revert present similar opportunities.

The workflow could look something like this:

match repo.in_progress_operation() {
  None => {},
  // Should RepositoryState be renamed to something like RepositoryOperation?
  op @ RepositoryState::RebaseInteractive,
  | op @ RepositoryState::CherryPickSequence,
  | op @ RepositoryState::RevertSequence => {
    let seq = repo.sequencer_state()?;
    if let Some(id) = seq.commit_being_edited {
        print_ps1_segment("{:?} editing: {}", op, id);
    } else {
        print_ps1_segment("{:?} {}/{}", op, seq.completed_operations, seq.total_operations);
    }
  }
  op => print_ps1_segment("{:?}", op)
}

@Byron
Copy link
Owner

Byron commented Apr 17, 2022

Now I see, the status quo is tailor made for this use-case, and pretty good at that. Thanks for elaborating.

Let me pivot back to the previous stance so we can merge as is once all tests have been ported/migrated.

For context on the git-rebase crate. It's expected that quite some logic goes to each of these major areas of git functionality so one will be able to use cargo features to toggle them off, reducing the compile times to the minimum necessary to get what an application needs.

@inferiorhumanorgans
Copy link
Author

Yeah, this was intended to be a lightweight check for where a lot of details aren't needed. I think that the next logical step is to flesh out how to expose interactive rebase state (since that seems to have the most interest).

@Byron
Copy link
Owner

Byron commented Apr 18, 2022

I'd be happy if we could get this version tested by integrating all tests there are in the repository you created before and place all adjustments/changes in another PR, keeping the scope of this one smaller.

@inferiorhumanorgans
Copy link
Author

inferiorhumanorgans commented Apr 18, 2022

Yeah I agree 100%. The get rebase status discussion belongs in a different issue. The tests I didn't carry over from https://github.com/inferiorhumanorgans/git-verification/ don't apply here. They are for validating branch information on empty repositories and for counting untracked files and are unrelated to in progress operations.

However the following RepositoryState variants do not have tests:
ApplyMailbox, ApplyMailboxRebase, Bisect, CherryPickSequence, Merge, RevertSequence.

@Byron Byron self-assigned this Apr 18, 2022
Byron added a commit that referenced this pull request Apr 18, 2022
Byron added a commit that referenced this pull request Apr 18, 2022
Byron added a commit that referenced this pull request Apr 18, 2022
@inferiorhumanorgans
Copy link
Author

So I was going to punt on writing tests for the other states until I sat down to write a test to cover the CherryPickSequence state and caught a couple logic errors. I'll have tests for Bisect, CherryPickSequence, Merge, and RevertSequence shortly. If your test refactoring is stable I'd be happy to shape the four additional tests accordingly.

I've never used the mailbox stuff so I'm not sure where to begin with that.

@Byron Byron merged commit 9679d6b into Byron:main Apr 18, 2022
@Byron
Copy link
Owner

Byron commented Apr 18, 2022

I've never used the mailbox stuff so I'm not sure where to begin with that.

I am happy about any test you contribute, so no need to force it with mailboxes :). Generally, it should be possible to create an email to send patches, save the mailbox format, and apply that while causing a conflict maybe. I only have done the mail-creation once in my life so unfortunately can't help more here either.

The refactoring is done now and the additional tests could be submitted in a new PR.

And: Thanks a lot for your contribution :)!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants