Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

9 tests rely on commands like ln -s making copies instead of symlinks on Windows #1443

Closed
EliahKagan opened this issue Jul 5, 2024 · 4 comments · Fixed by #1444
Closed
Assignees

Comments

@EliahKagan
Copy link
Contributor

EliahKagan commented Jul 5, 2024

Current behavior 😯

Background

As detailed in #1442, ln -s in Git Bash (or other MSYS2 environments) in Windows does not ordinarily attempt to create symlinks, but instead copies files. Its default behavior of copying files is independent of whether the current user has the ability to create symlinks, and also independent of the core.symlinks Git configuration variable. The behavior is also not directly related to Git's behavior of creating regular files when core.symlinks is not set to true, because when Git creates a regular file instead of a symlink, that file still represents a symlink in the index and contains its target path, rather than being a copy of the target.

ln -s is used in about 14 of gitoxide's fixture scripts.

Although ln -s does not typically attempt to create symlinks, it will attempt to do so--and succeed if the user has the ability to create them without UAC elevation--if the default behavior of MSYS2's emulated symlink system call is customized in the MSYS environment variable. As noted in git-for-windows/git#3122 (comment) and #1442, one way to do this is to set MSYS=winsymlinks:nativestrict.

One should be careful about doing this automatically, because the user may already have it set; like the CYGWIN variable that it is based on, it supports a number of options, not all of which relate to symlinks, which can be set together in its value by separating them with spaces (or tabs). Therefore, if in the future gitoxide test fixtures automatically enable winsymlinks:nativestrict, they should probably do so in a way that preserves unrelated options in $MSYS if present. However, such subtleties can usually be ignored in manual testing on a system where MSYS is known to be unset.

Test failures with MSYS=winsymlinks:nativestrict

Originally my goal was to find out if any tests documented as failing in #1358 were failing due to ln -s or other Unix-style commands (e.g., tar to unpack an archive with symlinks) not really creating symlinks, such that they would pass with MSYS=winsymlinks:nativestrict. It turns out that there was exactly one such test; see #1358 (comment).

However, the bigger effect is that there are nine new failures! This is to say that some tests have been depending on ln -s, or other commands that would ordinarily create symlinks, not attempting to do so.

Failures when using pre-generated archives

Using pre-generated archives by not defining GIX_TEST_IGNORE_ARCHIVES, I ran:

MSYS=winsymlinks:nativestrict cargo nextest run --all --no-fail-fast

This produced 9 failures:

     Summary [ 360.891s] 2363 tests run: 2354 passed (12 slow, 1 leaky), 9 failed, 9 skipped
        FAIL [   0.695s] gix-archive::archive from_tree::basic_usage_internal
        FAIL [   0.706s] gix-archive::archive from_tree::basic_usage_tar
        FAIL [   0.032s] gix-archive::archive from_tree::basic_usage_zip
        FAIL [   0.020s] gix-status-tests::worktree stack::paths_not_going_through_symlink_directories_are_ok_and_point_to_correct_item
        FAIL [   0.024s] gix-status-tests::worktree status::index_as_worktree::modified
        FAIL [   0.031s] gix-status-tests::worktree status::index_as_worktree::refresh
        FAIL [   0.029s] gix-status-tests::worktree status::index_as_worktree::unchanged
        FAIL [   0.061s] gix-worktree-state-tests::worktree state::checkout::allow_or_disallow_symlinks
        FAIL [   3.897s] gix-worktree-stream::stream from_tree::will_provide_all_information_and_respect_export_ignore
error: test run failed

See this gist for the full test output of this run. Other than the information in this issue description, that gist contains the most important information relevant to this issue. The next subsection covers the effect of setting both MSYS and GIX_TEST_IGNORE_ARCHIVES, but for the purpose of showing that their effects are mostly independent, so it may be of less interest.

Failures when ignoring pre-generated archives

I independently tried it with:

MSYS=winsymlinks:nativestrict GIX_TEST_IGNORE_ARCHIVES=1 cargo nextest run --all --no-fail-fast

This gave 24 failures:

     Summary [ 891.930s] 2363 tests run: 2339 passed (22 slow, 2 leaky), 24 failed, 9 skipped
        FAIL [   0.948s] gix-archive::archive from_tree::basic_usage_internal
        FAIL [   1.089s] gix-archive::archive from_tree::basic_usage_tar
        FAIL [   0.998s] gix-archive::archive from_tree::basic_usage_zip
        FAIL [   7.922s] gix-glob::glob pattern::matching::compare_baseline_with_ours
        FAIL [   8.021s] gix-pathspec::pathspec parse::baseline
        FAIL [   0.018s] gix-pathspec::pathspec parse::valid::glob_negations_are_always_literal
        FAIL [   0.018s] gix-pathspec::pathspec parse::valid::whitespace_in_pathspec
        FAIL [   1.988s] gix-pathspec::pathspec search::files
        FAIL [   4.424s] gix-pathspec::pathspec search::prefixes_are_always_case_sensitive
        FAIL [ 180.755s] gix-ref-tests::refs packed::iter::performance
        FAIL [   0.436s] gix-status-tests::worktree stack::paths_not_going_through_symlink_directories_are_ok_and_point_to_correct_item
        FAIL [   0.792s] gix-status-tests::worktree status::index_as_worktree::modified
        FAIL [   0.027s] gix-status-tests::worktree status::index_as_worktree::refresh
        FAIL [   0.909s] gix-status-tests::worktree status::index_as_worktree::unchanged
        FAIL [   5.599s] gix-submodule::submodule file::baseline::common_values_and_names_by_path
        FAIL [  17.483s] gix-submodule::submodule file::is_active_platform::pathspecs_matter_even_if_they_do_not_match
        FAIL [  30.376s] gix-submodule::submodule file::is_active_platform::submodules_with_active_config_are_considered_active_or_inactive
        FAIL [  38.142s] gix-submodule::submodule file::is_active_platform::submodules_with_active_config_override_pathspecs
        FAIL [  22.849s] gix-submodule::submodule file::is_active_platform::without_any_additional_settings_all_are_inactive_if_they_have_a_url
        FAIL [   5.158s] gix-submodule::submodule file::is_active_platform::without_submodule_in_index
        FAIL [   0.804s] gix-worktree-state-tests::worktree state::checkout::allow_or_disallow_symlinks
        FAIL [   1.531s] gix-worktree-stream::stream from_tree::will_provide_all_information_and_respect_export_ignore
        FAIL [   0.218s] gix::gix revision::spec::from_bytes::regex::find_youngest_matching_commit::regex_matches
        FAIL [   0.117s] gix::gix revision::spec::from_bytes::regex::with_known_revision::contained_string_matches_in_unanchored_regex_and_disambiguates_automatically
error: test run failed

But most of those failures are those that occur without setting the MSYS environment variable and were reported in #1358. The new failures are the tests that failed above without GIX_TEST_IGNORE_ARCHIVES.

See this other gist for the full test output from this run without using pre-generated archives.

Not necessarily specific to ln -s

These results are not necessarily specific to ln -s. This does not necessarily produce the same effect as setting the environment variable only for ln -s calls, because the MSYS behavior it is affecting is not specifically the behavior of ln -s, but rather the behavior of any executable that links to msys-2.0.dll and makes use of the POSIX symlink call that it emulates. Some more details on this distinction are given in #1442.

Note that fully native executables, including git, do not link to msys-2.0.dll, so whether they attempt to create symlinks is unaffected by the MSYS environment variable. In the MSYS2 documentation, this is correlated with the "C library" column in this table.

Expected behavior 🤔

When ln -s creates symbolic links, that should not cause any tests to fail on any system. If other POSIX commands such as tar are also involved in this, the same applies: when such a command ordinarily would create symbolic links but may not always do so on Windows, its ability to do so on Windows should not break the tests.

If special-casing is in place to work around the difficulty creating symlinks on Windows by treating their presence as unexpected, then this disparity with other platforms should be eliminated when possible. If not possible, the special-casing should be made more robust, so it does not assume the absence of features that may be present in reasonable configurations on non-broken systems.

If commands such as ln -s are used in test fixtures, or invoked by other means in testing, in any situations where they are really not intended to create symbolic links, then either a different command should be used, or the necessary configuration change should be made, such as removing symlink-related options from the value of $MSYS when present. My guess, however, is that calls to ln -s in gitoxide's test suite are not deliberately intended to copy a file rather than create a symlink.

Git behavior

Not applicable.

Steps to reproduce 🕹

Test commands are all shown above. I ran gix clean -xde between tests.

The Windows 10 system I tested on has "developer mode" enabled, which permits me to create symlinks without UAC elevation; this is important to the results in that it demonstrates that the problem is not the inability of ln -s to create symlinks when it tries, but rather that these nine failures seem to be due to it actually creating them.

@EliahKagan EliahKagan changed the title 9 tests rely on commands like ln -s making copies instead of symlinks on Windows 9 tests rely on ln -s et al. making copies instead of symlinks on Windows Jul 5, 2024
@EliahKagan EliahKagan changed the title 9 tests rely on ln -s et al. making copies instead of symlinks on Windows 9 tests rely on commands like ln -s making copies instead of symlinks on Windows Jul 5, 2024
@Byron
Copy link
Owner

Byron commented Jul 5, 2024

Thanks a lot for reporting!

My guess, however, is that calls to ln -s in gitoxide's test suite are not deliberately intended to copy a file rather than create a symlink.

It's true, the intend is for it to actually create a symlink, even on Windows. It was pretty clear that this didn't happen though, so I started to rely on using archives instead. These are also known to not create symlinks on Windows, I think, and it's strange that MSYS=winsymlinks:nativestrict cargo nextest run --all --no-fail-fast shows any difference. That variable shouldn't affect anything at all as the untar of archives is done in a Rust crate which at least in theory should be able to create symlinks. It definitely shouldn't be affected by the MSYS environment variable. But apparently, it is, which I consider a feature even as I always wanted symlinks to be created on Windows as well.

9 failing tests

I looked into one of the failing tests, gix-archive/tests/archive.rs and noticed that it clearly assumed there are no symlinks on WIndows. Fixing this would be easy, and probably the other tests are tuned like this as well.
In theory, this behaviour should be consistent even with archives being ignored.

In theory, testing with archives should be the same as testing without, and I'd hope that this MSYS flag is the key to (nearly) getting there as it should allow for feature-parity between the platforms.

Testing without archives on Windows

There is a way to partition tests and only run a subset of them in a deterministic fashion. To keep Windows runtimes in check and not let them pass 25 minutes, one could use this with a controlling script that runs random, non-repeating partitions until the time runs out. The partitioning should help to over time run all tests, instead of running only the first X tests each time.

https://nexte.st/docs/ci-features/partitioning/

Maybe this is something you could establish once all tests are fixed.

@Byron Byron self-assigned this Jul 5, 2024
@Byron
Copy link
Owner

Byron commented Jul 5, 2024

I think the reason for these 9 tests failing is that they don't have archives to begin with, so the scripts are always executed and thus affected by MSYS settings.

@Byron Byron mentioned this issue Jul 5, 2024
2 tasks
@EliahKagan
Copy link
Contributor Author

I think the reason for these 9 tests failing is that they don't have archives to begin with

I'm relieved that the reason--even if not the needed fix--is straightforward.

It was pretty clear that this didn't happen though, so I started to rely on using archives instead.

It seems to me that pre-generated archives actually have an advantage beyond working around known platforms limitations and speeding up test runs: they provide some protection against unknown platform-specific problems that would cause fixtures to generate the wrong thing, especially if the tests are being run fairly regularly both with and without using the committed archives.

There is a way to partition tests and only run a subset of them in a deterministic fashion. To keep Windows runtimes in check and not let them pass 25 minutes, one could use this with a controlling script that runs random, non-repeating partitions until the time runs out.

I wonder if there is also a way to effectively compare the archives that get generated (though what kinds of variations are correct would differ across tests that use them).

Byron added a commit that referenced this issue Jul 5, 2024
This will only reliably work on with developer setups, but that
seems fair to assume.
If this causes problems, it's fine to make it opt-in as well.
@Byron
Copy link
Owner

Byron commented Jul 5, 2024

I wonder if there is also a way to effectively compare the archives that get generated (though what kinds of variations are correct would differ across tests that use them).

I think I forgot to mention that the idea would be to use nextest as runner but without using test archives to force actually executing the scripts to assure they work. Maybe, instead of complicating the thing with partitioning, the job could also be run but setup so nobody has to wait for it. Ideally, failures get communicated by email but otherwise are not blocking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants