Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix repository name collision #1596

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ink-splatters
Copy link

Two versions of tests/demo/doc/mañana.txt existed in the repository at the same time, each of them with unique filename.

This was possible because of git ill-formed approach utilizing filesystem-dependent filename representations by default.

It's likely that the mentioned file was added twice: both on Linux and macOS, and due to filename encoding differences between the systems, git permitted it.

This PR fixes #1595 by:

  • first removing the affected file
  • changing faulty git defaults ( basically - enabling UTF-8 support)
  • re-adding the file

at the same time, each of them with unique filename which led
to naming collision.

TL;DR: because of file names encoded differently on macOS and Linux by
default, files with umlauts and similar marks, added both on Linux and macOS
at the same path may result in naming collision.

The proposed fix is to enable Unicode precomposition (basically UTF-8
support) in macOS version of git, and disable `quotePath`:

```
git config --global core.precomposeunicode true
git config --global core.quotePath false
```

then re-add files of the concern:

```
git rm --cached ...
git add
```

https://stackoverflow.com/questions/34549040/git-not-displaying-unicode-file-names

	deleted:    "man\314\203ana.txt"
	deleted:    "ma\303\261ana.txt"
Copy link

codecov bot commented Apr 18, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

❗ No coverage uploaded for pull request base (master@ae49a49). Click here to learn what that means.
Report is 2 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##             master    #1596   +/-   ##
=========================================
  Coverage          ?   97.28%           
=========================================
  Files             ?       48           
  Lines             ?     4607           
  Branches          ?        0           
=========================================
  Hits              ?     4482           
  Misses            ?      125           
  Partials          ?        0           
Flag Coverage Δ
unittests 97.28% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@yajo
Copy link
Member

yajo commented Apr 20, 2024

Tbh I never liked having that file around. The file is used in a couple of tests:

@pytest.mark.xfail(
condition=platform.system() in {"Darwin", "Windows"},
reason="OS without proper UTF-8 filesystem.",
strict=True,
)
def test_path_filter(tmp_path_factory: pytest.TempPathFactory) -> None:
src, dst = map(tmp_path_factory.mktemp, ("src", "dst"))
file_excluded: Mapping[StrOrPath, bool] = {
"x.exclude": True,
"do_not.exclude!": False,
# dir patterns and their negations
Path("exclude_dir", "x"): True,
Path("exclude_dir", "please_copy_me"): False, # no mercy
Path("not_exclude_dir", "x!"): False,
# unicode patterns
"mañana.txt": True,
"mañana.txt": False,
"manana.txt": False,
}

assert_file(tmp_path, "doc", "mañana.txt")

copier/tests/test_copy.py

Lines 234 to 246 in 69662e3

@pytest.mark.xfail(
condition=platform.system() == "Darwin",
reason="Mac claims to use UTF-8 filesystem, but behaves differently.",
strict=True,
)
def test_exclude_file(tmp_path: Path) -> None:
print(f"Filesystem encoding is {sys.getfilesystemencoding()}")
# This file name is b"man\xcc\x83ana.txt".decode()
render(tmp_path, exclude=["mañana.txt"])
assert not (tmp_path / "doc" / "mañana.txt").exists()
# This file name is b"ma\xc3\xb1ana.txt".decode()
assert (tmp_path / "doc" / "mañana.txt").exists()
assert (tmp_path / "doc" / "manana.txt").exists()

The tests were skipped on non-linux OS.

I think we should refactor those tests and create the src template in a tmp dir during the test using build_file_tree.

Do you think you could do that?

@yajo
Copy link
Member

yajo commented Apr 20, 2024

The most weird thing is that tests still pass, so I guess they were not doing what I thought 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Name collision which renders repo formally broken and causes several side effects
2 participants