Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[local] Refuse to create lakectl locals on case-insensitive filesystems #7650

Merged

Conversation

arielshaqed
Copy link
Contributor

Of course, allow it anyway if --forced to.

Fixes #7645.

Still need to add it to `lakectl local init`.

To check this (on Linux):

```sh
dd if=/dev/zero of=/tmp/my-loopback bs=1k count=10000
sudo losetup -f /tmp/my-loopback
losetup --all
mkfs -t ext4 -O casefold -E encoding_flag=strict /dev/loop11
mkdir /tmp/ifs; sudo mount /dev/loop11 /tmp/ifs/

mkdir /tmp/ifs/dir
chattr +F /tmp/ifs/dir

```

Obviously this won't really work in a CI/CD container, where there are
externalities that we do not control.
Copy link

github-actions bot commented Apr 10, 2024

🎊 PR Preview 0a737f7 has been successfully built and deployed to https://treeverse-lakeFS-preview-pr-7650.surge.sh

🕐 Build time: 0.011s

🤖 By surge-preview

Copy link

E2E Test Results - DynamoDB Local - Local Block Adapter

13 passed

@arielshaqed
Copy link
Contributor Author

arielshaqed commented Apr 10, 2024

How was this tested?

Unit tests are easy, and included. But for something that depends on the OS interface they are not enough. A system check is also needed!

I managed to perform a manual system check for acceptance. This is hard because developer machines should NOT have case-insensitive filesystems -- precisely for the reasons that we have this PR!

Luckily ext4 (and, apparently, ext2 and ext3) can be configured to allow case-insensitive directories. To set it up on Linux:

# Create a loopback file
dd if=/dev/zero of=/tmp/my-loopback bs=1k count=10000
# Create a loopback device
sudo losetup -f /tmp/my-loopback
# Get the name of the created device (I got /dev/loop11)
losetup --all
# Format an ext4 filesystem
sudo mkfs -t ext4 -O casefold -E encoding_flags=strict /dev/loop11
# Mount it
mkdir /tmp/ifs; sudo mount /dev/loop11 /tmp/ifs/
# (Give permissions...)

# Create a case-insensitive directory
mkdir /tmp/ifs/dir
chattr +F /tmp/ifs/dir

# Now use /tmp/ifs/dir for testing...

So I did this and verified that I get an error message on a case-insensitive directory under /tmp/ifs/dir, and that I can --force it into a warning but still succeed.

This is an acceptance check not a test!

Obviously this won't really work in a CI/CD container: there are
externalities that we cannot control. Examples:

  • Might be running on a Linux system whose ext* filesystem was compiled without casefold support.
  • Might be running on a Linux system with an non-ext* filesystem.
  • Might not have permissions to perform these actions.

All of these issues can change between GitHub runner releases, and none of them can be fixed from inside a test container - they all manifest at the VM level.

OTOH, this is an esoteric setup.

@arielshaqed arielshaqed requested a review from N-o-Z April 10, 2024 06:30
@arielshaqed arielshaqed added bug Something isn't working area/UI Improvements or additions to UI area/lakectl Issues related to lakeFS' command line interface (lakectl) include-changelog PR description should be included in next release changelog lakectl-local labels Apr 10, 2024
Copy link
Member

@N-o-Z N-o-Z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I don't have a lot of comments on the code itself.
However, I feel we need to understand the implications of this change. Blocking since this will break users who are currently using lakectl local on FSs that are case insensitive. I agree that his is a bug that needs to be fixed but I believe we must first communicate it to all the relevant stakeholders

@@ -107,5 +110,6 @@ var localCloneCmd = &cobra.Command{
func init() {
withGitIgnoreFlag(localCloneCmd)
withSyncFlags(localCloneCmd)
withForceFlag(localCloneCmd, "Force clone even on case-insensitive filesystem")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have a "force" flag that does something differently.
Can we please rename it to something like --ignore-case-insensitive or something of the sorts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to --allow-case-insensitive.

@arielshaqed
Copy link
Contributor Author

Thanks, I don't have a lot of comments on the code itself. However, I feel we need to understand the implications of this change. Blocking since this will break users who are currently using lakectl local on FSs that are case insensitive. I agree that his is a bug that needs to be fixed but I believe we must first communicate it to all the relevant stakeholders

I disagree, asking @ozkatz for guidance please.

  1. We do not block users of lakectl local with a case-insensitive filesystem: they will get an error message when they try to create a new local, and the error message tells them to set a flag.
    • Would you be happier to set this on a Viper configuration? Then users could set it once and for all.
  2. Users of lakectl local who are already on a case-insensitive filesystem are playing with fire!

Copy link
Contributor Author

@arielshaqed arielshaqed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Will wait for additional comments about desire UX here.

@@ -107,5 +110,6 @@ var localCloneCmd = &cobra.Command{
func init() {
withGitIgnoreFlag(localCloneCmd)
withSyncFlags(localCloneCmd)
withForceFlag(localCloneCmd, "Force clone even on case-insensitive filesystem")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to --allow-case-insensitive.

@arielshaqed arielshaqed force-pushed the bug/7645-report-case-insensitive-filename-in-lakectl-local branch from 818b1f4 to 0a737f7 Compare April 14, 2024 15:54
@talSofer
Copy link
Contributor

@arielshaqed Looking into the solution we chose, I’d like to suggest that we soften it a bit. Instead of dying by default on inits that involve dirs on case-insensitive FSs, print a warning and document this very well in our public docs and lakectl local best practices.
The reasoning behind my suggestion is that the current version of the fix means that most macOS and Windows users won’t be able to use lakectl local without forcing it (because the default FSs on both systems are case-insensitive). In addition, this is consistent with how Git works, and users can mitigate the problem by following file naming conventions.

- Don't break existing behaviour
- Stay compatible with Git, which does even less
- Don't tell off Windows and MacOS default filesystems
@arielshaqed
Copy link
Contributor Author

Thanks, @talSofer & @N-o-Z . I downgraded everything only to warn.

PTAL.

@arielshaqed arielshaqed requested a review from N-o-Z April 15, 2024 10:11
Copy link
Member

@N-o-Z N-o-Z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks,
I have one request regarding the IsCaseInsensitiveLocation function

Warning(fmt.Sprintf(CaseInsensitiveWarningMessageFormat, path))
}
if err != nil {
Warning(fmt.Sprintf("Check whether directory '%s' is case-insensitive: %s", path, err))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Warning(fmt.Sprintf("Check whether directory '%s' is case-insensitive: %s", path, err))
Warning(fmt.Sprintf("Check whether directory '%s' is case-insensitive failed: %s", path, err))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not done. The word "failed" does nothing to help with a real error message. Compare

Check whether directory './foo' is case-insensitive: Permission denied.

with

Check whether directory './foo' is case-insensitive failed: Permission denied.

Bash does the same thing:

echo foo > x
bash: x: Permission denied

So does every other UN*Xy program:

echo foo | tee x
tee: x: Permission denied

@@ -116,6 +119,17 @@ Use "%s checkout..." to sync with the remote or run "lakectl local clone..." wit
}
}

func warnOnCaseInsensitiveDirectory(path string) {
isCaseInsensitive, err := fileutil.IsCaseInsensitiveLocation(fileutil.OSFS{}, path)
if isCaseInsensitive {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to switch the ifs. I know it doesn't really matter in this case but it still hurts my eyes :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -39,6 +39,8 @@ var localCloneCmd = &cobra.Command{
DieFmt("directory '%s' exists and is not empty", localPath)
}

warnOnCaseInsensitiveDirectory(localPath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's just better to add this check to the local root command?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not done.

Given that we plan to allow Windows and MacOS users to work like this, I will not warn them about something that they can do nothing about it and that we want to support.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean here. The issue is relevant to any local command we perform, so why add it only in 2 sub commands?
Also, not maintainable, the next command we add which might perform a sync we will most likely forget to add this check

return false, fmt.Errorf("touch %s: %w", path, err)
}
defer func() {
_ = fs.Remove(path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to find a solution to the problem where remove fails as it affects the locally synced directory. The file will be synced to remote and the user is not aware of it.
I understand we don't want to use tmpfile here because the tmp folder might be on a different FS. But I think we need a solution for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is literally not possible to fix. OTOH it is also not possible to fail here unless a process with user privileges is actively trying to sabotage the user. I could return an error and panic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed f2f: print an warning message.

Copy link
Contributor Author

@arielshaqed arielshaqed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Not sure which of the changes is requested; unfortunately I object to many of them, and am not sure what to do about one.

I suggest panicking when we cannot Remove the test file. Would that resolve your requested changes?

@@ -116,6 +119,17 @@ Use "%s checkout..." to sync with the remote or run "lakectl local clone..." wit
}
}

func warnOnCaseInsensitiveDirectory(path string) {
isCaseInsensitive, err := fileutil.IsCaseInsensitiveLocation(fileutil.OSFS{}, path)
if isCaseInsensitive {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Warning(fmt.Sprintf(CaseInsensitiveWarningMessageFormat, path))
}
if err != nil {
Warning(fmt.Sprintf("Check whether directory '%s' is case-insensitive: %s", path, err))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not done. The word "failed" does nothing to help with a real error message. Compare

Check whether directory './foo' is case-insensitive: Permission denied.

with

Check whether directory './foo' is case-insensitive failed: Permission denied.

Bash does the same thing:

echo foo > x
bash: x: Permission denied

So does every other UN*Xy program:

echo foo | tee x
tee: x: Permission denied

@@ -39,6 +39,8 @@ var localCloneCmd = &cobra.Command{
DieFmt("directory '%s' exists and is not empty", localPath)
}

warnOnCaseInsensitiveDirectory(localPath)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not done.

Given that we plan to allow Windows and MacOS users to work like this, I will not warn them about something that they can do nothing about it and that we want to support.

return false, fmt.Errorf("touch %s: %w", path, err)
}
defer func() {
_ = fs.Remove(path)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is literally not possible to fix. OTOH it is also not possible to fail here unless a process with user privileges is actively trying to sabotage the user. I could return an error and panic.

@arielshaqed arielshaqed requested a review from N-o-Z April 15, 2024 13:49
Warn first if we failed to detect.  This will always produce the same
output, and is easier for readers.
@N-o-Z
Copy link
Member

N-o-Z commented Apr 15, 2024

Thanks!

Not sure which of the changes is requested; unfortunately I object to many of them, and am not sure what to do about one.

I suggest panicking when we cannot Remove the test file. Would that resolve your requested changes?

I think panicking is a bit drastic, but I do see this bug coming back to us sooner than later. Can we at least issue a an error or warning message in the defer function if the remove fails?

@arielshaqed arielshaqed force-pushed the bug/7645-report-case-insensitive-filename-in-lakectl-local branch from 2dd3bf2 to f6fc38f Compare April 17, 2024 12:32
Copy link
Contributor Author

@arielshaqed arielshaqed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks; PTAL!

return false, fmt.Errorf("touch %s: %w", path, err)
}
defer func() {
_ = fs.Remove(path)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed f2f: print an warning message.

Copy link
Member

@N-o-Z N-o-Z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!!!

@arielshaqed arielshaqed force-pushed the bug/7645-report-case-insensitive-filename-in-lakectl-local branch from f6fc38f to d0c1821 Compare April 17, 2024 12:47
@arielshaqed
Copy link
Contributor Author

Thanks!

@arielshaqed arielshaqed merged commit 173d9de into master Apr 17, 2024
35 checks passed
@arielshaqed arielshaqed deleted the bug/7645-report-case-insensitive-filename-in-lakectl-local branch April 17, 2024 14:36
emulatorchen pushed a commit to emulatorchen/lakeFS that referenced this pull request May 27, 2024
…ms (treeverse#7650)

* Add case-insensitive filesystem checker

* Add case-insensitive check to `lakectl local clone`

Still need to add it to `lakectl local init`.

To check this (on Linux):

```sh
dd if=/dev/zero of=/tmp/my-loopback bs=1k count=10000
sudo losetup -f /tmp/my-loopback
losetup --all
mkfs -t ext4 -O casefold -E encoding_flag=strict /dev/loop11
mkdir /tmp/ifs; sudo mount /dev/loop11 /tmp/ifs/

mkdir /tmp/ifs/dir
chattr +F /tmp/ifs/dir

```

Obviously this won't really work in a CI/CD container, where there are
externalities that we do not control.

* Mention that Git also fails on a case-insensitive filesystem

* [lint] Drop unneeded "_" receiver

* [CR] Rename --force to --allow-case-insensitive

* [CR] *Only* warn on case-insensitive filesystem, never fail

- Don't break existing behaviour
- Stay compatible with Git, which does even less
- Don't tell off Windows and MacOS default filesystems

* [CR] Change order of warning messages for case insensitivity

Warn first if we failed to detect.  This will always produce the same
output, and is easier for readers.

* [CR] Warn when failed to file used to test case insensitivity
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/lakectl Issues related to lakeFS' command line interface (lakectl) area/UI Improvements or additions to UI bug Something isn't working include-changelog PR description should be included in next release changelog lakectl-local
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Detect case-insensitive filesystem during lakectl local {init,checkout}
3 participants