Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

do *not* enforce UTF-8 #306

Closed
wants to merge 1 commit into from
Closed

do *not* enforce UTF-8 #306

wants to merge 1 commit into from

Conversation

anarcat
Copy link
Contributor

@anarcat anarcat commented May 18, 2022

i have tried to convert a real, live workstation to ZFS and had to recreate all filesystems because of this silly settings. It actually failed to copy source code from the supysonic project (which was eventually removed, but still). I don't think we should suggest such an advanced setting by default and, in general, I find that this guide suggests too many exotic things, instead of focusing on "just install the thing with ZFS".

@rlaager rlaager self-assigned this May 18, 2022
@rlaager
Copy link
Member

rlaager commented May 18, 2022

in general, I find that this guide suggests too many exotic things, instead of focusing on "just install the thing with ZFS".

Can you point to any others? There definitely were some of those in the past.

@anarcat anarcat changed the title do *not* enforce UTF-8 Draft: do *not* enforce UTF-8 May 18, 2022
@anarcat anarcat changed the title Draft: do *not* enforce UTF-8 WIP: do *not* enforce UTF-8 May 18, 2022
@anarcat
Copy link
Contributor Author

anarcat commented May 18, 2022

in general, I find that this guide suggests too many exotic things, instead of focusing on "just install the thing with ZFS".

Can you point to any others? There definitely were some of those in the past.

  • the bpool has a lot of settings, not sure they are all relevant
  • -O encryption=aes-256-gcm is the default, AFAIK
  • dnodesize=auto and the xattr stuff are optimisations, maybe they could be left out? or at least regrouped?
  • there are also a lot of mountpoints created, it's a bit excessive

that's what i can think of from the top of my head.

also, i noticed other occurences of normalization=formD that this PR doesn't patch, so i marked it as WIP for now.

@rlaager
Copy link
Member

rlaager commented May 19, 2022

  • the bpool has a lot of settings, not sure they are all relevant

The features settings on the bpool can be simplified with compatibility=grub2 once that feature lands in the distros. I've made a note about this.

Though in looking at this, I probably should submit a PR to expand the grub2 features list.

  • -O encryption=aes-256-gcm is the default, AFAIK

It didn't used to be. That said, encryption=on isn't that much simpler. It's still a setting to set. It's arguably useful to use the explicit algorithm name so people can see what it is using and change it if they want. But I can change this to encryption=on.

  • dnodesize=auto and the xattr stuff are optimisations, maybe they could be left out? or at least regrouped?

I don't know what you mean here. xattrs are enabled by default on modern systems (because systemd wants them), and we should store them using the more efficient option, right? Setting dnodesize=auto is required to actually take advantage of the large_dnode feature; the default is legacy.

  • there are also a lot of mountpoints created, it's a bit excessive

Unfortunately, the FHS intermingles system and user data. The trade-off here is technical correctness vs simplicity.

The list expanded somewhat when Ubuntu added support in the installer and wrote zsys (which is now basically dead). I spent a lot of time discussing this with Canonical. They said they had a lot of experience with this from efforts to make Ubuntu run on mobile devices with read-only root filesystems.

Many of them are optional, if you're not using the thing. Now that I'm moving away from zsys, I should perhaps separate those out (like I used to).

rlaager added a commit that referenced this pull request May 19, 2022
encryption=aes-256-gcm is now the default.

anarcat mentioned this in PR #306.

Signed-off-by: Richard Laager <rlaager@wiktel.com>
@anarcat
Copy link
Contributor Author

anarcat commented May 19, 2022 via email

I have tried to convert a real, live workstation to ZFS and had to
recreate all filesystems because of this silly settings. It actually
failed to copy [source code from the supysonic project][1] (which was
[eventually removed][2], but still).

This commit addresses *all* such incantations I could find, but keep a
reference to it in the **Hints** section or, if it's not present,
the **Notes** section, mentioning it was removed and why. This is so
people *can* still add it if they want.

 [1]: https://github.com/spl0k/supysonic/tree/270fa9883b2f2bc98f1482a68f7d9022017af50b/tests/assets/%E
 [2]: spl0k/supysonic#183
@anarcat
Copy link
Contributor Author

anarcat commented May 19, 2022

i rerolled this to include all known incantations of normalization=, this is now ready for review.

@anarcat anarcat changed the title WIP: do *not* enforce UTF-8 do *not* enforce UTF-8 May 19, 2022
@ghost
Copy link

ghost commented May 19, 2022

I'm the maintainer of the arch, NixOS, RedHat and Fedora guides (derived partially from the Debian guide) and I can not agree with the premise of this pull request.

The blog post explicitly deals with existing installations and does not apply to the guides at all, which deal with new installations and therefore should be removed. New installations should enforce UTF-8, see below.

UTF-8 remains desirable, because of interoperability. The fact that the problematic file path mentioned in the original description, was eventually removed speaks substance. In that pull request, the malformed UTF-8 string caused:

  • on Debian, scripts to break
  • breaks Mac OS (which ZFS also aims to support)

Therefore, I think using UTF-8 is a good default for most users.

If you insist on having malformed paths in your filesystem, you can simply adjust the guide to your own needs, or creating a disk image and experiment on that instead.

We are not enforcing the sensible UTF-8 default on anybody -- we only inform potential users of sensible defaults and documenting the most useful options. It is up to you to decide whether using the defaults or not.

The UTF-8 option is also a particularly thorny one, in that it must be decided at pool creation time and can not be changed later.

@rlaager
Copy link
Member

rlaager commented May 20, 2022

I'm going to consider this "wontfix". I understand that utf8only is not without controversy. But it's been the default in the guide for quite some time and it hasn't generated significant complaints. The Ubuntu developers didn't object to this with the installer work and it apparently hasn't generated complaints for them either. With each passing year, non-UTF-8 encoded filenames are less and less likely. I do point out the other side of this, so people can make their own choices.

@rlaager rlaager closed this May 20, 2022
rlaager added a commit that referenced this pull request May 20, 2022
This reduces the number of features I need to enable explicitly.

As discussed in #306 and #307.

Signed-off-by: Richard Laager <rlaager@wiktel.com>
rlaager added a commit that referenced this pull request May 20, 2022
The major change for Ubuntu is to mark many of these as optional, like
with Debian and the old guides.  As I move away from zsys, this will end
up being more like the old way.

This was also discussed in #306 and #307, but this change has
trade-offs.  It can reduce the number of datasets created on the
system, but it does so by increasing the complexity to read and follow
the guide.

Then I just harmonized Debian with Ubuntu.  Aside from whitespace,
reordering, rewording, the substantive changes were to drop /opt and
add /var/lib/NetworkManager.

Signed-off-by: Richard Laager <rlaager@wiktel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants