Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GNU Grep 2.13 causes build failure in glib on ZFS #829

Closed
ryao opened this issue Jul 13, 2012 · 5 comments
Closed

GNU Grep 2.13 causes build failure in glib on ZFS #829

ryao opened this issue Jul 13, 2012 · 5 comments
Labels
Type: Documentation Indicates a requested change to the documentation
Milestone

Comments

@ryao
Copy link
Contributor

ryao commented Jul 13, 2012

A failure involving sparse files occurs when building glib when using GNU Grep 2.13. There is an open bug report on this in the Gentoo bug tracker:

https://bugs.gentoo.org/show_bug.cgi?id=425668

This behavior is triggered by the following commit to GNU Grep and reverting it has been shown to prevent the problem:

http://git.savannah.gnu.org/cgit/grep.git/commit/?id=582cdfacf297181c2c5ffec83fd8a3c0f6562fc6

It does not occur on any other Linux filesystem. This might be related to bug #764, although I have no evidence for that. The only similiarity is that the ZFS Posix Layer is doing something strange.

@dechamps
Copy link
Contributor

dechamps commented Aug 2, 2012

I don't see how this is a bug in ZFS. stat(2) on my system has the following declaration:

 blkcnt_t  st_blocks;  /* number of 512B blocks allocated */

Here is a comment taken from the grep commit:

If the file has fewer blocks than would be needed to represent its data, then it must have at least one hole.

That's just wrong. If the file is compressed by the file system, then it can be allocated using fewer blocks, but that doesn't mean it's a sparse file. The author of the grep commit is using st_blocks, which is meant as an informational hint, as element of proof to conclude that the file is sparse. That's a very, very audacious interpretation of the semantics of st_blocks.

Has this been discussed with the grep authors or should someone drop a mail in their mailing list so, hopefully, they will realize that with complex filesystems like ZFS and btrfs, the number of blocks allocated for a file has little to do with its actual contents? Why should we make ZFS lie about the number of blocks it allocates for a file (which is useful information for the administrator: that's what makes du reliable) just because someone happened to misuse the field in one popular program?

You can break grep or du. Pick your poison.

@behlendorf
Copy link
Contributor

Indeed this is an issue with grep.

I don't believe anyone has brought this to their attention yet. We should since they're going to have to make the fix.

@ryao
Copy link
Contributor Author

ryao commented Aug 3, 2012

I believe that they are already aware:

http://git.savannah.gnu.org/cgit/grep.git/commit/?id=2f0255e9f4cc5cc8bd619d1f217902eb29b30bc2

I have asked people who are affected to test the upstream patch, but so far, no one has replied.

Also, I have received reports that this issue has been reproduced on btrfs (under certain kernel versions) and NFS.

@nedbass
Copy link
Contributor

nedbass commented Aug 21, 2012

I noticed grep 2.14 was released yesterday with a fix for this bug.

@behlendorf
Copy link
Contributor

Yay. OK, we'll chalk this one up to documentation that grep 2.13 should be avoided and simply advise people to update as it appears in their distributions.

pcd1193182 pushed a commit to pcd1193182/zfs that referenced this issue Sep 26, 2023
If there is an unclean shutdown (e.g. kernel panic or power loss) while
a txg is in progress, the agent may have created some objects for the
in-progress txg, which are no longer needed.  These objects are
destroyed when the agent next starts, so that they are not leaked.  The
code that implements this, `PoolState::cleanup_data_objects()`, assumes
that the last object written is `ObjectVersion(None)`, i.e. the reclaim
code has not rewritten the last object.

Unfortunately, this assumption is not true, which leads to the deletion
of the last object, which is still in use.  The bug can be triggered on
any agent restart, even if the system did not reboot.

This PR addresses the issue by taking into account the ObjectVersion
when determining the last-valid object’s key.

The problem was introduced by openzfs#668.  The fix is simple because the
version suffix is `-#`, rather than `/#` which was used earlier in the
development of openzfs#668.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Documentation Indicates a requested change to the documentation
Projects
None yet
Development

No branches or pull requests

4 participants