GNU Grep 2.13 causes build failure in glib on ZFS #829

ryao · 2012-07-13T23:02:30Z

A failure involving sparse files occurs when building glib when using GNU Grep 2.13. There is an open bug report on this in the Gentoo bug tracker:

https://bugs.gentoo.org/show_bug.cgi?id=425668

This behavior is triggered by the following commit to GNU Grep and reverting it has been shown to prevent the problem:

http://git.savannah.gnu.org/cgit/grep.git/commit/?id=582cdfacf297181c2c5ffec83fd8a3c0f6562fc6

It does not occur on any other Linux filesystem. This might be related to bug #764, although I have no evidence for that. The only similiarity is that the ZFS Posix Layer is doing something strange.

dechamps · 2012-08-02T21:52:28Z

I don't see how this is a bug in ZFS. stat(2) on my system has the following declaration:

 blkcnt_t  st_blocks;  /* number of 512B blocks allocated */

Here is a comment taken from the grep commit:

If the file has fewer blocks than would be needed to represent its data, then it must have at least one hole.

That's just wrong. If the file is compressed by the file system, then it can be allocated using fewer blocks, but that doesn't mean it's a sparse file. The author of the grep commit is using st_blocks, which is meant as an informational hint, as element of proof to conclude that the file is sparse. That's a very, very audacious interpretation of the semantics of st_blocks.

Has this been discussed with the grep authors or should someone drop a mail in their mailing list so, hopefully, they will realize that with complex filesystems like ZFS and btrfs, the number of blocks allocated for a file has little to do with its actual contents? Why should we make ZFS lie about the number of blocks it allocates for a file (which is useful information for the administrator: that's what makes du reliable) just because someone happened to misuse the field in one popular program?

You can break grep or du. Pick your poison.

behlendorf · 2012-08-03T02:31:33Z

Indeed this is an issue with grep.

I don't believe anyone has brought this to their attention yet. We should since they're going to have to make the fix.

ryao · 2012-08-03T02:34:33Z

I believe that they are already aware:

http://git.savannah.gnu.org/cgit/grep.git/commit/?id=2f0255e9f4cc5cc8bd619d1f217902eb29b30bc2

I have asked people who are affected to test the upstream patch, but so far, no one has replied.

Also, I have received reports that this issue has been reproduced on btrfs (under certain kernel versions) and NFS.

nedbass · 2012-08-21T20:46:56Z

I noticed grep 2.14 was released yesterday with a fix for this bug.

behlendorf · 2012-08-21T22:10:39Z

Yay. OK, we'll chalk this one up to documentation that grep 2.13 should be avoided and simply advise people to update as it appears in their distributions.

If there is an unclean shutdown (e.g. kernel panic or power loss) while a txg is in progress, the agent may have created some objects for the in-progress txg, which are no longer needed. These objects are destroyed when the agent next starts, so that they are not leaked. The code that implements this, `PoolState::cleanup_data_objects()`, assumes that the last object written is `ObjectVersion(None)`, i.e. the reclaim code has not rewritten the last object. Unfortunately, this assumption is not true, which leads to the deletion of the last object, which is still in use. The bug can be triggered on any agent restart, even if the system did not reboot. This PR addresses the issue by taking into account the ObjectVersion when determining the last-valid object’s key. The problem was introduced by openzfs#668. The fix is simple because the version suffix is `-#`, rather than `/#` which was used earlier in the development of openzfs#668.

behlendorf mentioned this issue Aug 7, 2012

Fedora 17: cannot compile anymore openzfs/spl#141

Closed

behlendorf closed this as completed Aug 21, 2012

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GNU Grep 2.13 causes build failure in glib on ZFS #829

GNU Grep 2.13 causes build failure in glib on ZFS #829

ryao commented Jul 13, 2012

dechamps commented Aug 2, 2012

behlendorf commented Aug 3, 2012

ryao commented Aug 3, 2012

nedbass commented Aug 21, 2012

behlendorf commented Aug 21, 2012

GNU Grep 2.13 causes build failure in glib on ZFS #829

GNU Grep 2.13 causes build failure in glib on ZFS #829

Comments

ryao commented Jul 13, 2012

dechamps commented Aug 2, 2012

behlendorf commented Aug 3, 2012

ryao commented Aug 3, 2012

nedbass commented Aug 21, 2012

behlendorf commented Aug 21, 2012