Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FIEMAP support #7545

Closed
wants to merge 1 commit into from
Closed

Add FIEMAP support #7545

wants to merge 1 commit into from

Conversation

behlendorf
Copy link
Contributor

Description

The FIEMAP ioctl is the standard Linux user space mechanism for inspecting the physical layout of a file on disk. The ioctl returns a list of extents each of which describes a region of the file. Compatible blocks are merged in to larger extents when they are physically contiguous and have identical flags.

The following per-extent flags are supported by ZFS.

FIEMAP_EXTENT_LAST           - The last extent in the mapping
FIEMAP_EXTENT_UNKNOWN        - Set on gang blocks
FIEMAP_EXTENT_DELALLOC       - Dirty extents not yet written
FIEMAP_EXTENT_ENCODED        - Set on compressed extents
FIEMAP_EXTENT_DATA_ENCRYPTED - Set on encrypted extents
FIEMAP_EXTENT_NOT_ALIGNED    - Extent is not block aligned
FIEMAP_EXTENT_DATA_INLINE    - Set on embedded block pointers
FIEMAP_EXTENT_UNWRITTEN      - Set on holes (normally not reported)
FIEMAP_EXTENT_MERGED         - Multiple block pointers were merged
FIEMAP_EXTENT_SHARED         - Set on deduplicated extents

The following flags are supported when requesting extents. Note that *_COPIES, *_NOMERGE, and *_HOLE are currently specific to ZFS but are applicable to other Linux file systems.

FIEMAP_FLAG_SYNC             - Sync the file first
FIEMAP_FLAG_COPIES           - Include all copies of an extent
FIEMAP_FLAG_NOMERGE          - Don't merge blocks in to extents
FIEMAP_FLAG_HOLES            - Include unwritten holes as an extent

Finally, the following reserved fields in the fiemap_extent structure have been utilized and should be reserved to prevent future conflicts.

.fe_reserved[0]   - Unique device ID; top-level VDEV ID for ZFS
.fe_reserved64[0] - Physical length of an extent

Future work:

  • The FIEMAP_FLAG_XATTR flag is not supported. Implementing this should be relatively straight forward if it is needed. This would be a handy way to determine if the xattrs have been stored in the dnode, spill block, or an external object.

  • The FIEMAP_FLAG_CACHE flag is not supported. We rely on the ARC to keep the right blocks cached.

  • The lseek(2) SEEK_HOLE and SEEK_DATA flags still rely on the dmu_offset_next() function. The zpl_llseek() function can be updated to use FIEMAP to quickly determine the next extent. This has the advantage of correctly accounting for newly dirtied or freed blocks without forcing a txg sync.

The FIEMAP interface is fully described at:

Motivation and Context

FIEMAP is a fundamental Linux interface and should be fully implemented. With this functionality added Linux utilites like filefrag(8) work as expected on a ZFS filesystem. See issue #264.

One useful example of the kind of information provided by FIEMAP is the ability to easily check if a file is file is compressed on disk.

$ sudo zfs set compress=on testpool/testfs
$ cp /bin/true /testpool/testfs/
$ fiemap -vs /testpool/testfs/true 
Extents: 1
ID   Logical (Start-End Length)              Physical (Start-End Length)             Dev Flags
0    0x000000000000-0x0000000078b7     30904 0x00000008fe00-0x0000000945ff     18432 0   last,encoded

Or the layout of a larger file on disk.

$ sudo zfs set recordsize=1m testpool/testfs
$ dd if=/dev/urandom of=/testpool/testfs/file bs=8M count=1
$ fiemap -vs /testpool/testfs/file 
Extents: 2
ID   Logical (Start-End Length)              Physical (Start-End Length)             Dev Flags
0    0x000000000000-0x0000002fffff   3145728 0x0000090dea00-0x0000093de9ff   3145728 0   merged
1    0x000000300000-0x0000007fffff   5242880 0x0000093f8e00-0x0000098f8dff   5242880 0   last,merged

How Has This Been Tested?

Nine new test cases have been added to the ZTS which test all of the supported FIEMAP_FLAG_*'s which can be used, and verify all of the possible FIEMAP_EXTENT_*'s flags which can be reported. Additionally, a new test utility called fiemap was added which can be used to verify a correct extent mapping was returned. The caller may specify the expected location of holes, data, extent flags, device ids, or number of extents. It's effectively a souped up version of filefrag(8) designed for validation.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

  • My code follows the ZFS on Linux code style requirements.
  • I have updated the documentation accordingly.
  • I have read the contributing document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.
  • All commit messages are properly formatted and contain Signed-off-by.
  • Change has been approved by a ZFS on Linux member.

The FIEMAP ioctl is the standard Linux user space mechanism for
inspecting physical layout of a file on disk.  The ioctl returns
a list of extents each of which describes a region of the file.
Compatible blocks are merged in to larger extents when they are
physically contiguous and have identical flags.

The following per-extent flags are supported by ZFS.

  FIEMAP_EXTENT_LAST           - The last extent in the mapping
  FIEMAP_EXTENT_UNKNOWN        - Set on gang blocks
  FIEMAP_EXTENT_DELALLOC       - Dirty extents not yet written
  FIEMAP_EXTENT_ENCODED        - Set on compressed extents
  FIEMAP_EXTENT_DATA_ENCRYPTED - Set on encrypted extents
  FIEMAP_EXTENT_NOT_ALIGNED    - Extent is not block aligned
  FIEMAP_EXTENT_DATA_INLINE    - Set on embedded block pointers
  FIEMAP_EXTENT_UNWRITTEN      - Set on holes (normally not reported)
  FIEMAP_EXTENT_MERGED         - Multiple block pointers were merged
  FIEMAP_EXTENT_SHARED         - Set on deduplicated extents

The following flags are supported when requesting extents.  Note
that *_COPIES, *_NOMERGE, and *_HOLE are currently specific to
ZFS but are applicable to other Linux file systems.

  FIEMAP_FLAG_SYNC             - Sync the file first
  FIEMAP_FLAG_COPIES           - Include all copies of an extent
  FIEMAP_FLAG_NOMERGE          - Don't merge blocks in to extents
  FIEMAP_FLAG_HOLES            - Include unwritten holes as an extent

Finally, the following reserved fields in the fiemap_extent structure
have been utilized and should be reserved to prevent future conflicts.

  .fe_reserved[0]   - Unique device ID; top-level VDEV ID for ZFS
  .fe_reserved64[0] - Physical length of an extent

Future work:

* The FIEMAP_FLAG_XATTR flag is not supported.  Implementing this
  should be relatively straight forward if it is needed.  This would
  be a handy way to determine if the xattrs have been stored in the
  dnode, spill block, or an external object.

* The FIEMAP_FLAG_CACHE flag is not supported.  We rely on the ARC
  to keep the right blocks cached.

* The lseek(2) SEEK_HOLE and SEEK_DATA flags still rely on the
  dmu_offset_next() function.  The zpl_llseek() function can be
  updated to use FIEMAP to quickly determine the next extent.
  This has the advantage of correctly accounting for newly dirtied
  or freed blocks without forcing a txg sync.

The FIEMAP interface is fully described at:

* https://www.kernel.org/doc/Documentation/filesystems/fiemap.txt

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#264
@behlendorf behlendorf added Type: Feature Feature request or new feature Status: Work in Progress Not yet ready for general review labels May 18, 2018
@behlendorf behlendorf requested a review from ahrens May 18, 2018 00:05
@behlendorf behlendorf mentioned this pull request May 18, 2018
@codecov
Copy link

codecov bot commented May 18, 2018

Codecov Report

Merging #7545 into master will decrease coverage by <.01%.
The diff coverage is 78.74%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #7545      +/-   ##
==========================================
- Coverage   77.42%   77.41%   -0.01%     
==========================================
  Files         336      339       +3     
  Lines      107529   108468     +939     
==========================================
+ Hits        83251    83975     +724     
- Misses      24278    24493     +215
Flag Coverage Δ
#kernel 78.38% <91.55%> (+0.41%) ⬆️
#user 66.14% <63.79%> (-0.26%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 43eb39d...15f728a. Read the comment docs.

Copy link
Contributor

@mschilli87 mschilli87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a native speaker but to me the use of 'in to' made it hard to
understand the documentation and with 'into' it would be more clear to me. Feel
free to ignore if I'm wrong or misunderstood.

Great to see this feature coming, btw! 🎉

* - FIEMAP_FLAG_SYNC - Sync extents before reporting
* - FIEMAP_FLAG_XATTR - Report extents used by xattrs (Unsupported)
* - FIEMAP_FLAG_COPIES - Report all data copies (ZFS-only)
* - FIEMAP_FLAG_NOMERGE - Never merge blocks in to extents (ZFS-only)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in to -> into?

fiemap_remove

# While the data is dirty only a single delalloc extent should be
# reported. Once the dirty data is converted in to holes it should
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in to -> into?

fiemap_verify -h -F "unwritten:all"
fiemap_remove

# Verify "merged" is set on blocks merged in to extents.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in to -> into?

# Since it's unlikely that all the blocks can be merged due their
# physical offset and device id a large number are written and only
# a single merge is required.
log_note "Verify 'merged' is set on data blocks merged in to extents"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in to -> into?

fiemap_verify -s -F "merged:>0"
fiemap_remove

# Verify "merged" is set on holes merged in to extents if requested.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in to -> into?


# Verify "merged" is set on holes merged in to extents if requested.
# Holes always merge so there must be only one.
log_note "Verify 'merged' is set on holes merged in to extents"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in to -> into?

fiemap_verify -s -H 0:$((BS*8)):1
fiemap_remove

# Write blocks in to holes: ...X..XX
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in to -> into?

fiemap_remove

# Write blocks in to holes: ...X..XX
log_note "Write blocks in to holes"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in to -> into?

#endif

/*
* Request that each block be reported and not merged in to an extent.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in to -> into?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will fix

{
zfs_fiemap_entry_t *fe;

fe = kmem_zalloc(sizeof (zfs_fiemap_entry_t), KM_SLEEP);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check fe != NULL

@tonyhutter
Copy link
Contributor

I wrote a small fuzz tester for the ioctl here. After hundreds of millions of iterations it never once crashed the kernel!

Copy link
Member

@ahrens ahrens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FIEMAP_EXTENT_NOT_ALIGNED - Extent is not block aligned

What does that mean? I would think that we are reporting on ZFS blocks, which are by definition ZFS-block-aligned. Is there some other sort of block at play here?

Note that *_COPIES, *_NOMERGE, and *_HOLE are currently specific to ZFS but are applicable to other Linux file systems.

Does that mean that we need to get these flags merged into the linux kernel, otherwise someone else might use the flag value to mean something else?

.fe_reserved[0] - Unique device ID; top-level VDEV ID for ZFS
.fe_reserved64[0] - Physical length of an extent

Likewise for these, what if someone else decides to use these for a different purpose?

FIEMAP_FLAG_SYNC - Sync the file first

I assume that means that we would do a txg_wait_synced() if (and only if) the file is dirty? (looks like this is wrong, from reading the code; at a minimum we should document it more precisely).

The lseek(2) SEEK_HOLE and SEEK_DATA flags still rely on the dmu_offset_next() function. The zpl_llseek() function can be updated to use FIEMAP to quickly determine the next extent. This has the advantage of correctly accounting for newly dirtied or freed blocks without forcing a txg sync.

That would be awesome!


/*
* The number of blocks times the block size may exceed the file size
* when there are pending dirty blocks which have not be written.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took me a while to parse this... I think you mean:
The fill count reported by FIEMAP (times the block size) may be larger than the file size reported by stat() when there are pending dirty blocks which have not be written.
But I don't understand how that would be the case. There can't be more pending dirty blocks than the size of the file (as reported by stat()). That said, I guess there could be a race condition between the calls to fiemap and stat, where the file size legitimately shrinks between them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it can for this special case where FIEMAP_FLAG_NOMERGE is passed and we return the fill count plus the pending dirty blocks. The dirty blocks might overwrite an existing block so we over report slightly. This might also be possible when the copies property is set larger than 1.

Now that's fine since if the actual extents are requested you'll just allocate a slightly larger buffer than needed. The returned extents will be accurate.

*/
static void
dbuf_add_dirty_map(list_t *list, range_tree_t *dirty_tree,
range_tree_t *free_tree)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make an assertion about the appropriate locks being held in the dnode?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.

{
dbuf_dirty_record_t *dr;

for (dr = list_head(list); dr != NULL; dr = list_next(list, dr)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can declare dr in the loop:
for (dbuf_dirty_record_t *dr = ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

dmu_buf_impl_t *db = dr->dr_dbuf;

if (db->db_buf == NULL)
(void) dbuf_read(db, NULL, DB_RF_MUST_SUCCEED);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this? Seems like it could be pretty heavyweight... and potentially cause locking issues?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, yes I meant to remove this before pushing.

*/
boolean_t dirty = B_FALSE;
for (dnode_t *mls_dn = multilist_sublist_head(mls); mls_dn != NULL;
mls_dn = multilist_sublist_next(mls, mls_dn)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be pretty expensive, right? It's O(number of dirty dnodes). Could we use multilist_link_active(), or some other flag specific to the dnode to make this determination?

Copy link
Contributor Author

@behlendorf behlendorf May 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll see if I can find a better way to do this.

/*
* When FIEMAP_FLAG_NOMERGE is set and the number of extents for the
* entire file are being requested. Then in this special case walking
* the entire indirect block tree is not required.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about:

+	 * If FIEMAP_FLAG_NOMERGE is set and the number of extents for the
+	 * entire file are being requested, this is a special case where walking
+	 * the entire indirect block tree is not required.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

if (BP_GET_FILL(bp) > 0) {
czb.zb_blkid = i;
error = zfs_fiemap_visit_indirect(spa, dnp, bp,
&czb, zfs_fiemap_cb, (void *)fm);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could take arbitrarily long (hours+). No writes will be processed during that time, since we are holding the txg open. This seems like a huge performance problem (and denial-of-service attack). I think we need to rearchitect this to get the info in chunks, dropping locks periodically so that this does not pause the entire storage pool while in progress.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. The FIEMAP ioctl itself is allowed to return sub-regions of the file if needed (e.g. the extent array is not large enough) and is expected to be re-called until FIEMAP_EXTENT_LAST is set, similar to read() or readdir() needing multiple calls.

* to fully complete in order to avoid using stale block pointers.
*/
if (dirty_txg > syncing_txg)
txg_wait_synced(spa_get_dsl(spa), syncing_txg);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand this... I think it's saying that if the dnode is dirty in a txg after the currently-syncing txg, then we will wait for the currently-syncing txg to finish syncing. Why do we do that?

At a high level, can you explain how we deal with pending changes, in the syncing txg; in the quiescing txg; and in the open txg? It seems like it could be hard to determine what's going on in the syncing txg, since there's a bunch of i/os in flight.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll extent the block comment above this function to explain at a high level how this is intended to work and why it's OK.

dn->dn_object, dnp->dn_nlevels - 1, 0);

for (int i = 0; i < MIN(dnp->dn_nblkptr, fm->fm_copies); i++) {
blkptr_t *bp = &dnp->dn_blkptr[i];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be running during spa_sync()? If so, we need the proper locks to prevent the dnode_phys from changing (specifically it's BP's). See dmu_buf_lock_parent().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now this is being protected by the dn_mtx which should prevent spa_sync() from changing the dnode_phys. Which is pretty heavy handed and goes to your larger point of this potentially taking a long time. I'll see what can be done tp chunk this up so never end up stalling the entire pool.

/*
* Recursively walk the indirect block tree for a dnode_phys_t and call
* the provided callback for all block pointers traversed.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is your idea that since we are walking via the BP's and ARC here, the DMU could be doing whatever it wants while we're in the middle of this? And we are sure that the blocks on disk won't be overwritten because we are holding the txg open so not enough txg's could pass to free and reallocate a block (due to TXG_DEFER_SIZE)? What if !defer_allowed in metaslab_sync_done()?

@ahrens
Copy link
Member

ahrens commented May 18, 2018 via email

* The following flags have been submitted for inclusion in future
* Linux kernels and the filefrag(8) utility.
*/
#define fe_device_reserved fe_reserved[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lustre uses this field as fe_device. It probably makes sense to be consistent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will fix.

#endif

/*
* Extent is shared with other space. Introduced in 2.6.33 kernel.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking, this means that the extent is shared with "another file" for "reflinked" files. It doesn't mean that e.g. the data is shared in the same block as an inode or with an xattr. I don't think ZFS supports per-file reflink yet, but it probably could because it is already COW.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying that this flag is specifically reserved for reflinked files? This flag is currently being set on extents which are shared by multiple files due to deduplication. Is that an approriate use?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think dedupe files are fine. I think the implementation is OK, just the description here makes it sound like the data extent might be shared with some non-data (e.g. inline file data which should be FIEMAP_EXTENT_DATA_INLINE, or different parts of the block are shared between different files like FIEMAP_EXTENT_DATA_TAIL). Maybe something like:

* Extent data is deduplicated/referenced by multiple files.  Introduced in 2.6.33 kernel. 

* On success, zero is returned, the count argument is set to the number of
* unallocated blocks (holes), and the bs argument is set to the block size
* (if it is not NULL). On error, a non-zero errno is returned and the values
* in count and bs are undefined.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this needs to be implemented on non-ZFS filesystems that don't understand FIEMAP_FLAG_NOMERGE it would be possible to retrieve the full FIEMAP info and then walk the extents to count the allocated blocks (or holes between the logical end of one extent and start of the next). It wouldn't be as efficient as just getting the fill count from ZFS, but would at least give you the "sparseness" of the file.


if (BP_IS_HOLE(bp)) {
fe->fe_logical_len = fm->fm_block_size;
fe->fe_flags |= FIEMAP_EXTENT_UNWRITTEN;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed recently, FIEMAP_EXTENT_UNWRITTEN is not really the same as a hole. An unwritten extent is for allocated blocks that have a flag in the extent/BP to indicate they should be read back as zero, but have not had any data/zeroes written to them.

Unwritten extents are used by fallocate() to reserve disk space efficiently in non-COW filesystems (e.g. ext4 or XFS) for large databases or video files. However, that doesn't make sense for COW filesystems, since the "reserved" blocks could never be used directly (though I guess they could be used to prevent other files from consuming that space). I'd rather not conflate these concepts, especially since FIEMAP_EXTENT_UNWRITTEN extents will normally reference allocated blocks, but that is not the case here.

fe->fe_flags |= FIEMAP_EXTENT_UNWRITTEN;
} else if (BP_IS_EMBEDDED(bp)) {
fe->fe_logical_len = BPE_GET_LSIZE(bp);
fe->fe_physical_start = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this be the location of the dnode block itself? Since you are setting FIEMAP_EXTENT_NOT_ALIGNED it could be the actual byte offset of the data, or at least the dnode within the block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good catch. I meant to go back and add this but skipped it initially.

* dirty blocks are assumed to fill holes, and free blocks are assumed
* to overlap with existing free blocks. This is a safe worst case
* estimate which may slightly over report the number of extents for
* a file being actively overwritten.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is fine, since an actively-overwritten file is going to return slightly-incorrect/stale information to userspace regardless of how hard the kernel tries, so it doesn't make sense to try too hard to get it exactly correct.

* exactly how the block pointer will be merged.
*/
if (fm->fm_extents_max == 0 && fm->fm_flags & FIEMAP_FLAG_NOMERGE &&
fm->fm_start == 0 && fm->fm_length == FIEMAP_MAX_OFFSET) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on how the fill count is reported (e.g. on a per-BP basis for every level in the tree), you might still be able to return the extent count for a sub-region of the file (i.e. fm_start != 0 or fm_length != FIEMAP_MAX_OFFSET), but that may not be a common-enough case to optimize.

if (BP_GET_FILL(bp) > 0) {
czb.zb_blkid = i;
error = zfs_fiemap_visit_indirect(spa, dnp, bp,
&czb, zfs_fiemap_cb, (void *)fm);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. The FIEMAP ioctl itself is allowed to return sub-regions of the file if needed (e.g. the extent array is not large enough) and is expected to be re-called until FIEMAP_EXTENT_LAST is set, similar to read() or readdir() needing multiple calls.

/* Incompatible ZFS-only flags masked out of compatibility check */
fei->fi_flags &= ~ZFS_FIEMAP_FLAGS_ZFS;

error = fiemap_check_flags(fei, ZFS_FIEMAP_FLAGS_COMPAT);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't have to call fiemap_check_flags() if you don't want. Make a zpl_check_flags() function that checks for all of the flags that the code currently understands. I made sure that this function was not called at the VFS layer, exactly so that we could have different flags on a per-filesystem basis.


while ((c = getopt(argc, argv, "achsvD:E:H:F:V:?")) != -1) {
switch (c) {
case 'a':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how a relates to NOMERGE? Maybe n or m?

Copy link
Contributor Author

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FIEMAP_EXTENT_NOT_ALIGNED - Extent is not block aligned

What does that mean? I would think that we are reporting on ZFS blocks, which
are by definition ZFS->block-aligned. Is there some other sort of block at
play here?

@ahrens this gets set for data in embedded blocks pointers, but we don't
currently bother reporting the exact physical offsets. We'll probably also
need this if we added support for reporting xattrs.

Note that *_COPIES, *_NOMERGE, and *_HOLE are currently specific to ZFS but
are applicable to other Linux file systems.

Does that mean that we need to get these flags merged into the linux kernel

Ideally yes. For the flags and the reserved fields.

FIEMAP_FLAG_SYNC - Sync the file first

I assume that means that we would do a txg_wait_synced() if (and only if)
the file is dirty? (looks like > > this is wrong, from reading the code; at
a minimum we should document it more precisely).

Good catch. Will fix.

I wrote a small fuzz tester for the ioctl here. After hundreds of millions
of iterations it never once crashed the kernel!

@tonyhutter excellent thanks!

#endif

/*
* Request that each block be reported and not merged in to an extent.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will fix


/*
* The number of blocks times the block size may exceed the file size
* when there are pending dirty blocks which have not be written.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it can for this special case where FIEMAP_FLAG_NOMERGE is passed and we return the fill count plus the pending dirty blocks. The dirty blocks might overwrite an existing block so we over report slightly. This might also be possible when the copies property is set larger than 1.

Now that's fine since if the actual extents are requested you'll just allocate a slightly larger buffer than needed. The returned extents will be accurate.

*/
static void
dbuf_add_dirty_map(list_t *list, range_tree_t *dirty_tree,
range_tree_t *free_tree)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.

{
dbuf_dirty_record_t *dr;

for (dr = list_head(list); dr != NULL; dr = list_next(list, dr)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

dmu_buf_impl_t *db = dr->dr_dbuf;

if (db->db_buf == NULL)
(void) dbuf_read(db, NULL, DB_RF_MUST_SUCCEED);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, yes I meant to remove this before pushing.

* convenient way to determine the range of TXGs to check.
*/
rl_t *rl = zfs_range_lock(&zp->z_range_lock, 0, UINT64_MAX, RL_READER);
open_txg = txg_hold_open(spa_get_dsl(spa), &th);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the time it didn't feel right to create and assign an empty tx. Let me rework this and verifty that's OK.

* to fully complete in order to avoid using stale block pointers.
*/
if (dirty_txg > syncing_txg)
txg_wait_synced(spa_get_dsl(spa), syncing_txg);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add the ASSERT. I think there might be one other place where this is done too.

* to fully complete in order to avoid using stale block pointers.
*/
if (dirty_txg > syncing_txg)
txg_wait_synced(spa_get_dsl(spa), syncing_txg);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll extent the block comment above this function to explain at a high level how this is intended to work and why it's OK.

/*
* When FIEMAP_FLAG_NOMERGE is set and the number of extents for the
* entire file are being requested. Then in this special case walking
* the entire indirect block tree is not required.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

dn->dn_object, dnp->dn_nlevels - 1, 0);

for (int i = 0; i < MIN(dnp->dn_nblkptr, fm->fm_copies); i++) {
blkptr_t *bp = &dnp->dn_blkptr[i];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now this is being protected by the dn_mtx which should prevent spa_sync() from changing the dnode_phys. Which is pretty heavy handed and goes to your larger point of this potentially taking a long time. I'll see what can be done tp chunk this up so never end up stalling the entire pool.

@@ -774,6 +813,7 @@ const struct inode_operations zpl_dir_inode_operations = {
.permission = zpl_permission,
#endif /* HAVE_GET_ACL | HAVE_CHECK_ACL | HAVE_PERMISSION */
#endif /* CONFIG_FS_POSIX_ACL */
.fiemap = zpl_fiemap,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This provides extent maps for directories, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

@behlendorf behlendorf added Status: Inactive Not being actively updated and removed Status: Work in Progress Not yet ready for general review labels Sep 25, 2018
@ahrens ahrens added the Status: Revision Needed Changes are required for the PR to be accepted label Sep 27, 2018
@bgly
Copy link

bgly commented Oct 22, 2018

I built this patch against zfs 0.8.0-rc1 and I am trying to test it, but I am unable to find the fiemap command on centos? Is this a program you wrote or how would I go about getting it? Running filefrag gives me:

[root@bl-vsnap-100 ~]# filefrag -v /vsnap/vpool1/fs1/debug.txt
Filesystem type is: 2fc12fc1
File size of /vsnap/vpool1/fs1/debug.txt is 128000 (1 block of 131072 bytes)
/vsnap/vpool1/fs1/debug.txt: FIBMAP unsupported

The fiemap under tests gives me:
[root@bl-vsnap-100 zfs]# tests/zfs-tests/cmd/fiemap/fiemap -vs /vsnap/vpool1/fs1/debug.txt
Failed to read FIEMAP: 95

More information on the file:
[root@bl-vsnap-100 zfs]# ls -li /vsnap/vpool1/fs1/debug.txt
128 -rw-r--r--. 1 root root 128000 Oct 19 21:39 /vsnap/vpool1/fs1/debug.txt
[root@bl-vsnap-100 zfs]# zdb -ddddddd -U /etc/vsnap/zpool.cache/1 vpool1/fs1 128Dataset vpool1/fs1 [ZPL], ID 135, cr_txg 16339, 10.0G, 15 objects, rootbp DVA[0]=<0:52002d200:200> DVA[1]=<0:540025e00:200> [L0 DMU objset] sha256 lz4 unencrypted LE contiguous unique double size=800L/200P birth=601115L/601115P fill=15 cksum=71659e33a1fbf7cd:259bb7857162d3e5:cfba156463c94dba:f33510d57f28953b

Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
   128    1   128K   125K     1K     512   125K  100.00  ZFS plain file (K=inherit) (Z=inherit)
                                           176   bonus  System attributes
    dnode flags: USED_BYTES USERUSED_ACCOUNTED USEROBJUSED_ACCOUNTED
    dnode maxblkid: 0
    path    /debug.txt
    uid     0
    gid     0
    atime   Fri Oct 19 21:39:14 2018
    mtime   Fri Oct 19 21:39:14 2018
    ctime   Fri Oct 19 21:39:14 2018
    crtime  Fri Oct 19 21:39:14 2018
    gen     554565
    mode    100644
    size    128000
    parent  34
    links   1
    pflags  40800000004
    xattr   129

Indirect blocks:
0 L0 0:52005a000:400 1f400L/400P F=1 B=554565/554565

            segment [0000000000000000, 000000000001f400) size  125K

The reason, I want filefrag is to use it to determine the LBA of a file. Unless you know of a better way to get the LBA of a file? I tried using zdb, but it doesn't give me that information.

@behlendorf

@behlendorf
Copy link
Contributor Author

@bgly you've got the right command, filefrag is the standard system tool you want. It sounds like this this didn't apply cleanly to 0.8.0-rc1. Perhaps you could try building 15f728a and use that for the debugging, until I get a chance to rebase this and address the review feedback.

@bgly
Copy link

bgly commented Oct 22, 2018

@bgly you've got the right command, filefrag is the standard system tool you want. It sounds like this this didn't apply cleanly to 0.8.0-rc1. Perhaps you could try building 15f728a and use that for the debugging, until I get a chance to rebase this and address the review feedback.

@behlendorf It doesn't apply cleanly, but the conflicts were very minimal. It was mostly just moving stuff around and deleting tests cases that no long exists like the removal of grow_pool/grow_replicas and how it became just one grow interface. The only thing that wasn't just deleting lines was changing txg_verify to TXG_VERIFY(dn->dn_objset->os_spa, i);

@adilger
Copy link
Contributor

adilger commented Oct 23, 2018

What version of filefrag Are you using? Newer versions (eg. 1.43 or 1.44) are better than older versions, depending on what distro you have.

What does strace show when running filefrag? It should try FIEMAP first, and only fall back to FIBMAP if there is an error.

@bgly
Copy link

bgly commented Oct 23, 2018

What version of filefrag Are you using? Newer versions (eg. 1.43 or 1.44) are better than older versions, depending on what distro you have.

What does strace show when running filefrag? It should try FIEMAP first, and only fall back to FIBMAP if there is an error.

open("/vsnap/vpool1/fs1/debug.txt", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=128000, ...}) = 0
fstatfs(3, {f_type=0x2fc12fc1, f_bsize=131072, f_blocks=536069, f_bfree=454097, f_bavail=454097, f_files=116249026, f_ffree=116249011, f_fsid={3578489450, 4658512}, f_namelen=255, f_frsize=131072, f_flags=ST_VALID|ST_NOATIME}) = 0
ioctl(3, FIGETBSZ, 0x603130)            = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f081750c000
write(1, "Filesystem type is: 2fc12fc1\n", 29Filesystem type is: 2fc12fc1
) = 29
write(1, "File size of /vsnap/vpool1/fs1/d"..., 77File size of /vsnap/vpool1/fs1/debug.txt is 128000 (1 block of 131072 bytes)
) = 77
ioctl(3, FS_IOC_FIEMAP, {fm_start=0, fm_length=18446744073709551615, fm_flags=0, fm_extent_count=292}) = -1 EOPNOTSUPP (Operation not supported)
ioctl(3, FIBMAP, 0x7ffecebce1ac)        = -1 EINVAL (Invalid argument)
write(2, "/vsnap/vpool1/fs1/debug.txt: FIB"..., 48/vsnap/vpool1/fs1/debug.txt: FIBMAP unsupported
) = 48
close(3)                                = 0
exit_group(22)                          = ?
+++ exited with 22 +++

Yep, it shows that it tries FIEMAP and gets Operation not supported and then fails with the FIBMAP error.

I also updated filefrag to 1.44, which I believe was the latest.

@behlendorf
Copy link
Contributor Author

@bgly this may sound like a silly question, but are you absolutely sure you loading the modified kmods? You can verify by checking /sys/module/zfs/version.

@bgly
Copy link

bgly commented Oct 23, 2018

@bgly this may sound like a silly question, but are you absolutely sure you loading the modified kmods? You can verify by checking /sys/module/zfs/version.

@behlendorf Does the version change when you build your own?
[root@bl-vsnap-100 zfs]# cat /sys/module/zfs/version
0.8.0-rc1

When i do the build the packages are tagged differently, and i run: sudo yum localinstall *.$(uname -p).rpm

[root@bl-vsnap-100 zfs]# ls *.rpm
kmod-zfs-4.18.8-1.el7.elrepo.x86_64-0.8.0-rc1_6_g6336bcb.el7.x86_64.rpm libzfs2-0.8.0-rc1_6_g6336bcb.el7.x86_64.rpm zfs-debuginfo-0.8.0-rc1_6_g6336bcb.el7.x86_64.rpm
kmod-zfs-devel-0.8.0-rc1_6_g6336bcb.el7.x86_64.rpm libzfs2-devel-0.8.0-rc1_6_g6336bcb.el7.x86_64.rpm zfs-dracut-0.8.0-rc1_6_g6336bcb.el7.x86_64.rpm
kmod-zfs-devel-4.18.8-1.el7.elrepo.x86_64-0.8.0-rc1_6_g6336bcb.el7.x86_64.rpm libzpool2-0.8.0-rc1_6_g6336bcb.el7.x86_64.rpm zfs-kmod-0.8.0-rc1_6_g6336bcb.el7.src.rpm
libnvpair1-0.8.0-rc1_6_g6336bcb.el7.x86_64.rpm zfs-0.8.0-rc1_6_g6336bcb.el7.src.rpm zfs-kmod-debuginfo-0.8.0-rc1_6_g6336bcb.el7.x86_64.rpm
libuutil1-0.8.0-rc1_6_g6336bcb.el7.x86_64.rpm zfs-0.8.0-rc1_6_g6336bcb.el7.x86_64.rpm zfs-test-0.8.0-rc1_6_g6336bcb.el7.x86_64.rpm

These are my newly built rpm's

@behlendorf
Copy link
Contributor Author

behlendorf commented Oct 23, 2018

@bgly yes, based on your packages you should see a 0.8.0-rc1_6_g6336bcb module version if you've loaded the modified kmod. You might also check /proc/sys/kernel/spl/gitrev.

@bgly
Copy link

bgly commented Oct 23, 2018

@bgly yes, based on your packages you should see a 0.8.0-rc1_6_g6336bcb module version if you've loaded the modified kmod. You might also check /proc/sys/kernel/spl/gitrev.

@behlendorf Interesting... It's not showing the right version... I dont see how that is possible considering I ran: sudo yum localinstall *.$(uname -p).rpm which completed successfully and did the upgrade... Is there anything special I need to do to reload the modules?

@behlendorf
Copy link
Contributor Author

@bgly the upgrade won't automatically swap out the kmod's since they might be in use. So you're going to want to unload them with modprobe -r zfs, and then re-load them with modprobe zfs.

@bgly
Copy link

bgly commented Oct 23, 2018

@behlendorf

[root@bl-vsnap-100 zfs]# modinfo zfs
filename: /lib/modules/4.18.8-1.el7.elrepo.x86_64/extra/zfs/zfs/zfs.ko
version: 0.8.0-rc1_6_g6336bcb
license: CDDL
author: OpenZFS on Linux
description: ZFS
alias: devname:zfs
alias: char-major-10-249
srcversion: D4918FDB57446ED6BAAEFF2
depends: spl,znvpair,icp,zlua,zunicode,zcommon,zavl
retpoline: Y
name: zfs
vermagic: 4.18.8-1.el7.elrepo.x86_64 SMP mod_unload modversions

[root@bl-vsnap-100 zfs]# modinfo spl
filename: /lib/modules/4.18.8-1.el7.elrepo.x86_64/extra/zfs/spl/spl.ko
version: 0.8.0-rc1_6_g6336bcb
license: GPL
author: OpenZFS on Linux
description: Solaris Porting Layer
srcversion: CE099E383264B0A0E12F332
depends:
retpoline: Y
name: spl
vermagic: 4.18.8-1.el7.elrepo.x86_64 SMP mod_unload modversions

here it shows the right version loaded...?

@behlendorf
Copy link
Contributor Author

here it shows the right version loaded...?

Not quite, modinfo will show you the version installed on the system. This may be different than the version which is running, which is what /sys/module/zfs/version shows. In the same way that after installing a new kernel you need to reboot your system to start using it.

@bgly
Copy link

bgly commented Oct 24, 2018

@behlendorf After rebooting the system, it seems like the zfs modules got replaced, which is good. The fiemap/filefrag commands now work correctly. The only thing that is odd is that /sys/module/zfs/version still shows the wrong version:
[root@bl-vsnap-100 vsnap]# cat /sys/module/zfs/version
0.8.0-rc1

@bgly
Copy link

bgly commented Oct 24, 2018

@behlendorf @adilger

[root@bl-vsnap-100 ~]# zfs/tests/zfs-tests/cmd/fiemap/fiemap -vs /vsnap/vpool1/fs1/testfile.txt
Extents: 1
ID   Logical (Start-End Length)              Physical (Start-End Length)             Dev Flags
0    0x000000000000-0x00000001f3ff    128000 0x00176002cc00-0x00176002cfff      1024 0   last,encoded

0x00176002cc00 =100395043840
0x00176002cfff = 100395044863

[root@bl-vsnap-100 ~]# filefrag -sv /vsnap/vpool1/fs1/testfile.txt
Filesystem type is: 2fc12fc1
File size of /vsnap/vpool1/fs1/testfile.txt is 128000 (1 block of 131072 bytes)
 ext:     logical_offset:        physical_offset: length:   expected: flags:
   0:        0..       0:     765953..    765953:      0:             last,encoded,eof

Do you guys know why the values differ?

Also, since this is ZFS, is the calculation from Physical offset to Block Number different than it would be for say an ext4?

For example: (Test just sits on a regular ext4 filesystem)

[root@bl-vsnap-100 ~]# filefrag -e test
Filesystem type is: 58465342
File size of test is 3451 (1 block of 4096 bytes)
ext:     logical_offset:        physical_offset: length:   expected: flags:
 0:        0..       0:    5802271..   5802271:      1:             last,eof

Now if you take the physical offset and multiply by 8 since (4096/512) = 8
ending LBA = ( (extent + 1) * 8 ) - 1 and beginning LBA = extent *8
So it would be LBA 46418168-46418175

@behlendorf
Copy link
Contributor Author

@bgly the discrepancy comes from that fact that filefrag reports the offsets in terms of the filesystems block size, and the fiemap test utility reports in bytes. The physical byte offset for your file is 100,395,043,840 bytes / 131,072 block size = 765,953 blocks.

There's also the additional complication with ZFS that the logical extent size is likely not the same as the physical extent size, thanks to compression. Again in your example, the logical extent size is 128k, but the data has been compressed down to a single 1k block. Currently filefrag assumes the logical and physical block sizes must be the same.

I'd suggest using the fiemap utility for now if you can since it was updated to report the additional fields reported by ZFS.

@bgly
Copy link

bgly commented Sep 25, 2019

@behlendorf any reason this didn’t go in?

@behlendorf
Copy link
Contributor Author

@bgly unfortunately, I've been unable to devote the time needed to address the review feedback so this can be wrapped up. This is definitely still functionality I'd like to see included when it's ready.

rohan-puri added a commit to rohan-puri/zfs that referenced this pull request Nov 5, 2019
Now all fiemap zfs-tests pass except 1, fiemap_free,
which is a regression in master.

...
home/rohan/zol-ws/zfs/tests/test-runner/bin/test-runner.py  -c "tests/runfiles/fiemap.run" -T "functional" -i "/home/rohan/zol-ws/zfs/tests/zfs-tests" -I "1"
Test: /home/rohan/zol-ws/zfs/tests/zfs-tests/tests/functional/fiemap/setup (run as root) [00:01] [PASS]
Test: /home/rohan/zol-ws/zfs/tests/zfs-tests/tests/functional/fiemap/fiemap_sync (run as root) [00:16] [PASS]
Test: /home/rohan/zol-ws/zfs/tests/zfs-tests/tests/functional/fiemap/fiemap_flags (run as root) [00:10] [PASS]
Test: /home/rohan/zol-ws/zfs/tests/zfs-tests/tests/functional/fiemap/fiemap_delalloc (run as root) [00:31] [PASS]
Test: /home/rohan/zol-ws/zfs/tests/zfs-tests/tests/functional/fiemap/fiemap_free (run as root) [00:01] [FAIL]
Test: /home/rohan/zol-ws/zfs/tests/zfs-tests/tests/functional/fiemap/fiemap_copies (run as root) [00:02] [PASS]
Test: /home/rohan/zol-ws/zfs/tests/zfs-tests/tests/functional/fiemap/fiemap_nomerge (run as root) [00:03] [PASS]
Test: /home/rohan/zol-ws/zfs/tests/zfs-tests/tests/functional/fiemap/fiemap_holes_sanity (run as root) [01:08] [PASS]
Test: /home/rohan/zol-ws/zfs/tests/zfs-tests/tests/functional/fiemap/cleanup (run as root) [00:00] [PASS]

Results Summary
PASS       8
FAIL       1

Running Time:   00:02:15
Percent passed: 88.9%
...

Signed-off-by: Rohan Puri <rohan.puri15@gmail.com>
rohan-puri added a commit to rohan-puri/zfs that referenced this pull request Nov 9, 2019
Now all fiemap zfs-tests pass except 1, fiemap_free,
which is a regression in master.

...
home/rohan/zol-ws/zfs/tests/test-runner/bin/test-runner.py  -c "tests/runfiles/fiemap.run" -T "functional" -i "/home/rohan/zol-ws/zfs/tests/zfs-tests" -I "1"
Test: /home/rohan/zol-ws/zfs/tests/zfs-tests/tests/functional/fiemap/setup (run as root) [00:01] [PASS]
Test: /home/rohan/zol-ws/zfs/tests/zfs-tests/tests/functional/fiemap/fiemap_sync (run as root) [00:16] [PASS]
Test: /home/rohan/zol-ws/zfs/tests/zfs-tests/tests/functional/fiemap/fiemap_flags (run as root) [00:10] [PASS]
Test: /home/rohan/zol-ws/zfs/tests/zfs-tests/tests/functional/fiemap/fiemap_delalloc (run as root) [00:31] [PASS]
Test: /home/rohan/zol-ws/zfs/tests/zfs-tests/tests/functional/fiemap/fiemap_free (run as root) [00:01] [FAIL]
Test: /home/rohan/zol-ws/zfs/tests/zfs-tests/tests/functional/fiemap/fiemap_copies (run as root) [00:02] [PASS]
Test: /home/rohan/zol-ws/zfs/tests/zfs-tests/tests/functional/fiemap/fiemap_nomerge (run as root) [00:03] [PASS]
Test: /home/rohan/zol-ws/zfs/tests/zfs-tests/tests/functional/fiemap/fiemap_holes_sanity (run as root) [01:08] [PASS]
Test: /home/rohan/zol-ws/zfs/tests/zfs-tests/tests/functional/fiemap/cleanup (run as root) [00:00] [PASS]

Results Summary
PASS       8
FAIL       1

Running Time:   00:02:15
Percent passed: 88.9%
...

Signed-off-by: Rohan Puri <rohan.puri15@gmail.com>
@Stoatwblr
Copy link

Brian, what's the current state of this one?

@behlendorf
Copy link
Contributor Author

what's the current state of this one?

There hasn't been any new progress. It needs to be rebased on the latest code and the remaining review feedback needs to be addressed.

SpaceCafe pushed a commit to SpaceCafe/zfs that referenced this pull request Nov 27, 2020
Signed-off-by: Lars Thoms <lars.thoms@spacecafe.org>
@fcladera
Copy link

fcladera commented Jan 4, 2021

A quick note supporting this MR.

This feature is necessary for the correct behavior of Yocto, a build system for embedded Linux. Some recipes require FIEMAP support, as shown by 1 and 2.

Thus, some Yocto builds on ZFS are currently not possible.

@mattmattox
Copy link

This feature would be really helpful to Longhorn

@ahrens
Copy link
Member

ahrens commented Jun 4, 2021

I noticed that this PR hasn’t been updated in some time. We would like to see this feature added, but it seems that it isn’t quite complete, so we’re going to close it for now. When you have time to revisit this work, feel free to reopen this PR or open a new one. Thanks for your contribution to OpenZFS and please continue to open PR’s to address issues that you encounter!

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Inactive Not being actively updated Status: Revision Needed Changes are required for the PR to be accepted Type: Feature Feature request or new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants