Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SCSI interface support for zvols #4042

Closed
zielony360 opened this issue Nov 24, 2015 · 7 comments
Closed

SCSI interface support for zvols #4042

zielony360 opened this issue Nov 24, 2015 · 7 comments
Labels
Type: Feature Feature request or new feature

Comments

@zielony360
Copy link

Hello,

do you plan to introduce SCSI interface support in zvols? It would be very useful, when sharing the zvol through SCSI target (FC, FCoE, iSCSI).

Looking in the context of VAAI, in addition to move the load to a storage server (with ZFS on it) from ESXi hosts, which can be already simulated by target, we can also save space by some kind of ZFS deduplication. I am talking about XCOPY and WRITE_SAME SCSI operations. If you use them on one zvol (or even maybe a pool?), you can just clone data using metadata pointer as Pure Storage does, similiar to snapshot clones. We would also benefit in a context of performance.

I found recent thread about lack of SCSI interface, which can be referenced: #4012

@ryao
Copy link
Contributor

ryao commented Nov 24, 2015

Thanks for filing this issue.

I agree that the XCOPY and WRITE_SAME SCSI operations are useful and there might be some overlap with data deduplication when things are aligned. Right now, zvols are implemented as a shim over a ZFS file through the Linux block device API. Prior to 0.6.5.0, this was done using the request_queue interface as if the zvols were real hardware that needed a top-half/bottom-half handler. That is similar to how scsi devices are implemented (which are another layer on top of that), but that added unnecessary overhead (spin locks, double dispatch, etcetera) that increased latencies and hurt throughput in tests. Reimplementing the zvols as SCSI devices would be a step in the opposite direction.

That said, it turns out that the Linux block device layer recently gained XCOPY-like support through REQ_COPY, so we can implement something like XCOPY without emulating SCSI devices:

https://git.kernel.org/cgit/linux/kernel/git/mkp/linux.git/commit/?h=xcopy&id=0bdeed274e16b3038a851552188512071974eea8

At a glance, it looks like the functionality that @sempervictus found to be broken in #4012 could start working when REQ_COPY is implemented, but we would need both to implement it and test it to confirm that. None of us were aware that Linux had gained this functionality, so now that we know, it can go on the roadmap. I am not making any promises, but I would be surprised if this were not implemented by Christmas.

@zielony360
Copy link
Author

@ryao
Sounds gorgeous. :-) Are you going to implement REQ_COPY in ZFS as real data copy or as metadata manipulation, which will save the space?

I dug a bit and it turns out that this patch set, which you attached, is not in mainline yet (neighter in block nor in target). Should we lobby at linux-scsi or target-devel mailing list?

@ryao
Copy link
Contributor

ryao commented Nov 25, 2015

Either will work for data on the wire. A metadata copy would be unlikely for the initial implementation. We probably should be able to test it before lobbying for it.

@sempervictus
Copy link
Contributor

@ryao: i may have a viable solution requiring no functional changes to the current implementation (aside from possibly addressing the bug mentioned later) - using SCST or LIO for their local target functionality by mapping those targets over ZVOLs. I'm not a big LIO fan, but with SCST this provides a very pleasant improvement in write throughput. #4097 has my current blocker - mapping through virtio-scsi kills the process, but locally it makes for very quick ZVOLs (likely via IO aggregation, as this even plays nice with sync=always, though of course, much slower).

@zielony360
Copy link
Author

@sempervictus
I am already using LIO qla2xxx target. I think it already has tcm_loop functionalities. Moreover it still doesn't solve the issue of passing XCOPY commands to ZFS. LIO probably will do simple copy, not pass XCOPY as zvols doesn't have proper interface to do that, so we don't have neighter too much performance improvement nor deduplication (thanks to metadata copy). Correct me if I am wrong, please.

Which version of ZFS are you using?

@ryao
Copy link
Contributor

ryao commented Mar 20, 2016

Just as an update... it turned out that the XCOPY code was proposed for the block IO layer, but was not merged. There is no way to do this without being a SCSI device and being a SCSI device would hurt performance unless certain tricks like XCOPY are used. Specifically, it would involve bring back the IO queue and having latencies consist of the time to queue and the time for completion. Being queueless allows us to consolidate those latencies into one, reduce total latency by ~20%, obtain significantly greater throughput (some reported 50% higher; others reported 200%) and lower CPU utilization (probably another 20%).

I could see someone implementing a voltype property to allow a zvol to be presented as a regular device or a scsi device. Implementing it would require simultaneously implementing zvols as they are now and as SCSI devices. It would also increase the potential for kernel API changes to affect us and probably increase the number of autotools checks by a fair amount.

@behlendorf behlendorf removed Status: Inactive Not being actively updated Type: Question Issue for discussion labels Dec 21, 2020
@behlendorf
Copy link
Contributor

Closing. This was an interesting idea but not something we're planning on implementing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature Feature request or new feature
Projects
None yet
Development

No branches or pull requests

5 participants
@behlendorf @sempervictus @ryao @zielony360 and others