QAT support for AES-GCM #7282

tcaputi · 2018-03-07T20:18:56Z

This patch adds support for acceleration of AES-GCM encryption
with Intel Quick Assist Technology.

Signed-off-by: Chengfeix Zhu chengfeix.zhu@intel.com
Signed-off-by: Weigang Li weigang.li@intel.com
Signed-off-by: Tom Caputi tcaputi@datto.com

Motivation and Context

This code allows ZFS to utilize QAT for the purpose of accelerating encryption. Unfortunately, QAT does not support AES-CCM (other than 128 bits), so the implementation only runs on AES-GCM.

How Has This Been Tested?

Tested on a system with a QAT DH895x PCIe card installed and with the card disabled to ensure that the implementations with and without the card are compatible. I will have performance numbers soon.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Performance enhancement (non-breaking change which improves efficiency)
Code cleanup (non-breaking change which makes code smaller or more readable)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (a change to man pages or other documentation)

Checklist:

My code follows the ZFS on Linux code style requirements.
I have updated the documentation accordingly.
I have read the CONTRIBUTING document.
I have added tests to cover my changes.
All new and existing tests passed.
All commit messages are properly formatted and contain Signed-off-by.
Change has been approved by a ZFS on Linux member.

behlendorf · 2018-03-07T22:18:39Z

@wli5 would you mind reviewing this.

codecov · 2018-03-08T02:44:50Z

Codecov Report

Merging #7282 into master will increase coverage by 0.03%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #7282      +/-   ##
==========================================
+ Coverage   76.53%   76.56%   +0.03%     
==========================================
  Files         328      327       -1     
  Lines      104006   103874     -132     
==========================================
- Hits        79600    79536      -64     
+ Misses      24406    24338      -68

Flag	Coverage Δ
#kernel	`76.27% <100%> (-0.09%)`	⬇️
#user	`65.87% <100%> (-0.05%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8e5d148...ea1644b. Read the comment docs.

wli5 · 2018-03-08T03:26:14Z

@behlendorf Yes, I will do it, thanks!

wli5 · 2018-03-08T05:14:29Z

@tcaputi sorry for misleading, I just check the doc, you are right, CCM only support 128 bits:
Cipher key length in bytes. For AES it can be 128 bits (16 bytes), 192 bits (24 bytes) or 256 bits (32 bytes). For the CCM mode of operation, the only supported key length is 128 bits (16 bytes).
https://01.org/sites/default/files/downloads/intelr-quickassist-technology/qacyapiv201public.pdf

rlaager · 2018-03-08T05:30:50Z

This commit seems to mix a lot of what must be general refactoring in with the encryption. Ideally, those should be separate commits, which would make review and troubleshooting (e.g. bisection) easier.

rlaager · 2018-03-08T05:40:48Z

My ability to give a reasonable review on this code is limited, but from a read-through, I don't see anything glaringly wrong.

I see there's a 4kB limit on buffer size. How does this interact with recordsize=1M? Does that get encrypted on the CPU, and is that faster than trying to offload it?

Is the performance break-even point a function of particular hardware (e.g. as CPUs get faster, even 4kB should be run in software and/or as QAT cards get faster, it would make sense to offload 8kB blocks)? If so, should those numbers be conditionalized on the card model? Or should we do something where we benchmark it on module load (which I think the assembly-optimized versions of checksumming and RAIDZ do)? Or should it be user-configurable (e.g. as a module parameter)?

wli5 · 2018-03-08T05:33:34Z

module/zfs/qat.c

+
+	ret = qat_crypt_init();
+	if (ret != 0)
+		goto error;


I'm just wondering when users who don't want to use QAT encryption, they just have DC instances in the configure file, for this case the qat_init should not return fail. Only when qat_dc_init and qat_cy_init all fail, qat_init returns fail. Does it make sense?

wli5 · 2018-03-08T05:49:42Z

@rlaager the buffer size 4KB is for DMA restriction, for each offloading request, it actually can be very large, e.g., in record size - 128KB, but the 128KB record buffer will be split into 32 pages and send to device as one sg-list buffer, this is to make sure the address is physical contiguous for DMA access.

wli5 · 2018-03-08T06:15:53Z

module/zfs/qat_crypt.c

+#define	QAT_CRYPT_MAX_INSTANCES		48
+
+#define	MAX_PAGE_NUM			1024
+#define	DIGEST_LENGTH			8


Seems these 4 consts are not used.

wli5 · 2018-03-08T06:18:51Z

module/zfs/qat_crypt.c

+	}
+}
+
+static inline struct page *


those 3 inline functions exist twice in both qat_crypt.c and qat_compress.c, can we merge them to qat.h?

wli5 · 2018-03-08T06:37:38Z

module/zfs/qat_crypt.c

+	status = init_cy_session_ctx(dir, cy_inst_handle, &cy_session_ctx, key,
+	    crypt, aad_len);
+	if (status != CPA_STATUS_SUCCESS)
+		goto error;


Can return fail directly (safely - as memory are all freed), if goto error branch, the calling of cpaCySymRemoveSession will cause issue.

Yes, return (status) here.

wli5 · 2018-03-08T06:39:08Z

module/zfs/qat_crypt.c

+	flat_src_buf = flat_src_buf_array;
+	flat_dst_buf = flat_dst_buf_array;
+	while (bytes_left > 0) {
+		in_pages[page_num] = mem_to_page(in);


forget page_num++, this will cause the pages not un-mapped.

tcaputi · 2018-03-08T17:12:56Z

I see there's a 4kB limit on buffer size. How does this interact with recordsize=1M? Does that get encrypted on the CPU, and is that faster than trying to offload it?

For clarification, the 4kB is a minimum buffer size. The maximum is 128k. As far as how this interacts with recordsize=1M, right now QAT will only accelerate blocks that are compressed below the 128k threshold. I received that upper bound from Intel. @wli5 is there any reason we should limit this functionality to 128k?

tcaputi · 2018-03-08T17:45:40Z

@wli5 Thanks for your suggestions and your comments have been addressed. Take a look when you get a chance.

behlendorf · 2018-03-08T17:48:59Z

module/zfs/qat.c

+};
+
+static kstat_t *qat_ksp = NULL;
+int zfs_qat_disable = 0;


You should pull the module_param() for this in to qat.c as well.

behlendorf · 2018-03-08T17:54:46Z

module/zfs/qat.c

+
+	ret = qat_dc_init();
+	if (ret != 0)
+		goto error;


nit: please just return (ret) here, and for the qat_crypt_init() failure below call qat_dc_fini() and return. Then we can drop the error label and it's contents.

behlendorf · 2018-03-08T17:57:42Z

module/zfs/qat.c

+	}
+
+	qat_dc_fini();
+	qat_crypt_fini();


nit: qat_crypt_fini() before qat_dc_fini().

behlendorf · 2018-03-08T18:00:15Z

module/zfs/qat.h

+	 * Note: when qat fail happens, it doesn't mean a critical hardware
+	 * issue, sometimes it is because the output buffer is not big enough,
+	 * and the compression job will be transfered to gzip software again,
+	 * so the functionality of ZFS is not impacted.


The last sentence needs to be updated to refer to the encyption and not compression job.

behlendorf · 2018-03-08T18:07:04Z

module/zfs/qat.h

+		kfree(*pp_mem_addr);
+		*pp_mem_addr = NULL;
+	}
+}


Rather than maintaining mem_alloc_contig(), and mem_free_contig() as static inlines they can be moved to qat.c converted to normal functions and renamed qat_mem_alloc_contig() and qat_mem_free_contig(). Let's add the qat_ prefix to mem_to_page as well but leave it a static inline since this one might actually have an impact on performance.

behlendorf · 2018-03-08T18:12:08Z

module/zfs/qat.h

+#define	PHYS_CONTIG_ALLOC(pp_mem_addr, size_bytes)	\
+	mem_alloc_contig((void *)(pp_mem_addr), (size_bytes))
+#define	PHYS_CONTIG_FREE(p_mem_addr)	\
+	mem_free_contig((void *)&(p_mem_addr))


QAT_PHYS_CONTIG_ALLOC / QAT_PHYS_CONTIG_FREE, to head off any future namespace collision.

behlendorf · 2018-03-08T18:16:13Z

module/zfs/qat_crypt.c

+
+	for (i = 0; i < num_inst; i++) {
+		cpaCyStopInstance(cy_inst_handles[i]);
+	}


nit: { } can be dropped, and the Cpa32U i declaration moved in to the for initializer.

for (Cpa32U i = 0; i < num_inst; i++) cpaCyStopInstance(cy_inst_handles[i]);

behlendorf · 2018-03-08T18:27:02Z

module/zfs/qat_crypt.c

+	status = init_cy_session_ctx(dir, cy_inst_handle, &cy_session_ctx, key,
+	    crypt, aad_len);
+	if (status != CPA_STATUS_SUCCESS)
+		goto error;


Yes, return (status) here.

behlendorf · 2018-03-08T18:32:17Z

module/zfs/qat_crypt.c

+
+	return (CPA_STATUS_SUCCESS);
+
+error:


Let's model this after qat_compress() which shares a common cleanup path for success and failure. Lines 351-363 are virtually identical to 370-383.

This patch adds support for acceleration of AES-GCM encryption with Intel Quick Assist Technology. Signed-off-by: Chengfeix Zhu <chengfeix.zhu@intel.com> Signed-off-by: Weigang Li <weigang.li@intel.com> Signed-off-by: Tom Caputi <tcaputi@datto.com>

tcaputi · 2018-03-08T21:18:04Z

@behlendorf your comments have been addressed. Thank you for the feedback.

wli5 · 2018-03-09T02:07:24Z

is there any reason we should limit this functionality to 128k?

@tcaputi for QAT hardware, there is no such limitation, we limit it in software for performance reason, user can modify the 128K to 1M in the code, it should be OK. We can also change it in this PR, one concern is for compression, there is a list of intermediate buffers each will be pre-allocated based on the max buffer size, if most cases the record size is < 1M, while we allocated so many "big" intermediate buffer, it is a waste. I don't know how many people use 1M record size, if you think 1M is common, we may change the value to 1M in this PR.

			status = QAT_PHYS_CONTIG_ALLOC(
			    &buffer_array[i][buff_num]->pBuffers->
			    &buffer_array[i][buff_num]->pBuffers->
			    pData, 2 * QAT_MAX_BUF_SIZE);
			    pData, 2 * QAT_MAX_BUF_SIZE);`

tcaputi · 2018-03-09T20:00:49Z

These graphs (courtesy of Intel) show the performance improvements granted when using QAT for encryption. I verified these results myself with a spinning disk pool and the results were very similar. As expected, QAT greatly reduces the CPU overhead associated with encryption and provides a marginal bump to throughput (encryption is usually not a bottleneck here).

adilger · 2018-03-26T23:11:45Z

for QAT hardware, there is no such limitation, we limit it in software for performance reason, user can modify the 128K to 1M in the code, it should be OK. We can also change it in this PR, one concern is for compression, there is a list of intermediate buffers each will be pre-allocated based on the max buffer size, if most cases the record size is < 1M, while we allocated so many "big" intermediate buffer, it is a waste. I don't know how many people use 1M record size, if you think 1M is common, we may change the value to 1M in this PR.

@wli5 I think that recordsize=1MB will become increasingly common for new datasets, as this improves performance significantly on faster storage devices, and is also the default for Lustre+ZFS data servers. Since recordsize is a dataset parameter, it should be fairly straight forward to determine at runtime if a 128KB or 1MB buffer is needed for intermediate compression buffers.

Either the buffers can be allocated based on recordsize, or they could be reallocated at 1MB (or larger, as needed) if larger blocks are compressed/encrypted. It would seem to me that offloading larger blocks to QAT would be more efficient than offloading smaller blocks due to fixed per-block overhead being amortized over more data, so this would be even better for QAT.

wli5 · 2018-03-27T04:05:23Z

@adilger thanks for comments.
Per our discussion with @tcaputi, the record size is dynamic and can be changed in each operation, while the intermediate buffer is allocated in the QAT initialize phase. Maybe we can think of having different buffer size limitation for compression and encryption, as encryption doesn't need intermediate buffer.

adilger · 2018-03-27T20:35:46Z

Is it possible to have the QAT intermediate compression buffers be reallocated if a larger request is submitted? At worst, if the larger buffer allocation fails it can return an error and fall back to CPU-based compression algorithm for larger buffers (i.e. what it is doing now), but in most cases the allocation should succeed and allow QAT to compress the larger buffers. That avoids the overhead of allocating too-large buffers if they are not needed, while allowing larger blocks to be compressed in the common cases.

A more complex buffer management system might allow "small" and "large" buffers separately (we do this for Lustre routers), only using "large" buffers if needed, and using "small" buffers for other requests. That would also reduce the amount of RAM pinned by the intermediate buffers to the number of concurrent "large" requests, rather than guessing in advance how many buffers might be needed and forcing all buffers to be the maximum size.

wli5 reviewed Mar 8, 2018

View reviewed changes

tcaputi force-pushed the qat branch from cf6e3e0 to 336417e Compare March 8, 2018 17:44

behlendorf requested changes Mar 8, 2018

View reviewed changes

QAT support for AES-GCM

ea1644b

This patch adds support for acceleration of AES-GCM encryption with Intel Quick Assist Technology. Signed-off-by: Chengfeix Zhu <chengfeix.zhu@intel.com> Signed-off-by: Weigang Li <weigang.li@intel.com> Signed-off-by: Tom Caputi <tcaputi@datto.com>

tcaputi force-pushed the qat branch from 336417e to ea1644b Compare March 8, 2018 21:17

wli5 approved these changes Mar 9, 2018

View reviewed changes

behlendorf approved these changes Mar 9, 2018

View reviewed changes

behlendorf merged commit cf63739 into openzfs:master Mar 9, 2018

geppi mentioned this pull request Jan 21, 2019

Intel QAT support when root is on a ZFS filesystem #8323

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QAT support for AES-GCM #7282

QAT support for AES-GCM #7282

tcaputi commented Mar 7, 2018

behlendorf commented Mar 7, 2018

codecov bot commented Mar 8, 2018 •

edited

Loading

wli5 commented Mar 8, 2018

wli5 commented Mar 8, 2018

rlaager commented Mar 8, 2018

rlaager commented Mar 8, 2018

wli5 Mar 8, 2018

wli5 commented Mar 8, 2018

wli5 Mar 8, 2018

wli5 Mar 8, 2018

wli5 Mar 8, 2018

behlendorf Mar 8, 2018

wli5 Mar 8, 2018

tcaputi commented Mar 8, 2018

tcaputi commented Mar 8, 2018

behlendorf Mar 8, 2018

behlendorf Mar 8, 2018

behlendorf Mar 8, 2018

behlendorf Mar 8, 2018

behlendorf Mar 8, 2018

behlendorf Mar 8, 2018

behlendorf Mar 8, 2018

behlendorf Mar 8, 2018

behlendorf Mar 8, 2018

tcaputi commented Mar 8, 2018

wli5 commented Mar 9, 2018 •

edited

Loading

tcaputi commented Mar 9, 2018

adilger commented Mar 26, 2018

wli5 commented Mar 27, 2018

adilger commented Mar 27, 2018

QAT support for AES-GCM #7282

QAT support for AES-GCM #7282

Conversation

tcaputi commented Mar 7, 2018

Motivation and Context

How Has This Been Tested?

Types of changes

Checklist:

behlendorf commented Mar 7, 2018

codecov bot commented Mar 8, 2018 • edited Loading

Codecov Report

wli5 commented Mar 8, 2018

wli5 commented Mar 8, 2018

rlaager commented Mar 8, 2018

rlaager commented Mar 8, 2018

Choose a reason for hiding this comment

wli5 commented Mar 8, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tcaputi commented Mar 8, 2018

tcaputi commented Mar 8, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tcaputi commented Mar 8, 2018

wli5 commented Mar 9, 2018 • edited Loading

tcaputi commented Mar 9, 2018

adilger commented Mar 26, 2018

wli5 commented Mar 27, 2018

adilger commented Mar 27, 2018

codecov bot commented Mar 8, 2018 •

edited

Loading

wli5 commented Mar 9, 2018 •

edited

Loading