Skip to content

Commit

Permalink
hqm: add HQM PF device support
Browse files Browse the repository at this point in the history
This commit fleshes out the HQM driver with full support for the physical
function device, including
- ioctl interface: through a device file, the driver provides ioctls for
  device configuration and measurement. The ioctl interface is versioned
  and designed for backwards compatibility and extensibility.
- mmap interface: the driver provides an mmap callback which makes
  a subset of the device accessible directly from user-space, which enables
  device enqueue and dequeue operations without a syscall.
- read interface: applications can read() the device file to receive alerts
  from the device driver.
- sysfs interface: applications can use the sysfs interface to query
  resource availability and further tune the device.
- CQ interrupts: the driver supports interrupt-driven applications in
  addition to polling-driven application.
- Dynamic power management: the device is put into D3Hot when no
  applications are using it.

Signed-off-by: Gage Eads <gage.eads@intel.com>
  • Loading branch information
gageeads authored and intel-lab-lkp committed Dec 2, 2019
1 parent f74841d commit 1be1600
Show file tree
Hide file tree
Showing 25 changed files with 21,721 additions and 9 deletions.
174 changes: 174 additions & 0 deletions Documentation/ABI/testing/sysfs-driver-hqm
@@ -1,3 +1,142 @@
What: /sys/bus/pci/devices/.../sequence_numbers/group<N>_sns_per_queue
Date: August 3, 2018
KernelVersion: TBD
Contact: gage.eads@intel.com
Description: Interface for configuring HQM load-balanced sequence numbers.

The HQM has a fixed number of sequence numbers used for ordered
scheduling. They are divided among four sequence number groups.
A group can be configured to contain one queue with 1,024
sequence numbers, or two queues with 512 sequence numbers each,
and so on, down to 32 queues with 32 sequence numbers each.

When a load-balanced queue is configured with non-zero sequence
numbers, the driver finds a group configured for the same
number of sequence numbers and an available slot. If no such
groups are found, the queue cannot be configured.

Once the first ordered queue is configured, the sequence number
configurations are locked. The driver returns an error on writes
to locked sequence number configurations. When all ordered
queues are unconfigured, the sequence number configurations can
be changed again.

This file is only accessible for physical function HQM devices.

What: /sys/bus/pci/devices/.../total_resources/num_atomic_inflights
What: /sys/bus/pci/devices/.../total_resources/num_dir_credit_pools
What: /sys/bus/pci/devices/.../total_resources/num_dir_credits
What: /sys/bus/pci/devices/.../total_resources/num_dir_ports
What: /sys/bus/pci/devices/.../total_resources/num_hist_list_entries
What: /sys/bus/pci/devices/.../total_resources/num_ldb_credit_pools
What: /sys/bus/pci/devices/.../total_resources/num_ldb_credits
What: /sys/bus/pci/devices/.../total_resources/num_ldb_ports
What: /sys/bus/pci/devices/.../total_resources/num_ldb_queues
What: /sys/bus/pci/devices/.../total_resources/num_sched_domains
Date: August 7, 2019
KernelVersion: TBD
Contact: gage.eads@intel.com
Description:
The total_resources subdirectory contains read-only files that
indicate the total number of resources in the device.

num_atomic_inflights: Total number of atomic inflights in the
device. Atomic inflights refers to the
on-device storage used by the atomic
scheduler.

num_dir_credit_pools: Total number of directed credit pools in
the device.

num_dir_credits: Total number of directed credits in the
device.

num_dir_ports: Total number of directed ports (and
queues) in the device.

num_hist_list_entries: Total number of history list entries in
the device.

num_ldb_credit_pools: Total number of load-balanced credit
pools in the device.

num_ldb_credits: Total number of load-balanced credits in
the device.

num_ldb_ports: Total number of load-balanced ports in
the device.

num_ldb_queues: Total number of load-balanced queues in
the device.

num_sched_domains: Total number of scheduling domains in the
device.

What: /sys/bus/pci/devices/.../avail_resources/num_atomic_inflights
What: /sys/bus/pci/devices/.../avail_resources/num_dir_credit_pools
What: /sys/bus/pci/devices/.../avail_resources/num_dir_credits
What: /sys/bus/pci/devices/.../avail_resources/num_dir_ports
What: /sys/bus/pci/devices/.../avail_resources/num_hist_list_entries
What: /sys/bus/pci/devices/.../avail_resources/num_ldb_credit_pools
What: /sys/bus/pci/devices/.../avail_resources/num_ldb_credits
What: /sys/bus/pci/devices/.../avail_resources/num_ldb_ports
What: /sys/bus/pci/devices/.../avail_resources/num_ldb_queues
What: /sys/bus/pci/devices/.../avail_resources/num_sched_domains
What: /sys/bus/pci/devices/.../avail_resources/max_ctg_atm_inflights
What: /sys/bus/pci/devices/.../avail_resources/max_ctg_hl_entries
Date: August 7, 2019
KernelVersion: TBD
Contact: gage.eads@intel.com
Description:
The avail_resources subdirectory contains read-only files that
indicate the available number of resources in the device.
"Available" here means resources that are not currently in use
by an application or, in the case of a physical function
device, assigned to a virtual function.

num_atomic_inflights: Available number of atomic inflights in
the device.

num_dir_credit_pools: Available number of directed credits in
the device.

num_dir_ports: Available number of directed ports (and
queues) in the device.

num_hist_list_entries: Available number of history list entries
in the device.

num_ldb_credit_pools: Available number of load-balanced credit
pools in the device.

num_ldb_credits: Available number of load-balanced credits
in the device.

num_ldb_ports: Available number of load-balanced ports
in the device.

num_ldb_queues: Available number of load-balanced queues
in the device.

num_sched_domains: Available number of scheduling domains
in the device.

max_ctg_atm_inflights: Maximum contiguous atomic inflights
available in the device.

Each scheduling domain is created with
an allocation of atomic inflights, and
each domain's allocation of inflights
must be contiguous.

max_ctg_hl_entries: Maximum contiguous history list entries
available in the device.

Each scheduling domain is created with
an allocation of history list entries,
and each domain's allocation of entries
must be contiguous.

What: /sys/bus/pci/drivers/hqm/module/parameters/log_level
Date: August 3, 2018
KernelVersion: TBD
Expand All @@ -15,3 +154,38 @@ Description: Interface for setting the driver's log level.
configuration descriptions, and function entry and exit
points. These messages are verbose, but they give a clear
view into the driver's behavior.

What: /sys/bus/pci/drivers/hqm/module/parameters/reset_timeout_s
Date: August 3, 2018
KernelVersion: TBD
Contact: gage.eads@intel.com
Description: Interface for setting the driver's reset timeout.
When a device reset (FLR) is issued, the driver waits for
user-space to stop using the device before allowing the FLR to
proceed, with a timeout. The device is considered in use if
there are any open domain device file descriptors or memory
mapped producer ports. (For PF device resets, this includes all
VF-owned domains and producer ports.)

The amount of time the driver waits for userspace to stop using
the device is controlled by the module parameter
reset_timeout_s, which is in units of seconds and defaults to
5. If reset_timeout_s seconds elapse and any user is still
using the device, the driver zaps those processes' memory
mappings and marks their device file descriptors as invalid.
This is necessary because user processes that do not relinquish
their device mappings can interfere with processes that use the
device after the reset completes. To ensure that user processes
have enough time to clean up, reset_timeout_s can be increased.

What: /sys/bus/pci/devices/.../dev_id
Date: August 6, 2019
KernelVersion: TBD
Contact: gage.eads@intel.com
Description: Device ID used in /dev, i.e. /dev/hqm<device ID>

Each HQM PF and VF device is granted a unique ID by the kernel
driver, and this ID is used to construct the device's /dev
directory: /dev/hqm<device ID>. This sysfs file can be read to
determine a device's ID, which allows the user to map a device
file to a PCI BDF.
5 changes: 3 additions & 2 deletions arch/x86/Makefile
Expand Up @@ -189,14 +189,15 @@ cfi-sections := $(call as-instr,.cfi_sections .debug_frame,-DCONFIG_AS_CFI_SECTI

# does binutils support specific instructions?
asinstr += $(call as-instr,pshufb %xmm0$(comma)%xmm0,-DCONFIG_AS_SSSE3=1)
sse2_instr := $(call as-instr,movapd %xmm0$(comma)%xmm0,-DCONFIG_AS_SSE2=1)
avx_instr := $(call as-instr,vxorps %ymm0$(comma)%ymm1$(comma)%ymm2,-DCONFIG_AS_AVX=1)
avx2_instr :=$(call as-instr,vpbroadcastb %xmm0$(comma)%ymm1,-DCONFIG_AS_AVX2=1)
avx512_instr :=$(call as-instr,vpmovm2b %k1$(comma)%zmm5,-DCONFIG_AS_AVX512=1)
sha1_ni_instr :=$(call as-instr,sha1msg1 %xmm0$(comma)%xmm1,-DCONFIG_AS_SHA1_NI=1)
sha256_ni_instr :=$(call as-instr,sha256msg1 %xmm0$(comma)%xmm1,-DCONFIG_AS_SHA256_NI=1)

KBUILD_AFLAGS += $(cfi) $(cfi-sigframe) $(cfi-sections) $(asinstr) $(avx_instr) $(avx2_instr) $(avx512_instr) $(sha1_ni_instr) $(sha256_ni_instr)
KBUILD_CFLAGS += $(cfi) $(cfi-sigframe) $(cfi-sections) $(asinstr) $(avx_instr) $(avx2_instr) $(avx512_instr) $(sha1_ni_instr) $(sha256_ni_instr)
KBUILD_AFLAGS += $(cfi) $(cfi-sigframe) $(cfi-sections) $(asinstr) $(avx_instr) $(avx2_instr) $(avx512_instr) $(sha1_ni_instr) $(sha256_ni_instr) $(sse2_instr)
KBUILD_CFLAGS += $(cfi) $(cfi-sigframe) $(cfi-sections) $(asinstr) $(avx_instr) $(avx2_instr) $(avx512_instr) $(sha1_ni_instr) $(sha256_ni_instr) $(sse2_instr)

KBUILD_LDFLAGS := -m elf_$(UTS_MACHINE)

Expand Down
10 changes: 8 additions & 2 deletions drivers/misc/hqm/Makefile
Expand Up @@ -4,5 +4,11 @@

obj-$(CONFIG_INTEL_HQM) := hqm.o

hqm-objs := \
hqm_main.o \
hqm-objs := \
hqm_main.o \
hqm_intr.o \
hqm_ioctl.o \
hqm_mem.o \
hqm_pf_ops.o \
hqm_resource.o \
hqm_smon.o \
83 changes: 83 additions & 0 deletions drivers/misc/hqm/hqm_dp_ops.h
@@ -0,0 +1,83 @@
/* SPDX-License-Identifier: GPL-2.0-only
* Copyright(c) 2017-2019 Intel Corporation
*/

#ifndef __HQM_OPS_DP_H
#define __HQM_OPS_DP_H

#include <linux/frame.h>
#include <asm/fpu/api.h>
#include <asm/cpu.h>

/* CPU feature enumeration macros */
#define CPUID_DIRSTR_BIT 27
#define CPUID_DIRSTR64B_BIT 28

static inline bool movdir64b_supported(void)
{
int eax, ebx, ecx, edx;

asm volatile("mov $7, %%eax\t\n"
"mov $0, %%ecx\t\n"
"cpuid\t\n"
: "=a" (eax), "=b" (ebx), "=c" (ecx), "=d" (edx));

return ecx & (1 << CPUID_DIRSTR64B_BIT);
}

/**
* movntdq_asm() - execute a movntdq instruction
* @addr: mapped producer port address
* @data0: least-significant 8B to move
* @data1: most-significant 8B to move
*
* This function executes movntdq, moving @data0 and @data1 into the address
* @addr.
*/
static inline void movntdq_asm(long long __iomem *addr,
long long data0,
long long data1)
{
#ifdef CONFIG_AS_SSE2
__asm__ __volatile__("movq %1, %%xmm0\n"
"movhps %2, %%xmm0\n"
"movntdq %%xmm0, %0"
: "=m" (*addr) : "r" (data0), "m" (data1));
#endif
}

static inline void hqm_movntdq(void *qe4, void __iomem *pp_addr)
{
/* Move entire 64B cache line of QEs, 128 bits (16B) at a time. */
long long *_qe = (long long *)qe4;

kernel_fpu_begin();
movntdq_asm(pp_addr + 0, _qe[0], _qe[1]);
/* (see comment below) */
wmb();
movntdq_asm(pp_addr + 0, _qe[2], _qe[3]);
/* (see comment below) */
wmb();
movntdq_asm(pp_addr + 0, _qe[4], _qe[5]);
/* (see comment below) */
wmb();
movntdq_asm(pp_addr + 0, _qe[6], _qe[7]);
kernel_fpu_end();
/* movntdq requires an sfence between writes to the PP MMIO address */
wmb();
}

static inline void hqm_movdir64b(void *qe4, void __iomem *pp_addr)
{
asm volatile(".byte 0x66, 0x0f, 0x38, 0xf8, 0x02"
:
: "a" (pp_addr), "d" (qe4));
}

/* objtool's instruction decoder doesn't recognize the hard-coded machine
* instructions for movdir64b, which causes it to emit "undefined stack state"
* and "falls through" warnings. For now, ignore the functions.
*/
STACK_FRAME_NON_STANDARD(hqm_movdir64b);

#endif /* __HQM_OPS_DP_H */

0 comments on commit 1be1600

Please sign in to comment.