-
Notifications
You must be signed in to change notification settings - Fork 653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exploit s390's new and enhanced PCI memory I/O (MIO) instructions #1122
Conversation
I honestly don't know what to do this PR. |
It is quite special so that's understandable. Sadly I can't currently refer you to a specification but I'm here to answer First some general s390 PCI and I/O background:
I can also give some more pointers to related code in the Linux Kernel:
|
d3ac128
to
b42f08d
Compare
@jgunthorpe I have pushed an updated version with the following changes:
I did not use the IFUNC mechanism in this version because I think a flag is simpler and allows us to save the function call overhead for the simple load/store case where I'd also expect more gain from inlining especially since the length parameter |
b2d7008
to
212b4f6
Compare
212b4f6
to
87f3009
Compare
I just pushed a new version with the following changes:
|
14096c9
to
298bf24
Compare
I just pushed a v5. It is rebased on current master and removes again the busy wait for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems fine, just a few nitpicky things
util/mmio.c
Outdated
|
||
static __attribute__((constructor)) void check_mio_supported(void) | ||
{ | ||
#ifdef HWCAP_S390_PCI_MIO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather you use the usual shim mechanism for this constant to ensure it is always declared regardless of the libc version. ie have cmake test it and replace sys/auxv.h like we do for other things
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thanks, will look into that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In v6 that I just pushed, I added a buildlib/fixup-include/sys-auxv.h
with the HWCAP_S390_PCI_MIO
. We'll likely have backports for glibc 2.34 for this flag in the major distributions with s390 support anyway but in the code the shim method is definitely nicer than the inline #ifdef
.
#define mmio_flush_writes() asm volatile("" ::: "memory") | ||
#elif defined(__loongarch__) | ||
#define mmio_flush_writes() asm volatile("dbar 0" ::: "memory") | ||
#elif defined(__riscv) | ||
#define mmio_flush_writes() asm volatile("fence ow,ow" ::: "memory") | ||
#elif defined(__s390x__) | ||
#include "s390_mmio_insn.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is read and write being defined through udma_barrier.h ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I include the s390_mmio_insn.h
for the s390_is_mio_supported
flag. One way not to include the read/write would be to move the MAKE_READ/MAKE_WRITE
to mmio.h
and keep s390_mmio_insn.h
just for the flag and the instruction wrappers. Would that be acceptable @jgunthorpe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just add an extern for the flag, it doesn't have be in exactly one header. Just be sure to include all the headers that define the extern in the C file that instantiates it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried that, sadly it causes a redundant redecleration warning in all files that include both udma_barrier.h
and mmio.h
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now clearly we could work around this with an extra define much like a header guard such that util/udma_barrier.h
or util/s390_mmio_insn.h
depending on which comes first. But that's kind of ugly. No?
Personally I think, moving the MAKE_READ/MAKE_WRITE as well as the 8 bit read/write to util/mmio.h
while including util/s390_mmio_insn.h
is a clean solution. It even improves the match with the file name, keeping s390_mmio_insn.h
strictly about the instructions. I also noticed I had the flag check in both the s390_pciwb()
itself and again in mmio_flush_writes()
with that removed I find it pretty clean. Basically in util/udma_barrier.h
we're just left with #define mmio_flush_writes() s390_pciwb()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pushed a v6 with the above idea for you to take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh blah, yes we have to deal with that extra warning in these headers.
298bf24
to
501dad7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems OK overall, just do the cmake thing please
# glibc before 2.35 does not necesarily define the HWCAP_S390_PCI_MIO hardware | ||
# capability bit constant. Check for it and if necessary shim it in such that | ||
# kernel support for PCI MIO instructions can always be checked. | ||
RDMA_Check_C_Compiles(HAVE_GLIBC_HWCAP_S390_PCI_MIO " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please look at the end of this file and print out a message when this has to be activated like all the other cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added one at the bottom, wasn't sure before because there's already a Success/Failed message for the check itself.
In the existing MMIO implementation s390 relies on special syscalls to access PCI memory. This was necessary as s390 originally only had special privileged instructions for accessing PCI memory. With z15 however comes support for 4 new PCI memory I/O (MIO) instructions which operate on virtually mapped PCI memory spaces. While these are still special PCI access instructions instead of real MMIO they behave much more like standard MMIO access. There is a load like instruction pcilgi, a store like instruction pcistgi a block store variant for efficient memcpy pcistbi and a write barrier instruction pciwb. The load and store variants always operate on a 64 bit register but only load/store the right most bytes of the register controlled by a length value in a paired register (even numbered register rN + odd numbered register r(N+1)). As the previous PCI instructions did not operate on virtual memory mappings at all a kernel using them does not setup virtual memory mappings and thus can't support user-space using the new instructions. Also as use of PCI MIO instructions can be disabled via the pci=nomio kernel parameter we can't solely rely on hardware support and kernel version. Instead Linux exposes whether PCI MIO instructions are enabled via an ELF hardware capability. With this patch we check for this capability and if enabled use the newly introduced PCI MIO instructions for MMIO access and barriers. Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
501dad7
to
fe43e33
Compare
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Hi All,
In the existing MMIO implementation s390 relies on special syscalls to
access PCI memory. This was necessary as s390 originally only had
special privileged instructions for accessing PCI memory. With z15
however comes support for 4 new PCI memory I/O (MIO) instructions which
operate on virtually mapped PCI memory spaces.
While these are still special PCI access instructions instead of real
MMIO they behave much more like standard MMIO access. There is a load
like instruction pcilgi, a store like instruction pcistgi a block store
variant for efficient memcpy pcistbi and a write barrier instruction
pciwb. The load and store variants always operate on a 64 bit register
but only load/store the right most bytes of the register controlled by
a length value in a paired register (even numbered register rN + odd
numbered register r(N+1)).
As the previous PCI instructions did not operate on virtual memory
mappings at all a kernel using them does not setup virtual memory
mappings and thus can't support user-space using the new instructions.
Also as use of PCI MIO instructions can be disabled via the
pci=nomio
kernel parameter we can't solely rely on hardware support andkernel version. Instead Linux exposes whether PCI MIO instructions are
enabled via an ELF hardware capability. With this patch we check for
this capability and if enabled use the newly introduced PCI MIO
instructions for MMIO access and barriers.
Note to reviewers: One thing I'm not sure about is the handling of the
ELF hardware capability in udma_barrier.h. Because that is used without
mmio.h/mmio.c at least in the compile test for DMA coherency support.
So I didn't want to add a dependency to the
extern s390_mio_supported
variable. Instead I added a separate
getauxval()
call but that doesn'tseem ideal either, I couldn't measure any performance impact even on
a 100 Gbit/s card but still it's kind of unnecessary overhead.
Thanks,
Niklas