Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

needed ? how-to and tested on xyz #46

Closed
tripLr opened this issue Oct 8, 2019 · 2 comments
Closed

needed ? how-to and tested on xyz #46

tripLr opened this issue Oct 8, 2019 · 2 comments

Comments

@tripLr
Copy link

tripLr commented Oct 8, 2019

In general,
What's the method to fix the driver for a newer kernel ?

Also, is there a way to add a verified kernel report to the readme so we can look for these devices used for sale?

Also in general, who actually makes the controller chip ? Would they have current source code or a similar chipset for porting or even creating a kernel patch to be submitted ?

@snuf
Copy link
Collaborator

snuf commented Oct 8, 2019

Hi @tripLr,

Steps I do when attempting to fix, the process can take anywhere from 4 hours to 20+, mostly due to context switching (heh life, sleep and a doing work that pays for food tends to get in the way):

  1. Build the newer kernel for the header files
  2. Compile the module with the header files
    a. Compile has run and broken on newer kernel, go to 3
    b. If all goes well go to 8
  3. Look through errors generated, and look at where things broke
  4. Look at changes in detection of parameters (kio_config)
  5. Look at changes in the newer kernel (https://elixir.bootlin.com/linux/latest/source/kernel)
  6. Try to figure out if changes are simple, or more complex
    a. when more simple, fix kIo_config, and put fix in place, and start from 1.
    b. when more complex:
    b1. Look at where other drivers used the same construct that is now "obsolete", and trace the changes made.
    b2. Look for newer commercial "open" drivers that have had similar constructs and have now had changes (the NVIDIA driver has helped a bit here and there)
    b3. Sieve through LKML, and hope to find something useful, on what why where and when
    b4. Look at the block level, and abstraction implementations in the linux kernel and figure out what has changed
  7. If the module compiles, run automated compile for several kernels (ideally the kio_config should be compared to previous versions for consistency, which is not done yet).
  8. Run debian packaging and generate packages, (check package size)
  9. Run the integration tests, in a PCIe forwarded VM with the specified kernels
    a. compile the module
    b. and insert the module
    b1. Kernel panic, return to 5b
    c. run fio tests, and capture output
    c1. Kernel panic, return to 5b
    c2. Fio reports data corruption (should add a good mix of tests here btw, do just 2 now), return to 5b
    c3. Fio reports dramatically low throughput, return to 5b
  10. if all goes well push to github, either a branch, or merge to master depending on impact.

Most of the above testing, compiling, and verification is done in ubuntu docker containers based on an environment file. Testing is done from a VM with PCIe passthrough. Most of the things that can be automated and scripted are scripted. I should probably put all that stuff in github so other people can use it too, haven't gotten around to tidying it up in away I like though.

Adding verified kernel support to the README would actually be a good idea, however one of the things is that this driver requires testing and verification. The screwing around with in kernel parts can cause horrendous data loss!
Device wise the supported devices by this driver are here, their deviceId, translation can be found here, which also contains some other vendor names that are based on the FusionIO controller. This is excluding the "rebranded" cards by HP, Dell, IBM, Cisco, Fujitsu, and Supermicro, more information can be found here. Making a matrix of the in driver listed Vendor IDs should be possible, and is something I'll look into when I have some spare cycles.
As a side note, the IOMemory devices have a different driver, and I've done a fix for someone once, but never checked it in (shame on me). The differences in the driver shim for the IOMemory, and this family of devices, however are not staggering. Mind you that this "driver" is actually a shim that sits between the actual compiled device interface, and the kernel.

To your last question, the reason this fork started is because the company that fabricates these cards was lagging behind considerably on supporting newer kernels. At the time, over six years ago, I was fortunate to get one thrown in my lap. However, when updating my kernel the kernel module didn't compile anymore, so I decided to "fix" it myself, instead of waiting on the vendor. Not until about five years ago though I put this stuff on github, as I thought it might be of use to other people. It was a project that lost my attention for a while, and am grateful for the work that @plappermaul has done, and does.

Best,

Funs

@snuf
Copy link
Collaborator

snuf commented Oct 10, 2019

@tripLr thoughts?

@snuf snuf pinned this issue Dec 11, 2019
@snuf snuf closed this as completed Dec 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants