Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't map PCI memory if kernel driver is active in Linux >= 4.5 #1286

Closed
alexandergall opened this issue Feb 13, 2018 · 0 comments · Fixed by #1436
Closed

Can't map PCI memory if kernel driver is active in Linux >= 4.5 #1286

alexandergall opened this issue Feb 13, 2018 · 0 comments · Fixed by #1436

Comments

@alexandergall
Copy link
Contributor

This test case uses the current master (2ad3ed2) with a trivial graph:

local c = config.new()
config.app(c, "in", require("apps.intel_mp.intel_mp").Intel,
           { pciaddr = "0000:69:00.1" })
config.app(c, "out", require("apps.basic.basic_apps").Sink)
config.link(c, "in.output -> out.input")
engine.configure(c)
engine.main({ duration = 5, report = { showlinks = true } })

When the device is unbound from the kernel, it works as expected

$ sudo ./snabb pci_bind -u 0000:69:00.1
Unbound 0000:69:00.1, ready for Snabb.
$ sudo ./snabb snsh test.lua
link report:
          18,604,865 sent on in.output -> out.input (loss rate: 0%)

But when the device is still bound to the kernel, the result is

$ sudo bash -c "echo 0000:69:00.1 >/sys/bus/pci/drivers/ixgbe/bind"
$ sudo ./snabb snsh test.lua
core/main.lua:26: Invalid argument

Stack Traceback
===============
(1) Lua function 'handler' at file 'core/main.lua:168' (best guess)
        Local variables:
         reason = string: "core/main.lua:26: Invalid argument"
         (*temporary) = C function: print
(2) global C function 'error'
(3) Lua global 'assert' at file 'core/main.lua:26'
        Local variables:
         v = nil
(4) Lua field 'map_pci_memory_unlocked' at file 'lib/hardware/pci.lua:159'
        Local variables:
         device = string: "0000:69:00.1"
         n = number: 0
         lock = boolean: false
         filepath = string: "/sys/bus/pci/devices/0000:69:00.1/resource0"
         err = nil
(5) Lua method 'new' at file 'apps/intel_mp/intel_mp.lua:358'
        Local variables:
         self = table: 0x41190df0  {rss_tab:function: 0x41192338, transmit:function: 0x41192520, rss_tab_build:function: 0x41192358 (more...)}
         conf = table: 0x41193da8  {mtu:9014, linkup_wait_recheck:0.1, rate_limit:0, linkup_wait:120, master_stats:true (more...)}
         self = table: 0x41196da0  {shm_root:/intel-mp/69:00.1/, pciaddress:0000:69:00.1, rate_limit:0, vmdq:false (more...)}
         vendor = string: "0x8086"
         device = string: "0x10fb"
         byid = table: 0x41191d40  {driver:table: 0x41191ac0, registers:82599ES, max_q:16}
(6) Lua function 'ops' at file 'core/app.lua:342' (best guess)
        Local variables:
         name = string: "in"
         class = table: 0x41190df0  {rss_tab:function: 0x41192338, transmit:function: 0x41192520, rss_tab_build:function: 0x41192358 (more...)}
         arg = table: 0x41193da8  {mtu:9014, linkup_wait_recheck:0.1, rate_limit:0, linkup_wait:120, master_stats:true (more...)}
(7) Lua global 'apply_config_actions' at file 'core/app.lua:369'
        Local variables:
         actions = table: 0x411967e0  {1:table: 0x41196830, 2:table: 0x41196928, 3:table: 0x41196a08, 4:table: 0x41196ad0 (more...)}
         ops = table: 0x41196aa0  {unlink_output:function: 0x41196c60, stop_app:function: 0x41196d38, free_link:function: 0x41196cf8 (more...)}
         remove_link_from_array = Lua function 'remove' (defined at line 289 of chunk core/app.lua)
         (for generator) = C function: builtin#6
         (for state) = table: 0x411967e0  {1:table: 0x41196830, 2:table: 0x41196928, 3:table: 0x41196a08, 4:table: 0x41196ad0 (more...)}
         (for control) = number: 1
         _ = number: 1
         action = table: 0x41196830  {1:start_app, 2:table: 0x41196868}
         name = string: "start_app"
         args = table: 0x41196868  {1:in, 2:table: 0x41190df0, 3:table: 0x41193da8}
(8) Lua field 'configure' at file 'core/app.lua:136'
        Local variables:
         new_config = table: 0x40859098  {links:table: 0x4085ebd8, apps:table: 0x40860288}
         actions = table: 0x411967e0  {1:table: 0x41196830, 2:table: 0x41196928, 3:table: 0x41196a08, 4:table: 0x41196ad0 (more...)}
(9) main chunk of file 'test.lua' at line 7
(10) global C function 'dofile'
(11) Lua global 'run_script' at file 'program/snsh/snsh.lua:87'
        Local variables:
         parameters = table: 0x40052ad8  {}
         command = string: "test.lua"
(12) Lua field 'run' at file 'program/snsh/snsh.lua:71'
        Local variables:
         parameters = table: 0x40052ad8  {}
         profiling = boolean: false
         traceprofiling = boolean: false
         start_repl = boolean: false
         noop = boolean: true
         program = nil
         opt = table: 0x41fafc40  {t:function: 0x405efc90, q:function: 0x405efcb0, P:function: 0x400528b0 (more...)}
(13) Lua function 'main' at file 'core/main.lua:67' (best guess)
        Local variables:
         program = string: "snsh"
         args = table: 0x4049bf80  {1:test.lua}
(14) global C function 'xpcall'
(15) main chunk of file 'core/main.lua' at line 230
(16)  C function 'require'
(17) global C function 'pcall'
(18) main chunk of file 'core/startup.lua' at line 3
(19) global C function 'require'
(20) main chunk of [string "require "core.startup""] at line 1
        nil

This problem does not occur on a 4.4 kernel. A look in the 4.9 kernel source reveals the following. From drivers/pci/pci-sysfs.c:pci_mmap_resource():

        if (res->flags & IORESOURCE_MEM && iomem_is_exclusive(res->start))
                return -EINVAL;

and from kernel/resource.c:iomem_is_exclusive():

                /*
                 * A resource is exclusive if IORESOURCE_EXCLUSIVE is set
                 * or CONFIG_IO_STRICT_DEVMEM is enabled and the
                 * resource is busy.
                 */
                if ((p->flags & IORESOURCE_BUSY) == 0)
                        continue;
                if (IS_ENABLED(CONFIG_IO_STRICT_DEVMEM)
                                || p->flags & IORESOURCE_EXCLUSIVE) {
                        err = 1;
 

See also https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=90a545e981267e917b9d698ce07affd69787db87
This change has been introduced with 4.5 and is the kernel default. At least for Debian, it is also the distribution default.

The PCI memory is mapped by the family of functions map_pci_memory, map_pci_memory_locked and map_pci_memory_unlocked, which also include an optional advisory lock on the file /sys/bus/pci/devices/<pci_address>/resource0. The intel_mp driver uses the lock to determine who is the master process for the device:

   -- Setup device access
   self.base, self.fd = pci.map_pci_memory_unlocked(self.pciaddress, 0)
   self.master = self.fd:flock("ex, nb")

The unbinding of the device from the kernel driver happens after this (in the init() method). This is why the test-case fails on Linux >= 4.5 when the CONFIG_IO_STRICT_DEVMEM feature is enabled.

It seems to me that the proper solution to this problem is to simply map the memory only after the device has been unbound. This would require unbundling the advisory lock from map_pci_memory() and moving the call of unbind_device_from_linux() out of the init() method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant