Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wrong naming of controllers #455

Closed
scheuref opened this issue Jan 31, 2019 · 12 comments
Closed

wrong naming of controllers #455

scheuref opened this issue Jan 31, 2019 · 12 comments

Comments

@scheuref
Copy link

scheuref commented Jan 31, 2019

Dear Maintainers,

We tried to replace the default namespace of 8 TB on a INTEL SSDPE2KX080T8 with 2 namespaces of 4 TB.
We had several strange issue with recent kernels.

On SLES12 SP3 (4.4.162-94.72-default #1 SMP Mon Nov 12 18:57:45 UTC 2018 (9de753f) x86_64 x86_64 x86_64 GNU/Linux)
with nvme-cli v1.2 we could delete the ns but we could create only one ns. This seems to be a limitation from intel.
However, recreating the ns with the original size gave a NS_INSUFFICIENT_CAPACITY error,
but trying with a size 64 MiB smaller was successful.

On recent kernels, for ex. grml 4.19.0-1-grml-amd64 #1 SMP Debian 4.19.8-1+grml.1 (2018-12-11) x86_64 GNU/Linux
with nvme-cli v1.6 we had the same issue with lower capacity after deleting a ns.
And we also had a much serious bug: nvme-cli was sending the commands to the wrong nvme's.
For ex., a format command on nvme0 was wiping all data on nvme3 !!! (fortunately it was a test server...)
We build nvme-cli from the current master branch (v1.7.18.g637b) with the same results.


    #As you can below, nvme4 is the controller for /dev/nvme1n1
    #and nvme1 is the controller for /dev/nvme4n1.
    #'nvme format /dev/nvme1' will therefore wipe /dev/nvme4n1 instead of /dev/nvme1n1 !
    #It goes without saying the this bug is UPMOST CRITICAL as it wipe the wrong SSD !!
    #I am unsure if the bug is in Linux Kernel or in the nvme-cli toolset.
    #Mapping controller => namespace:
        CVPF633200E72P0PGN /dev/nvme0 => /dev/nvme3n1 
        CVPF633200E32P0PGN /dev/nvme1 => /dev/nvme4n1 
        CVPF6332005M2P0PGN /dev/nvme2 => /dev/nvme5n1 
        CVPF633200KK2P0PGN /dev/nvme3 => /dev/nvme6n1 
        BTLJ841002H88P0HGN /dev/nvme4 => /dev/nvme1n1 
        BTLJ8410014X8P0HGN /dev/nvme5 => /dev/nvme0n1 
        CVFT71120002400LGN /dev/nvme6 => /dev/nvme2n1 

    root@grml:~# nvme list
        Node             SN                   Model                                    Namespace Usage                      Format           FW Rev  
        ---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
        /dev/nvme0n1     BTLJ8410014X8P0HGN   INTEL SSDPE2KX080T8                      1           8.00  TB /   8.00  TB    512   B +  0 B   VDV10131
        /dev/nvme1n1     BTLJ841002H88P0HGN   INTEL SSDPE2KX080T8                      1           8.00  TB /   8.00  TB    512   B +  0 B   VDV10131
        /dev/nvme2n1     CVFT71120002400LGN   INTEL SSDPEDMD400G4                      1         400.09  GB / 400.09  GB    512   B +  0 B   8DV10171
        /dev/nvme3n1     CVPF633200E72P0PGN   INTEL SSDPE2MX020T7                      1           2.00  TB /   2.00  TB    512   B +  0 B   MDV10271
        /dev/nvme4n1     CVPF633200E32P0PGN   INTEL SSDPE2MX020T7                      1           2.00  TB /   2.00  TB    512   B +  0 B   MDV10271
        /dev/nvme5n1     CVPF6332005M2P0PGN   INTEL SSDPE2MX020T7                      1           2.00  TB /   2.00  TB    512   B +  0 B   MDV10271
        /dev/nvme6n1     CVPF633200KK2P0PGN   INTEL SSDPE2MX020T7                      1           2.00  TB /   2.00  TB    512   B +  0 B   MDV10271

    root@grml:~# for i in /dev/nvme{0..6}; do echo -n "$i: "; nvme id-ctrl $i --human-readable --vendor-specific | head -6 | paste -s; done | column -t
        /dev/nvme0:  NVME  Identify  Controller:  vid  :  0x8086  ssvid  :  0x8086  sn  :  CVPF633200E72P0PGN  mn  :  INTEL  SSDPE2MX020T7  fr  :  MDV10271
        /dev/nvme1:  NVME  Identify  Controller:  vid  :  0x8086  ssvid  :  0x8086  sn  :  CVPF633200E32P0PGN  mn  :  INTEL  SSDPE2MX020T7  fr  :  MDV10271
        /dev/nvme2:  NVME  Identify  Controller:  vid  :  0x8086  ssvid  :  0x8086  sn  :  CVPF6332005M2P0PGN  mn  :  INTEL  SSDPE2MX020T7  fr  :  MDV10271
        /dev/nvme3:  NVME  Identify  Controller:  vid  :  0x8086  ssvid  :  0x8086  sn  :  CVPF633200KK2P0PGN  mn  :  INTEL  SSDPE2MX020T7  fr  :  MDV10271
        /dev/nvme4:  NVME  Identify  Controller:  vid  :  0x8086  ssvid  :  0x8086  sn  :  BTLJ841002H88P0HGN  mn  :  INTEL  SSDPE2KX080T8  fr  :  VDV10131
        /dev/nvme5:  NVME  Identify  Controller:  vid  :  0x8086  ssvid  :  0x8086  sn  :  BTLJ8410014X8P0HGN  mn  :  INTEL  SSDPE2KX080T8  fr  :  VDV10131
        /dev/nvme6:  NVME  Identify  Controller:  vid  :  0x8086  ssvid  :  0x8086  sn  :  CVFT71120002400LGN  mn  :  INTEL  SSDPEDMD400G4  fr  :  8DV10171

Best Regards
Francois Scheurer

@scheuref
Copy link
Author

linked to issue 454

@keithbusch
Copy link
Contributor

As far as the names goes, having a different number on nvme than its namespaces, nvmeYn1, is an artifact of how the nvme native multipathing works. The namespaces inherit the subsystem identifier and the controller handles use their controller identifier.

@engel75
Copy link

engel75 commented Jan 31, 2019

@keithbusch
You might have missed something in the output. Shouldn't

nvme id-ctrl /dev/nvme0 | grep ^sn

output:
BTLJ8410014X8P0HGN

The problem does not occur with kernel 4.4.

@keithbusch
Copy link
Contributor

I think you're misunderstanding what I'm saying. The name, "nvme0" has no relation to "nvme0n1".

If you want to preserve that relationship, in your kernel config, make sure CONFIG_NVME_MULTIPATH is not enabled.

@keithbusch
Copy link
Contributor

Or kernel param, nvme_core.multipath=0, should do the same thing.

@engel75
Copy link

engel75 commented Jan 31, 2019

@keithbusch
I actually misunderstood you, excuse me. Thank you for the good hint. We will test the Kernel cmdline parameter immediately.
It seems that no Intel Datacenter NVMe SSD supports more than one namespace. Which U.2 SSDs on the market support more than one namespace at all?

@keithbusch
Copy link
Contributor

I'm a bit disconnected from the group that makes them, so I'm not sure if the firmware that enables multi-namespace is released. I'll ping some people and see which products, if any support the feature.

@keithbusch
Copy link
Contributor

The response I received is that the feature is currently under development, but no shipping products support it at this time.

@scheuref
Copy link
Author

scheuref commented Jan 31, 2019

Hi Keithbusch

As far as the names goes, having a different number on nvme than its namespaces, nvmeYn1, is an > artifact of how the nvme native multipathing works. The namespaces inherit the subsystem identifier > and the controller handles use their controller identifier.
The name, "nvme0" has no relation to "nvme0n1".

Very useful info!
Many thanks for the explanation :-) So that's really not a bug but a feature ! :-)
I tested and confirm that booting with nvme_core.multipath=0 preserved that relationship.

Cheers

@keithbusch
Copy link
Contributor

When nvme multipath is enabled, the relationship among named devices is much easier to see through /sys/class/nvme/ rather than through /dev/.

The nvme multipath is really efficient so it's great to use when you have such capabilities in nvme controllers. We had to change the namespace's parent device to the subsystem rather than the controller for that, so the naming numerals had to be different. This change has caused a bit of confusion, but you can safely disable multipathing if you haven't got dual/multi-port controllers.

@scheuref
Copy link
Author

Thx again, very helpful to get your comments!

@engel75
Copy link

engel75 commented Feb 1, 2019

The response I received is that the feature is currently under development, but no shipping products support it at this time.

Thank you very much for the quick feedback. In an Intel forum I read that support for the DC P4510 will be provided in a future firmware. I hope it won't take too long.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants