Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend hardware RAID for physical disks and controller #148

Closed
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
213 changes: 213 additions & 0 deletions design/baremetal-operator/raid-disk-controller-naming.md
@@ -0,0 +1,213 @@
<!--
This work is licensed under a Creative Commons Attribution 3.0
Unported License.

http://creativecommons.org/licenses/by/3.0/legalcode
-->

# raid-extension-for-physical-disks-and-controller

## Status

provisional

## Summary

This document provides the naming convention adopted to express
``physical_disks`` and ``controller`` sections of a RAID configuration of
a baremetal-host. The aim is to provide a uniform way of representation for
hardware from various vendors, in a way that enhances operator experience.
It also proposes an implementation for the same.

## Motivation

Vendors have different ways of defining the names for their physical disks
and RAID controllers within Ironic. This leads to a complication for the
operator where user experience is affected. There needs to be a uniform
naming convention that abstracts this and increases convenience.

### Goals

The primary goal is to reach consensus on one naming convention to depict
disk and RAID controller names in the baremetal-host ``hardware-raid``
section.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is metal3 going to be the only part of the stack that understands the naming abstraction? So it would be our convention?

How does one normally identify the disks? Using values like we have in our rootDeviceHints structure?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is metal3 going to be the only part of the stack that understands the naming abstraction? So it would be our convention?

Yes. It will be our convention, that should be user friendly, and depicts vendor agnosticism.

How does one normally identify the disks? Using values like we have in our rootDeviceHints structure?

For RAID configurations, the physical disks are usually identified by various vendors as follows:

The iDRAC driver, with a conjured string like Disk.Bay.0:Enclosure.Internal.0-1:RAID.Integrated.1-1. Here's the reference.

The iRMC driver uses a list of integers for disks. Here's the reference.

Others follow their own conventions. The plan is to abstract these details into user friendly naming convention, since these are difficult and cumbersome for the user. The implementation will only focus on the iDRAC driver, but since all the vendors will implement the same names from a user interface perspective, I proposed a simple convention above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I phrased that question poorly.

What information does Ironic expect as identification for the disks?

Why isn't this abstraction in Ironic?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on the raid-interface in use, since the API call made to the corresponding BMC for the server has it's own way of accepting parameters. Ironic accepts those, and passes them on. So, for instance, the idrac-wsman interface accepts physical disks in an extended format like Disk.Bay.0:Enclosure.Internal.0-1:RAID.Integrated.1-1 whereas the irmc interface for RAID accepts disks as simple integers like 0, 1, or 2.

As to why isn't the abstraction in Ironic, I'm not entirely sure. Since Ironic is operated by another operator, in a larger ecosystem I'd say the abstraction to use would be up to the operator.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to derive those vendor-specific IDs from values we have from inspection (even if it is data not in the host status fields today)? Could we reuse the rootDeviceHints structure to specify a device for consistency within our API then have a set of types that know how to do the conversion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach I had in mind was this: metal3 could find the device names via Ironic

What to do when the device names are not available?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another idea: cannot we make the drivers in ironic support simply numbers in addition to full names? This would remove the need for new API on the ironic side and will make this feature available to everyone using ironic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach I had in mind was this: metal3 could find the device names via Ironic

What to do when the device names are not available?

The logic inside the BMO will make sure that the names be made available for the supported BMC types( for unsupported ones, we can simply throw an exception). And this will be done by ensuring disk names through ironic (whatever method should we chose to do so)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another idea: cannot we make the drivers in ironic support simply numbers in addition to full names? This would remove the need for new API on the ironic side and will make this feature available to everyone using ironic.

That's a great idea. We should evaluate this further and break it down into an RFE that is implementable for the ironic conductor. @dtantsur What are you thinking?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. I'd merge this design with vendor-specific names (since we cannot guarantee the mapping for all vendors) and figure it out on the ironic side.


Afterwards, implementation of the same will be done within the
``baremetal-operator`` by extending the ``baremetal-host`` specification.

### Non-Goals

This specification does not deal with ``software-raid`` and extension for it.

## Proposal

The crux of the proposal is the naming convention that is to be adopted
for interfacing with the user. Currently, we have the following options

### Controller Naming

For controllers that are installed specifically for the operating system,
for example the [BOSS Card](1) by DellEMC, they are almost always used with
Copy link
Member

@Hellcatlk Hellcatlk Nov 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BOSS Card](1) is the syntax of link in Markdown.
Do you mean [BOSS Card](https://i.dell.com/sites/doccontent/shared-content/data-sheets/en/Documents/Dell-PowerEdge-Boot-Optimized-Storage-Solution.pdf)(1)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just trying to add links as references, without having to mention the URL within the proposal, for cleanliness. Do you think it's not the right way?

Copy link
Member

@Hellcatlk Hellcatlk Nov 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, but [BOSS Card](1) will be explained to a link, like BOSS Card. you can use [BOSS Card][1] replace it.

If I misunderstand your intention, please ignore this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I mean it to be a link. No worries :)

two disks/drives in ``RAID-1``, and can be refered to as
``primary-controller``. Similar method can be adopted to address such
Copy link
Member

@Hellcatlk Hellcatlk Nov 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if we want to specify the controller of the first hardware RAID volume, it must be primary-controller, right? And what happens if users use both "root device hint" and "primary controller".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The primary_controller doesn't have any implementation side effects, other than it is a name for a controller that is unlike other RAID controllers in the server. I'm fine with not using this term, and we can refer to them all as raid_controller_1 etc.

And yes, that's another way of looking at the primary_controller. But what I meant is that the BOSS card (which is a RAID controller) is a different hardware component within the server created specifically for root devices. That's why I gave it a distinguished notion of primary_controller. But you're right, that's another way of looking at it.

controllers by other vendors.

Other controllers that are installed in the server can be used in any order,
and thus can be named ``secondary-controller-x`` where x can be an integer
greater than or equal to 1, e-g ``secondary-controller-1``,
``secondary-controller-2`` and so on.

### Physical Disk/Drive Naming

With physical disks, the primary issue is disk enumeration. We can address
it this way:

- Start from the front bay, and count drives in the order that they are
listed by vendors. For instance, vendors would list them as XXXXX-1-XXX,
XXXXX-2-XXX and so on. We can use these 1 and 2 to specify ordering and
provide a simpler naming convention as ``Disk-1``, ``Disk-2`` and so on.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just use the number?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with using just the number. Let's see if anybody else has any comments.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just use the number unless there is some objection.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any ordering guarantees across disks at least within a controller?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I'm wrong @rpioso but the ordering of disks is deterministic for iDRAC BMC type. I can't be sure about other vendors since I don't have availability of that kind of hardware.
Perhaps @Hellcatlk can tell us about RMC type hardware?

Regardless, I'm of the opinion that the numbering should be deterministic and vendors should bear the burden of ensuring this. We can proceed with development and make this determinism a necessary condition.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For irmc, the ordering of disks is also fixed.

- The drives inserted in the backplane can be counted _afterwards_ using
the same convention.

This will provide the user with a simpler way of listing controllers
and disks and it should suffice for a significant majority of use-cases.

### User Stories

The following user stories appertain to the proposal in question

#### Story 1

As an operator, I should be able to focus on defining my RAID configurations
without having to deal with intricacies of individual vendor's naming
conventions.

#### Story 2

As an operator, I'd like to be able to specify the physical disks I want
to use for my RAID configuration. Or, in case of multiple controllers, I
should be able to choose from them in defining my ``hardware-raid``
configuration, or both.

## Design Details

- The CRD spec will have to be extended to add fields for ``physical_disks``
and ``controller`` under the ``hardware_raid`` section.
- The provisioner will then be extended to process these fields.
- There needs to be a map to transform user-interfacing names to the
vendor-specific names (like in case of ``firmware`` config).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to be a vendor-specific feature. How is it going to be handled?

Why use vendor-specific controller names but try to invent vendor-independent disk names?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the unfortunate part where we don't have a purely vendor-agnostic interface available inside ironic that we can query through a singular format and extract information for each BMC type. What do you suggest otherwise, as a solution?

- The provisioner logic will perform lookups, and make Ironic API calls
accordingly.
- The status field will mirror these fields after successful configuration.
This is to provide the operator with the information regarding current config.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you mind giving a full example of the CRD here please ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, let me add those

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maelk I've added the sample CRD into the latest patch. The naming convention is up for debate and I'd really like to have input on that. Thank you for your time!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is hardwareVolumes the correct term to use in the yaml? Perhaps logicalVolumes is a better term to use.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's "hardwareVolumes" as in "volumes managed by a hardware RAID controller", as opposed to software RAID. Both hardware and software RAID produce logical volumes so I don't think that name describes what we want, but I agree the current naming is awkward. I don't have any good suggestions though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dah yes you're right. Let's come up with some names. How about hardwareRAIDVolume? I think that does it well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the naming to hardwareRAIDVolume(s).

Note: gophercloud already supports these new fields, no extension is required
there.

In terms of the CRD, the current implementation of the RAID has the following
sections in it, for hardware RAID:

```yaml
spec:
raid:
hardwareRAIDVolumes:
- sizeGibiBytes: 1000
level: 1
name: volume-name
rotational: true
numberOfPhysicalDisks: 2
```

We propose to add fields into the CRD, in the following manner:

```yaml
spec:
raid:
hardwareRAIDVolumes:
- sizeGibiBytes: 1000
level: 1
name: volume-1
rotational: true
numberOfPhysicalDisks: 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably ignore most of the other selection fields if we're given a list of the exact disks to use, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose that depends on the precision of the naming convention. If we decide upon simpler names, for user convenience, we will need some of these selection fields, to narrow down the space. On the other hand, if we use precise names, we can ignore the selection fields. Let's see whatever naming convention we agree upon.

physicalDisks:
- diskName: Disk-1
- diskName: Disk-2
controller: primary-controller
- sizeGibiBytes: 2000
level: 0
name: volume-2
rotational: false
numberOfPhysicalDisks: 2
physicalDisks:
- diskName: Disk-3
- diskName: Disk-4
controller: secondary-controller-2
```

### Implementation Details/Notes/Constraints

The CRD will be extended to add fields to ``spec`` and ``status`` sections.
A map will be created in ``raid.go`` which will map user interfacing names
to the vendor-specific names.
The current RAID API call logic will be appended to process these new fields
and populate them in the ``target_configuration`` structure.
Other vendors will have to extend the provisioner logic accordingly.

### Risks and Mitigations

Since we will be specifying physical disks to be used, care needs to be taken
to not accidently erase media with sensitive data on them. It is very easy
to undesirably remove data from disks.

### Work Items

- Extend the CRD with agreed naming convention.
- Implement provisioner logic to process those fields to make Ironic API
calls.
- Testing of the code (see test plan below).

### Dependencies

- gophercloud; the dependency is satisfied. It has the functionality we need.
- ironic; the dependency is satisfied. It has the functionality we need.

### Test Plan

The code will be tested in a development environment with a
stand-alone deployment of the ``baremetal-operator`` and ``ironic``.
A number of deployments will be performed with various combinations of
``physical_disks`` and ``controller`` fields to exercise complete capability.

Testing will only be performed for ``idrac-wsman``, since that is the only
availability at the moment. That is to say, with DellEMC hardware.

Other vendors will have to test the code accordingly.

### Upgrade / Downgrade Strategy

The Ironic API changes that break backwards compatibility are going to require
changes to the provisioner logic to construct the API call accordingly. This
is highly unlikely and therefore upgrading should be fine.
The user interfacing part (the API fields in the BMH) are not changing in any
case. The upgrades to the operator can be performed safely without breaking
any functionality.

### Version Skew Strategy

None.

## Drawbacks

None.

## Alternatives

Rely on the current ``hardware-raid`` configuration which does not allow
for specifying physical disks and RAID controllers, but works well in
use cases where such a functionality is not desired.

## References

[1]: (https://i.dell.com/sites/doccontent/shared-content/data-sheets/en/Documents/Dell-PowerEdge-Boot-Optimized-Storage-Solution.pdf)

[2]: (https://docs.openstack.org/ironic/latest/admin/raid.html)