Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose board-serial and board-product through SMBios #21

Closed
wants to merge 1 commit into from
Closed

Expose board-serial and board-product through SMBios #21

wants to merge 1 commit into from

Conversation

dalehamel
Copy link
Contributor

This change exposes the SMBios attributes for the board serial and board product.

This enables for unique identification of Blade nodes, where the chassis serial will be the same as all other blades in the chassis.

We currently use these attributes at Shopify to uniquely identify nodes at boot time, where we can't depend on other attributes such as UUID to exist.

It's a relatively minor change, and expands the existing functionality by exposing additional standard SMBios attributes, which are particularly useful on blade nodes / where there are multiple nodes per chassis.

@Vanders
Copy link

Vanders commented Jun 10, 2014

👍 You are me and I claim $5(I have an iddenticalish patch queued in my own branch, I just hadn't gotten around to submitted a PR for it).

Don't you need a patch for smbios.h to define the type 2 structure & SMBIOS_TYPE_BOARD_INFORMATION?

@vtolstov
Copy link

Very nice. I'm alredy hit this problem then generation cluster_id in pacemaker based on uuid.
Does board serial and board-product always uniq?

@dalehamel
Copy link
Contributor Author

We use a simple algorithm for generating our own skus amd so far it hasn't
failed us.

Manufacturer + most specific serial.

  1. Shorten the manufacturer to 3 characters (ex Dell Inc. to DEL and
    supermicro to SPM) this is just for convenience.
  2. Append the board serial, if available, else use the chassis serial.
    Board serial will be needed for blade nodes. We haven't seen any case other
    than some rare full chassis nodes not having a board serial. Even if they
    don't, they have the chassis serial.

Haven't encountered a manufacturer where a uniquely identifing serial isn't
encoded in smbios, so it is pretty reliable.

On Tuesday, June 10, 2014, Vasiliy Tolstov notifications@github.com wrote:

Very nice. I'm alredy hit this problem then generation cluster_id in
pacemaker based on uuid.
Does board serial and board-product always uniq?


Reply to this email directly or view it on GitHub
#21 (comment).

@dalehamel
Copy link
Contributor Author

@Vanders you're absolutely right, i added the missing file.

Must have forgotten it when I rebased my branches to do the PR

@dalehamel
Copy link
Contributor Author

ping @ipxe-devel for review :)

@robinsmidsrod
Copy link
Contributor

Just curious, why do you need these variables defined with names instead of using the generic SMBIOS variable expansion, which is like this:

iPXE> show smbios/0.0.3:string
smbios/0.0.3:string = 12/01/2006

@dalehamel
Copy link
Contributor Author

@robinsmidsrod that could definitely be used, but since we're already exposing chassis smbios attributes for convenience, why not expose board level smbios attributes? The board level attributes are more specific to the actual node being booted, and will reduce confusion / make it easier to access them. Not everyone will know how to load figure out the smbios string to use, and this isn't really documented anywhere that i can see.

@dalehamel
Copy link
Contributor Author

The only reference I can find to reading arbitrary attributes from smbios is from gpxe docs:

http://etherboot.org/wiki/commandline

And I only found this because I knew specifically what to look for.

@dalehamel
Copy link
Contributor Author

@mcb30 thoughts on exposing board level serial / product since we already do for chassis, and board info is more useful on blade nodes?

@ipxe-devel
Copy link

On 11/06/14 21:30, Dale Hamel wrote:

@mcb30 https://github.com/mcb30 thoughts on exposing board level
serial / product since we already do for chassis, and board info is more
useful on blade nodes?

All of the SMBIOS information is already exposed via constructed
settings, e.g.

${smbios/2.7.0} # this is your "board-serial"
${smbios/2.5.0} # this is your "board-product"

Since all of the information can already be accessed, the question
becomes whether or not these settings will be sufficiently widely used
(compared to all of the other information available via SMBIOS) to
justify giving them names.

There is a non-zero cost of naming a setting; each named setting costs
approximately 20+<name_len>+<description_len> bytes: for example,
"board-serial" will cost 20+13+18=51 bytes.

So, to merge this patch, I need to know that it will be sufficiently
useful to a sufficiently large number of people to justify the code size
cost.

Michael

@dalehamel
Copy link
Contributor Author

I'm clearly biased, as I proposed this patch.

To reiterate my rational though: the chassis level product and serial are
already exposed, so this information is deemed to be important enough to
merit the code size cost.

The purpose of reading the smbios info at boot time is probably to identity
a node.

This chassis information is useless at it's intended purpose on blade
nodes, where it will only uniquely identify the parent chassis and not the
compute node itself.

Supplying the board level smbios information is more likely to achieve this
purpose, even on non blade nodes. Also, this information is much more
reliable than some other smbios attributes that are already exposed with
variable names, which may be defined and taking up space but have no actual
value.

Perhaps it's a slippery slope allowing any named smbios attributes at all?
What if we simply ifdef each class of smbios data, to make it easier to
override those which are named by default? Or, do away with all smbios
named variables and document more clearly how they values can be ready?

How can we really demonstrate if a naming a particular attribute can
justify its code size? To me this is really just a guess.

On Thursday, June 12, 2014, ipxe-devel notifications@github.com wrote:

On 11/06/14 21:30, Dale Hamel wrote:

@mcb30 https://github.com/mcb30 thoughts on exposing board level
serial / product since we already do for chassis, and board info is more
useful on blade nodes?

All of the SMBIOS information is already exposed via constructed
settings, e.g.

${smbios/2.7.0} # this is your "board-serial"
${smbios/2.5.0} # this is your "board-product"

Since all of the information can already be accessed, the question
becomes whether or not these settings will be sufficiently widely used
(compared to all of the other information available via SMBIOS) to
justify giving them names.

There is a non-zero cost of naming a setting; each named setting costs
approximately 20+<name_len>+<description_len> bytes: for example,
"board-serial" will cost 20+13+18=51 bytes.

So, to merge this patch, I need to know that it will be sufficiently
useful to a sufficiently large number of people to justify the code size
cost.

Michael


Reply to this email directly or view it on GitHub
#21 (comment).

@robinsmidsrod
Copy link
Contributor

I just did a tiny check with VirtualBox, and both 2.7.0 and 2.5.0 do contain useful values (same as product/serial). Is it possibly that we could swap chassis product/serial to board product/serial for the named settings?

Other than that, it might also be useful to include more examples on the smbios wiki page so that information like the one mentioned above is easier to find. It might also be useful to link to the SMBIOS reference manual.

@dalehamel
Copy link
Contributor Author

With the patch:

ls -l bin/undionly.kpxe
-rw-r--r-- 1 dale.hamel dale.hamel 70461 Jun 12 11:44 bin/undionly.kpxe

Without the patch:

ls -l bin/undionly.kpxe   
-rw-r--r-- 1 dale.hamel dale.hamel 70412 Jun 12 11:50 bin/undionly.kpxe

So we're talking about around 49 bytes total for this patch. I agree that we should document reading the SMBios attrs better.

As a separate suggestion, perhaps moving the ipxe.org to github.io would make it easier to PR against the documentation, so it's not just a few people writing the docs?

@robinsmidsrod
Copy link
Contributor

The reason the editing capability on the iPXE wiki is limited is to avoid getting documentation written in a bunch of different styles (for the main docs). You can register an account there and write an appnote page detailing the useful SMBIOS values and then ask for it to be linked from one of the main pages.

I do agree that contributing documentation should be easier than it is today.

@ipxe-devel
Copy link

On 12/06/14 16:42, Robin Smidsrød wrote:

I just did a tiny check with VirtualBox, and both 2.7.0 and 2.5.0 do
contain useful values (same as product/serial). Is it possibly that we
could swap chassis product/serial to board product/serial for the named
settings?

That would potentially break backwards compatibility, so I'm reluctant
to do that.

I'm convinced by the argument that ${board-serial} is sufficiently
useful, since ${serial} will apparently not be unique between blades.
I'm less convinced that ${board-product} is useful (not least because
there's no obvious reason to me why ${board-product} is more justifiable
than ${board-manufacturer}, which hasn't been proposed).

Does anyone object to having just ${board-serial} included as a named
setting?

Michael

@ipxe-devel
Copy link

On 12/06/14 16:53, Dale Hamel wrote:

As a separate suggestion, perhaps moving the ipxe.org to github.io would
make it easier to PR against the documentation, so it's not just a few
people writing the docs?

I've generally found that collaboratively-written documentation tends to
end up being wildly out of date and self-contradictory. This is
certainly what happened with the old Etherboot wiki. A lot of effort
has gone into ensuring that where the iPXE documentation exists, it can
be relied upon to be definitively correct.

Michael

@dalehamel
Copy link
Contributor Author

@ipxe-devel my only argument for board-product is that it is useful during automated intake, which is what we use iPXE for. However, it is possible to collect this data later in the process, so it's not essential to have at boot time. The board-manufacturer is not likely to vary from chassis manufacturer, just because of how most large companies tend to perform their orders for servers - that's why it was not proposed.

I can see your point when it comes to community docs, but my suggestion to move ipxe.org to github.io would still keep control within the hands of selected gatekeepers, such as yourself, and have the added benefit that changes like this one could be submitted with accompanying documentation, which might help keep the docs complete / up-to-date. But, this isn't the right venue for such a discussion :)

I'm updating the patch to remove board-product, and squash the commits.

@dalehamel
Copy link
Contributor Author

Size after removing board-product:

wc -c bin/undionly.kpxe
70431 bin/undionly.kpxe

So it adds 19 bytes over upstream.

@ipxe-devel
Copy link

That is the beauty of the gethub.io proposal, unlike a wiki where everyone
edits the same document, you fork, make changes to the local copy and send
a pull request so that the changes may be reviewed and accepted or
rejected. This has the advantage that you can accept good work from people
you do not trust. and would help with things like all the penis enlargement
adds I have been seeing in the rss feed. The disadvantage is that it would
abandon the really cool errornumber lookup auto generation.

On Thu, Jun 12, 2014 at 10:33 AM, Michael Brown mcb30@ipxe.org wrote:

On 12/06/14 16:53, Dale Hamel wrote:

As a separate suggestion, perhaps moving the ipxe.org to github.io would
make it easier to PR against the documentation, so it's not just a few
people writing the docs?

I've generally found that collaboratively-written documentation tends to
end up being wildly out of date and self-contradictory. This is certainly
what happened with the old Etherboot wiki. A lot of effort has gone into
ensuring that where the iPXE documentation exists, it can be relied upon to
be definitively correct.

Michael


ipxe-devel mailing list
ipxe-devel@lists.ipxe.org
https://lists.ipxe.org/mailman/listinfo.cgi/ipxe-devel

Ben Hildred
Automation Support Services
303 815 6721

@ipxe-devel
Copy link

On 12/06/14 17:41, Dale Hamel wrote:

I'm updating the patch to remove board-product, and squash the commits.

Thanks; have applied:

http://git.ipxe.org/ipxe.git/commitdiff/7fe0735

Michael

@ipxe-devel
Copy link

On 12/06/14 17:50, Ben Hildred wrote:

That is the beauty of the gethub.io http://gethub.io proposal, unlike
a wiki where everyone edits the same document, you fork, make changes to
the local copy and send a pull request so that the changes may be
reviewed and accepted or rejected. This has the advantage that you can
accept good work from people you do not trust.

This is a good idea in theory, but my experience is generally that it
takes me more time to edit the average submission than it would to write
the equivalent documentation myself.

and would help with
things like all the penis enlargement adds I have been seeing in the rss
feed.

Those should have stopped; a couple of weeks ago I restricted the wiki
permissions so that only real people can edit (i.e. you now have to
manually ask for edit rights). Are you still seeing spam via the RSS feed?

The disadvantage is that it would abandon the really cool
errornumber lookup auto generation.

And there's no way I'm abandoning that feature! :)

Michael

@dalehamel
Copy link
Contributor Author

@ipxe-devel there is probably still a way to do the errornumber auto generation.

Workflow would go like this:

0.) Move site to static github.io (possibly using something like jekyll to make changes easier)
1.) Someone generates a PR, which is reviewed and deemed good
2.) The branch is merged locally, and you run whatever code you had to do the error number generation as a static page (jekyll could probably help with this)
3.) You update the master branch and close the PR.

Or, just pull merge normal PRs like normal, then do add the updated error code stuff after merging master.

Otherwise, keep the old servers around just for error code /linenumber lookup. Either way, hosting off of github.io should reduce your hosting costs and improve reliability.

I can help you with the migration (or even just a mockup) if you like - love the iPXE project and would be happy to contribute in any way that I can.

I don't think that github.io and error number lookup are mutually exclusive.

@ipxe-devel
Copy link

On 12/06/14 18:08, Dale Hamel wrote:

Workflow would go like this:

0.) Move site to static github.io (possibly using something like jekyll
to make changes easier)
1.) Someone generates a PR, which is reviewed and deemed good
2.) The branch is merged locally, and you run whatever code you had to
do the error number generation as a static page (jekyll could probably
help with this)
3.) You update the master branch and close the PR.

The error pages are not static; they get generated on demand. The list
of possible errors is generated via

make bin/errors bin-x86_64-linux/errors bin-i386-efi/errors ....etc

and then processed into an SQLite database using contrib/errdb/errdb.pl.
When a page within the error namespace is requested, a custom dokuwiki
plugin then uses the database to generate the header information
(including the hyperlinks to the lines of code whence the error might
originate).

It's a very slick process involving absolutely no manual intervention on
my part. Any error added to the codebase will show up in an error page
within a very short time of being pushed to the master branch, and the
pertinent details for the error page (including the full error text) are
pulled directly from the source code.

Or, just pull merge normal PRs like normal, then do add the updated
error code stuff after merging master.

Otherwise, keep the old servers around just for error code /linenumber
lookup. Either way, hosting off of github.io should reduce your hosting
costs and improve reliability.

Hosting costs are zero since I already own a physical server in
Telehouse North, and it all just runs from there. I would be concerned
about the loss of control over hosting on any free external service
(particularly the risk that the service might in future decide to insert
advertisements).

I can help you with the migration (or even just a mockup) if you like -
love the iPXE project and would be happy to contribute in any way that I
can.

Thank you for the offer; I do appreciate it. I have to say that I'm
very happy with the current hosting arrangement and don't really want to
change unless there is a demonstrable benefit which won't increase my
workload.

Michael

@dalehamel
Copy link
Contributor Author

Fair enough, thanks for your hard work et al.

-Dale

On Thu, Jun 12, 2014 at 1:24 PM, ipxe-devel notifications@github.com
wrote:

On 12/06/14 18:08, Dale Hamel wrote:

Workflow would go like this:

0.) Move site to static github.io (possibly using something like jekyll
to make changes easier)
1.) Someone generates a PR, which is reviewed and deemed good
2.) The branch is merged locally, and you run whatever code you had to
do the error number generation as a static page (jekyll could probably
help with this)
3.) You update the master branch and close the PR.

The error pages are not static; they get generated on demand. The list
of possible errors is generated via

make bin/errors bin-x86_64-linux/errors bin-i386-efi/errors ....etc

and then processed into an SQLite database using contrib/errdb/errdb.pl.
When a page within the error namespace is requested, a custom dokuwiki
plugin then uses the database to generate the header information
(including the hyperlinks to the lines of code whence the error might
originate).

It's a very slick process involving absolutely no manual intervention on
my part. Any error added to the codebase will show up in an error page
within a very short time of being pushed to the master branch, and the
pertinent details for the error page (including the full error text) are
pulled directly from the source code.

Or, just pull merge normal PRs like normal, then do add the updated
error code stuff after merging master.

Otherwise, keep the old servers around just for error code /linenumber
lookup. Either way, hosting off of github.io should reduce your hosting
costs and improve reliability.

Hosting costs are zero since I already own a physical server in
Telehouse North, and it all just runs from there. I would be concerned
about the loss of control over hosting on any free external service
(particularly the risk that the service might in future decide to insert
advertisements).

I can help you with the migration (or even just a mockup) if you like -
love the iPXE project and would be happy to contribute in any way that I
can.

Thank you for the offer; I do appreciate it. I have to say that I'm
very happy with the current hosting arrangement and don't really want to
change unless there is a demonstrable benefit which won't increase my
workload.

Michael


Reply to this email directly or view it on GitHub
#21 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants