Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wip/hughsie/nvme #666

Closed
wants to merge 4 commits into from
Closed

Wip/hughsie/nvme #666

wants to merge 4 commits into from

Conversation

hughsie
Copy link
Member

@hughsie hughsie commented Aug 16, 2018

No description provided.

@hughsie
Copy link
Member Author

hughsie commented Aug 16, 2018

UNTESTED!

/* create device */
fu_device_add_flag (dev, FWUPD_DEVICE_FLAG_INTERNAL);
fu_device_add_flag (dev, FWUPD_DEVICE_FLAG_UPDATABLE);
fu_device_add_flag (dev, FWUPD_DEVICE_FLAG_REQUIRE_AC);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This collection of flags I don't think is necessarily true. Since NVMe drives are PCIe device, it's certainly possible to have them in a Thunderbolt device or thunderbolt PCIe enclosure. To really prove they are internal/require-ac you'd have to look at the topology to see if they go through a thunderbolt bridge.

I don't think that's worth it though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we use INTERNAL to decide some security stuff, so maybe it is worth discovering.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any good heuristics for the topology walking? I assume we just need to look at the bridges and root controllers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe @gicmo or @YehezkelShB have some ideas here. My beset guess for a heuristic would be to match the thunderbolt NHI uevent to find it's parent and then see if the parent is in common.

Copy link
Member

@superm1 superm1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well that was quick. I guess since you have a json dump in the directory it would be good to read that in and with a self test script (and maybe call /bin/true with it to test "flash process" rather than nvme)

return FALSE;
}
nvme_frmw = json_object_get_int_member (obj, "frmw");
if ((nvme_frmw & 0x08) == 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a magic number, maybe a #define ?

if (g_strv_length (nvme_mn_split) == 2) {
if (g_strcmp0 (nvme_mn_split[1], "TOSHIBA") == 0 ||
g_strcmp0 (nvme_mn_split[1], "TOS") == 0 ||
g_strcmp0 (nvme_mn_split[1], "TO") == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sorta feels like it's going to grow into a pretty big list, how about just storing this in quirk data so it's easy to match?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's really only Toshiba that does it the wrong way around, but I do agree this kind of thing needs to go to the quirk database. Will fix.


/* write blob to temp location */
tmpdir = g_dir_make_tmp ("fwupd-XXXXXX", error);
if (tmpdir == NULL)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious to me what permissions this temp directory gets created with g_dir_make_tmp. Is it only writable by the user ID of the process that created it? If not, then I think you'll need to adjust that before the file is written.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we don't actually need a directory, just a file. I suppose we could even write this to TMPDIR directly. It's a shame we can't pass an fd and just back the fd using a blob of memory.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe they would take a patch for that though for later and then this plugin can change behavior based on nvme --version

fu_device_set_progress (device, 100);

/* delete temp location */
if (!fu_common_rmtree (tmpdir, error))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you'll want to do this with a device locker instead to make sure that temp directory gets cleared even if the flash process failed.

@superm1
Copy link
Member

superm1 commented Aug 16, 2018

Assuming this works I guess we'll want to enact getting it installed by more packages automatically too.
I think that can be followed up later though.

@superm1
Copy link
Member

superm1 commented Aug 16, 2018

Have you examined any firmware binaries to check how they're constructed? I inquire because I notice this:

The firmware image may be constructed of multiple pieces that
are individually downloaded with separate Firmware Image Download
commands. Each Firmware Image Download command includes a Dword
Offset and Number of Dwords that specify a Dword range. The host
software shall ensure that firmware pieces do not have Dword ranges
that overlap. Firmware portions may be submitted out of order to the
controller.

If that's a common situation then this might be difficult to represent in the plugin.

@hughsie
Copy link
Member Author

hughsie commented Aug 16, 2018

If that's a common situation then this might be difficult to represent in the plugin.

I've never seen an image with more than one chunk, and I've never seen an image that had an offset that wasn't 0. If we did have to support such a thing I'd urge the vendor to ship a DfuSe binary, that can have multiple images, each with a 32bit offset, and additional metadata. This could then be parsed in the plugin and each image applied.

If you think that's a sensible thing to support out of the box (or if you know of a NVMe binary with more than one chunk) just say and I'll add it to the plugin.

@superm1
Copy link
Member

superm1 commented Aug 16, 2018

If you think that's a sensible thing to support out of the box (or if you know of a NVMe binary with more than one chunk) just say and I'll add it to the plugin.

If it's not common I don't think it's sensible to support. I haven't analyzed any of the binaries we use yet, so I don't know if that's something we have to worry about.

@hughsie
Copy link
Member Author

hughsie commented Aug 20, 2018

It wasn't hard to use the ioctl to read the CNS without using nvme-cli. I think this version is much better. Could you re-review please? Thanks.


#include "dfu-chunked.h"

#include "fu-nvme-device.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you want to re-order these headers, wasn't the convention we were going for to have "local" first followed by system?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's config.h, system, local.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I guess that matches what we discussed previously (#568 (review)) I just remembered backwards.


#include <glib/gstdio.h>

#include <linux/nvme_ioctl.h>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this adds an implicit dependency of kernel 4.4 right? Unless you end up wanting to run this plugin on RHEL 7.x it would be good to check at plugin startup for at least this kernel version and fail otherwise.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, I should check for this in meson.build rather relying on a specific kernel version -- good catch, thanks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

fu_nvme_device_init (FuNvmeDevice *self)
{
fu_device_add_flag (FU_DEVICE (self), FWUPD_DEVICE_FLAG_REQUIRE_AC);
fu_device_add_flag (FU_DEVICE (self), FWUPD_DEVICE_FLAG_UPDATABLE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weren't we talking about making this updatable flag a whitelist (eg quirked) until OEM vs aftermarket is sorted out?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I still think that's best. Having it as a quirk still makes it hard to test tho. Maybe we can use --force if something isn't in the whitelist?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think that's good for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done the force thing. Do you think it makes sense to whitelist based on model, vendor or guid? Or leave the whitelist to later?

@superm1
Copy link
Member

superm1 commented Aug 21, 2018

Looks like the remaining CI failures are all real things (small though).

}
if (!fu_nvme_device_parse_cns (self, buf, sizeof(buf), error))
return FALSE;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to add a function that at least dumps the vendor block into debug output until we know what to do with it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All 1024 bytes of it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe keep the printing of it behind FU_NVME_VERBOSE or similar. I think it would be good when we have some parsing methods to try to throw at it though to have the plumbing around it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@hughsie hughsie force-pushed the wip/hughsie/nvme branch 6 times, most recently from 45ba5f6 to 705789a Compare August 23, 2018 16:30
g_autofree gchar *guid_id = NULL;

/* add extra component ID if set */
component_id = fu_nvme_device_get_string_safe (buf, 0xc36, 0xc3d);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The start offset is right, but this should be 8 long not 7 long.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, it is. It's "inclusive" --hence the + 1 in the fu_nvme_device_get_string_safe() length calculation. I did it that way to match the constants in the NVMe spec. If it's confusing I can add a comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, a comment would be good.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@superm1
Copy link
Member

superm1 commented Aug 23, 2018

Tested it on a Dell disk that does export both an EFI GUID and Component ID, it did build the GUIDs properly. Haven't tried flashing yet though, waiting for a new binary to do that.

/* check the GUID is plausible */
for (guint i = 0; i < 16; i++)
c_cnt += buf[addr_start + i];
if (c_cnt < 0xff)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really true? all EFI guids add up to something smaller than 255? I'd never heard that, but it's great to know if it is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, that's completely a made up rule. I figured statistically the GUID would have an average summed value of 2032, and that GUIDs like 01000200-0000-0300-0010-080020000a00 would be statistically unlikely and it probably indicates it's not actually a GUID.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds pretty good.
What would you think about putting in a warning for this situation that it found some data but that it doesn't look like a GUID? That way at least if we find this happening in the wild we can see warnings in fwupd.log to go with it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, as I understand it there are quite a few drives with component IDs and not GUIDs, so wouldn't a lot of drives have warnings? Are the early drives going to have zeros for the GUID? I guess we could avoid a warning for that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah they should have 0's.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, fixed.

fu_nvme_device_parse_cns_maybe_dell (self, buf);

/* fall back to the device description */
if (fu_device_get_guids (FU_DEVICE (self))->len == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your data set were you able to check if any other vendors 'vs' blobs set off the Dell creation of a Dell GUID? With this approach I would just wonder if another vendor vs section has a set of bytes that could represent an EFI guid at that offset) (but doesn't actually) what happens.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It takes more than the plausible GUID, it also takes a plausible component ID string too.

Copy link
Member

@superm1 superm1 Aug 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I guess that makes sense then for this. It would only apply to "newer" disks (that only have more than one field), but it's safer.

@hughsie
Copy link
Member Author

hughsie commented Aug 26, 2018

Fixed to actually make updates work, and merged, rebasing against wip/hughsie/gudev. Thanks for all the reviews.

@hughsie hughsie closed this Aug 26, 2018
@hughsie hughsie deleted the wip/hughsie/nvme branch August 26, 2018 18:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants