Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSD Firmware #124

Closed
ehegnes opened this issue Mar 16, 2016 · 54 comments
Closed

SSD Firmware #124

ehegnes opened this issue Mar 16, 2016 · 54 comments

Comments

@ehegnes
Copy link
Collaborator

ehegnes commented Mar 16, 2016

Prompted by @sprc's suggestion in issue #122, discussion of the SSD firmware shall be confined to this thread.

The following update was suggested by me:

@remexre, you are brave. There be dragons ahead/use at your own risk, and all that. If I were you (when I'm you in a few days), I'd dd my drive as a backup and run this command from a LiveCD with the device unmounted.

Firmware Blob: firmware-ssd-02.3.bin
MD5SUM: 5cbc5c2da4bed35ea499d18dbde6d295
Command:

sudo hdparm --fwdownload-mode7 "/path/to/firmware.bin" --yes-i-know-what-i-am-doing --please-destroy-my-drive "/dev/sda"

This should all, presumably, work. I'm really not sure. :)

This didn't work, resulting in:

/dev/sda:
fwdownload: xfer_mode=7 min=1 max=65535 size=461312
FAILED: Input/output error

But this also did not harm the drive.

Mentions: @remexre, @recri, @aeroevan

@stefanwiegmann
Copy link

Does anybody know, which versions of the firmware have this problem? I am on 1.8 and never had any issue running arch on it (including luks-encryption, no swap).

@recri
Copy link
Contributor

recri commented Mar 16, 2016

The point I was trying to make is that it's not normal usage to rewrite the
firmware on your disk drive. So if you happen to down load the new
firmware file on to your disk drive, just like you have always done before,
and then run the software to update the firmware, it's entirely possible
that the firmware rewrite will disable your disk drive before you read the
firmware off of it. Programs are not usually written with the expectation
that the file system will disappear while the program is running. The
disappearance of the file system will probably yield an Input/output error,
the program will crash, no firmware will be rewritten, the system will
crash, and all will be well after a cold start, modulo a few dangling
files. And, no, I haven't done that exactly, but, yes, I've done several
very similar things.

-- rec --

On Wed, Mar 16, 2016 at 10:39 AM, Eric Hegnes notifications@github.com
wrote:

Prompted by @sprc https://github.com/sprc's suggestion in issue #122
#122, discussion of the
SSD firmware shall be confined to this thread.

The following update was suggested by me:

@remexre https://github.com/remexre, you are brave. There be dragons
ahead/use at your own risk, and all that. If I were you (when I'm you in a
few days), I'd dd my drive as a backup and run this command from a LiveCD
with the device unmounted.

Firmware Blob: firmware-ssd-02.3.bin
https://gist.github.com/ehegnes/92ed8fe0078294b71ec6/raw/f565b2ad326747665b92ba1325b558de75399735/firmware-ssd-02.3.bin
MD5SUM: 5cbc5c2da4bed35ea499d18dbde6d295
Command:

sudo hdparm --fwdownload-mode7 "/path/to/firmware.bin" --yes-i-know-what-i-am-doing --please-destroy-my-drive "/dev/sda"

This should all, presumably, work. I'm really not sure. :)

This didn't work, resulting in:

/dev/sda:
fwdownload: xfer_mode=7 min=1 max=65535 size=461312
FAILED: Input/output error

But this also did not harm the drive.

Mentions: @remexre https://github.com/remexre, @recri
https://github.com/recri, @aeroevan https://github.com/aeroevan


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#124

@remexre
Copy link
Contributor

remexre commented Mar 16, 2016

@recri I got a new flash drive, put arch linux 2016.03.01 on it, booted it, then rsync'd the firmware over from a separate machine, then ran the command. /dev/sda was not mounted until after I got the error, and there was no crash, the machine didn't shut down until I ran halt -p.

@ehegnes
Copy link
Collaborator Author

ehegnes commented Mar 16, 2016

@recri, I did mention that it should probably be done from a LiveCD. Google solves this issue by providing the firmware in a depthcharge payload (I think).

@raphael
Copy link
Owner

raphael commented Mar 16, 2016

@remexre just curious: is that an actual SSD or a USB flash drive?

@remexre
Copy link
Contributor

remexre commented Mar 16, 2016

The live disk was from a flash drive from some conference, /dev/sda was the SSD.

@ghost
Copy link

ghost commented Mar 17, 2016

@stefanwiegmann i can confirm that 1.8 is affected, my ssd died about 5 minutes ago and i was running firmware version 1.8. (Ubuntu not Arch but it probably doesnt matter) Luckily i dd'd my drive on Wednesday so if i can get a replacement ill be able to re-image it

@raphael did google give you a hard time about being in developer mode?

@stefanwiegmann
Copy link

thanks for the info, pruddiman. guess I'll dd and sit down on the weekend to do whatever I have to do.

@ehegnes
Copy link
Collaborator Author

ehegnes commented Mar 18, 2016

Uh oh. Looks like this is a bigger issue than I originally thought. I suppose I too will be dding and spending the rest of the night trying to flash the new firmware.

In the worst case (most kludgy) scenario, would it be possible to restore ChromeOS with the official recovery images, boot with their depthcharge payload so the update installs, and reinstall our *nix distros?

EDIT: I'll be on IRC if anybody has some more insights. I'm not exactly an expert in the world of firmware.

@raphael
Copy link
Owner

raphael commented Mar 18, 2016

@pruddiman it took some convincing, had to send videos. If you have fsck logs these help.

@ehegnes I keep 5gb to dual boot chromeos for updates.

@ehegnes
Copy link
Collaborator Author

ehegnes commented Mar 18, 2016

@raphael, that sounds like a wise idea for whenever I reinstall next.

On Thu, Mar 17, 2016, 10:10 PM Raphaël Simon notifications@github.com
wrote:

@pruddiman https://github.com/pruddiman it took some convincing, had to
send videos. If you have fsck logs these help.

@ehegnes https://github.com/ehegnes I keep 5gb to dual boot chromeos
for updates.


You are receiving this because you were mentioned.

Reply to this email directly or view it on GitHub
#124 (comment)

@stefanwiegmann
Copy link

as I said, I'll dd and flash the firmware, no problem. But I am still curious: is the firmware really the reason for the ssd failures? Or do we have to be careful with something else as well.

I remember the arch thread https://bbs.archlinux.org/viewtopic.php?pid=1587933#p1587933 had comments about degrading performance for some. I never had these issues. Could it be something like passing all the discards through fstab, lvm and luks or swap space/swappiness?

@pierater
Copy link

Hey I have been following this repo for a little while and going through the threads. I just got my pixel, but I don't know how to check my SSD firmware. I am running Arch with kernel version 4.4.2-6. Also, @ehegnes what channel are you guys on?

@ghost
Copy link

ghost commented Mar 18, 2016

@pierater you can check your firmware version with hdparm -i /dev/sda (or whatever device your ssd is) from terminal you'll see FwRev=**** on the first line of the output

@raphael
Copy link
Owner

raphael commented Mar 18, 2016

@stefanwiegmann good question. I had my swap mounted on a usb drive so we know that's not the problem. I wonder if my 10 Linux kernel compilation a day usage pattern contributed to the problem although I'm not sure it's the right order of magnitude (in terms of how many writes one can make before the SSD starts failing). It did degrade progressively as I had to run fsck a few times before the final failure. I wish I knew the real root cause.

@stefanwiegmann
Copy link

@raphael so you had constantly errors piling up in fsck's? I don't think your compiling is enough to have these issues already, if it should last years under "normal" usage, with "normal" covering average users writing, reading, caching movies and music all the time

@raphael
Copy link
Owner

raphael commented Mar 18, 2016

Agreed. I had to run fsck a few times over the course of a few weeks before it wouldn't be able to fix all the problems.

@nelsonni
Copy link
Contributor

I've got Arch running on firmware revision S9FM01.8 for SSD without any performance issues. I've got it formatted with BTRFS though, so read/writes patterns are probably going to be slightly different.

@pierater
Copy link

I'm running arch with 1.8, although it's only been just over a week and I haven't noticed anything. How long is it taking for the SSDs to die?

@ethanmad
Copy link
Contributor

Same situation for me as @nelsonni: Arch, btrfs, firmware 1.8. Good partition scheme with nothing fancy on top. I've had the Pixel set up this way since August.

But I'm scared. I can't have my computer stop working while I'm at school.

@stefanwiegmann
Copy link

it's good to read, that some report no issues on 1.8. But having updated firmware wouldn't be bad either. Did anybody flash the firmware outside of chromeOS (from other liveCD?) via hdparm......?

@ehegnes
Copy link
Collaborator Author

ehegnes commented Mar 20, 2016

I'm using F2FS on 1.8, and fsck never fixes the errors on boot. But at least the errors aren't increasing in number.

@stefanwiegmann I just tried hdparm via the Gentoo LiveCD, and I had a mite more luck with --fwdownload-mode3, but I still get the inevitable FAILURE: Input/output error. Apparently, some drives just aren't compatible with hdparm due to the way they structure their firmware files, so maybe that's the case for us.

If it's any consolation, it seems that 1.7 is the earliest fw, not 1.8.

I'm going to ditch this custom update method for now and try to dual boot as per @raphael's suggestion, updating via the official payload.

@ehegnes
Copy link
Collaborator Author

ehegnes commented Mar 20, 2016

Although it will be a longer solution, I can do a write-up if it works.

@stefanwiegmann
Copy link

@ehegnes did you try this: https://wiki.archlinux.org/index.php/SSD_memory_cell_clearing. I wonder if that would fix it. I did that before I did my install, just as a pre-caution. It will wipe everything, so you either dd or take your time to install everything all over again.

@ehegnes
Copy link
Collaborator Author

ehegnes commented Mar 20, 2016

@stefanwiegmann neat resource! I'll try it before I recover ChromeOS.

@ehegnes
Copy link
Collaborator Author

ehegnes commented Mar 21, 2016

@stefanwiegmann, full disk erasure did not solve the hdparm issues.

I'm currently in the process of recovering and setting up dual boot.

@stefanwiegmann
Copy link

@ehegnes "hdparm issues" meaning being able to flash the firmware?

I was more hoping/expecting the fsck errors would go away :-) Good luck!

@ehegnes
Copy link
Collaborator Author

ehegnes commented Mar 21, 2016

@stefanwiegmann oops! I meant being able to flash the firmware. I suppose I misread your suggestion and just missed my chance to check the errors.

I am, however, on 2.3 now. I used this script, provided by Google, to create a recovery drive. After recovery (hold ESC + F3 and tap Power), on its first boot it warns of a "critical update" and reboots twice, presumably installing firmware update(s), before launching into ChomeOS.

At this point, I could either dd my backed up Gentoo image to the drive, or try to dual boot.

@stefanwiegmann
Copy link

@ehegnes :-) at least you have newest firmware now.

So, do we know now, if you don't get any errors anymore, what it was? firmware or ssd-reset? Guess it wouldn't matter much to you ;-)

@ehegnes
Copy link
Collaborator Author

ehegnes commented Mar 21, 2016

I actually don't know if I'm error-free. I don't have a LiveCD with recent enough fsck tools to be able to check. I'll check and report as soon as I restore.

@stefanwiegmann
Copy link

Okay, I am curious. Thanks for updating!

@ehegnes
Copy link
Collaborator Author

ehegnes commented Mar 24, 2016

Finally got a dual boot working. Ran a full check on all partitions and everything seems fine with the disk. I suppose we never found a proper solution to updating the firmware, but I'd be glad to do a write-up if there is interest? Some of the intricacies of dual booting with ChromeOS are not trivial.

@colemickens
Copy link
Contributor

How did you do it? The script in the Arch wiki was painless and nearly fully automated if I recall correctly (it was many months ago at this point. I had decided to dual boot for this exact reason - firmware updates - though I was more concerned about the typec->dp adapter).

@ehegnes
Copy link
Collaborator Author

ehegnes commented Mar 24, 2016

@colemickens, after restoring ChromeOS, I used Google's GPT partitioning tool cgpt to resize the stateful partition that ChromeOS uses and the partitions /dev/sda6 (labeled KERN-C) and /dev/sda7 (labeled ROOT-C) to fill the remaining space — I recommend any scripts like the one ChrUbuntu uses to do this, as it is much easier than learning cgpt. After a reboot to fix the stateful partition, you boot into a LiveCD and restore your backed-up root partition to /dev/sda7, your boot partition to /dev/sda6, and proceed with the usual steps (grub2-mkconfig -o /boot/grub/grub.cfg and grub2-install /dev/sda --force) to setup the bootloader. Then you press Ctrl+L as usual at the boot screen and wait for it to timeout to your bootloader. Make sure to change /etc/fstab to appropriately reflect the new partition scheme.

That was a mite rushed. I can include resource links and specific commands if it would help.

@ehegnes
Copy link
Collaborator Author

ehegnes commented Mar 24, 2016

The parts that weren't trivial for me were recognizing that you can only resize partitions, not delete or create them (otherwise, ChormeOS complains that you need to restore the OS again) and recognizing that you need to use /dev/sda6 (or KERN-C) to house your kernels (that last part may not be true, but it's the only way I could get it to boot).

@colemickens
Copy link
Contributor

I just used this: https://wiki.archlinux.org/index.php/Chrome_OS_devices#Alternative_installation.2C_Install_Arch_Linux_in_addition_to_Chrome_OS

Worked out of the box. Didn't have to do anything special with kernel placement or anything. A completely normal install of Arch worked and I could dual boot afterward.

@ehegnes
Copy link
Collaborator Author

ehegnes commented Mar 24, 2016

@colemickens Right, that's the kind of script that I would recommend for partitioning. Is your boot partition separate from your root partition, or is everything on one partition?

@colemickens
Copy link
Contributor

I was lazy, it's just one partition.

@ehegnes
Copy link
Collaborator Author

ehegnes commented Mar 24, 2016

Then I stand corrected and suppose you don't need to use KERN-C. :) Thanks for the added info.

@stefanwiegmann
Copy link

glad you are back in business! Guess for now, I am to lazy to change anything. Sounds like everybody had visible errors leading up to this and you two just proofed it can be fixed at that point by yourself. As long as I don't get fsck issues, I keep what I have. Thanks!

@iain-logan
Copy link

So what's the state of play now? I've been running Slackware current since a little after the launch of the pixel, and as a result haven't received any of Google's firmware updates. Is it critically recommended to get these updates? Seeing this has kinda scared me, I can't have my SSD die during uni.

@ehegnes
Copy link
Collaborator Author

ehegnes commented Apr 2, 2016

@iain-logan, I would dd the entire disk (/dev/sda, so it includes your partition table and all) to some external storage, restore ChromeOS for the firmware update(s), and flash your backup to the disk, again with dd.

I opted to dual boot, if only because it makes playing DRM content much less painful.

Updating the firmware is probably a good idea, and if you are getting fsck errors that can't be resolved, a full disk erasure before restoring the backup, as @stefanwiegmann suggested, might help.

Would you like a detailed guide? :)

EDIT: Also, Slackware is awesome.

@iain-logan
Copy link

Thanks for the prompt response!

Cool, I've got dd running making a backup of the disk currently.

Dual boot does sound like an attractive option for getting future updates, but I think I'll need to leave that for when I have more time.

In regards to fsck errors, I don't have a live image to hand, so I can't unmount my main partition to run fsck on it. Again, I think I'll leave that until I have a little more time.

I'm ashamed to admit that I'm no expert when it comes to this kind of thing, so a bit of a guide would be really appreciated. Perhaps this kind of information would be worth adding to the README here?

@stefanwiegmann
Copy link

if I remember correctly, the situation is this; @ehegnes, @raphael, please correct me, if I'm wrong:

There are pixels with ssd firmware 1.8 or lower, which suffer from degrading performance and visible fsck errors. There are many pixels on 1.8 which don't have issues. fsck on boot should be on by default. You would know, if you turned it off. If you have errors, you should see them during boot. fsck on ArchWiki

@raphael didn't do anything about it and it died. He was able to get it replaced, but it didn't sound like the standard-no-questions-asked procedure.
@ehegnes had the errors and did two things: SSD memory cell clearing and then installed ChromeOS again, which took care of updating the firmware. At this point we know this will solve the issues, but we don't know if only one of them would have been sufficient. My idea earlier was to backup with dd, only reset the ssd and not update the firmware and then restore via dd. But, hey, there is a turkish proverb: The shortest way is the way you know.

Once you get fsck errors or have degrading performance, you still have time to do what @ehegnes did. I don't have problems (I am on 1.8) and will wait until I get them or until I want to redo everything anyway.

@SimionKreimer
Copy link

Dual boot seems like a good way to go for future updates. It would be
really nice to have some step by step directions on how to do all of that.

On Sun, Apr 3, 2016 at 4:34 PM, Stefan Wiegmann notifications@github.com
wrote:

if I remember correctly, the situation is this; @ehegnes
https://github.com/ehegnes, @raphael https://github.com/raphael,
please correct me, if I'm wrong:

There are pixels with ssd firmware 1.8 or lower, which suffer from
degrading performance and visible fsck errors. There are many pixels on
1.8 which don't have issues. fsck on boot should be on by default. You
would know, if you turned it off. If you have errors, you should see them
during boot. fsck on ArchWiki https://wiki.archlinux.org/index.php/Fsck

@raphael https://github.com/raphael didn't do anything about it and it
died. He was able to get it replaced, but it didn't sound like the
standard-no-questions-asked procedure.
@ehegnes https://github.com/ehegnes had the errors and did two things: SSD
memory cell clearing
https://wiki.archlinux.org/index.php/SSD_memory_cell_clearing and then
installed ChromeOS again, which took care of updating the firmware. At this
point we know this will solve the issues, but we don't know if only one of
them would have been sufficient. My idea earlier was to backup with dd,
only reset the ssd and not update the firmware and then restore via dd.
But, hey, there is a turkish proverb: The shortest way is the way you know.

Once you get fsck errors or have degrading performance, you still have
time to do what @ehegnes https://github.com/ehegnes did. I don't have
problems (I am on 1.8) and will wait until I get them or until I want to
redo everything anyway.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#124 (comment)

@ehegnes
Copy link
Collaborator Author

ehegnes commented Apr 4, 2016

@stefanwiegmann, that all seems correct, except I didn't do the disk erasure in the way that the Arch wiki describes. I just restored ChomeOS, which is (probably?) effectively the same.

@SimionKreimer, I'll do a full write-up tomorrow morning with directions for dual-booting Arch, Ubuntu, and other distros, along with directions for backing up, installing the updates through ChromeOS, and restoring without dual-booting.

@raphael, the README is getting kinda long. Might it make sense to add my write-up to a wiki page instead and add a link to it on the README?

@recri
Copy link
Contributor

recri commented Apr 4, 2016

Well, I have a dirt simple dual boot which may not be for everyone.

I simply told Ubuntu 15.10 to install on /dev/sda1 as the root and only
partition, without reformatting and without removing anything, so the
entire Ubuntu installation is on a drive which ChromeOS uses in mysterious
ways but has not had a conflict that I have noticed. I also told the
Ubuntu 15.10 installer to make a bootstrap for the partition, which uses a
deprecated block list bootstrap, but also works as far as I've noticed.
Specifying this boot method the first time, manually, was a pain, but the
installer did it quite simply. This setup, in one of two versions, has
been running for months. I only reinstalled because I had a replacement
Pixel.

It's extremely unhygienic, but there it is.

-- rec --

On Sun, Apr 3, 2016 at 7:55 PM, SimionKreimer notifications@github.com
wrote:

Dual boot seems like a good way to go for future updates. It would be
really nice to have some step by step directions on how to do all of that.

On Sun, Apr 3, 2016 at 4:34 PM, Stefan Wiegmann notifications@github.com
wrote:

if I remember correctly, the situation is this; @ehegnes
https://github.com/ehegnes, @raphael https://github.com/raphael,
please correct me, if I'm wrong:

There are pixels with ssd firmware 1.8 or lower, which suffer from
degrading performance and visible fsck errors. There are many pixels on
1.8 which don't have issues. fsck on boot should be on by default. You
would know, if you turned it off. If you have errors, you should see them
during boot. fsck on ArchWiki <https://wiki.archlinux.org/index.php/Fsck

@raphael https://github.com/raphael didn't do anything about it and it
died. He was able to get it replaced, but it didn't sound like the
standard-no-questions-asked procedure.
@ehegnes https://github.com/ehegnes had the errors and did two
things: SSD
memory cell clearing
https://wiki.archlinux.org/index.php/SSD_memory_cell_clearing and then
installed ChromeOS again, which took care of updating the firmware. At
this
point we know this will solve the issues, but we don't know if only one
of
them would have been sufficient. My idea earlier was to backup with dd,
only reset the ssd and not update the firmware and then restore via dd.
But, hey, there is a turkish proverb: The shortest way is the way you
know.

Once you get fsck errors or have degrading performance, you still have
time to do what @ehegnes https://github.com/ehegnes did. I don't have
problems (I am on 1.8) and will wait until I get them or until I want to
redo everything anyway.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
<
https://github.com/raphael/linux-samus/issues/124#issuecomment-205049757>


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#124 (comment)

@ehegnes
Copy link
Collaborator Author

ehegnes commented Apr 4, 2016

Actually, I didn't realize that there wasn't an install script for Arch, so that won't be in the write-up.

@recri that may work, although I wouldn't know how. I was thinking of summarizing the instructions here and providing a script so that people can understand how their system is being partitioned while also easily making room for another distribution.

Then they can simply use an official installer or install normally, taking care to select a certain partition as their root partition.

@vadixidav
Copy link

Is there any way to update the firmware after the SSD stops working? Right now using the recovery media doesn't help either and responds with "and unexpected error has occurred." Seabios stopped working a while ago as well, which meant I couldn't boot to external media either. Now I don't see how it could possibly boot. I might try to see if they will replace my chromebook, but I imagine they wont.

Edit: Also looks like I've gone over the 1 year warranty as well.

@cowlicks
Copy link
Contributor

Hey y'all, just to confirm. The SSD issues were definitely due to firmware?

I was having SSD problems and just assumed it was a hardware failure, so I reinstalled chromeos so that I could get this thing warrantied. But upon installing chromos, the failures stopped and I'm not sure how to detect the errors with chomeos. The "crosh" storage_test_1/2 tests don't find anything. Also I'm not even sure how to check my SSD firmware version from chromeos (I guess no one has written a browser extension for that hehe).

@stefanwiegmann
Copy link

:-) you basically did, what @ehegnes did. Restored ChromeOS, which in the process updated your firmware. It also wiped your ssd completely before doing that. This is "a" way to fix it. Good to know it works this way.

The other way I was curious about is doing the wipe yourself, keep the old firmware and install from scratch (Not ChromeOS). But this is hypothetical, if anybody tries this, please keep me posted.

@cowlicks
Copy link
Contributor

@ehegnes I'd appreciate a write up. I'd like to dual boot Arch and chromeos so I can still get firmware updates. Also I've never installed Arch, I was on Debian before.

@ehegnes
Copy link
Collaborator Author

ehegnes commented Apr 18, 2016

@cowlicks, I have been meaning to do a write up for a while. I'll make that a priority for the wiki. @colemickens pointed out above that he used this Arch wiki section to guide him in installing Arch alongside Chrome OS. That guide, however, is a mite vague, and I aim to provide something that is much more detailed.

@vadixidav
Copy link

Good news, they replaced my Pixel! I tried to make crouton work, but it just doesn't work well with Arch Linux. Hopefully there wont be another firmware bug like the previous one...

Just to be clear, ChromeOS permanently updates the firmware, right?

@raphael raphael closed this as completed Aug 24, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests