Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Digital Ocean API is not told to scrub (securely delete) VM on destroy #2525

Closed
sneak opened this Issue · 53 comments
@sneak

Major security issue: the Digital Ocean API has a parameter on the destroy call to securely scrub the root blockdev on VM destroy, preventing future customers from reading the data left on disk by your VM.

This is surely a digitalocean security issue, but they're passing it on to users by making it a parameter - rather shitty of them. This is documented in their API at https://cloud.digitalocean.com/api_access - see "scrub_data".

Fog does not pass this parameter, leaving Fog-destroyed VMs vulnerable to later customers stealing the data contained on them.

@sneak

confirmed - i am able to recover previous customer data off of digitalocean blockdevs (dd if=/dev/vda bs=1M | strings -n 100 > out.txt) so any digitalocean VM ('droplet') created and then had sensitive data copied to it and then destroyed via test-kitchen will leave that data on the SSD for the next customer to read.

@sneak

screen shot 2013-12-30 at 5 30 37 am

These are some of the strings I pulled off the root blockdev - I have no such /var/deploy/chegou, or any of these files. I was able to recover someone else's webserver logs from yesterday, as well.

@sneak

The DO api documentation not behind login is here: https://developers.digitalocean.com - search for "scrub_data"

@jgod jgod referenced this issue in smdahlen/vagrant-digitalocean
Open

Scrub data when deleting #83

@icco
Collaborator

Kinda seems like we should set scrubbing the data to be on by default. Shouldn't be hard. I don't have a do account to test, but I'll get one and try and submit a patch.

In related news, this was posted to hn: https://news.ycombinator.com/item?id=6983097

@icco
Collaborator

So the request we need to update is /droplets/[droplet_id]/destroy.

@sneak

AFAIK it should be as straightforward as adding "?scrub_data=true" to

https://github.com/fog/fog/blob/master/lib/fog/digitalocean/requests/compute/destroy_server.rb#L13

Per Will at Digital Ocean support, "true" is the truth value to pass to an api endpoint documented as boolean in their docs.

@icco
Collaborator

Yeah, got the commit ready, just wanna test it. Sadly Whoever wrote the digital ocean bootstrap method didn't set sane defaults. Or any for that matter...

@agh

I'm guessing the DigitalOcean guys would prefer it if you didn't scrub your volumes, given they're on solid-state drives and almost certainly aren't using SLC NAND so they (probably) have lifetime write endurance issues to ponder.

This preference is likely expressed through scrub_data not being true by default.

Less writes. Less drives going pop. More money to be made, since drives which fizzle out once their allotted ~60TB (or whatever) has been written to them are generally not replaced under warranty.

Of course unless you (as the customer) know there's nothing sensitive on there, better safe than sorry. :smile:

@johnedgar

There are a couple of things I wanted to say, and I can speak with some authority on the subject as I speak on behalf of DigitalOcean.

This was mentioned to me on twitter hours ago, prior to this post. The first thing I said is that most people these days understand the importance of a responsible disclosure, and that we take all security issues very seriously. Not following responsible disclosure with a company such as DigitalOcean is extremely irresponsible and I would be amiss to point that if anyone did ever find a software vulnerability filing it and waiting 24 hours for the appropriate response is preferred. (EDIT: I wanted to edit this and update that this paragraph was because the original understanding was the flag was being passed but not being respected)

As far as I can tell here, there is no unexpected behavior that isn't documented or stressed. In both our API documentation, and our control panel, we note that you must either pass a flag or check the box to security delete the data. As far as I can tell, the flag is currently functionally correctly.

Is the complaint that customer data is being leaked from VMs? That the flag being passed via our API/Dash isn't actually working? Or, that our policy on not doing a secure delete by default isn't something you agree with?

j.

@ahknight

@johnedgar The hubbub is most likely that the default behavior is to leak data and that users of software that consume the DO API are several layers removed from the flag that would protect them, so DO being proactive and defaulting it to True would be a Good Thing™.

@agh

@johnedgar Are you implying (or stating) that it is your opinion, or that of your employer, that this issue being opened constitutes irresponsible disclosure of a potential security issue involving the DigitalOcean platform?

It certainly feels like you're :point_right: that "extremely irresponsible" finger at someone and it wasn't just an idle comment.

@icco
Collaborator

@johnedgar, I believe @sneak's concern was that Fog wasn't surfacing this scrub data option, and also that it was opt-in instead of opt-out. Beyond those concerns, everything is working properly.

(Although, I personally would love a web UI for the /events API to verify that the code in Fog is working properly with your API.)

@johnedgar

@agh I'm saying that if there was a security issue we would hope that it would be done via our responsible disclosure policy as we do take security seriously and do address all concerns.

@icco Roger. If you email me with a suggested implementation I'll happily pass it along to our api guy.

@ahknight We're extremely responsive to our users, I'm going to mention it to our product team tomorrow, but also http://digitalocean.uservoice.com/ is very well monitored.

@sneak

To frame full disclosure as irresponsible is unprofessional. There is nothing irresponsible about full immediate disclosure.

It's also a red herring. You yourself said that this is a design decision, working as intended, and that I should tell people about it to raise understanding and awareness.

https://mobile.twitter.com/jedgar/status/417515507018129408?screen_name=jedgar

This is simply a case of Digital Ocean using dark patterns and user-hostile defaults. No other major cloud provider charges their customers extra to not give their data to the next user.

You don't take security seriously, as this design and your tweets show.

@sneak

The api call is called destroy. When called, it doesn't destroy the data without a special poorly documented flag. Your customers are unaware of this. Libraries don't know this. Tools that rely on those libraries don't support this. The default behavior of "destroy" is "don't destroy".

I was using DO via fog and kitchen-digitalocean today. test-kitchen tore down a vm with a bunch of sensitive data (chef cookbooks with attribute files containing credentials) on it during normal operation. You happily make that data available to third parties by default, a fact I verified before freaking out. This will now cost me multiple thousands of dollars in customer credits for days or a week I must now spend auditing and invalidating those cookbook credentials. Thank heavens I had stopped it before kitchen synced our webserver private keys to the machine or this would be an even bigger nightmare.

Your API docs don't even say what value needs to be set for the "scrub_data" flag. That I had to get via a customer support rep.

@ahoward

whatever DO says next is going to weigh heavily on their future - let's hope someone with business sense has a GH account....

@johnedgar

@sneak I appreciate your product feedback.

I think it's valuable, and I'll be sharing it with the rest of our company, including our whole C-team, tomorrow morning at our weekly kick-off meeting. I'll report back with what the result of that is. Like I've said 100000 times, and can't be any more clear about, we do take security seriously.

If you'd like, I'll happy give you a call tomorrow.

@sneak

Show me any other vps provider that silently provides access to customer A's data to customer B after receiving commands from customer A to destroy their instance and then I'll believe you guys aren't at the very bottom of the "takes security seriously" list.

A single counterexample would convince me you're not totally insane to think this a reasonable default.

@johnedgar

@sneak To be perfectly honestly I have no idea and it's 4am on Sunday morning and I've spent the last 4 hours on a hacker news story so I'm going to go to bed vs researching it, however, I don't want it to look like I'm leaving this hanging, I'll answer all these question, and more, tomorrow. :)

@lorenzogatti

"I think it's valuable, and I'll be sharing it with the rest of our company, including our whole C-team, tomorrow morning at our weekly kick-off meeting."
Tomorrow morning you'll be dead meat. This sort of problem can and should be fixed by drastic emergency measures.

@mbrownnycnyc

1) I don't think there's any reason this needs to be required by DigitlalOcean for free. As agh states, there is a real cost involved.

2) There is no reason you can't and shouldn't be using shred to delete sensitive data BEFORE you use some "magic/opaque" call in a web UI (for all intents) to "delete" an instance. This sort of event should have been caught in a security audit done by the end user on their hosting provider.

The solution here would be to have DigitalOcean warn you to shred sensitive data, or it might be nice if they coded that in: "We believe that you have MariaDB data files on your instance. Would you like us to shred those for you?"

@petems petems referenced this issue in pearkes/tugboat
Closed

Destroy data by default... #81

@jedsmith

Show me any other vps provider that silently provides access to customer A's data to customer B after receiving commands from customer A to destroy their instance and then I'll believe you guys aren't at the very bottom of the "takes security seriously" list.

Every hosting provider goes through learning the same things. First it's shredding volumes before reusing them, then ARP attacks against gateways on the subnet, then ...

If you delete a Linode, volumes, snapshots, anything, the data is shredded unconditionally. You don't get a choice. This also happens at Rackspace, Joyent, you name it. It's such a common vector that when a new hosting provider comes to town, it's one of the first things people try. That a high-level employee of DO came out swinging on behalf of the company when shown this pull request intrigues me. A lot.

I also noticed that only one API method has the shred parameter. I don't see a shred parameter for destroying volumes, such as snapshots. That concerns me too.

The solution here would be to have DigitalOcean warn you to shred sensitive data, or it might be nice if they coded that in: "We believe that you have MariaDB data files on your instance. Would you like us to shred those for you?"

My disks at a hosting provider are a black box. Get the hell out of them. Your job is to back them up at the block level if I request and plug them into my box. If you are mounting the filesystem, we have problems.

Regardless, this is not a fog issue in the broad, and this probably isn't the appropriate forum.

@tobz

Why is data lifecycle management so inextricably linked to the hosting provider? You're paying them for an unmanaged instance, where you hold the keys to the castle and handle all aspects of the guest OS. For all intents and purposes, this feels like it should be like any other decommissioning situation. You need to make sure you're scrubbing / shredding where possible when releasing resources back to the wild. Can't physically shred a drive? Better zero it out.

Granted, there'a a lot of people simply seeing the low cost per month and saying "hey, I should have a server!". They might not know anything about compliance or shredding drives and probably treat their instances as if DO is building them a new one, and then deleting and shredding it for them, like a physical server. That's obviously not the case for any cloud provider, though, and they all differ by their varying levels of automatic wiping, etc.

Maybe we should also try and educate users before lampooning DO over this? Also, for what it's worth, while I can find references to wiping EBS volumes before reuse in EC2, I can't find any reference to ephemeral drives being wiped as well. Anyone looked into this?

@jedsmith

That's obviously not the case for any cloud provider, though, and they all differ by their varying levels of automatic wiping, etc.

@tobz Nope. Every major provider, Amazon included (for both EBS and ephemeral volumes), scrubs every inch of user data before reusing the underlying media. It's one of the unspoken industry requirements to graduate from "Joe's VPS shack" to "reputable provider."

Taking another opportunity to say that even though this pull request was linked from outside, the fog developers don't care about this debate and it's not relevant here. Suggest taking it back to HN.

@tobz
@jedsmith

I'm ex-industry and can speak authoritatively on one provider, but I don't like to invoke that I used to work there publicly out of respect for them, since I'm just some dude. I know several people at the other names I mentioned, including Amazon, and can speak with authority on Amazon since I brought it up with them at a prior employer as a concern.

@tobz
@jedsmith

I have no incentive to lie to you, particularly in a public forum, and since you're having a hard time grasping "this isn't the place, dude," I'm not going to continue this discussion. My apologies to the fog people for dragging it as far as it is.

And people wonder why I avoid commentary...

@tobz
@tobz
@tszming

@tobz, I think you can refer to p.18 for ec2 ephemeral storage

Customer instances have no access to raw disk devices, but instead are presented with virtualized disks. The AWS 
proprietary disk virtualization layer automatically resets every block of storage used by the customer, so that one 
customer’s data are never unintentionally exposed to another. AWS recommends customers further protect their data 
using appropriate means. One common solution is to run an encrypted file system on top of the virtualized disk device.
@FlorianHeigl

Easy solution:
Do not wipe the disks if a customer reinstalls etc. since advertedly reinstalling without being aware this does things to data, are quite common(*).

Whenever VM (disk) is no longer used by the same unique customer, wipe it. <- there is no single reason to not do it.
If your system doesn't support it, this should be worked on.

(*)As are, apparently, people who don't delete their disks before passing back a server. Did you wonder about the difference between an admin and someone who can use an API? This thread just found it. So sad.

@markprzepiora

Can someone confirm that @johnedgar really represents DigitalOcean? Because what a friggin joke. When DigitalOcean first appeared as a half-price alternative to Linode, I was skeptical. I decided to wait a year or two before switching over my accounts to them because I was worried it would turn out to be an amateur-hour gong show.

Now I know...

@yanatan16 yanatan16 referenced this issue from a commit in yanatan16/python-digitalocean
@yanatan16 yanatan16 Add scrub_data option to destroy.
See fog/fog#2525 for priority.
2a5ec09
@yanatan16 yanatan16 referenced this issue in koalalorenzo/python-digitalocean
Merged

Add scrub_data option to destroy. #20

@iangcarroll

@markprzepiora He has a company email on his profile, so I assume you could email that.

@foolano19

:+1: to scrub data by default, given the side effect not doing that has.

I'm responsible for the initial implementation of the provider and, to be honest, I don't remember if that API parameter was there or not. Maybe I just missed it or I didn't fully understand the implications and skipped it, since it was optional.

FWIW, if you're a DigitalOcean's customer and want them to change their policy, best thing would be to get in contact with them and/or open an issue there. I think this issue should be dedicated to discussing whether it's a good idea to scrub data by default or not.

@rubiojr
Collaborator

Sorry, :point_up: that was me using the wrong browser tab and account...

@rubiojr
Collaborator

way to go @nphase, thanks.

@geemus geemus closed this in #2526
@geemus
Owner

Wow. Maybe the most controversial/discussed fog issue to date. Thanks everyone for remaining largely civil. I believe the fog side of this issue has now been merged in (fog will now default to scrubbing). I'd encourage further thoughts and discussion move to digitaloceans uservoice (as I would expect it to get more attention/response there). See: http://digitalocean.uservoice.com/forums/136585-digitalocean/suggestions/5280685-scrub-vms-on-delete-by-default

@johnedgar

Just a follow up, we'll be issuing a blog post shortly (we're just finishing it now).

@mmalecki mmalecki referenced this issue in pkgcloud/pkgcloud
Closed

Tell DigitalOcean to scrub disk data #215

@contra

@sneak I just have to say that posting on HN about this 2 hours after you opened the issue (before anyone even replied) is extremely unprofessional

@mikesun

Anybody know if the scrub on SSDs actually works correctly? I'm not familiar with how DO maps its virtual block devices onto the actual SSDs, but if they are file-backed, it's been shown that securely erasing files on SSDs is a difficult problem --- see https://www.usenix.org/legacy/events/fast11/tech/full_papers/Wei.pdf

@sneak

@Contra

1) full immediate disclosure of security bugs and/or issues is both responsible and professional.

2) this behavior is BY DESIGN on the part of digital ocean.

3) their very own "Chief Technology Evangelist" John Edgar agreed with me when I suggested that this behavior be more widely publicized for their users' sake:

https://twitter.com/jedgar/status/417515507018129408

@johnedgar

Hey guys, just wanted to note we've updated our blog: https://digitalocean.com/blog

@jedsmith

I know, I've been the guy saying to move this discussion elsewhere, but I'm pretty amused by the gap between @sneak's screenshot and experiences with compromising his keys, and the blog post outright lying and saying "there was no user data compromise! all is well!" That blog post was clownshoes, and I hope you know that, @johnedgar, which makes it even funnier because "transparency" is in the title.

This is the turning point for me on taking DO seriously. I used to give you guys a lot of slack because I worked at a competitor -- one you misrepresent a lot, for what it's worth -- and was cognizant of my bias, but hoo boy. That blog post did the trick.

@kenperkins

Thanks for the heads up on this. @mmalecki and I have published pkgcloud v0.8.17 (node equivalent of fog) with support for default scrub_data for DO destroyServer to true. pkgcloud/pkgcloud#218

@jamiesonbecker

Couple of points about 'other' providers, specifically AWS. I'm sure Linode does securely scrub data, but keep in mind that (IMO) Linode's higher level of service and security does come with a substantially larger pricetag.

Amazon used to refer to securely scrubbing disk between users in prior PCI compliance whitepapers (I don't have them now). They don't anymore. In fact, on page 65 of their Risk and Compliance Whitepaper (aws.amazon.com/compliance/) they say "Refer to AWS Overview of Security Processes Whitepaper".

In THAT whitepaper, found here: http://aws.amazon.com/security/, they just repeat the same statement (page 8) that they securely destroy data as part of HARDWARE decommissioning.

AWS apparently does not wipe ephemeral storage at all, and in the event of an instance being killed off (it is 'ephemeral', after all), you have no chance to securely wipe that data before it is re-allocated. However, AWS DOES "wipe" (no details as to how) data on EBS volumes (page 20 of that same whitepaper) and suggests that you perform your own wipe prior to deleting the volume. (Of course, EBS volumes can and will move from disk to disk and machine to machine, so this implies that securely wiping before deletion will not have any impact on the previous disks that the data may have lived on.)

IN FACT, that 60 page whitepaper DOES NOT EVEN CONTAIN THE WORD EPHEMERAL. They ONLY wipe EBS. THE IMPLICATION: DON'T STORE ANY PRIVATE DATA OF ANY SORT ON EPHEMERAL DISKS AT AMAZON!

IMO, Digital Ocean's processes thus far have not been shown to be worse than Amazon's except that Amazon at least makes some sort of effort by default to apply some sort of data wipe before reusing virtual disk (only via EBS! not ephemeral!!!) and DO does not. However, apples to apples, Amazon does not appear to even offer the OPTION of wiping ephemeral disk as DO does! I'm going to have to go start sifting ephemeral stores at amazon for real data to see if this is true.

@grayj grayj referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.
@grayj grayj referenced this issue from a commit in grayj/salt
@grayj grayj Made destroy always scrub_data.
In response to security issue where DigitalOcean does not clean storage media between users.

fog/fog#2525

saltstack#9488
db9b2aa
@grahamc

@jamiesonbecker My theory RE AWS is they don't store data directly on disk, but instead have something like CephFS or GlusterFS behind it. Since this method of recovering data is specific to the hardware behind it, it may not apply. Just a theory.

@jamiesonbecker

@grahamc Even if that's the case (which is possible but I view it as unlikely: I believe ephemeral disk is simply local, unreliable disk from the Xen build), data fragments (such as AES keys) would probably still be recoverable. The lack of transparency and Amazon's famous secrecy does not do them any favors.. if they ever do have a serious breach, they'll have to open the door up that much wider to restore public trust.

@geemus
Owner
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.