-
Notifications
You must be signed in to change notification settings - Fork 378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: ransomware protection using S3-compatible Object Locking (immutability) #1067
Comments
Related #1090 |
S3-compatible object locking relies on 2 main features to provide "immutability" and protect objects from deletion or corruption: (1) object retention times; and (2) object versioning. When an object-locking bucket is created, object versioning is enforced and cannot be disabled.
Restoring a data from a kopia snapshot may require reading from the bucket specific object versions that correspond to a particular "point in time". #1259 Adds support for accessing a kopia repository at a specified point in time, assuming that the corresponding object versions are still present in the store and accessible. This feature works with S3-compatible stores, both with (a) buckets with object-locking and (b) buckets with only object versioning enabled. Again, the requirement is that ALL object versions corresponding to the specific point in time are still present in the store. For this reason, it is necessary to enable object locking and have a retention policy for the objects in order to be able to recover the state of the repository at the desired point in time. |
@drsound IIUC B2 offers a S3-compatible interface. Presumably, the S3 implementation should work. It would require accessing the repo as if it were a S3 store. I don't have access to Backblaze, thus, I have not way of testing it. |
Thanks @julio-lopez, I already thought of using B2 as an S3 compatible storage: I'll do that and report if it "object locking" works fine or not. |
Good to know, definitely helpful for testing. |
According to the B2 documentation, it seems that B2's S3 API should provide all the functionality that kopia needs from the store via the S3 API. https://www.backblaze.com/b2/docs/s3_compatible_api.html Also, other folks have had success using kopia + B2 for immutability and point-in-time restore. This was all done via the S3-compatible API. |
@julio-lopez Is there documentation about how to use the features introduced in 0.9.0? Also, is there more to this issue such that it is not closed yet? Thanks for all of your work - looking forward to some ransomware protection with kopia :-) |
No documentation yet, PRs are welcome. Here are the high-level steps at the top of my head. I hope this helps. Repo Setup
Going Back in TimeTo restore a snapshot from a previous version of the repository, assuming the client is not connected:
kopia repo connect s3 --bucket=my-versioned-bucket --prefix=kopia-repo/ \
--point-in-time=2021-11-29T01:10:00.000Z # ....
kopia snapshot list
kopia snapshot restore ... |
Super helpful, thank you! I will consider a PR once I get a chance to sit down and test this myself. Is it not possible that if some data is kept for a long period of time, having a retention policy on the bucket would still fudge the backup? For example, if there is a blob added on day 1 with a 90 day retention period, on day 91, if that data is still present on the source (and thus would be included in new backups), wouldn't it get tagged for deletion? |
Yeah, I think that you would not ever want a lifecycle policy defined on the bucket. You should let Kopia do the deletes itself which would be soft deletes until the bucket retention expires. What I have been trying for the last week (not long enough to have fully tested) is backups every day at midnight with Kopia policy of 90 daily backups. And bucket retention is set to 30 days. So I have 30 days to notice a ransomware attack and do a point in time recovery as an emergency. Otherwise, normal backups continue longer than that window up to 90 days and rely on Kopia to delete the files older than that. |
Lifecycle policies in S3 are what define when deleted objects actually get removed in versioned buckets. Retention just says when a file can be modified or deleted, even if it's a current file. I have been testing this over the last 24 hours or so, and that seems to be working as intended, and point in time restores are going okay. If you have a blob created on a day 1, it would be able to be deleted on day 30 in your case even if the file(s) represented by that blob were still present on your machine. I think lifecycle policies with versioned buckets are much safer if your goal is to prevent e.g. client-side malware/takeover deleting things you actually want or need to keep. Edit: I am considering a PR for some docs around this, so I really want to understand your perspective and make sure I'm covering as many plausible setups as I can. |
My idea was that you would let kopia delete things whenever it wants - even on day 1. But don't give your IAM user that's running kopia the ability to delete non-current versions of objects, then use a NoncurrentVersionExpiration that extends a day or two past when you tell kopia to prune snapshots. |
@bashirsouid Mainly curious, is this in a bucket with object locking enabled? |
In AWS S3, it is possible to define a lifecycle policy that deletes the non-current blob version X days after the blob was deleted. Some details:
|
Yeah, actually I am using Backblaze via the S3 API specifically because of this feature (thank you!) I turned on the object lock in Backblaze and set to 30 days. Nothing else configured really. |
Want to help document this feature with regards to backblaze? :-) |
Yeah, I would be happy to. I was mostly waiting to get to my vacation time so would have some time to test out a simulated malicious attack on my bucket and recovery. I can take notes when I do it so can document it. I am actually new to finding Kopia just a week ago. I like it a lot! It has some advantages over my prior setup with Borg backup (ex: sync to another repository). Also need free time to port my Ansible machine setup scripts to setup Kopia. Not sure how useful those would be to other people though because I mostly just write them for myself (ex: all my machines are Ubuntu) instead of making it general purpose. |
Hi bashirsouid, I'd be interested in the outcome of your attack simulation and effectiveness of randsomeware protection! Also, ansible scripts on such setups are also most welcome. |
Oh yeah. I wrote the Ansible scripts a while ago. Not sure how to share them but maybe I'll just upload them to an empty GitHub repo later today. I made it to my use case (which is how I usually treat Ansible, instead of build for universe and support 10 operating systems, etc.). But it's been perfect for me because I have it backup to both cloud backups (off-site backups) and removable drives (onsite backups) in same job but it doesn't fail if the removable drives are not plugged in because I don't want to always have them attached when running Ansible (I'm lazy). It also sets up a Kopia user with read-only access to locations and runs periodically backups so if I don't run my Ansible setup job for a few days I still get backups. But it's less obvious the backup succeeded when it's a different user. So that's one of the reasons why it backups up when the Ansible setup job runs itself interactively. And I have it push a health check notification to healthchecks site so if it doesn't backup for few days I get a page to pagerduty. So yeah, that's a long explanation... But that's kinda why I write Ansible scripts myself because I like to customize it for my use case with all the behavior I need. Sometimes I copy something from Ansible Galaxy as a starting point but then always customize it for my use case again. I guess the simulated ransomware attack was more of a test of Backblaze than Kopia really. But yeah, Backblaze did the right thing and didn't actually remove any files until expiration time so Kopia had all the files it needed to restore. |
As for the ansible scripts, perhaps you also have input on backup reporting. I've started a forum post to harvest use cases / idea's. Indeed, active health checks are a good point as well. I had not thought about outsourcing the healthchecks to a third party, very pragmatic. Please have a look here. Looking forward to your input (and the scripts once you place them online): https://kopia.discourse.group/t/the-daily-mail-what-are-your-ideas-on-backup-reporting/437 |
Hmm interesting thread. I think that you may have some different requirements than me for reporting though. I just need to know if the backup operation failed or didn't run for 24h (ex: computer off, no network, etc.). So I don't need to email myself the actual logs. I just need to know something is up and then I will go to the machine and investigate. Also, because the backup actually runs when I run my daily Ansible maintenance scripts and I already check the final output of that after execution, I do have some extra visual notice that something is wrong when it happens because I will see it in my Ansible job output. You can POST a decent amount of text to a healthchecks ping so you could probably put the output there if you really needed to and connect it to email, pager duty, etc. But IMHO it doesn't seem necessary to me. Just get notified of failures or lack of running to email, sms, or paging app (or all three) and jump on the box when it fails. |
Here is my Ansible role to setup Kopia: https://github.com/bashirsouid/AnsibleKopia |
That is very usable, thank you for sharing! |
Hey, just discovered kopia and as a B2 + object lock user I'm keen to learn if this all Just Works(tm) yet? My naive understanding is that it should be sufficient to set the B2 object lock for N days (with no lifecycle rules, to be handled by kopia) and then just plug on - normal kopia operation will continue to upload snapshots and request old ones get deleted (which b2 will refuse until obj lock expires) is that about correct? |
I see this in the release notes for version 0.8: "S3 - support for buckets configured with object locking" Is there a setting or anything else that should be used here? |
When I try to set the retention period on Backblaze B2 I get the following error:
I run the following command directly after creating the repo: |
Did you create the bucket with lock already on? You have to enable that option in backblaze at creation time, it can't be modified later. |
@bashirsouid Yes I created the bucket with object lock already on. I haven't set a default retention time yet. |
Ok, I see that the B2 Repository type doesn't support Object Lock. After connecting to Backblaze B2 via the S3 API setting the retention parameters worked. |
I've been testing kopia with object-lock on Backblaze (using the S3 protocol). I think kopia needs to update the retention time periodically for it to actually provide sufficient security from a ransomware attack.
My understanding is that deletes are not affected by lifecycle rules, so if I were to execute a 'delete' operation on b2 I could immediately delete this file, and could not restore it. |
FYI, my understanding of Backblaze Lifecycles seems to be different from @jkowalski above. Lifecycle policies can be used to move files from hidden to deleted (or regular to hidden), but do not seem to affect deletion itself as far as I can tell on B2. |
ATM, extending the retention period (updating the retain-until date) needs to be done outside kopia. |
Thanks. I am using |
Not sure why the retention period would not be set on those. Are you using a default bucket policy to set the default retention period or leveraging The safest is to set the retention period for all the blobs (files) in the repository. That would be all the blobs that have the same prefix as the repository prefix specified during |
Thanks for the help. I think I did not understand how the S3 API was mapped to Backblaze B2. From the documentation (and having done some tests with the MinIO API), I found that:
The above should have been obvious on retrospection, but I didn't fully understand. Looking at So my understanding is:
In all cases, I need to rely on a hide-to-delete lifecycle policy to prevent my bucket filling with unused data over time. Would you be open to a patch that either extends the retention-time of locked objects during maintenance or for each snapshot create? Assuming the above is correct, does it make sense to document it in Kopia somewhere? I can try to write a patch for that too if it is useful. |
Great @PhracturedBlue! Note that I'll gladly volunteer review or write the official documentation for getting ransomware protection with kopia and S3-compatible API providers over summer. Where should we document it's use, perhaps a 'Ransomware protection' section below https://kopia.io/docs/getting-started/#setting-up-kopia. What do you think @jkowalski ? |
I'd be happy to review it. |
I will not claim to be a great documentation writer, but I took an initial stab at a document here: I decided to put it in the Advanced topics section as I think it needs more context than would fit in the Getting Started section. I also absolutely need help filling out the AWS sections since I am only familiar with Backblaze B2. |
You are being humble @PhracturedBlue! Great start! |
While not specifically about S3 storage, I was reading about Google Cloud Storage 'Retention Policies', and it seems like they are not particularly compatible with Kopia. This is primarily because you cannot use GCS Retention Policies and Object Versioning concurrently and GCS automatically applies the bucket's Retention Time to each item added to the bucket. This means that there is no way to replace a file in a bucket with Retention Policies enabled (before the retention time). You could use Object Locking alongside a Retention Policy to provide ransomware protection, but you would need to ensure that every created blob has a unique filename, and I don't believe that is currently the case in kopia. Specifically, the 'kopia' blobs and 's' blobs seem to be rewritten. I am unsure if compaction affects rewriting any other blobs. So supporting ransomware protection on GCS equivalent to what can be done in S3 would require either:
That said, I believe that using Object Versioning with GCS along with a key without Delete Permission (and a suitable Lifecycle Policy) would be equivalent to using S3 in a similar manner, and I will try to document this in my document PR |
Kia ora folks, Kopia is a brilliant work of software art. And this is a very useful thread. I've had a crack at documenting my experience of setting up Kopia with S3 Object Lock here: However, one outstanding question I have is, if I set a 90 day retention period, does this really protect my backups? Don't all snapshots stack on top of each other? Therefore, the last 90 days of snapshots depend on the prior 900 days of snapshots? If my assumption is correct, then it seems that plain S3 versioning (without object lock) and then splitting of credentials and tasks would be a better approach. I.e. the host running the backup (probably a web server with large attack surface) only has credentials to write to the bucket, not delete versions. Auto maintenance disabled and a separate management host with different credentials will do the maintenance work on a scheduled basis. Perhaps then Object Lock is only of interest for compliance reasons where storage cost/efficiency is not a priority? Ngā mihi |
Thank you for the write-up @RhysGoodwin ! |
@RhysGoodwin Your understanding is correct. Object lock is not very useful in its current form without the patches I've provided here #2179. |
Thanks @PhracturedBlue. Understood. I'll look forward to this being merged. With regards to ransom protection with plain S3 versioning (without Object Lock) it occurs to me that my comment above about running maintenance tasks under a separate credential is incorrect and not necessary. For a versioned bucket, as long as Kopia only has Get/Put/Delete permission then it can do everything as usual including maintenance and "deleting" obsolete data. In fact, it has no idea the bucket is versioned at all, it carries on as usual and deletes are just DeleteMarkers. If Kopia's credential is compromised an attacker doesn't have permission to delete previous versions. As long as the S3 lifecycle rule which cleans-up non-current versions gives ample time for an attack to be detected and remediated then happy days. Granted, this is not the same thing as Object Lock for compliance but will provide sufficient protection for many use cases. Apologies if I'm just stating the obvious here, it just took a bit for me to get my head around it, hopefully these comments will help other newbies like me looking for ransom protection for Kopia repos. |
I'm not sure what 'not planned' means as a close reason, but For anyone wondering, this feature is now integrated in the master branch (at least for S3 repositories) as PR #2179 was merged |
"I'm more concerned about something crawling into the machine and finding the credentials to the repository somewhere in Kopia's config and using that to mass delete files." - @Neurrone
A shared document has been created to facilitate a transparent discussion logging design choicese, tweaking to changes in Kopia and for drafting documentation. Feel free support and comment. If you want editing rights to this document, please send me your gmail address on slack.
This issue is based on a conversation on slack that took place on slack from May 8th 2021 onwards.
Related to #711, #743 and #913
User story (B2 specific, but please read the question S3-generic):
As a Kopia user weary of randsomware attacks on backups,
I would like to use S3-compatible object locking
so that files cannot be deleted by randsomware or other malicious actors (that do not have access to my (MFA-protected) S3-storage account).
How would or how could Kopia best relate to a) default B2 bucket retention policies (that lock a file for a set number of days) and b) to lifecycle rules (that allow for deletion after a set number of days to save money on the storage bill)?
The challenge I imagine is in cleaning up unreferenced data at some point. Perhaps a maintainence flag could be set with the amount of days after which the S3-storage provider will delete the file. Repackaging content that is still being referenced in to be deleted files would be trivial I imagine.
## Without using Object Locking / immutability the following should be possible
Note that it should already be possible to use one pair of application keys for backing up and restoring (without delete capability) and another for maintenance (that has delete abilities).
Question by Dickson Tan 7:04 AM May 9 2021
Thanks for confirming my hypothesis about permissions.
I see the documentation mentions that maintenance is done occasionally. Does that mean that if its been more than an hour since Kopia was last invoked, it would try to run quick maintenance the next time I invoke Kopia?
How often should I run quick + full maintenance? Would say once a week or month suffice?
Is it possible for me to pass in a different API key manually for a one-time invocation of the cli to run maintenance tasks? Or would I have to create a separate B2 repository which is identical to my usual one, just differing by the key?
When does the snapshot retention policy get evaluated and old snapshots removed? I'd imagine if it finds snapshots to delete, it would need to modify / delete files and would fail without keys with sufficient permissions. If I were to guess, this happens during maintenance?
Answer by Jarek Kowalski 11:54 PM May 9 2021
@dickson Tan some answers:
The quick maintenance needs to run reasonably frequently to ensure good performance - without Kopia will become slower and slower as the number of index files grows. How frequently - it depends on your usage
The full maintenance is only needed to free up disk space when data from old snapshots is no longer needed, if you don’t care about that, you can completely disable it (kopia maintenance set --enable-full false) and/or run manually whenever you feel like it (kopia maintenance run --full)
You can prevent maintenance from running as part of all the other commands by passing undocumented --no-auto-maintenance parameter to every command
To use different set of access keys for maintenance, it’s easiest to create a separate config file (kopia repo connect --config-file /path/to/some-other-file.config b2 --access-key=XX and run maintenance manually by explicitly setting that: kopia --config-file=/path/to/some-other-file.config maintenance run
Snapshot retention is evaluated after each snapshot and this is when old snapshots are marked for deletion (this does not delete actual files but only writes ‘tombstone’ markers for existing snapshots so they don’t show up in the list) . Only full maintenance ultimately gets rid of the contents.
The text was updated successfully, but these errors were encountered: