Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cloud Sync to Backblaze B2 #288

Closed
thibaultmol opened this issue Jul 2, 2016 · 53 comments
Closed

Cloud Sync to Backblaze B2 #288

thibaultmol opened this issue Jul 2, 2016 · 53 comments

Comments

@thibaultmol
Copy link

Hi

Bacbklaze just launched B2 v1.0.
Backblaze is a company that hosts data in the cloud at a REALLY low price and with B2 you can use it like AWS storage.
I'd be great if you could use B2 to sync your Nextcloud server to Backblaze. (The usage of B2 isn't the same as AWS but it would be nice to just login and have it work in the web interface)(I know I can setup the command line tool but I feel like it'be great if I just had to enter username and password (and two step auth) like on Synology NAS's)
https://www.backblaze.com/b2/partner-synology.html

Thanks

@MariusBluem
Copy link
Member

@GrahamJenkins
Copy link

I second this request. I am considering a switch from ownCloud to Nextcloud and would love for this to be a viable option. My existing server does not have expandable storage and the pricing of S3 is too high to be practical for personal use.

How difficult would this be? I'm slightly considering looking into the code for the S3 external storage, I imagine it would be very similar, if somebody would be willing to give some pointers I could try my hand at this.

Ref: https://www.backblaze.com/b2/docs/

@EnMod
Copy link

EnMod commented Jul 21, 2016

+1, I too would be ecstatic to have this as an option. Thanks for linking those docs.

@GrahamJenkins
Copy link

Does anybody know how much work this would involve or how long it might take a competent developer?

@jb510
Copy link

jb510 commented Sep 19, 2016

First, I am brand new to Nexctloud and have no idea how it implements external storage, so I am a long way from jumping in and starting work on this.

Some useful background however:
As far as I know, BackBlaze doesn't provide an official B2 API client in any language.

There are a few PHP Wrappers/Classes/SDKs/Clients in development for B2 independently though. None of those seem fully mature or widely used though. Most are labeled alpha/beta.

  1. The most recent/active/popular is: https://github.com/cwhite92/b2-sdk-php
  2. This one is even more popular but appears stalled: https://github.com/kamilz/b2backblaze

Other notes:
There is also a FUSE written in Python that seems mature: https://github.com/sondree/b2_fuse
If someone wanted they could probably use that to mount B2 storage directly to the server running Nextcloud and I'm guessing use it directly (again, I'm too new to NextCloud to even know for sure that's viable, but it seems like it might be).

Finally for other interested in building integrations and the requirments for being listed on the integrations page: https://www.backblaze.com/b2/docs/integrations.html

@8BitAce
Copy link

8BitAce commented Oct 22, 2016

I'd definitely like to see this as well. I've been waiting since they first announced the service last year but as of yet neither OwnCloud, Pydio, nor NextCloud seem to be attempting it. (I've taken a look into implementing it myself but my lack of knowledge in PHP and owncloud's plugin system seemed insurmountable).

Some additional points though:

  • No, I don't believe there are any official clients, but the API is a rather simple REST-based one with similarities to S3.
  • The FUSE implementation is less than ideal as the developer notes it is not production-ready and includes numerous limitations (concurrency issues, slow performance with > 1000 files)

I've been surprised there hasn't been more interest in adding this to one of the open source cloud solutions. Is there perhaps some guidance for how someone like myself could implement this? I imagine it would largely be the same as the S3 implementation.

Edit: I'm taking a look and I think I can take a stab at this.

@WebSpider
Copy link

👍 this would be yet another compelling reason for me to finally make that switch from oc to nc

@tommyent

This comment was marked as duplicate.

@jakimfett
Copy link

I'd like to add my +1 to this, and request that two factor auth be supported.

@MariusBluem
Copy link
Member

MariusBluem commented Dec 15, 2016

Please stop 👍 such feature requests ... instead of this - you may want to develop or find somebody who wants to do make this real 😜 ... you may also want to consider about posting a bug bounty on it:

http://bountysource.com/teams/nextcloud/issues

In this issue, we should only discuss this feature technically and should not tell each other how much we want it 😅 THX. However: Thanks for your interest in this topic. If you want to react to specific issues or pull requests on GitHub, you can also use reactions (https://github.com/blog/2119-add-reactions-to-pull-requests-issues-and-comments) 😉

@despens
Copy link

despens commented Jan 2, 2017

https://www.bountysource.com/issues/35689072-cloud-sync-to-backblaze-b2

@jasonehines
Copy link

I just ran across this: https://github.com/ncw/rclone/ It has a mount feature I plan on trying with my nextcloud

@dv-anomaly
Copy link

dv-anomaly commented Jul 4, 2017

It would be nice to see full support for b2 as a regular backend (external storages & primary storage). I've tried using s3proxy, but nextcloud fails to create files in the bucket. I assume it's probably related to some of the limitations listed below. b2 works a bit differently if you take a look at the protocol.

  • bucket names must be between 6 and 50 characters and consistent of only letters, number, and hyphens
  • does not return ETag as an MD5 hash, instead returning SHA1 hash
  • does not support blob-level access control
  • does not support conditional GET
  • does not support copying objects
  • does not support Content-Disposition, Content-Encoding, Content-Language, Content-MD5, or Expiry headers
  • multipart upload requires at least 2 parts

For those that just need a basic implementation this fork of s3ql does seem to work well. It can be a bit of a chore to get it compiled, and it makes it almost impossible to scale horizontally. For deployments that don't need to load balance a handful of front-ends s3ql will get the job done. It is better than most fuse based s3 file systems as it will cache to disk, or in my case an SSD array.

s3ql/s3ql#8
https://github.com/sylvainlehmann/s3ql

@bitshark
Copy link

I'm really excited about this s3ql backblaze fork!

@benginoe
Copy link

Would be a nice feature in nc13. ;)

@dv-anomaly
Copy link

dv-anomaly commented Oct 4, 2017

Minio just added experimental B2 support. It looks like minio can be used as a gateway to native B2 buckets. Providing a way to transparently use an s3 compatible API.

I'll be investigating this further. But I thought I would share the link. I would still like to see official support for B2 in nextcloud, but this might be a good workaround.

https://blog.minio.io/experimental-amazon-s3-api-support-for-backblaze-b2-cloud-storage-service-685e0f35a6d7

@MorrisJobke
Copy link
Member

B2 external storage will not be implemented in the server code itself, but it can be implemented as an app that brings this into files_external, like the files_external_dropbox app does this for Dropbox for example: https://github.com/icewind1991/files_external_dropbox

Nevertheless I will not close this ticket here, because there is a bounty on it.

@tilllt
Copy link

tilllt commented Apr 7, 2018

also would be interested in b2 support but cannot help for lack of php fu there is this though:
https://github.com/cwhite92/b2-sdk-php
https://github.com/gliterd/backblaze-b2

@codyhazelwood
Copy link

For those looking, I tried using Minio's experimental B2 support in the edge channel and mounting it to NextCloud as an external S3 storage source. There are unfortunately quite a few issues, due to the unimplemented CopyObject API and some odd behavior with creating folders. I didn't have time to dig into it, but it's definitely not going to work out of the box.

@mnajamudinridha
Copy link

hi @codyhazelwood, how about minio b2 gateway as primary storage? i have same issue with google cloud storage as external s3 storage, but work perfect with primary storage.

@codyhazelwood
Copy link

Good call @najcardboyz. I tried playing with that for a little while, and it works with no errors, but it is unusably slow. It takes 20 - 40 seconds to change pages in the Nextcloud UI. Uploads and downloads are still speedy though, it's just slow navigating through the UI.

@mnajamudinridha
Copy link

yes, i've tried too and slow, maybe because latency from data center vps to backblaze b2, or anything else. I will try wasabi for later, have you tried it?

@nextcloud-bot nextcloud-bot added the stale Ticket or PR with no recent activity label Jun 20, 2018
@Vaelatern
Copy link

Can I ask why "B2 external storage will not be implemented in the server code itself"? If I were to develop it, and try to get it merged, would it be rejected? Is it a thing where "We've accepted Swift, S3, Azure, and are just done with all these protocols"?

@MariusBluem
Copy link
Member

Hey @Vaelatern ... Great to hear you want to implement this feature 🙈 We recommend implementing it as a separate app expanding files_external-app instead of adding B2-specific code into this repository.

You can find an example storage provider as an app over here: https://github.com/icewind1991/files_external_dropbox

We are of course here to assist you if you have any questions 😇

@nextcloud-bot nextcloud-bot removed the stale Ticket or PR with no recent activity label Feb 5, 2019
@Vaelatern
Copy link

Vaelatern commented Feb 5, 2019

Can I put primary storage on a files_external-app, like I can for S3 at present? And thank you, I'm hoping my employer wants this feature enough to pay for it too.

@mnajamudinridha
Copy link

file external can using s3, if you want use s3 as primary storage it can, but external storage for primary storage is different function

@much-doge
Copy link

https://www.backblaze.com/blog/backblaze-b2-s3-compatible-api/ I think this should help slightly_smiling_face

It is indeed.. now I only need to migrate my data from the old bucket to the new s3 compatible bucket.

@tsgoff
Copy link

tsgoff commented May 5, 2020

Class C transactions cap reached 75% after 1 day configured in nextcloud

Setting "Check for changes" to never fixed it

@blueish4
Copy link

blueish4 commented May 6, 2020

I've noticed that Backblaze seems to be storing 2 identical versions of each file Nextcloud uploads when connected with the new S3 compatible API (one of which is immediately hidden but costs towards the data stored)- can anyone else observe this behaviour? It happens with the mobile app and desktop sync, but the gvfs connector on GNOME doesn't seem to trigger the issue, which is strange as they both send a single PUT command to the server. Server version is 18.0.4.

@much-doge
Copy link

I've noticed that Backblaze seems to be storing 2 identical versions of each file Nextcloud uploads when connected with the new S3 compatible API (one of which is immediately hidden but costs towards the data stored)- can anyone else observe this behaviour? It happens with the mobile app and desktop sync, but the gvfs connector on GNOME doesn't seem to trigger the issue, which is strange as they both send a single PUT command to the server. Server version is 18.0.4.

I also experienced the same issue. Copying files form my local storage to external B2 storage from within web interface creates duplicate files in B2

@joyov
Copy link

joyov commented May 26, 2020

I believe this is because Nextcloud creates a .part file and then renames the file once the upload is complete. Rename in S3 means PutObjectCopy+DeleteObject, which is how you end up with one live and one hidden object. My solution for now is to use the "Keep only the last version of the file" lifecycle setting, which means B2 will delete the hidden copy after a day.

I also had to disable file locking as it was causing problems with file deletion. Is probably not needed for S3 anyways (S3 objects always have atomic consistency):

'filelocking.enabled' => false,

This should already be disabled (by default):

'filesystem_check_changes' => 0,

@kesselb
Copy link
Contributor

kesselb commented Jul 22, 2020

To anyone who applied the patch I posted: Revert it.

I spent some time today to polish it more and submit a pull request for Nextcloud but run into various issues. The storage implementation assumes that the number of bytes read from a stream are the number of bytes written to the remote storage. But the ObjectUploader always reads the first 5 MB of a stream to check if MultipartUpload or PutObject should be used thus the counter for read bytes is always (at least) 5 MB to high. This is somehow related to those "Argument 1 passed to OC\Files\Cache\CacheQueryBuilder::whereFileId() must be of the type int, null given" errors. Default encryption module workarounds this problem because the size of the encrypted file is always higher than the actual content.

I still strongly advice to use it.

@kalatabe
Copy link

Class C transactions cap reached 75% after 1 day configured in nextcloud

Setting "Check for changes" to never fixed it

I'm facing the same problem with a B2 external storage, but setting "Check for changes" to "never" doesn't help. Today, one single nextcloud desktop client generated about 250 class C transactions in about 30 minutes.
Is there anything else I could try?

@skjnldsv skjnldsv added 0. Needs triage Pending check for reproducibility or if it fits our roadmap 1. to develop Accepted and waiting to be taken care of and removed 0. Needs triage Pending check for reproducibility or if it fits our roadmap labels Aug 20, 2020
@solracsf
Copy link
Member

This does not affect only B2 but other S3 compatible filesystems dealing (badly) with multipart uploads for small files.
This leads to a lot of fclose, cacheQueryBuilder and other Expected filesize of errors while using multipart uploads.

@joshtrichards
Copy link
Member

Is this Issue still needed? Backblaze has since added native S3 support (May 2020). They treat it a as first class access option. It seems to have made the need for separate dedicated support for B2's original API moot. This issue here basically went quiet shortly after Backblaze launched S3 support.

https://www.backblaze.com/docs/cloud-storage-s3-compatible-api

It seems to work fine for me (or it did the last time I checked a few months ago).

@thibaultmol You're the original requestor of this enhancement - what's your assessment of things today? If you feel the original request is settled, can you close this Issue out? Thanks!

@detly
Copy link

detly commented Jul 31, 2023

I'm not the opener, but as someone who was interested in this because I was using Backblaze, I can confirm that the S3 integration works perfectly (for me, anyway) with Nextcloud. You do, of course, have to be a bit careful with things like preview generator or indexer plugins running on transaction-charged storage, but that's neither a surprise nor a Backblaze-specific concern.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests