Terabyte file support #2791

letiemble · 2016-02-21T11:28:08Z

Hi,

This PR is about the file size limit in Syncthing.

From what I understand from the specifications and the code, the current size limit is 1000000 blocks of 131072 bytes, which means that any file larger than 130 GB cannot be replicated. Am I right ?

I dived into the code and patched it to support up to 1 terabyte file size. My tests with 300 GB and 450 GB files were ok and they got replicated successfully.

Here are some points left:

The mix of go generate (for the model part ) and hardcoded values (for the DB/Protocol part) make it difficult to know which are the actual limits. For example, the number of blocks of a file has a direct influence on the messages' size (during index exchange). This is not obvious when reading the code. Is is planned to have a specific doc on the subject ? A plus would be to have this information available on the command line via a specific switch.
Is it planned to log/store/ the cases when limits are crossed so a feedback can be provided to the user ? When I tried to replicate a 300 GB file, I only knew that something went wrong by using STTRACE.
Is there any side-effects with increasing values to support very large files ? If not, what is the rationale behind the current values ?

Regards.

AudriusButkevicius · 2016-02-21T11:31:21Z

I think it should error out if the file is bigger. I'll leave this for calmh to merge, as he was the one who imposed limits in the first place.

calmh · 2016-02-23T10:21:06Z

A limit of some sort needs to be in place to prevent us from attempting to allocate a gazillion bytes of ram in the face of corruption or protocol changes. The long term solution to this issue is to use a variable block size, which is sort of planned. In the meantime we should probably increase the size. The limits are supposed to be all handled by the go-generate updates parts, but the "truncated" code is manually updated as we want to avoid the cost of loading the (potentially huge, now) list of blocks when we don't need that info.

I haven't looked at the code yet (limited device, will later) but as a first guess this is probably fine and should be merged if there's nothing technically wrong with it.

calmh · 2016-02-23T10:24:51Z

Code LGTM to me, except it should not update the man page (that's auto generated from the actual protocol spec), it should be squashed to a single commit, and there should be a companion PR to increase the recommended limits in the actual spec.

Increase limit when unmarshaling XDR. Increase the size of message.

letiemble · 2016-02-27T07:36:05Z

I have squashed the commits (minus the MAN page one). Where should I submit the companion PR for the limit in the actual spec ?

calmh · 2016-02-27T07:55:05Z

LGTM just waiting for the build server to come online again for verification.

https://github.com/syncthing/docs/blob/master/specs/bep-v1.rst for the spec, one of the final sections, "message limits".

letiemble · 2016-02-28T11:23:22Z

Companion PR created (syncthing/docs#127).

Companion PR for the syncthing/syncthing#2791

ashleycawley · 2016-02-29T13:48:25Z

From a user perspective I have been trying to Sync a 1.7TB directory, a few files are ~~488GB some around 100~~200GB, all I experienced was a stuck initial scan, no error messages or evidence of problems in the log as far as I could spot. A bit frustrating, thanks to the prompt & helpful Syncthing community I was directed here to see this thread. I dearly would have loved to see an error message or just a warning/notice if it noticed files over 130GB.

Now I know this limitation I can easily split files down to be smaller that is not a problem for me in my use-case.

I would just like to say a huge thank you to the Developers who spend their time on this very worth while project, I love independence and the flexibility that the settings provide the user. I will continue to recommend Syncthing to others wherever I can. Thank you for your time and efforts.

calmh · 2016-03-02T14:09:14Z

LGTM, I'm just going to get back from vacation and fix the build server so we can run some builds and tests on this before mergin.

calmh · 2016-03-04T14:56:02Z

@st-jenkins retest this please

calmh · 2016-03-04T15:26:08Z

Merged as c8b6e6f, thanks.

romprod · 2017-02-24T12:17:14Z

Hi,

Is this limit able to be increased any more? Or is it possible to set it manually by editing anything?

I have a 1.3TB file which I guess will hit this limit.

AudriusButkevicius · 2017-02-24T12:43:28Z

I think this is no longer relevant, I don't think we have limits anymore, but let us know if you hit issues.

calmh · 2017-02-24T12:58:58Z

Also you will almost certainly not want to use syncthing for syncing 1.3TB files anyway. But if it works for you, go ahead. :)

romprod · 2017-02-24T13:02:46Z

Are there better alternatives for large files above 1TB? The files never change once they've been created.....

Ferroin · 2017-02-24T13:05:12Z

If they never change once created, you're much better off using some generic file transfer tool to throw them over the network than using a tool like Syncthing, and that actually applies to any size file, not just TB+ ones.

romprod · 2017-02-24T17:41:27Z

Well they need be sent over a 100mbit line across the internet.... Was hoping to use Syncthing to split the files up into blocks and send that way.

a8ksh4 · 2017-02-24T17:51:40Z

I'm curious why we wouldn't want to use syncthing for static files as well. Is there overhead from syncthing having to hash the file periodically to check for changes? I'd expect this is not an issue for smaller data sets, but could be a prob for really large data sets (either large files or large numbers of files).

…

On Fri, Feb 24, 2017 at 9:41 AM romprod ***@***.***> wrote: Well they need be sent over a 100mbit line across the internet.... Was hoping to use Syncthing to split the files up into blocks and send that way. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#2791 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAHE7jty8iUhwUqbA7nmYkmdemsmfaFtks5rfxZJgaJpZM4He7ED> .

AudriusButkevicius · 2017-02-24T17:54:32Z

why would you want to use syncthing for static files if rsync/scp does it better and faster?

a8ksh4 · 2017-02-24T18:10:21Z

Because with rsync/scp, I need to have other automation/supporting configs to make them work - dns names registered to find the hosts I want to sync data to, and automation to call rsync/scp and handle failures. With syncthing, I can just drop a file into the shared area (be it a static file or a dynamic, changing, file) and it magically appears on all of my systems. :)

Ferroin · 2017-02-24T18:21:07Z

OK, let's frame this a different way:
Using Syncthing for stuff like this is trading a pretty significant amount of efficiency for some convenience.

As a point of comparison, over a direct gigabit link, copying data between two reasonably high-end systems, I get about 20% better throughput using rsync than i do Syncthing, roughly another 5% using SCP, and another 4-7% on top of that if I just use netcat. Note that most of the difference between Syncthing, rsync, and SCP is processing overhead, while the difference for netcat is protocol overhead (netcat uses nothing on top of TCP (or UDP, or SCTP, or DCCP, depending on what switches you pass), so there's zero protocol overhead compared to the others).

Now, I do get the DNS issue, but that's not hard to handle sanely as long as you have some system somewhere that has a fixed IP.

calmh · 2017-02-25T01:50:48Z

All that said, it's the incremental changes that are painful on large files. If they never change you should be mostly fine using syncthing, and may indeed get some gains from devices helping each other out. Also blocks that are common to several files will be reused locally instead of transferred which is a win.

…

Upgrade FileInfo up to 10000000 blocks. 1310 GB files can be shared.

52924d0

Increase limit when unmarshaling XDR. Increase the size of message.

letiemble mentioned this pull request Feb 28, 2016

Companion PR for the syncthing/syncthing#2791 syncthing/docs#127

Merged

calmh added a commit to syncthing/docs that referenced this pull request Feb 28, 2016

Merge pull request #127 from letiemble/patch-2

72ab46b

Companion PR for the syncthing/syncthing#2791

Merge branch 'master' into B_TerabyteFile_Support

0d686d0

calmh self-assigned this Mar 2, 2016

calmh closed this Mar 4, 2016

st-review added the frozen-due-to-age Issues closed and untouched for a long time, together with being locked for discussion label Aug 24, 2017

syncthing locked and limited conversation to collaborators Aug 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Terabyte file support #2791

Terabyte file support #2791

letiemble commented Feb 21, 2016

AudriusButkevicius commented Feb 21, 2016

calmh commented Feb 23, 2016

calmh commented Feb 23, 2016

letiemble commented Feb 27, 2016

calmh commented Feb 27, 2016

letiemble commented Feb 28, 2016

ashleycawley commented Feb 29, 2016

calmh commented Mar 2, 2016

calmh commented Mar 4, 2016

calmh commented Mar 4, 2016

romprod commented Feb 24, 2017

AudriusButkevicius commented Feb 24, 2017

calmh commented Feb 24, 2017 via email

romprod commented Feb 24, 2017

Ferroin commented Feb 24, 2017

romprod commented Feb 24, 2017

a8ksh4 commented Feb 24, 2017 via email

AudriusButkevicius commented Feb 24, 2017

a8ksh4 commented Feb 24, 2017

Ferroin commented Feb 24, 2017

calmh commented Feb 25, 2017 via email

Terabyte file support #2791

Terabyte file support #2791

Conversation

letiemble commented Feb 21, 2016

AudriusButkevicius commented Feb 21, 2016

calmh commented Feb 23, 2016

calmh commented Feb 23, 2016

letiemble commented Feb 27, 2016

calmh commented Feb 27, 2016

letiemble commented Feb 28, 2016

ashleycawley commented Feb 29, 2016

calmh commented Mar 2, 2016

calmh commented Mar 4, 2016

calmh commented Mar 4, 2016

romprod commented Feb 24, 2017

AudriusButkevicius commented Feb 24, 2017

calmh commented Feb 24, 2017 via email

romprod commented Feb 24, 2017

Ferroin commented Feb 24, 2017

romprod commented Feb 24, 2017

a8ksh4 commented Feb 24, 2017 via email

AudriusButkevicius commented Feb 24, 2017

a8ksh4 commented Feb 24, 2017

Ferroin commented Feb 24, 2017

calmh commented Feb 25, 2017 via email