Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for S3 daily buckets #1455

Merged
merged 9 commits into from
Apr 28, 2023
Merged

Conversation

juhassi
Copy link
Contributor

@juhassi juhassi commented Apr 27, 2023

This is a works for me version of S3 daily bucket support for Filesender. I'll continue testing this a bit and if I don't find any real issues, we might be inviting some of our real users to test-drive this during next few months.

When storage_type is set to CloudS3, daily buckets can be enabled with $config['cloud_s3_use_daily_bucket'] = true;
Bucket naming prefix can be set with cloud_s3_bucket_prefix. Bucket name will be prefix + YYYY-MM-DD, for example:

$config['storage_type'] = 'CloudS3';
$config['cloud_s3_use_daily_bucket'] = true;
$config['cloud_s3_bucket_prefix'] = "Demo-";

Would today use bucket Demo-2023-04-27. Buckets are created (and removed when no longer needed) by cron.php, there is also scripts/task/S3bucketmaintenance.php that can be run to do only daily bucket maintenance when needed (mainly if you've just turned this on and need to have the daily buckets created asap).

Whenever maintenance script is run, list of existing buckets is fetched from S3. If they don't already exist, new daily buckets will be generated for today + 14 next days. Then all old buckets will be checked for contents and if they are empty, they'll be removed.

If today's bucket does not exist, all transfers will fail with a generic exception. For this reason, buckets are generated a bit in advance to allow administrators some breathing room in situations when new buckets can't be created (too many buckets already created error etc) -- I hope everybody is looking at their cron job's error emails :-)

Also, as a general note, performance with S3 backend can be greatly increased by bumping up upload_chunk_size and related values. My test system is performing adequately with:

$config['upload_chunk_size'] = 1024 * 1024 * 40;
$config['upload_crypted_chunk_size'] = $config['upload_chunk_size'] + 32;
$config['download_chunk_size'] = $config['upload_chunk_size'];

Increasing upload_chunk_size also requires you to verify PHP's memory limit is equal or bigger than about 5x upload_chunk_size.

juhassi added 2 commits April 27, 2023 10:47
transfers_table.php should always hide transfer option STORAGE_CLOUD_S3_BUCKET from the user. User does not need to see the name of our internal S3 bucket.
Support for Daily Buckets in CloudS3 storage class.

Daily buckets are enabled when storage_type == CloudS3 and cloud_s3_use_daily_bucket is set to true.
Prefix for daily bucket names is configured with option cloud_s3_bucket_prefix
Bucket names are cloud_s3_bucket_prefix + "YYYY-MM-DD"

Buckets need to be created by cron.php (or by running S3bucketmaintenance.php) before they can be used.
Same scripts also remove old buckets when they no longer have any objects in them.
@github-actions
Copy link

If there are selenium UI results for this code they will be at filesenderuici@791f15d

juhassi and others added 2 commits April 27, 2023 13:02
If we've already received enough data (= as much as there should be in this file),
set gameover=true to fail future read requests for more data.
Check if we are at the end of stream and set gameover=true
@github-actions
Copy link

If there are selenium UI results for this code they will be at filesenderuici@5a5898d

@github-actions
Copy link

If there are selenium UI results for this code they will be at filesenderuici@5e46900

@juhassi
Copy link
Contributor Author

juhassi commented Apr 28, 2023

Also added some documentation updates to this PR.

@monkeyiq
Copy link
Contributor

Looks good. Thanks!

I have a few little things which I might update next.

@monkeyiq monkeyiq merged commit 73cad53 into filesender:development Apr 28, 2023
6 checks passed
This was referenced Apr 28, 2023
@monkeyiq
Copy link
Contributor

I have put your performance information into #1459 and merged it to prevent it being lost / buried.

@juhassi juhassi deleted the PullReq-20230427 branch May 2, 2023 08:43
@juhassi
Copy link
Contributor Author

juhassi commented May 2, 2023

Changing chunk_size invalidates all already uploaded files when using chunked storage methods (as then filesender doesn't know what used to be the chunk_size for that particular download -> it can't request proper objects from backend storage).

I'm thinking if it would be a good idea to include current chunk_size as option when creating a Transfer and use that instead of currently configured chunk_size when accessing files in that particular Transfer.

@monkeyiq
Copy link
Contributor

monkeyiq commented May 3, 2023

I like this idea. I think the current chunk_size is referenced in crypto_app.js a bit and would need to be refactored to use either transfer->chunk_size or file->chunk_size. The classes/data/File class already stores the storage_class_name used when a file was created so it seems a logical place to put the chunk_size too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants