Skip to content

Set checksum if not seeked #40513

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Dec 23, 2022
Merged

Set checksum if not seeked #40513

merged 5 commits into from
Dec 23, 2022

Conversation

jvillafanez
Copy link
Member

@jvillafanez jvillafanez commented Nov 23, 2022

Description

Range downloads might set up a wrong checksum if there isn't a checksum set. This causes problems later because a full download will send the wrong checksum and clients might flag the download even though the file was downloaded ok.

Note that range downloads might still get the checksum of the whole file, not just the downloaded piece.

In addition, appending content won't calculate a checksum.

There are some additional cases that might still need to be covered

Related Issue

https://github.com/owncloud/enterprise/issues/5457

Motivation and Context

How Has This Been Tested?

  • test environment:
  • test case 1:
  • test case 2:
  • ...

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Database schema changes (next release will require increase of minor version instead of patch)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Technical debt
  • Tests only (no source changes)

Checklist:

  • Code changes
  • Unit tests added
  • Acceptance tests added
  • Documentation ticket raised:
  • Changelog item, see TEMPLATE

Notes

  • On average, range download don't generate a checksum because not the whole file is read. However, it really depends on how PHP reads the stream internally. PHP usually read blocks of 8KB and keep them in a buffer even though only a few bytes are requested and returned. This means that, for small files (less than 8KB), there is a chance that the whole file is buffered, so in this case a checksum could be stored even though, technically, not the whole file is returned.
  • It's expected that, regardless of the outcome, the checksum stored in the DB is always correct. This is important taking into account the behavior of the previous point.
  • While the expected actions on the data stream are expected to be easy coming from the outside (either read the whole o a part of the file, or write contents) additional actions might happen, such as overwrite a part of the stream while leaving the rest intact. In this case, a conservative approach is taken, so if there are doubts on whether we've read the whole file or the checksum could be compromised (by moving the stream pointer to a previous position, for example), no checksum will be stored because it's very likely it will be wrong.
  • Partial data won't be saved among requests. This could be relevant for apps that download the file by chunks. The app could request the first megabyte, then the second, then the third... In this case, even though the file ends up being downloaded, no checksum will be stored.

@update-docs
Copy link

update-docs bot commented Nov 23, 2022

Thanks for opening this pull request! The maintainers of this repository would appreciate it if you would create a changelog item based on your changes.

@ownclouders
Copy link
Contributor

💥 Acceptance tests pipeline apiProxySmoke-8-2-mariadb10.2-php7.4 failed. The build has been cancelled.

https://drone.owncloud.com/owncloud/core/37334/168

@jvillafanez jvillafanez marked this pull request as ready for review November 24, 2022 13:06
@jvillafanez
Copy link
Member Author

@phil-davis is it possible to add some acceptance tests for this scenario? Basically, if the download has a request range, no checksum should be generated. If there is a checksum already, it won't be changed.
A relatively big file (1-5MB) should be enough. A small file might get hit by the first point in the notes, so a checksum might be generated even though a couple of bytes have been requested.

Copy link
Contributor

@IljaN IljaN left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find! LGTM 💯

@jnweiger
Copy link
Contributor

Merging. We have only the one expected fsweb error in the ci, that is unrelated to this.

@jnweiger jnweiger merged commit 4626cd3 into master Dec 23, 2022
@delete-merged-branch delete-merged-branch bot deleted the set_checksum_if_not_seeked branch December 23, 2022 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants