Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple versions of file shown on website after single upload #2

Closed
larkinox opened this issue May 23, 2018 · 20 comments
Closed

Multiple versions of file shown on website after single upload #2

larkinox opened this issue May 23, 2018 · 20 comments
Labels
Bug Something isn't working Fixed OneDrive API Blocker An API issue prevents further work at this point in time OneDrive Business Workaround Available A work around for the specific issue is available

Comments

@larkinox
Copy link

Hi,

I'm not sure if this is a bug or me being an idiot. With that out of the way, the problem is that the OneDrive web interface shows multiple versions for every file that I upload, meaning that each file consumes multiples of its own size from my storage allocation.

Some background:
onedrive --version: onedrive v1.1.2-1-ge8d3a26
OS: Debian 9.4 (Stretch)
For testing, I deleted all files from my account on the OneDrive servers, emptied the recycle bin and the second recycle bin. Storage Metrics page shows 0 bytes usage. I deleted my local items.sqlite3* files.

Command ran: onedrive --verbose --local-first --monitor --synchronize

Sequence of events:

  1. A file is recognised as absent from the OneDrive account and is uploaded.
  2. The storage metrics page shows the file size for this file steadily increasing during the upload (to 100MB, the actual file size). The file has a "~tmp" prefix as expected.
  3. During the upload, the Version History for the file shows one version with a timestamp of the time of the upload.
  4. The file finishes uploading, the filename changes to remove the "~tmp" prefix.
  5. The file storage metrics summary size continues to increase, eventually stopping at 200MB.
  6. The Version History for the file now shows two versions, one with a timestamp of upload and one with a timestamp when the file was first created.

The upshot of this is that even if my files total, say 1GB, my storage allocation is reduced by 2GB. If I manually download and check each "version", they are identical (md5sum). I suppose the intended outcome is that there is one version, which has a modification date equal to when the file was last modified, not when it was uploaded.

A short section of the log file is shown here:
Onedrive_log.txt

I have also screen-grabbed a short video showing the problem. Keep looking at the file 'duplicati-b167be4f012754240b8679b9b81a9f6fa.dblock.zip.aes' which I highlight at various points.
video_example.zip

Many thanks in advance,
James

@abraunegg
Copy link
Owner

Thanks for the great detail behind this issue.

I have looked at the issue and tested against OneDrive Business & OneDrive Personal. It appears that this issue only affects OneDrive Business accounts. The issue itself it certainly related to ensuring that the metadata of an object is updated.

@larkinox
Copy link
Author

Thanks for the update. I'm very happy to test if you think you find the cause.

@abraunegg
Copy link
Owner

abraunegg commented May 24, 2018

This issue is described in further detail here: OneDrive/onedrive-api-docs#778

Essentially, when a new file is created on OneDrive, it sets the file creation time to the date / time the file was created on the OneDrive platform. The 'onedrive' client currently then performs a 'patch' to reset the last modified date / time value to what the local file details are. SharePoint then sees this as a change and versions the file.

The resolution appears to look at the session upload capability and combine what the 'onedrive' client is attempting to do in a single step.

This will take a little bit of unraveling - but it is a starting point & potential path to resolution is clear.

@larkinox
Copy link
Author

I've checked the behaviour with the versioning disabled in onedrive (suggested as a workaround in OneDrive/onedrive-api-docs#778. )

The result is that a single upload uses the correct amount of space (i.e. not double) and only a single version is created. However, the version history and file properties page show different dates (the date the file was created and the date the file was uploaded respectively).

Version history:
version_history

File properties:
file_properties

@abraunegg abraunegg added the Bug Something isn't working label May 24, 2018
@abraunegg abraunegg removed their assignment May 28, 2018
@abraunegg abraunegg added the In Progress Currently being worked on label May 28, 2018
@abraunegg
Copy link
Owner

Run into a OneDrive API bug potentially - OneDrive/onedrive-api-docs#870

@abraunegg abraunegg added OneDrive API Blocker An API issue prevents further work at this point in time Workaround Available A work around for the specific issue is available and removed Investigating labels Jun 5, 2018
@larkinox
Copy link
Author

larkinox commented Jun 6, 2018

Thanks for the work you've done on this. I've read the OneDrive API bug you created but I'm not sure how you think a workaround would work. Could you elaborate a little?

@larkinox
Copy link
Author

larkinox commented Jun 6, 2018

I've just realised that you mean the workaround I described above ;-) I'll push ahead with that for now and keep an eye here in future.

@abraunegg
Copy link
Owner

abraunegg commented Jun 8, 2018

The OneDrive API team have confirmed that it is a bug that I have come across (OneDrive/onedrive-api-docs#870). A potential workaround for that has been provided which I need to test in order to progress the resolution of this bug.

abraunegg added a commit that referenced this issue Jun 9, 2018
* Resolve #11 where shared folders were unable to be sync'd due to fileSystemInfo data being within the remoteItem object
* Initial work on resolving #2, but fix not validated or complete
@abraunegg
Copy link
Owner

Just to update - a fix is progressing nicely. Should be able to finalise this in a few days.

@larkinox
Copy link
Author

Thanks for the update. I'm very happy to test when you're ready.

@abraunegg abraunegg removed the OneDrive API Blocker An API issue prevents further work at this point in time label Jun 14, 2018
abraunegg added a commit that referenced this issue Jun 14, 2018
* Update session upload to include a 'Request Body' that includes the 'localFileLastModifiedTime' to avoid making a patch call which 'versions' the uploaded file on OneDrive Business
* Resolve Multiple versions of file shown on website after single upload (OneDrive Business) (#2)
@abraunegg
Copy link
Owner

@jlarkin-oxford
Can you help validate https://github.com/abraunegg/onedrive/tree/Issue-%232 ?

This code switches the upload of new files to session objects & as part of the session includes the 'timeLastModified' fields so that it negates the need for the patch call for new files. I have verified locally that when uploading new files, only a single file is uploaded and has the correct timestamps.

@larkinox
Copy link
Author

This has definitely helped in that I don't consume twice my storage allocation. There may be timestamp issue still but I'm not certain of the intended outcome. Here is my test procedure and results:

onedrive --version: onedrive v1.1.2-14-g491507a
delete ~/.config/onedrive/items.sqlite3*
Delete all files from online OneDrive account, empty recycle bin and second stage recycle bin.
Test case 1: Library versioning disabled as described here (This is the workaround mentioned in OneDrive/onedrive-api-docs#778)
Test case 2: Library versioning enabled with "Create major versions" selected and a maximum number of versions set to 10. I believe this was the default state from when my account was new and I first experienced this issue.

Command ran: onedrive --verbose --local-first --monitor --synchronize

In both cases, the outcome was the same:

  • An upload of a 100MB file consumed 100MB of my storage allocation (correct)
  • The timestamp of the file in the normal OneDrive web interface, once it finished uploading, shows the date the file was created in the "Modified" column (correct).
  • Syncing the files onto a Windows PC shows that they have the timestamps for when the files were created (correct)
  • In the Storage Metrics page online, the version history shows one version with the date and time the file was created (correct).
  • On the main Storage Metrics page, before you click the link for a specific file's version history, the "Last Modified" column shows the date and time the file was uploaded (incorrect?).
  • The parent folders for each file have a "Modified" date and time of when they were uploaded, in the web interface, storage metrics page and the sync'd Windows machine (incorrect?)

In any case, the fix has definitely solved the "double storage allocation" problem so I'm happy to close the issue if you want.

Thanks for all your help.

@abraunegg
Copy link
Owner

abraunegg commented Jun 14, 2018

On the main Storage Metrics page, before you click the link for a specific file's version history, the "Last Modified" column shows the date and time the file was uploaded

That is potentially a bug in the UI as without updating the last modified time, when you upload a file via the UI or by the client (when specifically set to not modify the timestamps), the file creation time / modified time are the same.

The parent folders for each file have a "Modified" date and time of when they were uploaded, in the web interface, storage metrics page and the sync'd Windows machine (incorrect?)

Potentially the same issue as above.

I have found one more part to fix with this issue - and that is when a file is modified, 2 version still get recorded, so this fix is incomplete.

@larkinox
Copy link
Author

Is that not the intended design though? If I modify a file, and have specified that up to 10 versions should be kept, I would expect a second version to be stored. If I had specified that only a single version should be kept and two versions were being stored, that would be the problem.

@abraunegg
Copy link
Owner

abraunegg commented Jun 14, 2018

Not quite. What I am seeing is:

Modify file -> new file upload v.1
Update timestamp -> file now gets a v.2

So now you have 3 versions of the file for 2 activities - initial upload and simply updating the file once.

@larkinox
Copy link
Author

Ah, I see. Yes, not quite as intended.

@abraunegg
Copy link
Owner

It looks like another OneDrive API issue.

If the file to be uploaded as a "modification" already exists on OneDrive, and using the "replace" for @name.conflictBehavior , 2 versions of the file are created - when the expectation should only be one.

Will have to get the OneDrive API team to investigate this.

@abraunegg abraunegg added the OneDrive API Blocker An API issue prevents further work at this point in time label Jun 16, 2018
@abraunegg
Copy link
Owner

abraunegg commented Jun 18, 2018

Opened a new case for this additional issue discovered:

OneDrive/onedrive-api-docs#877

Edit: Confirmed as a OneDrive API Bug

@abraunegg
Copy link
Owner

Releasing this fix into 'master' as code changes were previously validated. Two file versions for OneDrive for Business will still get created when a modified file is uploaded, but this is due to OneDrive/onedrive-api-docs#877 and not this code now.

Additionally, once #23 is resolved, the checks that this application uses to verify that a file has indeed changed locally before uploading to OneDrive Business will lessen the impact of the OneDrive API issue above.

abraunegg added a commit that referenced this issue Jul 2, 2018
…Issue #2) (#40)

* Resolve multiple versions of file shown on website after single upload - however Issue #23 & OneDrive API bug (OneDrive/onedrive-api-docs#877) will still create 2 file versions on OneDrive Business. The work around is to disable file versions until #877 is resolved.
@lock
Copy link

lock bot commented Jan 6, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Jan 6, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Bug Something isn't working Fixed OneDrive API Blocker An API issue prevents further work at this point in time OneDrive Business Workaround Available A work around for the specific issue is available
Projects
None yet
Development

No branches or pull requests

2 participants