Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File Versioning #121

Closed
luke- opened this issue Jul 30, 2021 · 47 comments
Closed

File Versioning #121

luke- opened this issue Jul 30, 2021 · 47 comments
Assignees

Comments

@luke-
Copy link
Contributor

luke- commented Jul 30, 2021

Currently, when a file with the same file name is uploaded, it is overwritten or a new file is created depending on the configuration.
An edited file (e.g. via OnlyOffice) is always overwritten.

In future, a new version of the file should always be created in these cases.

In the context menu, there should be a link "Versions", which opens a modal that allows access and restore of older versions.


A CFile (module) consequently has several File (file) models.


Also related to: #122 (comment)

@luke- luke- mentioned this issue Jul 30, 2021
@luke-
Copy link
Contributor Author

luke- commented Jul 30, 2021

#113

@yurabakhtin
Copy link
Contributor

@luke-

Currently, when a file with the same file name is uploaded, it is overwritten or a new file is created depending on the configuration.

Sorry, I cannot find such configuration, please tell me.

In the context menu, there should be a link "Versions", which opens a modal that allows access and restore of older versions.

Do you tell about files which are stored on disc and not in DB, if yes then where do you plan to store the old versions, also on the disc, i.e. after each update we should create an additional file with some suffix of the modification date?

@luke-
Copy link
Contributor Author

luke- commented Sep 8, 2021

@yurabakhtin Here the configuration part:
image

No, the Idea is this:
Currently: Each file entry in the CFiles module humhub\modules\cfiles\models\File is currently associated with exactly one humhub\modules\cfiles\models\File.

With versioning, each humhub\modules\cfiles\models\file should be able to have multiple humhub\modules\cfiles\models\File records. (For each file version one.)

@yurabakhtin
Copy link
Contributor

@luke- Ok, thanks for the info, however in your examples the same class so I think you mean this:

  • Currently: humhub\modules\cfiles\models\File is associated with single core humhub\modules\file\models\File
  • Idea: humhub\modules\cfiles\models\File will be associated with multiple core humhub\modules\file\models\File

Right?

@luke-
Copy link
Contributor Author

luke- commented Sep 9, 2021

@yurabakhtin Yes!

@yurabakhtin
Copy link
Contributor

@luke-

In the context menu, there should be a link "Versions", which opens a modal that allows access and restore of older versions.

When do you plan to merge the PR #131? Because if I will add the new context menu item "Versions" in old version then it will be not easy to merge.
It looks you forgot to merge because I see you merge the core changes but not for the cfiles module from the issue #127.

@yurabakhtin
Copy link
Contributor

@luke- I have started the logic in the commit b5e5929:

file_versioning

Next step is to implement UI for switching, I will start on old context menu while new is not merged yet.

@luke-
Copy link
Contributor Author

luke- commented Sep 10, 2021

@yurabakhtin Ok great looks good so far!

@yurabakhtin
Copy link
Contributor

yurabakhtin commented Sep 10, 2021

@luke- Commit cea87c1:

file_versions

What I should still do:

  • Don't show the context menu item "Versions" if the file has only single version,
  • Add tests,
  • Probably change the UI from radio list to table view, @luke- please let me know if UI should be changed there somehow, or just add a small preview for image files.

@luke-
Copy link
Contributor Author

luke- commented Sep 13, 2021

@yurabakhtin Looks already pretty good!

Probably a table in the version dialog would be nicer.

Columns could be:
Time, Author, Size, Options -> Revert to this version, Download

It would be important that other modules like OnlyOffice or other file editors could work with versioning. Maybe we should provide here some API?

@yurabakhtin
Copy link
Contributor

  • Don't show the context menu item "Versions" if the file has only single version,

Commit 01dc7f9

  • Add tests,

Commit 21b948a

@yurabakhtin
Copy link
Contributor

@luke-

It would be important that other modules like OnlyOffice or other file editors could work with versioning. Maybe we should provide here some API?

Currently the code model humhub\modules\file\models\File doesn't restrict to have multiple records for same object. I.e. each model from other modules can has several records in the table like we implemented for the model cfiles\models\File.

If some another model from external module will want to have same multiple File records then code should be changed from the external module side, if we tell about OnlyOffice then I guess here https://github.com/ONLYOFFICE/onlyoffice-humhub/blob/master/models/CreateDocument.php#L91 we should allow to find existing File instead of current $file = new File();.

Then I don't understand why we need to implement some API if each external module has an access to core DB tables. I.e. if we need some UI to work with file versions we can get the data directly from DB without API requests.
Also when OpenOffice is created as attachment to a post on stream then we can find the file on the cfiles browser in the folder "Files from the stream" where we may open context menu "Versions".

@luke-
Copy link
Contributor Author

luke- commented Sep 14, 2021

@yurabakhtin "Don't show the context menu item "Versions" if the file has only single version," , sorry for the late feedback, but if you use a table view on this lightbox, we should always show this menu entry. Otherwise it may confuses users.

@luke-
Copy link
Contributor Author

luke- commented Sep 14, 2021

@yurabakhtin Regarding the API: Here is the code line where in the OO Module a change of the document is saved.
https://github.com/ONLYOFFICE/onlyoffice-humhub/blob/master/controllers/BackendController.php#L123

How would the code have to look like to support versioning in CFiles? Maybe a extra API is too much here, but the code should be as simple as possible.

Ok, interesting point that versioning could theoretically be possible for files from normal posts as well. However, this is currently not absolutely necessary.

@luke-
Copy link
Contributor Author

luke- commented Sep 14, 2021

@yurabakhtin Here a quick idea/thought:

The HumHub Core File Model has the attributes object_model/id which refer to the related CFile Model.
Would it theoretically also be possible to leave this link object_model/id only set for latest file Model record?
Older core File model versions could have a link to the Core File Model itself?

Example of core file table:

Current situation:

id object_model object_id ... Note
1 CFile 1
2 CFile 1
3 CFile 1 Latest file marked in cfiles table

Idea:

id object_model object_id ... Note
1 File 3
2 File 3
3 CFile 1 Latest Version

What do you think?

@yurabakhtin
Copy link
Contributor

@luke-

Probably a table in the version dialog would be nicer.
Columns could be:
Time, Author, Size, Options -> Revert to this version, Download

Commit 401391b:

table-view

@yurabakhtin
Copy link
Contributor

@luke-

"Don't show the context menu item "Versions" if the file has only single version," , sorry for the late feedback, but if you use a table view on this lightbox, we should always show this menu entry. Otherwise it may confuses users.

Sorry, not clear your idea why we should display the "Versions" table when CFile has only single version, please see how it will looks:

single-revision

I.e. there is no the action link to switch to revision because no reason to switch to the same version, so do you still think we should display the context menu item "Versions" and then the table with single row like on my screenshot above?

@luke-
Copy link
Contributor Author

luke- commented Sep 14, 2021

@yurabakhtin Thank you for the screenshot. Yes, I would nevertheless show it, even if only one version is available. That way the user sees that there is a versioning at all.

@luke-
Copy link
Contributor Author

luke- commented Sep 14, 2021

@luke-

Probably a table in the version dialog would be nicer.
Columns could be:
Time, Author, Size, Options -> Revert to this version, Download

Commit 401391b:

table-view

Looks good. But I would prefer that the latest version always appears at the top. If, for example, there are 5 pages with 10 versions each, and you roll back to page 3 of one version, it becomes confusing.

@yurabakhtin
Copy link
Contributor

@luke-

Yes, I would nevertheless show it, even if only one version is available. That way the user sees that there is a versioning at all.

Ok, commit 5351fb4.

@yurabakhtin
Copy link
Contributor

@luke-

Looks good. But I would prefer that the latest version always appears at the top. If, for example, there are 5 pages with 10 versions each, and you roll back to page 3 of one version, it becomes confusing.

Commit 047caa6:

top_ordered

You tell about pages, should I implement the pagination for the table?

@yurabakhtin
Copy link
Contributor

@luke-

#121 (comment)

Would it theoretically also be possible to leave this link object_model/id only set for latest file Model record?
Older core File model versions could have a link to the Core File Model itself?

What do you think?

From the "Current situation:" table we can know what base/core Files are linked with CFile model, but how can we know this from the "Idea:" table? Currently we select versions by object_model and object_id. Sorry, but now the idea is not clear for me, also why do you need the linking core File model to itself?

@luke-
Copy link
Contributor Author

luke- commented Sep 14, 2021

@yurabakhtin The main point of my idea is that, theoretically, with the object_model/id mapping, a regular humhub\modules\file\models\File can also have multiple humhub\modules\file\models\File versions.

Example:

  • A CFile has exactly one File, which refers to latest version.
  • A File also have multiple associated File records, which represents are the alternative versions.

Advantages:

  • Only one file is ever assigned to the content model (e.g. CFile). This is always the last and active version.
  • This means that no additional field such as file_id would be necessary in the cfiles or other modules tables.
  • We could implement the versioning and, if necessary, helper methods centralised in the core. I could imagine, for example, that in the long term we need a system to combine many changes within a short period of time from several versions into one version in order to save memory.

Disadvantages:

  • A little more complex.
  • It is a little more complicated from the lookup

@luke-
Copy link
Contributor Author

luke- commented Sep 15, 2021

Here is an example code of how the code on the core side might look like:
humhub/humhub@develop...enh/file-versioning

Example how to rollback to a version:

/** @var humhub\modules\file\models\File $file */
/** @var humhub\modules\file\models\File $previousFileVersion */

$previousFileVersion = $file->getFileVersionsQuery()->limit(1)->one();
$file->replaceWithFile($previousFileVersion);

@yurabakhtin
Copy link
Contributor

@luke- Thank you for the details, yes now I got it, so if we have these data:

id object_model object_id ... Note
1 File 3
2 File 3
3 CFile 1 Latest Version

after uploading new CFile the data will be updated to this:

id object_model object_id ... Note
1 File 4
2 File 4
3 File 4
4 CFile 1 Latest Version

on switching the CFile#1 to the version File#2 the data will be like this:

id object_model object_id ... Note
1 File 2
2 CFile 1 Latest Version
3 File 2
4 File 2

I will implement core helpers to work with such versions algorithm.

@luke-
Copy link
Contributor Author

luke- commented Sep 16, 2021

@yurabakhtin Yes, exactly!

@yurabakhtin
Copy link
Contributor

@luke- I have implemented the file versioning from core side in PR humhub/humhub#5293.
The related changes for the cfiles module in the commit 0dc6c26.


You didn't answer yet on the question:

You tell about pages, should I implement the pagination for the table?

but I think we need this because it looks like this:

need_pagination

@yurabakhtin
Copy link
Contributor

@luke- Commit e1881ca - Pagination for file versions table:

versions-pagination

Single page:

single_page

@yurabakhtin
Copy link
Contributor

@luke- Please note the PR has errors in tests https://github.com/humhub/cfiles/runs/3634226999?check_suite_focus=true#step:25:31 because the branch develop need new changes from the core PR humhub/humhub#5293.

@luke-
Copy link
Contributor Author

luke- commented Sep 20, 2021

@yurabakhtin Awesome & Thanks, please see my PR reviews. Looks quite nice thanks! Can you please also prepare a draft PR which provides versioning here. https://github.com/ONLYOFFICE/onlyoffice-humhub

@yurabakhtin
Copy link
Contributor

@luke- Could you please provide me with some test server so I will check how the module store files?
Currently I can test this only without editing files, but please note it is possible to create several files with same name for a post:

test_same_file_names

I guess on edit a file it is stored into the same File, i.e. new record it not created in the table file, but you want to create a new record instead of previous, and previous record should be linked to the current with object_model = humhub\modules\file\models\File, right?

Also currently we don't have a context menu like on cfiles browser, should we add a button to view versions here? :
versions

@luke-
Copy link
Contributor Author

luke- commented Sep 21, 2021

In my example PR I would also assume that each file modification (CFiles ReUpload, OnlyOffice Edit, Text Editor Edit),
first creates a new File record and then calls this ReplaceWithFile() method.
See: https://github.com/humhub/humhub/blob/enh/file-versioning/protected/humhub/modules/file/models/File.php#L284

In your core PR/apprach (if I understood it correctly) the "replaceWithFile" step happens automatically when
creating/attaching a new File using the event ActiveRecord::EVENT_AFTER_INSERT.
Does this also work with content where multiple files are added? If yes, we can stick to your approach.


An important question to clarify would be whether we want to add Versioning support for all Contents (Posts, Tasks, Wiki)
or only for the CFiles module (at least for the beginning).

Here I assumed the second, that we offer Versioning Support optionally for modules and an interface class must be implemented first to enable it. In this case, the CFiles could use versioning and the default Post module could just replace the file.
So depending on where the File (CFile or Post) is atteched you could either have versioning or not. https://github.com/humhub/humhub/blob/enh/file-versioning/protected/humhub/modules/file/models/File.php#L294-L302

Alternatively, we can enable versioning for all contents (not only CFiles) directly.

I think it would be good to maybe start with just the CFile module for now and have only the most necessary changes in the core. If the versioning works out, we can move the classes like the VersionController later into the Core.

What do you think?

@yurabakhtin
Copy link
Contributor

@luke-

In your core PR/apprach (if I understood it correctly) the "replaceWithFile" step happens automatically when
creating/attaching a new File using the event ActiveRecord::EVENT_AFTER_INSERT.
Does this also work with content where multiple files are added? If yes, we can stick to your approach.

In my core PR https://github.com/humhub/humhub/pull/5293/files the "replaceWithFile" step happens automatically on the event ActiveRecord::EVENT_AFTER_INSERT, it works like this sample:

If before new uploading a file with same name we have a DB like this:

id object_model object_id ... Note
1 File 3
2 File 3
3 CFile 1 Latest Version

then after uploading new CFile the data will be updated to this:

id object_model object_id ... Note
1 File 4
2 File 4
3 File 4
4 CFile 1 Latest Version

I.e. firstly we find ID of the previous latest version, in our case ID=3, then update the record (id=3) object_model: CFile => File and object_id: 3 => 4, also we should find all other old versions i.e. we should do object_id: 3 => 4 for records with id=1 and id=2. All these actions are executed by single SQL query here https://github.com/humhub/humhub/pull/5293/files#diff-1d69c3220adcc4f2d087e12f111719fb7a477227e335a8640ae4bc26b4f01435R85-R102.
Please note in your example version https://github.com/humhub/humhub/blob/enh/file-versioning/protected/humhub/modules/file/models/File.php#L284-L303 I see only updating the previous version, but there is no updating of all other more older versions.

But I am bit confused here myself because by this logic on attaching a File to a Post all previous attached Files of the Post should become old versions of the latest uploaded, that is wrong indeed because Post should have multiple attached files, and we should not replace previous attached Files. However currently my code doesn't overwrite previous attached files, i.e. on each attaching it works like before:

attach_to_post

After debug the code I found answer why all previous attached Files stay as separate file of a Post instead of expecting converting to old versions as it works with CFiles:

debug_post_attaching

I.e. my current code doesn't break files of the Post model, but of course we cannot be sure that it will works with all other model, because if some model will set object_model before insert then all previous files of the same model will be converted to old version that will be wrong of course, so I agree we should implement some interface like you suggested AttachedFileVersioningSupport and we will run my behaviour Versions::refreshVersions only for models with such implemented interface, do you agree this?


I think it would be good to maybe start with just the CFile module for now and have only the most necessary changes in the core. If the versioning works out, we can move the classes like the VersionController later into the Core.

What do you think?

As I wrote above we should implement some interface as you suggested AttachedFileVersioningSupport and on first step use this only for CFiles and then we will may apply for other models, but currently we don't have a core UI to work with versions, we have this only from cfiles module side, i.e. probably in future it will be some core UI to review old versions and switching to another version.

@luke-
Copy link
Contributor Author

luke- commented Sep 22, 2021

@yurabakhtin Ok, then let's proceed with interface and "replaceWithFile" method. But we should always (even for modules with non versioning support) use the "replaceWithFile" method to update file contents.

I like the approach that files can't be changed anymore. This opens us good chances in things like caching, CDN, Trashcan...

@luke-
Copy link
Contributor Author

luke- commented Sep 22, 2021

The Versioning UI stuff can be moved from CFiles to Core, with a later version e.g. HumHub v1.11 or v1.12

@yurabakhtin
Copy link
Contributor

@luke- Please review new changes in core PR - humhub/humhub@194a6c6, and in cfiles module - 8126753.

@yurabakhtin
Copy link
Contributor

@luke- I don't find any new comments in the PRs what should be changed there.

@luke-
Copy link
Contributor Author

luke- commented Sep 27, 2021

@yurabakhtin For the core PR there is a comment here: humhub/humhub#5293 (review)

We may also need a strategy to clean up older versions. I think for example the OnlyOffice module does an AutoSave every X minutes.

Here some ideas for options:

  • Define max. versions per file (e.g. 50)
  • Define max. versions per day per file (for versions older than x days)
  • Auto Combine combine versions created by a short interval by the same users

We could implement a version cleanup like that later, but we should think about it.

@yurabakhtin
Copy link
Contributor

@luke-

For the core PR there is a comment here: humhub/humhub#5293 (review)

I answered humhub/humhub#5293 (comment).

@luke-
Copy link
Contributor Author

luke- commented Sep 27, 2021

@yurabakhtin Looks and works great! Can you implement that someone with ManageFiles permission can also delete old versions as well?

@yurabakhtin
Copy link
Contributor

yurabakhtin commented Sep 28, 2021

@luke-

Can you implement that someone with ManageFiles permission can also delete old versions as well?

Commit c8312c5, please note the core fix humhub/humhub#5311 is required to make the deleting of versions working.

@luke-
Copy link
Contributor Author

luke- commented Sep 28, 2021

@yurabakhtin Ok great, is there anything else to do with the issue? I have written a comment at the PR. But otherwise we can merge it? Or is something still missing?

@luke- luke- closed this as completed Oct 16, 2021
@yurabakhtin
Copy link
Contributor

@luke- I found only the permissions don't work related to your written comment, please read my answer #138 (comment).

@yurabakhtin yurabakhtin reopened this Oct 18, 2021
@luke-
Copy link
Contributor Author

luke- commented Oct 18, 2021

@yurabakhtin Yes, please rewrite that, so $cfile->content->canEdit() can "View older versions" and restore.

@yurabakhtin
Copy link
Contributor

@luke- Done in commit 54a7640.

@luke-
Copy link
Contributor Author

luke- commented Oct 18, 2021

@yurabakhtin Thanks! Then the PR should be ready for the time being, right?

@yurabakhtin
Copy link
Contributor

@luke- Yes, right, we can merge it.

@luke- luke- closed this as completed Nov 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants