Skip to content

Revision UI removed & activity stream improved#3972

Merged
amercader merged 118 commits into
masterfrom
3484_revision_ui_removal2
May 3, 2019
Merged

Revision UI removed & activity stream improved#3972
amercader merged 118 commits into
masterfrom
3484_revision_ui_removal2

Conversation

@davidread
Copy link
Copy Markdown

@davidread davidread commented Jan 2, 2018

Fixes #3484

  • Revision UI removed (as warned in CKAN 2.7.0 changelog):

    • /revision/list - List of which datasets changed. Not easy to page through 200 at a time. Not used or useful as far as we can tell.
    • /revision/{id} - Detail of a revision - displays timestamp and the dataset - no use. Problematic when there are billions of revisions you have billions of these pages that get indexed.
    • /revision/edit/{id} - to undelete a revision. No-one uses this method. Better ways to achieve this.
    • /revision/diff/{id} - diffs different versions of a dataset. Replaced by /dataset/changes/
  • History page removed - /dataset/{id}/history now redirects to Activity Stream. The History page was not supported or linked to from anywhere (on default templates) but it was useful for admins. The functionality of seeing the past versions is fully replaced by the Activity Stream now. (The Atom feed at the same URL is also removed: /dataset/{id}/history?format=atom, again deprecated. But there is a feed of datasets ordered by metadata_modified at /feed/dataset.atom, and the changes are in the package_activity_list API which could be made an Atom feed if there is interest.)

  • Activity Stream saves the full dictized dataset, rather than just the package table. Migration script: migrate_package_activity.py (discussed below)

  • Activity Stream for a dataset now has 2 new links against each activity item: "View this version" to view the dataset as it was and "Changes" - the diff since the previous version:
    screen shot 2018-05-12 at 10 32 47
    The changes (diff) page looks like this:
    screenshot 2019-02-15 at 17 32 05
    ...
    screenshot 2019-02-15 at 17 32 23
    This isn't the clearest - changes are not highlighted - you have to look in the margin for a letter. But it's a pretty good start and can be improved.

    • NB The "View this version" and "Changes" pages can viewed by admins and optionally others too - depending on new option ckan.auth.public_activity_stream_detail. The default is False, so that sites upgrading don't suddenly splurge this detail without taking the conscious act of changing it, although we encourage it in the changelog. New sites will have it set to True by default, because it is added that way in deployment.ini_templ.
  • Viewing old versions of datasets is done with a different parameter name. Uses ?activity_id=<id> rather than @<revision_id> or @<date>. (This breaking change is ok because the only links to these were from the history page which was not linked to itself.)

  • Activity Stream model is simplified - we use just Activity and remove ActivityDetail. Previously when you update a resource, it'd create an Activity('changed package') and also a connected ActivityDetail('resource changed'). This just seems too complex in the model. Now if you change just an extra field or a resource field, it'll simple say 'X updated the dataset Y', rather than 'X added the resource Z to the dataset Y'.
    Before:
    screenshot 2019-02-08 at 11 58 43
    After:
    screenshot 2019-02-08 at 11 58 53
    Since we store the pkg_dict on every commit, we could in future write logic that could analyze the diff to describe the change in more detail, like before. But it's worth it for the simpler model.

    • ActivityDetail is no longer used and is deprecated. It remains in the model for now, in case someone has some custom details they want to keep. But the user is asked to agree to deletion and then the table is emptied by when they run migrate_package_activity.py.
    • activity_detail_list logic function is removed
  • New action functions for viewing an activity: activity_show and very similar: activity_data_show making command-line diffs possible e.g.:

diff <(ckanapi action activity_data_show id=484f01d7-d9c8-4a4d-bb42-00da6b8525f7 object_type=package -c $CKAN_INI) <(ckanapi action activity_data_show id=293a87ed-7ff9-4191-9d92-cb523763b8dd object_type=package -c $CKAN_INI)
15c15
<   "metadata_modified": "2018-01-02T16:29:44.571621",
---
>   "metadata_modified": "2018-01-02T16:29:44.834676",
80c80
<       "revision_id": "1e6a3a25-2099-4318-891c-a2dd78b173de",
---
>       "revision_id": "9e779436-fe43-400d-92ad-9f1583a60a60",
91,92c91,92
<   "revision_id": "9eddc9f4-a29b-4feb-aff6-bfcea87f2139",
<   "state": "draft",
---
>   "revision_id": "9e779436-fe43-400d-92ad-9f1583a60a60",
>   "state": "active",
  • Diffing between package versions is also possible with a new action function activity_diff.
  • migrate_package_activity.py - migrates activity stream to store the full package dict, using the package revision tables.
    • Package revision tables (package_revision/package_extra_revision/package_tag_revision/resource_revision) are not removed and are still being populated, but are not used for anything. (We keep them just in case: There is a chance that there are packages have been create/changed/deleted but there are not corresponding activity stream objects (due to disabling ckan.activity_streams_enabled, or for changes while running ckan before 1.6, or due to a bug certain changes are not recorded as activity). Regenerating activity stream objects for these would probably be possible from the revisions, so why not leave them.)
    • This does not migrate old versions of groups/organizations/member/package_relationship/system_info so those revision tables remain, to be tackled in future. Eventually we'll get rid of vdm.
    • This migration takes a while to run (could be hours for the biggest sites), and the effect is only noticeable to admins by default, so rather than include it in the normal paster db upgrade migrations, it is a standalone script. Ideally you download it and run it against any version of CKAN (I've tested it 2.3-2.8) before you upgrade to this version of CKAN. But if you don't, that is fine - you can upgrade to this or a future CKAN version, and run this migration if you want - the only downside of not migrating is that the "View this version" and "Changes" links will 404 until you do run the migration (and by default only admins can see Changes page anyway). (An earlier version of this PR prevented you upgrading until the migratino was done, but is no longer the case.)

I think we are agreed that it is ok to remove Revision UI and History page without prior deprecation because they are long unlinked and unsupported.

This PR removes a chunk of code to do with 'revisions'. However migrate_package_activity.py requires the revisions still in the model. A future PR (looking like https://github.com/ckan/ckan/pull/4664/files#diff-67c809480fe3487c995fb33d7d134d84) can remove the rest:

  • tables: group_revision/group_extra_revision/member_revision/package_relationship_revision/system_info_revision - maybe we need a similar way to view changes to groups at least? maybe the rest can go.
  • model code
  • vdm

This PR follows on from #4627: Activity Stream is displayed with standard templates, rather than via action functions that returned HTML.

This PR took over from the original work on this issue that was #3485

Todo:

  • Go through review comments Completely remove the revision controller. #3485
  • diffing - see if it is poss using the api and diff.
  • Check: _filter_activity_by_user commenting out
  • Remove ActivityDetail in a migration script
  • write test for migrate_package_activity.py (ensure it doesn't break in future)
  • move the revision stuff out of model_dictize code into migrate_revisions.py (including calling package_dictize with the (context['revision_id'] option)
  • config option for everyone to see diffs
  • test against old/real activity streams
  • Really old versions of CKAN included activity event types that no longer exist, and trying to render those activity events fails. We need to add a fallback template to render these without exception. This will be more important once this PR goes in since it will be much easier to add arbitrary event types. (TkTech: One important test we should have is to ensure activity doesn't fail to render when the database includes an activity type or objects no longer exist. There have been 4-5 issues about this on the mailing list from what I've seen.)
  • Rendering org/group events when the org/group has been deleted cause crashes on some templates.
  • check templates-bs2 are in sync

Features:

  • includes tests covering changes
  • includes updated documentation
  • includes user-visible changes
  • includes API changes
  • includes bugfix for possible backport

TkTech and others added 3 commits April 27, 2017 14:51
Remove the legacy revision controller 'tests'

Activities are now just templates. Removes all _html 'helpers'.

Remove leftover mentions of activity_streams.py and missed _html helpers from actions.

Switch activity action based on group type.

Fix 'View this version' links.

Redirect legacy history page to activity feed (this page is not referenced by anything in core ckan)

Revert a small change that breaks old fanstatic-based CKAN.

Remove mentions to user_activity_list_html and _html test checks.

Store complete package dict in activity. Store user name at time of change. Show historic package versions.

Legacy tests do not include valid users when creating test packages.

Ignore authentication when creating the package dict for the activity record.
@davidread davidread changed the title [WIP] Revision UI removal [WIP] Revision UI removal etc Jan 2, 2018
David Read added 3 commits January 2, 2018 16:05
…arly the last person to use less used a more recent version (than 2.5.2) because the color values in main.css are no longer long-hand (e.g. main.css has #ffffff rather than the #fff short-hand found in the .less file)
@davidread davidread changed the title [WIP] Revision UI removal etc [WIP] Revision UI removed & activity stream improved Jan 2, 2018
@amercader
Copy link
Copy Markdown
Member

@davidread this is great!, are you still working on it?

@amercader amercader added the WIP label Jan 9, 2018
@davidread
Copy link
Copy Markdown
Author

@amercader Yes :)

David Read added 2 commits February 5, 2018 21:40
…functions: activity_show (which takes activity IDs) and activity_list_show (which takes object IDs).
@davidread
Copy link
Copy Markdown
Author

davidread commented Mar 16, 2018

In this PR only admins see the old versions of datasets (i.e. not everyone). The concern I have is with existing CKAN sites, where I don't think we should suddenly make the old versions so visible. I'd expect the vast majority of admins had previously assumed that old versions were not released. They wouldn't wish to show old versions which may have incorrect metadata, or multiple draft versions, which may be embarrassing.

The previous versions of datasets (excluding resources) have been available only in obscure places:

  • an API call
  • History page, which is not linked from anywhere

A site owner can allow anyone to see the old versions, by simply overriding the auth function activity_list_show, if they choose.

@wardi and anyone else - it would be good to hear if you think this is right.

@davidread davidread force-pushed the 3484_revision_ui_removal2 branch 2 times, most recently from 41203fa to 6c89fa0 Compare April 12, 2019 16:44
* Use PackageDictizeMonkeyPatch to patch package_dictize, because a context manager like this cleans up after itself effectively, so any tests that run after test_migrate_package_activity.py and test_revision_legacy_code.py will use the standard package_show now.
* Fix test_revision_legacy_code.py now package_extra_revision table is not being populated at the moment.
* Copied in create_object_version() from vdm, so we can use PackageExtraRevision without its .continuity being mapped to the PackageExtra.
@davidread davidread force-pushed the 3484_revision_ui_removal2 branch from 6c89fa0 to 15507df Compare April 12, 2019 16:48
@davidread
Copy link
Copy Markdown
Author

@wardi I've now removed the restrictions about having to do the migration before you upgrade beyond this version. I think this is good to merge, so please do take a look!

@davidread
Copy link
Copy Markdown
Author

I was hoping this would reduce CKAN's code size, but the extra is due to the new diff functionality, and ckan/migration/revision_legacy_code.py which is starting to include hacked bits of vdm, so it's not really in the CKAN codebase proper to concern devs in the future.

@hardingalexh
Copy link
Copy Markdown
Contributor

Anything else to contribute for this MR? My team has been evaluating implementing a similar feature and it'd be great to get this merged so we can use it :)

@wardi
Copy link
Copy Markdown
Contributor

wardi commented Apr 18, 2019

looking great, just had one minor comment about an API parameter above

@davidread davidread force-pushed the 3484_revision_ui_removal2 branch from 75592d9 to 9bc6e76 Compare April 19, 2019 18:19
@davidread davidread force-pushed the 3484_revision_ui_removal2 branch from 9bc6e76 to 6b5c6d1 Compare April 19, 2019 18:26
@davidread
Copy link
Copy Markdown
Author

@wardi said:

#3972 is fine with me, just need the ok from @amercader and/or @smotornyuk since I'm not assigned

@amercader @smotornyuk can either of you merge this PR now, please? It's been a long-time fork and I'm really keen to move on and get the rest of vdm removed.

@amercader amercader merged commit 6b5c6d1 into master May 3, 2019
@amercader
Copy link
Copy Markdown
Member

Thanks for your work on this @davidread (and @TkTech for starting it)! Really thorough and well-thought.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Removal of deprecated revision controller.

7 participants