Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

export/import group data via json file #4632

Merged
merged 42 commits into from Oct 24, 2018
Merged
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
b65a7b9
first pass at export group
robguthrie Apr 25, 2018
31fa3dd
add migrate group service to export and import groups from json file
robguthrie Apr 25, 2018
78a3985
add group export service to move data between loomio installs
robguthrie Apr 25, 2018
972b946
tidy up group exporter
robguthrie Apr 25, 2018
3d1b8e1
ignore existing records
robguthrie Apr 25, 2018
894e2ed
dont need that one
robguthrie Apr 25, 2018
82b2060
nor these really
robguthrie Apr 25, 2018
37ff9ac
Add tests + better performance for MigrateGroupService
gdpelican Apr 25, 2018
b865763
Ensure json export
gdpelican Apr 30, 2018
9f48008
merge master
robguthrie May 4, 2018
c77cf87
hack GroupExportService to recover memberships
robguthrie May 4, 2018
ccec44d
Merge branch 'master' into copy-data
robguthrie May 24, 2018
89c16a0
Merge branch 'master' into copy-data
robguthrie May 29, 2018
ed13291
do not baloon memory on export
robguthrie May 30, 2018
e349ca9
need to defuck discussion_readers.volume method
robguthrie May 31, 2018
2880358
dont overload table_name methods with db calling methods
robguthrie May 31, 2018
3cf0db0
dont need it
robguthrie May 31, 2018
6944102
fast enough export and import
robguthrie May 31, 2018
c240a7f
remove user ip columns. exclude user confidental columns
robguthrie Jun 2, 2018
fc68d90
getting there
robguthrie Jun 10, 2018
9f7e1c2
Merge branch 'master' into copy-data
robguthrie Jun 10, 2018
194c646
self serve group data export
robguthrie Jun 10, 2018
6c3091d
remove debug
robguthrie Jun 10, 2018
8c4cb79
implement review feedback
robguthrie Jun 12, 2018
696393c
Merge branch 'master' into copy-data
robguthrie Aug 4, 2018
d7b16aa
feat(email): fixed subject line for group export ready email
gregorykan Oct 19, 2018
a00c30e
chore(email): added link to email
gregorykan Oct 19, 2018
c69a580
feat(email): specify in email that link should only be available for …
gregorykan Oct 19, 2018
4acbfad
chore(style): added default layout to group_export_ready
gregorykan Oct 19, 2018
6ec1d00
chore(copy): case stuff
gregorykan Oct 19, 2018
c2fd158
feat(polls): only surface non-anonymous polls and related data
gregorykan Oct 19, 2018
e3027c5
feat(polls): surface only exportable poll (i.e. non-anonymous) data
gregorykan Oct 19, 2018
d89bdc6
feat(jobs): enqueue a delayed job for document (attachment) deletion …
gregorykan Oct 19, 2018
9785df1
chore(tests): added test to make sure 403 is returned if user does no…
gregorykan Oct 22, 2018
132aad6
chore(tests): added test to check that email is sent
gregorykan Oct 22, 2018
e14a40f
Merge branch 'master' into copy-data
gregorykan Oct 22, 2018
0f4bce3
fix(document): document url uses computed url
gregorykan Oct 23, 2018
e26f879
fix(fs): specify /tmp folder for temp file
gregorykan Oct 23, 2018
792c435
fix(document): fixed condition
gregorykan Oct 23, 2018
69443f0
fix(document): reference the right attribute in view file
gregorykan Oct 23, 2018
d028687
fix(document): new, open, close before save
gregorykan Oct 23, 2018
9c710c9
fix(document): using after_create callback to ensure url is set only …
gregorykan Oct 23, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
5 changes: 5 additions & 0 deletions app/controllers/api/groups_controller.rb
Expand Up @@ -37,6 +37,11 @@ def upload_photo
respond_with_resource
end

def export
service.export(group: load_and_authorize(:group), actor: current_user)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pedantic, but you should be able to do

service.export(group: load_resource, actor: current_user)

because it should be the service's job to authorize whether you can export or not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of related. One thing that might be relevant is anonymous polls.
A data export would allow people to see who voted what.

If it were not for anonymous polls then I'd say that anyone who is a member should be able to export the group data.

If there are anonymous polls, I don't know if anyone should be able to download the data. It's a weird situation. What are your thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a poll's anonymous, we should anonymize the export data.

Here's where I kinda wish we had serializers for these things, because it would mean we could make tweaks like this a bit more easily, rather than throwing scopes on the exportable_relations.

Instead of that though (I reckon it's a PITA), I think I'd prefer filtering out anonymous polls over to including them

# exportable_relations.rb
has_many :exportable_polls, -> { where(anonymous: false) }, source: :polls
# group_export_service
RELATIONS = [
  ...
  'exportable_polls' # (and not 'polls')
]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(NB that we don't use the 'discussion_polls' relation at the moment, because we're ensuring that a poll has the group_id set correctly if the discussion_id is set. A bit stateful-y, but it's worked so far.

render json: { success: :ok }
end

private
def track_visit
VisitService.record(group: resource, visit: current_visit, user: current_user)
Expand Down
11 changes: 11 additions & 0 deletions app/jobs/group_export_job.rb
@@ -0,0 +1,11 @@
class GroupExportJob < ActiveJob::Base
queue_as :low_priority

def perform(group, actor)
groups = actor.groups.where(id: group.all_groups)
filename = GroupExportService.export_filename_for(group)
GroupExportService.export(groups, filename)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have expected

filename = GroupExportService.export(group)

and then call group.all_groups from within the service

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plan was that anyone in a group can export the data. Meaning they should only be able to export data for groups they belong to. So I want to pass in only groups the user belongs to.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, then better to pass in the actor here and do that line within the service. I think the filename method is out of place here, there's no reason to have two static methods on GroupExportService like this, since they're so intertwined.

filename = GroupExportService.export(group, actor)
document = Document.create author: actor, file: File.open(filename, 'r'), title: filename
UserMailer.group_export_ready(actor, group, document).deliver

I also wonder about creating an event for this so that we can easily track it and send it to other places in the future if we want (like, text me when it's done, or send me a push notification with a link to the thing, etc.)

GroupExported.publish!(group, actor, document)

This also maintains our current distance from our ideal of 'events are the only way to send emails within the app'

document = Document.create(author: actor, file: File.open(filename, 'r'), title: filename)
UserMailer.group_export_ready(actor, group, document).deliver
end
end
9 changes: 9 additions & 0 deletions app/mailers/user_mailer.rb
Expand Up @@ -61,6 +61,15 @@ def user_added_to_group(recipient, event)
locale: [@user.locale, @inviter.locale]
end

def group_export_ready(recipient, group, document)
@user = recipient
@document = document
send_single_mail to: @user.email,
subject_key: "user_mailer.group_export.subject",
subject_params: {group_name: group.name},
locale: @user.locale
end

def login(user:, token:)
@user = user
@token = token
Expand Down
2 changes: 1 addition & 1 deletion app/models/ability/group.rb
Expand Up @@ -2,7 +2,7 @@ module Ability::Group
def initialize(user)
super(user)

can :show, ::Group do |group|
can [:show], ::Group do |group|
if user.is_admin?
true
elsif group.archived_at || group.is_guest_group?
Expand Down
1 change: 1 addition & 0 deletions app/models/comment.rb
Expand Up @@ -5,6 +5,7 @@ class Comment < ApplicationRecord
include HasMentions
include HasDrafts
include HasCreatedEvent
include HasEvents

has_paper_trail only: [:body]

Expand Down
115 changes: 115 additions & 0 deletions app/models/concerns/group_export_relations.rb
@@ -0,0 +1,115 @@
module GroupExportRelations
extend ActiveSupport::Concern

included do
# polls
has_many :discussion_polls, through: :discussions
has_many :poll_options, through: :polls
has_many :poll_unsubscriptions, through: :polls
has_many :poll_did_not_votes, through: :polls
has_many :outcomes, through: :polls
has_many :stances, through: :polls
has_many :stance_choices, through: :stances

# documents
has_many :discussion_documents, through: :discussions, source: :documents
has_many :poll_documents, through: :polls, source: :documents
has_many :comment_documents, through: :comments, source: :documents
has_many :public_discussion_documents, through: :public_discussions, source: :documents
has_many :public_poll_documents, through: :public_polls, source: :documents
has_many :public_comment_documents, through: :public_comments, source: :documents

# reactions
has_many :discussion_reactions, -> { joins(:user) }, through: :discussions, source: :reactions
has_many :poll_reactions, -> { joins(:user) }, through: :polls, source: :reactions
has_many :stance_reactions, -> { joins(:user) }, through: :stances, source: :reactions
has_many :comment_reactions, -> { joins(:user) }, through: :comments, source: :reactions
has_many :outcome_reactions, -> { joins(:user) }, through: :outcomes, source: :reactions

# readers
has_many :discussion_readers, through: :discussions

# guest groups
has_many :discussion_guest_groups, through: :discussions, source: :guest_group
has_many :poll_guest_groups, through: :polls, source: :guest_group

# users
has_many :discussion_authors, through: :discussions, source: :author
# has_many :discussion_reader_users, through: :discussion_readers, source: :user
has_many :comment_authors, through: :comments, source: :user
has_many :poll_authors, through: :polls, source: :author
has_many :outcome_authors, through: :outcomes, source: :author
has_many :stance_authors, through: :stances, source: :participant
has_many :reader_users, through: :discussion_readers, source: :user
has_many :non_voters, through: :poll_did_not_votes, source: :user

# events
has_many :membership_events, through: :memberships, source: :events
has_many :discussion_events, through: :discussions, source: :events
has_many :comment_events, through: :comments, source: :events
has_many :poll_events, through: :polls, source: :events
has_many :outcome_events, through: :outcomes, source: :events
has_many :stance_events, through: :stances, source: :events
end

def all_groups
Queries::UnionQuery.for(:groups, [
Group.where(id: self.id),
self.subgroups,
self.discussion_guest_groups,
self.poll_guest_groups
])
end

def all_users
Queries::UnionQuery.for(:users, [
self.members,
self.discussion_authors,
self.comment_authors,
self.poll_authors,
self.outcome_authors,
self.stance_authors,
self.reaction_users,
self.reader_users,
self.non_voters
])
end

def all_events
Queries::UnionQuery.for(:events, [
self.membership_events,
self.discussion_events,
self.comment_events,
self.poll_events,
self.outcome_events,
self.stance_events
])
end

def all_notifications
Notification.where(event_id: all_events.pluck(:id))
end

def all_documents
Queries::UnionQuery.for(:documents, [
self.documents,
self.discussion_documents,
self.poll_documents,
self.comment_documents
])
end

def all_reactions
Queries::UnionQuery.for(:reactions, [
self.discussion_reactions,
self.poll_reactions,
self.stance_reactions,
self.comment_reactions,
self.outcome_reactions
])
end

def reaction_users
User.where(id: all_reactions.pluck(:user_id))
end
end
1 change: 1 addition & 0 deletions app/models/discussion.rb
Expand Up @@ -51,6 +51,7 @@ def self.always_versioned_fields
has_many :polls, dependent: :destroy
has_many :active_polls, -> { where(closed_at: nil) }, class_name: "Poll"
has_one :search_vector

has_many :comments, dependent: :destroy
has_many :commenters, -> { uniq }, through: :comments, source: :user
has_many :documents, as: :model, dependent: :destroy
Expand Down
4 changes: 2 additions & 2 deletions app/models/discussion_reader.rb
Expand Up @@ -61,9 +61,9 @@ def recall!(persist: true)
save if persist
end

def volume
def computed_volume
if persisted?
super || membership&.volume || 'normal'
volume || membership&.volume || 'normal'
else
membership.volume
end
Expand Down
2 changes: 1 addition & 1 deletion app/models/formal_group.rb
Expand Up @@ -48,7 +48,7 @@ class FormalGroup < Group
belongs_to :default_group_cover

has_many :subgroups,
-> { where(archived_at: nil).order(:name) },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this, but wonder if we need to check the /g/:key/subgroups query to make sure it's still returning the same thing?

-> { where(archived_at: nil) },
class_name: 'Group',
foreign_key: 'parent_id'
has_many :all_subgroups, class_name: 'Group', foreign_key: :parent_id
Expand Down
15 changes: 11 additions & 4 deletions app/models/group.rb
Expand Up @@ -11,22 +11,29 @@ class Group < ApplicationRecord

belongs_to :creator, class_name: 'User'
belongs_to :parent, class_name: 'Group'

has_many :discussions, foreign_key: :group_id, dependent: :destroy
has_many :public_discussions, -> { visible_to_public }, foreign_key: :group_id, dependent: :destroy, class_name: 'Discussion'
has_many :comments, through: :discussions

has_many :all_memberships, dependent: :destroy, class_name: 'Membership'
has_many :memberships, -> { where is_suspended: false, archived_at: nil }
has_many :members, through: :memberships, source: :user

has_many :accepted_memberships, -> { accepted }, class_name: 'Membership'
has_many :accepted_members, through: :accepted_memberships, source: :user

has_many :admin_memberships, -> { where admin: true, archived_at: nil }, class_name: 'Membership'
has_many :admins, through: :admin_memberships, source: :user

has_many :membership_requests, dependent: :destroy
has_many :pending_membership_requests, -> { where response: nil }, class_name: 'MembershipRequest'
has_many :members, through: :memberships, source: :user
has_many :accepted_members, through: :accepted_memberships, source: :user

has_many :discussions, foreign_key: :group_id, dependent: :destroy
has_many :public_discussions, -> { visible_to_public }, foreign_key: :group_id, dependent: :destroy, class_name: 'Discussion'
has_many :polls, foreign_key: :group_id, dependent: :destroy
has_many :public_polls, through: :public_discussions, dependent: :destroy, source: :polls

include GroupExportRelations

scope :archived, -> { where('archived_at IS NOT NULL') }
scope :published, -> { where(archived_at: nil) }

Expand Down
8 changes: 8 additions & 0 deletions app/models/guest_group.rb
Expand Up @@ -5,6 +5,14 @@ def id_and_subgroup_ids
Array(id)
end

def subgroups
Group.none
end

def documents
Document.none
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmmm I wonder if we want to move the has_many :documents line from FormalGroup to Group instead of doing this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

end

def group_privacy=(term)
raise 'guest groups cant be open' if term == 'open'
super
Expand Down
2 changes: 1 addition & 1 deletion app/models/membership.rb
Expand Up @@ -89,6 +89,6 @@ def discussion_readers
private

def set_volume
self.volume = user.default_membership_volume if group.is_formal_group?
self.volume = user.default_membership_volume if id.nil? && group.is_formal_group?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh hmm what is this change needed for?

I think more idiomatic is self.new_record?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've working to remove all unnecessary N+1 queries. Reading volume triggers a query in a couple of places (checking default membership volume and group volume) and it also pollutes the export because it returns a value that isn't actually what the column actually contains.

So that's why I've moved to using a method that clearly says it's giving a computed value for volume rather than overloading the simple accessor. I like it more and it hasn't really been a problem to change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I was referring to the id.nil? addition.

end
end
1 change: 1 addition & 0 deletions app/models/outcome.rb
Expand Up @@ -5,6 +5,7 @@ class Outcome < ApplicationRecord
include Reactable
include Translatable
include HasCreatedEvent
include HasEvents

set_custom_fields :calendar_invite, :event_summary, :event_description, :event_location

Expand Down
2 changes: 1 addition & 1 deletion app/models/user.rb
Expand Up @@ -21,7 +21,7 @@ class User < ApplicationRecord
demo_bot: ENV['DEMO_BOT_EMAIL'] || 'contact+demo@loomio.org'
}.freeze

devise :database_authenticatable, :recoverable, :registerable, :rememberable, :trackable
devise :database_authenticatable, :recoverable, :registerable, :rememberable
attr_accessor :recaptcha
attr_accessor :restricted
attr_accessor :token
Expand Down
4 changes: 0 additions & 4 deletions app/serializers/discussion_reader_serializer.rb
Expand Up @@ -33,10 +33,6 @@ def discussion_reader_id
object.id
end

def discussion_reader_volume
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gets used in discussion_model.coffee, I wonder if we need to account for that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, wonder if volume changes are not strictly necessary for group export

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A flag that this needs to be addressed in some way before merge; we can't be referencing discussionReaderVolume anywhere on the client if this method is taken out

object.volume
end

def seen_by_count
object.discussion.seen_by_count
end
Expand Down
68 changes: 68 additions & 0 deletions app/services/group_export_service.rb
@@ -0,0 +1,68 @@
class GroupExportService
RELATIONS = %w[
all_users
all_events
all_notifications
all_documents
all_reactions
memberships
membership_requests
discussions
polls
poll_options
poll_did_not_votes
poll_unsubscriptions
outcomes
stances
stance_choices
discussion_readers
comments
]

JSON_PARAMS = { groups: {methods: [:type]},
users: {except: [:encrypted_password,
:reset_password_token,
:email_api_key,
:reset_password_token,
:unsubscribe_token] }}.with_indifferent_access.freeze

def self.export(groups, filename)
ids = Hash.new { |hash, key| hash[key] = [] }
File.open(filename, 'w') do |file|
groups.each do |group|
puts_record(group, file, ids)
RELATIONS.each do |relation|
puts "Exporting: #{relation}"
group.send(relation).find_each(batch_size: 20000) do |record|
puts_record(record, file, ids)
end
end
end
end
end

def self.export_filename_for(group)
"tmp/#{DateTime.now.strftime("%Y-%m-%d_%H-%M-%S")}_#{group.name.parameterize}.json"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A note that the group.name.parameterize will fail for guest groups, which we should be able to export just fine.

end

def self.puts_record(record, file, ids)
table = record.class.table_name
return if ids[table].include?(record.id)
ids[table] << record.id
file.puts({table: table, record: record.as_json(JSON_PARAMS[table])}.to_json)
end

def self.import(filename)
tables = File.open(filename, 'r').map { |line| JSON.parse(line)['table'] }.uniq
Copy link
Contributor

@gdpelican gdpelican Jun 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some stuff that smells a bit here, but I haven't got a better solution off the top of my head given that (I'm assuming) we don't want to port all of the file into memory. I think json parsing each line of the file N+1 times is going to hurt us, and wonder if there's a way to avoid it.

I wouldn't be opposed to, say, iterating through each line of the file, parsing it, and then,
if the table matches the current table, klass.new and append it to an array of records
if the table doesn't match, klass.import, flush the existing array, and start a new one with the new table

Then we only iterate through the file once, run JSON.parse once per line, and still maintain support for weirdo input like a stray group in a middle of a big run of discussions or whatnot.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PS I believe import will silently fail imports with ids that exist already, which would be the same behaviour as this would exhibit

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import totally fails if some of the ids already exist. so the check is necessary.

Thanks for helping to try and improve, but i think I'd like to call it as good enough for now. It's the best solution (fastest export for large groups by miles) after quite a few attempts and I'd like to move on.

tables.each do |table|
klass = table.classify.constantize
existing_ids = klass.pluck(:id)
new_records = File.open(filename, 'r').map do |line|
data = JSON.parse(line)
next unless (data['table'] == table && !existing_ids.include?(data['record']['id']))
klass.new(data['record'])
end.compact!
klass.import(new_records, validate: false)
end
end
end
5 changes: 5 additions & 0 deletions app/services/group_service.rb
Expand Up @@ -44,6 +44,11 @@ def self.move(group:, parent:, actor:)
EventBus.broadcast('group_move', group, parent, actor)
end

def self.export(group: , actor: )
actor.ability.authorize! :show, group
GroupExportJob.perform_later(group, actor)
end

def self.merge(source:, target:, actor:)
actor.ability.authorize! :merge, source
actor.ability.authorize! :merge, target
Expand Down
2 changes: 2 additions & 0 deletions app/services/user_service.rb
Expand Up @@ -28,6 +28,8 @@ def self.delete_many_spam(name_fragment)

def self.delete_spam(user)
# destroyed (cascade delete)
raise "no deletey admin plezse" if user.is_admin?

Group.where(creator_id: user.id).destroy_all
Poll.where(author_id: user.id).destroy_all
Discussion.where(author_id: user.id).destroy_all
Expand Down
1 change: 1 addition & 0 deletions app/views/user_mailer/group_export_ready.html.haml
@@ -0,0 +1 @@
%p= t :'user_mailer.group_export_ready.body_html', url: @document.file.url
Expand Up @@ -11,6 +11,18 @@ angular.module('loomioApp').directive 'groupActionsDropdown', ->
replace: true
controller: ['$scope', ($scope) ->

$scope.canExportData = ->
Session.user().isMemberOf($scope.group)

$scope.openGroupExportModal = ->
ModalService.open 'ConfirmModal', confirm: ->
submit: -> Records.groups.export($scope.group.id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We tend towards actions on the model, so

# group_model.coffee
export: =>
  @remote.post(@id, 'export')
submit: $scope.group.export

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thanks

text:
title: 'group_export_modal.title'
helptext: 'group_export_modal.body'
submit: 'group_export_modal.submit'
flash: 'group_export_modal.flash'

$scope.canAdministerGroup = ->
AbilityService.canAdministerGroup($scope.group)

Expand Down