New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maintenance feature to remove old users, scheduled in settings #5288

Merged
merged 1 commit into from Oct 16, 2014

Conversation

Projects
None yet
5 participants
@jaywink
Copy link
Contributor

jaywink commented Oct 5, 2014

Submitting for initial review even though tests are not finished yet.

This is a maintenance feature, hopefully one of future many, to remove old user accounts. See discussion here: https://www.loomio.org/d/IgihNCe4/expiring-old-accounts . The aim is to help podmins maintain their pod size and thus guarantee resources for users who actually are active.

Basically:

  • Podmin enables feature and changes possible defaults (how many days inactivity, how many days of warning to user, how many to process max per day)
  • Cron job runs every day to check for any inactive users and sends warning emails to them + tags them with a timestamp when it is ok to remove said account + triggers a future job to Sidekiq for the actual removal (+1 day after remove_after timestamp)
  • User login will remove any "remove_after" timestamp
  • Sidekiq processes job in the future, and if user still has a remove_after timestamp, will close the account (I'm counting this means remove too, please correct if wrong)

Any jobs will not do any removals if podmin decides to turn the feature off after generating lots of removal warnings - the setting is checked before removal.

I'll add more tests tomorrow for the few methods used, just ran out of time today.


# deliver to be closed emails to account holders
# and queue accounts for closing to sidekiq
users.each do |user|

This comment has been minimized.

@jhass

jhass Oct 5, 2014

Member

users.find_each for improved memory usage ;)

users.each do |user|
user.flag_for_removal!
Maintenance.account_removal_warning(user).deliver
Workers::RemoveOldUser.perform_at((AppConfig.settings.maintenance.remove_old_users.warn_days.to_i+1).days, user.id)

This comment has been minimized.

@jhass

jhass Oct 5, 2014

Member

perform_in(seconds) or perform_at(datetime), you're doing perform_at(delay) there.

@jhass

This comment has been minimized.

Copy link
Member

jhass commented Oct 5, 2014

I think I would prefer a solution like Sidetiq instead of whenever since that requires no additional setup by the pod maintainer and should seamlessly work on PaaS stuff.

@Flaburgan

This comment has been minimized.

Copy link
Member

Flaburgan commented Oct 5, 2014

<3 <3 <3 you can't know how much I love you to work on this!

@jaywink jaywink force-pushed the jaywink:remove-old-users branch from 4175c91 to 914fd47 Oct 6, 2014

@jaywink

This comment has been minimized.

Copy link
Contributor

jaywink commented Oct 6, 2014

Two changes by @jhass done and also replaced Scheduler thingy with Sidetiq - yeah it's much more elegant. I didn't configure it any way special, any opinions on that? Also there is a web UI but don't really see what is the point in enabling it - would it hurt any memory etc?

Now tests..

@jhass

This comment has been minimized.

Copy link
Member

jhass commented Oct 6, 2014

I'd say we can stick with the default config until it becomes an issue. I think it'd be nice to load the web extension. It hooks into Sidekiq's webinterface which we already mount so it's just that require, can't do any harm. We deploy a forked application server by default, memory shouldn't be an issue ;) Not that I expect that it adds any relevant amounts.


it '#removes user whose remove_after timestamp has passed' do
@user = FactoryGirl.create(:user, :remove_after => Time.now-1.days)
expect(@user).to receive(:close_account!)

This comment has been minimized.

@jaywink

jaywink Oct 8, 2014

Contributor

Any idea how do I actually get this to work? :) Rspec is like black magic to me

This comment has been minimized.

@jhass

jhass Oct 8, 2014

Member

You can't rely on User.find(@user.id) to return the same object. One possibility is to use expect_any_instance(User) but a bit cleaner is to just stub User.find:

user = double(id: 1, remove_after: 1.day.ago) # no need for the instance variable btw
allow(User).to receive(:find).with(user.id).and_return(user)
expect(user).to receive(:close_account!)
Workers::RemoveOldUser.new.perform(user.id)

This technique should make the test a whole lot faster too.

This comment has been minimized.

@jaywink

jaywink Oct 9, 2014

Contributor

That totally works - thanks!

it '#doesnt remove user whose remove_after timestamp hasnt passed' do
@user = FactoryGirl.create(:user, :remove_after => Time.now+1.days)
expect(@user).not_to receive(:close_account!)
Workers::RemoveOldUser.new.perform(@user.id)

This comment has been minimized.

@jhass

jhass Oct 8, 2014

Member
user = double(id: 1, remove_after: 1.day.from_now)
allow(User).to receive(:find).with(user.id).and_return(user)
expect(user).to_not receive(:close_account!)
Workers::RemoveOldUser.new.perform(user.id)
enable: false
after_days: 730
warn_days: 30
limit_removals_to_per_day: 100

This comment has been minimized.

@jhass

jhass Oct 8, 2014

Member

Update the example file too please.

This comment has been minimized.

@jaywink

jaywink Oct 9, 2014

Contributor

Done

@jaywink

This comment has been minimized.

Copy link
Contributor

jaywink commented Oct 9, 2014

OK I think test coverage is probably high enough - comments, anything I missed? Wasn't sure how to call User.after_database_authentication by signing in user, so I called it directly. Devise claims to always call it on database authentication, but I think I'll have a look into that just to make sure.

Also diaspora.yml needs the new settings. Will need to work now, continue later in the evening.

@jaywink

This comment has been minimized.

Copy link
Contributor

jaywink commented Oct 9, 2014

Oh and also:

Rewrite user removal logic to also include users who have never signed in, but for these we don't send them a warning and queue removal in 1 day

Made sense to me at least.

end

it '#does not queue user that is not inactive' do
@user = FactoryGirl.create(:user, :last_seen => Time.now-728.days, :sign_in_count => 5)

This comment has been minimized.

@jhass

jhass Oct 9, 2014

Member

No need for the instance (@) variable here either.

it '#does not queue user that is not inactive' do
@user = FactoryGirl.create(:user, :last_seen => Time.now-728.days, :sign_in_count => 5)
Workers::QueueUsersForRemoval.new.perform
user = User.find(@user.id)

This comment has been minimized.

@jhass

jhass Oct 9, 2014

Member

I guess you mean user.reload

This comment has been minimized.

@jaywink

jaywink Oct 9, 2014

Contributor

Of course I meant that, how silly of me to forget.. krhm ;)

@jaywink

This comment has been minimized.

Copy link
Contributor

jaywink commented Oct 9, 2014

OK two things left at least

  • Make sure User.remove_after IS cleared when user returns. I'm guessing at the moment only when user authenticates. But does this matter? I guess it could if users are set to remember for longer than the delay for removal is (in theory). Will look into it anyway tomorrow.
  • Enable webview for sidetiq
@maxwell

This comment has been minimized.

Copy link
Member

maxwell commented Oct 11, 2014

Sorry for the delay in the review... I REALLY appreciate you working on this.

I think this is basically what we need.. there is a couple of things that could go a long way here.

  1. we should improve/verify the tests (if they are not already there) that close_account actually removes all of the contacts, likes, participations share_visibilities, aspect visibilities, and any other new models that have been recently added.

  2. I know when this gets scheduled on JD.com, its going to barf hardcore just because its going to be so much.... It might be better to just build out a rake task that people can CRON or whatever themselves... that way they can control the pace.

For example, I'd love to set the task to just delete 1000 expired users once an hour, (or something like this) that way I can churn through it slowly but surely without bringing everything to its knees. It will take awhile to clean things out, but it will be super helpful in spreading out the deletes so users don't get affected.

For small pods, this 1k limit might be something they never even notice.

Does that make sense? Thank you so much!

users = User.where("last_seen < ? and locked_at is null and remove_after is null",
Time.now - (AppConfig.settings.maintenance.remove_old_users.after_days.to_i).days)
.order(:last_seen)
.limit(AppConfig.settings.maintenance.remove_old_users.limit_removals_to_per_day)

This comment has been minimized.

@jhass

jhass Oct 11, 2014

Member

Hey @maxwell thanks for your comments, it would be great if you at least glance at the code and previous discussion first. I'm commenting on the line which implements your second point. This line limits the number of account deletions per day to a configurable maximum.

@jaywink

This comment has been minimized.

Copy link
Contributor

jaywink commented Oct 11, 2014

  1. we should improve/verify the tests (if they are not already there) that close_account actually removes all of the contacts, likes, participations share_visibilities, aspect visibilities, and any other new models that have been recently added.

Yeah have no idea really what exactly the close_account does except AFAIK it does trigger account deletion, so those things should be in scope and if they are not, then something has probably gone wrong somewhere when new stuff has been added :) But I think this is kinda out of scope of this PR.

  1. I know when this gets scheduled on JD.com, its going to barf hardcore just because its going to be so much.... It might be better to just build out a rake task that people can CRON or whatever themselves... that way they can control the pace.

What about a rake task in addition to the sidetiq scheduling - so in that case those who prefer could set up manual cronning and those who want to follow the sidetiq scheduling can use that. I'm guessing I can easily make a rake task to just call the worker directly, I'll try that tomorrow, busy today :(

I'm not sure if sidetiq schedule could be a configuration item, doesn't look to me so though I might be wrong. Will have a look at that too.

Thanks @maxwell for commenting - hoping that this would help somehow in cleaning up the big pods.

@jaywink jaywink force-pushed the jaywink:remove-old-users branch 2 times, most recently from df79090 to 816d514 Oct 14, 2014

@jaywink

This comment has been minimized.

Copy link
Contributor

jaywink commented Oct 14, 2014

Mmmmm so much complete, much can merge? :P

Well, two things;

  • I added a rake task now so that with this setting enabled, one can set it to run whenever one wants. It'll still run via sidetiq too, but the rake task could be used to better balance the load, for example as @maxwell said, 1000 users per hour. What would happen in reality would be that once a day an extra 1000 would be run by sidetiq. I assume this is reasonable?
  • Another thing, I checked out the way devise works and AFAIK the User.remove_after timestamp will not be removed on login if the user is authenticated with the rememberable strategy (https://github.com/plataformatec/devise/blob/master/lib/devise/strategies/database_authenticatable.rb#L13). What this means is that in_theory_, if a pod had a remember me thingy set to a loooong time and the remove after inactivity set to a shorter time, the user might be logged with an old remember me token and still get squashed by the removal. Of course the user would get a warning email, but that wouldn't help since the remove_after would not be removed at all.

Opinions? It's a theoritical risk if one does really crazy configuration really.

@jaywink

This comment has been minimized.

Copy link
Contributor

jaywink commented Oct 14, 2014

I suppose if wanted, I could make it also check last_seen for extra safety just before close_account here https://github.com/diaspora/diaspora/pull/5288/files#diff-fd589d584d3081f0f3b8565675d715c7R14

Something like and user.last_seen < Time.now-(AppConfig.settings.maintenance.remove_old_users.after_days.to_i).days-(AppConfig.settings.maintenance.remove_old_users.warning_days.to_i).days 💃

@Flaburgan

This comment has been minimized.

Copy link
Member

Flaburgan commented Oct 14, 2014

I could make it also check last_seen for extra safety just before close_account

Yeah that sounds like the "bug" we had for the stats, when someone logged without using the login page (he was remembered by the site). So, same problem, same solution, last seen seems nice :)

@jaywink

This comment has been minimized.

Copy link
Contributor

jaywink commented Oct 15, 2014

Extra safety added! So safe now!

@jhass

This comment has been minimized.

Copy link
Member

jhass commented Oct 15, 2014

Looking at Travis you want to add some Timecop.freeze magic to your specs.

@jhass

This comment has been minimized.

Copy link
Member

jhass commented Oct 16, 2014

I've been thinking, what do we want to do if email is not enabled?

@jaywink

This comment has been minimized.

Copy link
Contributor

jaywink commented Oct 16, 2014

I've been thinking, what do we want to do if email is not enabled?

Send an SMS? :) Really, there isn't much that can be done, except to maybe wait longer, but that doesn't seem justifiable.

Btw, when someone tries to log in with a closed account - is there a message shown or just login error?

@jhass

This comment has been minimized.

Copy link
Member

jhass commented Oct 16, 2014

iirc we lock the account, so you should get the account locked message.

What about printing a boot warning when the feature is enabled but mail isn't?

@jaywink

This comment has been minimized.

Copy link
Contributor

jaywink commented Oct 16, 2014

What about printing a boot warning when the feature is enabled but mail isn't?

Ah you meant POD email, of course :) I'm feeling a little slow today.. I thought you meant if the user email is a dummy (since we don't verify them).

Yes that would make sense I guess. I'm not sure how common it is for email not to be enabled for pods? But printing a warning would be a little something at least.

Would be nice to get warnings to the admin panel, so they would be seen when the podmin looks in there, every time.

@Flaburgan

This comment has been minimized.

Copy link
Member

Flaburgan commented Oct 16, 2014

Would be nice to get warnings to the admin panel, so they would be seen when the podmin looks in there, every time.

Yeah that's one of the item listed in #4800 : to be able to alert the podmin with a message where some conditions are present, like this setting without email enabled, or the version of the pod outdated compared to the version on available on github...

Maintenance feature to remove old users
Add Sidetiq webview to the Sidekiq monitoring panel

Add rake task maintenance:queue_users_for_removal
This basically just triggers an immediate run of the normal maintenance remove old users functionality that is normally (if enabled) scheduled to run once a day via sidetiq

Add extra safety when checking for user removal due to inactivity.
Now also user.last_seen will also be checked to make sure a user will not be removed in the event that the Devise rememember me login functionality has stopped the users remove_after timestamp from being removed.

Add initializer for maintenance job.
Add warning about mail being disabled if remove_old_users maintenance is enabled.

@jaywink jaywink force-pushed the jaywink:remove-old-users branch from 2320b0c to 69c3566 Oct 16, 2014

@jaywink

This comment has been minimized.

Copy link
Contributor

jaywink commented Oct 16, 2014

Bang. Warning added as an initializer, timecop in business freezing the times. Your move travis.

@jhass jhass added this to the next-major milestone Oct 16, 2014

@jhass

This comment has been minimized.

Copy link
Member

jhass commented Oct 16, 2014

Alright, lgtm. Thank you!

jhass added a commit that referenced this pull request Oct 16, 2014

Merge pull request #5288 from jaywink/remove-old-users
Maintenance feature to remove old users, scheduled in settings

@jhass jhass merged commit f361a0f into diaspora:develop Oct 16, 2014

1 check passed

continuous-integration/travis-ci The Travis CI build passed
Details
@Flaburgan

This comment has been minimized.

Copy link
Member

Flaburgan commented Oct 17, 2014

Yay! Awesome!

@goobertron

This comment has been minimized.

Copy link

goobertron commented Oct 17, 2014

Thanks so much for doing this.

For some reason my line note has disappeared. I was asking whether user.last_seen < self.safe_remove_after in https://github.com/diaspora/diaspora/pull/5288/files#diff-fd589d584d3081f0f3b8565675d715c7R21 should be user.last_seen > self.safe_remove_after - in other words, the time elapsed since last account activity should be greater than (not less than) the lower limit for 'safe removal'. Of course if user.last_seen and self.safe_remove_after are fixed dates rather than periods, it is as it should be.

I've only just noticed the text of the email alert. I'd like to amend this if that's OK. It reads a bit brutally at the moment, I'd like to change it to something like 'it looks as though you don't want to use your account any more, so we'd like to remove it for you. If you want to keep it, all you have to do is to log in before %{remove_after}'. I'll make a separate PR for this.

@goobertron

This comment has been minimized.

Copy link

goobertron commented Dec 11, 2014

@maxwell, did you pull and try this yet? It would be great to know how it's working.

@jaywink jaywink deleted the jaywink:remove-old-users branch Dec 11, 2014

@jaywink

This comment has been minimized.

Copy link
Contributor

jaywink commented Dec 11, 2014

@goobertron @maxwell at least it works up until the flagging and sending warnings. My pod has had so far one user (test@fb.com ;)) who has been flagged and a notification sent. I think that might be a test user created for facebook app review back in the days.

Anyway, in 21 days the user will actually be deleted.

Btw, @maxwell, please if you need help with upkeeping jd, just ask. It's sad so see it running an old version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment