Skip to content

Implement Service's Users removal#38

Merged
bors[bot] merged 4 commits intomasterfrom
change/implement-all-service-users-removal
Feb 6, 2019
Merged

Implement Service's Users removal#38
bors[bot] merged 4 commits intomasterfrom
change/implement-all-service-users-removal

Conversation

@miguelsorianod
Copy link
Copy Markdown
Contributor

Provide a method to delete all Users of a given Service.

Also provide the HTTP method 'DELETE /services/:service_id/users/all' via Internal API in order to provide this to apisonator administrators.

The decision to NOT delete the Users of the Service when removing a Service is intentional. This has been done to follow coherence like with other entities that are accessible and manageable via Internal API, like with Application, Metrics, Application Keys, etc...

The removal of the Users of the Service has been done taking into account that the number of users of a Service can be big and a batching strategy has been followed.


def delete_attributes
keys = ATTRIBUTES.map { |attr| storage_key(attr) }
keys << storage_key(:user_set)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The responsability of removing the members of the Service's Users set has been intentionally moved to User

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this key ever removed when deleting all users from a service?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. When all elements of a set are removed then the set is automatically removed in Redis.

Comment thread lib/3scale/backend/user.rb Outdated
while !finished
(current_scan_cursor_position, current_usernames) =
storage.sscan(users_set_key, current_scan_cursor_position,
{ count: PIPELINED_SLICE_SIZE })
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I use the global constant named PIPELINED_SLICE_SIZE, that was already defined, but the name might be a little misleading because we are not really doing pipelining. Do you think we should create a different constant in the user.rb file to not cause confusion?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not create it in user.rb, since it has nothing to do with it - create a new one with comments on why is it needed in the storage module

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Comment thread app/api/internal/users.rb Outdated
delete '/all' do |service_id|
begin
User.delete_all(service_id)
{ status: :deleted}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.to_json ??

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

storage.sscan(users_set_key, current_scan_cursor_position,
{ count: PIPELINED_SLICE_SIZE })
delete_users_in_batch(service_id, current_usernames)
finished = current_scan_cursor_position == "0"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you assure that finished will be eventually true , no matter what? on redis connection error, exception will rise, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a Redis connection an Exception will rise. However, regarding your question, Redis scan cannot always guarantee the command execution termination as can be seen in the documentation https://redis.io/commands/scan:

The SCAN algorithm is guaranteed to terminate only if the size of the iterated collection remains bounded to a given maximum size, otherwise iterating a collection that always grows may result into SCAN to never terminate a full iteration.

This is easy to see intuitively: if the collection grows there is more and more work to do in order to visit all the possible elements, and the ability to terminate the iteration depends on the number of calls to SCAN and its COUNT option value compared with the rate at which the collection grows.

So yes, there's the danger of that, but it's inherent to the data (the collection could be growing very fast, etc...)

Memoizer.clear key
end

def self.delete_users_in_batch(service_id, usernames)
Copy link
Copy Markdown
Member

@eguzki eguzki Jun 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use class method User.delete!(service_id, username) to implement this method?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems a good idea 👍 I'm going to implement it

@miguelsorianod miguelsorianod force-pushed the change/implement-all-service-users-removal branch from 240ed8d to 22c588a Compare June 26, 2018 13:35
@miguelsorianod
Copy link
Copy Markdown
Contributor Author

I've checked all your commits and answered them, implementing some of the requested changes. Could you review them?

Thank you.

Comment thread test/unit/user_test.rb Outdated
assert_equal User.exists?(7003, "username#{i}"), false
end
assert_equal User.exists?(7004, "username_differentservice"), true
User.delete!(7004, "username_differentservice")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clean up should not be part of the test

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cleanup is there because I did not want to leave the user in the DB when the test ends

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a setup method above that cleans the DB.
That method runs before each test, so there's no need to clean up on every individual one.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@eguzki
Copy link
Copy Markdown
Member

eguzki commented Jun 26, 2018

I wanted to rise a discussion about the design decision made here that might not be right.

This implementation is deleting potentially big sets of data synchronously from the API endpoint call?

Delete job could be better option? In addition, retry and recover from errors can be done from offline tasks.

@Martouta
Copy link
Copy Markdown

Martouta commented Jul 2, 2018

Please also remember that when this is done, the Pisoni Gem should have this endpoint too 😄

users_set_key = service_users_set_key(service_id)
while !finished
(current_scan_cursor_position, current_usernames) =
storage.sscan(users_set_key, current_scan_cursor_position,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered using scard and spop operations to delete all elements of the set? At least seems less risky than using cursors on potentially big data sets.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When this was implemented, we were using Redis server 2.8 in SaaS. The spop command does not support specifying the number of elements to delete until Redis 3.2 (https://redis.io/commands/spop) so we could not use this command and that was the reason we opted to use cursor-based deletion.
However, recently we have upgraded our Redis server version so we could replace the cursor to use scard and spop.
If we do this we would need to update the apisonator development Docker images to use a newer version of Redis too because right now it is still using 2.8 and would stop working.
Right now we have the following Redis versions:
SaaS: Redis 4.X
On-Prem: Redis 3.2

If we upgrade Redis server version in the development images I'd say to use Redis 3.2 (the lower version). This would mean that we would not be using in dev the same version that in SaaS but we would be testing a version where all commands should be compatible in both versions. What do you think?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@miguelsorianod I think we should update the development images as you suggested so we can change this to use spop 👍

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 . I've done this in #69

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I upgraded the dev images and we are now using Redis 3.2.X for our dev images. However, the Redis ruby client version that we use in some of our services is 3.2.2, but the support for spop with the number of elements to pop was added to 3.3.2. At this moment we cannot upgrade to that version in those services due to a bug we found with reconnections redis/redis-rb#668. We'll continue investigating on that and for the moment to be able to proceed we'll leave this with scan.

@eguzki
Copy link
Copy Markdown
Member

eguzki commented Jan 23, 2019

This feature should be merge.

Anything else to be added/updated?

Looks quite good to me

Memoizer.clear key
end

def self.delete_users_in_batch(service_id, usernames)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Methods defined under private with self.f are not private in Ruby. Just one of those things that might be confusing at first :)

Private class methods need to be defined under private inside class << self or using https://ruby-doc.org/core-2.5.3/Module.html#method-i-private_class_method

No need to change this now, as I see that there are other methods in this class that are supposed to be private but are not.

Comment thread test/unit/user_test.rb Outdated
test '.delete_all removes all the Users of a Service' do
Service.save!(id: 7003, provider_key: 'test_provkey', default_service: false)
Service.save!(id: 7004, provider_key: 'test_provkey', default_service: false)
num_users = 600
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

600 looks like an arbitrary number. I guess you decided this to be larger than the number of users to be deleted on each batch? If so, I think it would be better for readability to relate this variable with that constant, something like num_users = BATCH_SIZE + 1.
What do you think?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 . It was indeed to be larger than the number of users to be deleted on each batch

Comment thread test/unit/user_test.rb
assert_equal metrics, user.load_metric_names
end

test '.delete_all removes all the Users of a Service' do
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also test the case of deleting the users of a service that does not have any. Just to make sure that nothing crashes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Comment thread test/unit/user_test.rb Outdated

User.delete_all(7003)

for i in 1..num_users
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: In Ruby (1..num_users).each { |n| ... } is more idiomatic.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe 1.upto(num_users).each { |n| ... } ?? just another option, valid as well.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If say that the only allowed option here is #times 😆

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but #times is 0-based. If index is needed and 1-based as its first implementation, I'd say it is better

1.upto(num_users).each { |n| ... }

end
end

delete '/services/:service_id/users/all' do
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's no need for the /all part.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

# is desired/needed. Batching is performed when a lot
# of storage operations are need to be performed and we
# want to minimize database blocking of other clients
BATCH_SIZE = 400
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have a constant related to this one in a different module:

PIPELINED_SLICE_SIZE = 400

I think BATCH_SIZE should be where you placed it because it's clearly related to Storage. I think that's better than having a global constants module. But I also think that it would be clearer if we moved the other constant here. That could be done on a different PR.

example_request 'Deleting all Users of a Service' do
expect(status).to eq 200
expect(response_json['status']).to eq 'deleted'
expect(ThreeScale::Backend::User.delete_all(service_id)).to be nil
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this expect assumes things that are not documented nor tested.

In the unit tests of User you did not check the return value of delete_all, you just verified that after calling it, users no longer exists. I think that's fine, but the here we should just do the same.

The alternative would be to define a return value for delete_all in docs/tests, but I don't think it's needed in this case.
What do you think?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now check if the user has been deleted or not instead of the return value of the delete_all method

Comment thread app/api/internal/users.rb Outdated
end
end

delete '/all' do |service_id|
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as @davidor suggested, maybe

DELETE /internal/services/:service_id/users would be more RESTful

can be implemented by

delete '' do |service_id|

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if I have a following URL constructing code:

http.delete("/internal/services/#{service_id}/users/#{username}")

and username is nil? So the resulting URL is /internal/services/:service_id/users/ ? Would it delete all users?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't be a problem.

If you call /internal/services/:service_id/users/ then it would return a 404 not found because that URL has not been defined in the internal API.

For example, if I change the current internal api test code to execute this code (notice the slash at the end):

  delete '/services/:service_id/users/' do
    parameter :service_id, 'Service ID', required: true

    example_request 'Deleting all Users of a Service' do
      expect(status).to eq 200
      expect(response_json['status']).to eq 'deleted'
      expect(ThreeScale::Backend::User.exists?(service_id, username)).to be false
    end
  end

and I keep the internal api code as:

        delete '' do |service_id|
          begin
            User.delete_all(service_id)
            { status: :deleted}.to_json
          rescue => e
            [400, headers, { status: :error, error: e.message }.to_json]
          end
        end

When I execute the test I get the following:

Failures:

  1) Users (prefix: /services/:service_id/users) DELETE /services/:service_id/users/ Deleting all Users of a Service
     Failure/Error: expect(status).to eq 200
     
       expected: 200
            got: 404
     
       (compared using ==)
     # ./spec/acceptance/api/internal/users_api_spec.rb:151:in `block (3 levels) in <top (required)>'

Finished in 0.90085 seconds (files took 0.68985 seconds to load)
144 examples, 1 failure

Failed examples:

rspec ./spec/acceptance/api/internal/users_api_spec.rb:150 # Users (prefix: /services/:service_id/users) DELETE /services/:service_id/users/ Deleting all Users of a Service

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@miguelsorianod miguelsorianod force-pushed the change/implement-all-service-users-removal branch from 22c588a to 3fb5af8 Compare January 24, 2019 18:15
@miguelsorianod
Copy link
Copy Markdown
Contributor Author

miguelsorianod commented Jan 25, 2019

I made some changes and I think it could be merged.
How do you see it?

Comment thread lib/3scale/backend/user.rb Outdated
def self.delete_users_in_batch(service_id, usernames)
if usernames.size != 0
service_usernames_keys = usernames.map { |username| self.key(service_id, username) }
storage.del(service_usernames_keys)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these 2 should be grouped in a pipeline:

          storage.pipelined do
            storage.del(service_usernames_keys)
            storage.srem(self.service_users_set_key(service_id), usernames)
          end

Sorry, I didn't see this in my first review. All the rest looks good to me 👍

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

Add the BATCH_SIZE constant in the Storage module.
This constant is to be used when batching of operation
into the storage is desired/needed. Batching is
performed when a lot of storage operations are need
to be performed and we want to minimize database
blocking of other clients.
This method has been implemented in order to allow the clients
of the User class to delete all Users of a Service.
The implementation has been done taking into account that
the number of Users of a Service could be big and the processing is
done in batches
@miguelsorianod miguelsorianod force-pushed the change/implement-all-service-users-removal branch from 3fb5af8 to ebfaf37 Compare February 5, 2019 15:58
@eguzki
Copy link
Copy Markdown
Member

eguzki commented Feb 6, 2019

bors r=@davidor,@eguzki

bors Bot added a commit that referenced this pull request Feb 6, 2019
38: Implement Service's Users removal r=davidor,eguzki a=miguelsorianod

Provide a method to delete all Users of a given Service.

Also provide the HTTP method 'DELETE /services/:service_id/users/all' via Internal API in order to provide this to apisonator administrators.

The decision to NOT delete the Users of the Service when removing a Service is intentional. This has been done to follow coherence like with other entities that are accessible and manageable via Internal API, like with Application, Metrics, Application Keys, etc...

The removal of the Users of the Service has been done taking into account that the number of users of a Service can be big and a batching strategy has been followed.

Co-authored-by: Miguel Soriano <msoriano@redhat.com>
@bors
Copy link
Copy Markdown
Contributor

bors Bot commented Feb 6, 2019

Build succeeded

@bors bors Bot merged commit ebfaf37 into master Feb 6, 2019
@bors bors Bot deleted the change/implement-all-service-users-removal branch February 6, 2019 11:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants