Conversation
|
|
||
| def delete_attributes | ||
| keys = ATTRIBUTES.map { |attr| storage_key(attr) } | ||
| keys << storage_key(:user_set) |
There was a problem hiding this comment.
The responsability of removing the members of the Service's Users set has been intentionally moved to User
There was a problem hiding this comment.
Is this key ever removed when deleting all users from a service?
There was a problem hiding this comment.
Yes. When all elements of a set are removed then the set is automatically removed in Redis.
| while !finished | ||
| (current_scan_cursor_position, current_usernames) = | ||
| storage.sscan(users_set_key, current_scan_cursor_position, | ||
| { count: PIPELINED_SLICE_SIZE }) |
There was a problem hiding this comment.
Here I use the global constant named PIPELINED_SLICE_SIZE, that was already defined, but the name might be a little misleading because we are not really doing pipelining. Do you think we should create a different constant in the user.rb file to not cause confusion?
There was a problem hiding this comment.
do not create it in user.rb, since it has nothing to do with it - create a new one with comments on why is it needed in the storage module
| delete '/all' do |service_id| | ||
| begin | ||
| User.delete_all(service_id) | ||
| { status: :deleted} |
| storage.sscan(users_set_key, current_scan_cursor_position, | ||
| { count: PIPELINED_SLICE_SIZE }) | ||
| delete_users_in_batch(service_id, current_usernames) | ||
| finished = current_scan_cursor_position == "0" |
There was a problem hiding this comment.
can you assure that finished will be eventually true , no matter what? on redis connection error, exception will rise, right?
There was a problem hiding this comment.
On a Redis connection an Exception will rise. However, regarding your question, Redis scan cannot always guarantee the command execution termination as can be seen in the documentation https://redis.io/commands/scan:
The SCAN algorithm is guaranteed to terminate only if the size of the iterated collection remains bounded to a given maximum size, otherwise iterating a collection that always grows may result into SCAN to never terminate a full iteration.
This is easy to see intuitively: if the collection grows there is more and more work to do in order to visit all the possible elements, and the ability to terminate the iteration depends on the number of calls to SCAN and its COUNT option value compared with the rate at which the collection grows.
So yes, there's the danger of that, but it's inherent to the data (the collection could be growing very fast, etc...)
| Memoizer.clear key | ||
| end | ||
|
|
||
| def self.delete_users_in_batch(service_id, usernames) |
There was a problem hiding this comment.
Why not use class method User.delete!(service_id, username) to implement this method?
There was a problem hiding this comment.
Seems a good idea 👍 I'm going to implement it
240ed8d to
22c588a
Compare
|
I've checked all your commits and answered them, implementing some of the requested changes. Could you review them? Thank you. |
| assert_equal User.exists?(7003, "username#{i}"), false | ||
| end | ||
| assert_equal User.exists?(7004, "username_differentservice"), true | ||
| User.delete!(7004, "username_differentservice") |
There was a problem hiding this comment.
clean up should not be part of the test
There was a problem hiding this comment.
The cleanup is there because I did not want to leave the user in the DB when the test ends
There was a problem hiding this comment.
There's a setup method above that cleans the DB.
That method runs before each test, so there's no need to clean up on every individual one.
|
I wanted to rise a discussion about the design decision made here that might not be right. This implementation is deleting potentially big sets of data synchronously from the API endpoint call? Delete job could be better option? In addition, retry and recover from errors can be done from offline tasks. |
|
Please also remember that when this is done, the Pisoni Gem should have this endpoint too 😄 |
| users_set_key = service_users_set_key(service_id) | ||
| while !finished | ||
| (current_scan_cursor_position, current_usernames) = | ||
| storage.sscan(users_set_key, current_scan_cursor_position, |
There was a problem hiding this comment.
Have you considered using scard and spop operations to delete all elements of the set? At least seems less risky than using cursors on potentially big data sets.
There was a problem hiding this comment.
When this was implemented, we were using Redis server 2.8 in SaaS. The spop command does not support specifying the number of elements to delete until Redis 3.2 (https://redis.io/commands/spop) so we could not use this command and that was the reason we opted to use cursor-based deletion.
However, recently we have upgraded our Redis server version so we could replace the cursor to use scard and spop.
If we do this we would need to update the apisonator development Docker images to use a newer version of Redis too because right now it is still using 2.8 and would stop working.
Right now we have the following Redis versions:
SaaS: Redis 4.X
On-Prem: Redis 3.2
If we upgrade Redis server version in the development images I'd say to use Redis 3.2 (the lower version). This would mean that we would not be using in dev the same version that in SaaS but we would be testing a version where all commands should be compatible in both versions. What do you think?
There was a problem hiding this comment.
@miguelsorianod I think we should update the development images as you suggested so we can change this to use spop 👍
There was a problem hiding this comment.
I upgraded the dev images and we are now using Redis 3.2.X for our dev images. However, the Redis ruby client version that we use in some of our services is 3.2.2, but the support for spop with the number of elements to pop was added to 3.3.2. At this moment we cannot upgrade to that version in those services due to a bug we found with reconnections redis/redis-rb#668. We'll continue investigating on that and for the moment to be able to proceed we'll leave this with scan.
|
This feature should be merge. Anything else to be added/updated? Looks quite good to me |
| Memoizer.clear key | ||
| end | ||
|
|
||
| def self.delete_users_in_batch(service_id, usernames) |
There was a problem hiding this comment.
Methods defined under private with self.f are not private in Ruby. Just one of those things that might be confusing at first :)
Private class methods need to be defined under private inside class << self or using https://ruby-doc.org/core-2.5.3/Module.html#method-i-private_class_method
No need to change this now, as I see that there are other methods in this class that are supposed to be private but are not.
| test '.delete_all removes all the Users of a Service' do | ||
| Service.save!(id: 7003, provider_key: 'test_provkey', default_service: false) | ||
| Service.save!(id: 7004, provider_key: 'test_provkey', default_service: false) | ||
| num_users = 600 |
There was a problem hiding this comment.
600 looks like an arbitrary number. I guess you decided this to be larger than the number of users to be deleted on each batch? If so, I think it would be better for readability to relate this variable with that constant, something like num_users = BATCH_SIZE + 1.
What do you think?
There was a problem hiding this comment.
👍 . It was indeed to be larger than the number of users to be deleted on each batch
| assert_equal metrics, user.load_metric_names | ||
| end | ||
|
|
||
| test '.delete_all removes all the Users of a Service' do |
There was a problem hiding this comment.
I think we should also test the case of deleting the users of a service that does not have any. Just to make sure that nothing crashes.
|
|
||
| User.delete_all(7003) | ||
|
|
||
| for i in 1..num_users |
There was a problem hiding this comment.
Nitpick: In Ruby (1..num_users).each { |n| ... } is more idiomatic.
There was a problem hiding this comment.
maybe 1.upto(num_users).each { |n| ... } ?? just another option, valid as well.
There was a problem hiding this comment.
If say that the only allowed option here is #times 😆
There was a problem hiding this comment.
but #times is 0-based. If index is needed and 1-based as its first implementation, I'd say it is better
1.upto(num_users).each { |n| ... }
| end | ||
| end | ||
|
|
||
| delete '/services/:service_id/users/all' do |
There was a problem hiding this comment.
I think there's no need for the /all part.
| # is desired/needed. Batching is performed when a lot | ||
| # of storage operations are need to be performed and we | ||
| # want to minimize database blocking of other clients | ||
| BATCH_SIZE = 400 |
There was a problem hiding this comment.
We already have a constant related to this one in a different module:
I think BATCH_SIZE should be where you placed it because it's clearly related to Storage. I think that's better than having a global constants module. But I also think that it would be clearer if we moved the other constant here. That could be done on a different PR.
| example_request 'Deleting all Users of a Service' do | ||
| expect(status).to eq 200 | ||
| expect(response_json['status']).to eq 'deleted' | ||
| expect(ThreeScale::Backend::User.delete_all(service_id)).to be nil |
There was a problem hiding this comment.
I think that this expect assumes things that are not documented nor tested.
In the unit tests of User you did not check the return value of delete_all, you just verified that after calling it, users no longer exists. I think that's fine, but the here we should just do the same.
The alternative would be to define a return value for delete_all in docs/tests, but I don't think it's needed in this case.
What do you think?
There was a problem hiding this comment.
I now check if the user has been deleted or not instead of the return value of the delete_all method
| end | ||
| end | ||
|
|
||
| delete '/all' do |service_id| |
There was a problem hiding this comment.
as @davidor suggested, maybe
DELETE /internal/services/:service_id/users would be more RESTful
can be implemented by
delete '' do |service_id|
There was a problem hiding this comment.
What happens if I have a following URL constructing code:
http.delete("/internal/services/#{service_id}/users/#{username}")and username is nil? So the resulting URL is /internal/services/:service_id/users/ ? Would it delete all users?
There was a problem hiding this comment.
It shouldn't be a problem.
If you call /internal/services/:service_id/users/ then it would return a 404 not found because that URL has not been defined in the internal API.
For example, if I change the current internal api test code to execute this code (notice the slash at the end):
delete '/services/:service_id/users/' do
parameter :service_id, 'Service ID', required: true
example_request 'Deleting all Users of a Service' do
expect(status).to eq 200
expect(response_json['status']).to eq 'deleted'
expect(ThreeScale::Backend::User.exists?(service_id, username)).to be false
end
end
and I keep the internal api code as:
delete '' do |service_id|
begin
User.delete_all(service_id)
{ status: :deleted}.to_json
rescue => e
[400, headers, { status: :error, error: e.message }.to_json]
end
end
When I execute the test I get the following:
Failures:
1) Users (prefix: /services/:service_id/users) DELETE /services/:service_id/users/ Deleting all Users of a Service
Failure/Error: expect(status).to eq 200
expected: 200
got: 404
(compared using ==)
# ./spec/acceptance/api/internal/users_api_spec.rb:151:in `block (3 levels) in <top (required)>'
Finished in 0.90085 seconds (files took 0.68985 seconds to load)
144 examples, 1 failure
Failed examples:
rspec ./spec/acceptance/api/internal/users_api_spec.rb:150 # Users (prefix: /services/:service_id/users) DELETE /services/:service_id/users/ Deleting all Users of a Service
22c588a to
3fb5af8
Compare
|
I made some changes and I think it could be merged. |
| def self.delete_users_in_batch(service_id, usernames) | ||
| if usernames.size != 0 | ||
| service_usernames_keys = usernames.map { |username| self.key(service_id, username) } | ||
| storage.del(service_usernames_keys) |
There was a problem hiding this comment.
I think these 2 should be grouped in a pipeline:
storage.pipelined do
storage.del(service_usernames_keys)
storage.srem(self.service_users_set_key(service_id), usernames)
endSorry, I didn't see this in my first review. All the rest looks good to me 👍
Add the BATCH_SIZE constant in the Storage module. This constant is to be used when batching of operation into the storage is desired/needed. Batching is performed when a lot of storage operations are need to be performed and we want to minimize database blocking of other clients.
This method has been implemented in order to allow the clients of the User class to delete all Users of a Service. The implementation has been done taking into account that the number of Users of a Service could be big and the processing is done in batches
3fb5af8 to
ebfaf37
Compare
38: Implement Service's Users removal r=davidor,eguzki a=miguelsorianod Provide a method to delete all Users of a given Service. Also provide the HTTP method 'DELETE /services/:service_id/users/all' via Internal API in order to provide this to apisonator administrators. The decision to NOT delete the Users of the Service when removing a Service is intentional. This has been done to follow coherence like with other entities that are accessible and manageable via Internal API, like with Application, Metrics, Application Keys, etc... The removal of the Users of the Service has been done taking into account that the number of users of a Service can be big and a batching strategy has been followed. Co-authored-by: Miguel Soriano <msoriano@redhat.com>
Build succeeded |
Provide a method to delete all Users of a given Service.
Also provide the HTTP method 'DELETE /services/:service_id/users/all' via Internal API in order to provide this to apisonator administrators.
The decision to NOT delete the Users of the Service when removing a Service is intentional. This has been done to follow coherence like with other entities that are accessible and manageable via Internal API, like with Application, Metrics, Application Keys, etc...
The removal of the Users of the Service has been done taking into account that the number of users of a Service can be big and a batching strategy has been followed.