-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-1927: [Plasma] Add delete function #1427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cpp/src/plasma/store.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment should be indented as much as the following line
cpp/src/plasma/client.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In client.h we should probably clarify the behavior of this method. In particular, if a client calls Delete, that does not guarantee that the object will be deleted. In particular, if the object is in use by another client, then the Delete call is a no-op, right? Can you mention this in the documentation for this method in client.h?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it doesn't guarantee the object will be deleted for some reasons.
cpp/src/plasma/store.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we already have a delete_objects method, does it make sense to also have a delete_object method? Is the behavior of this method different from calling delete_objects on a vector of one object ID?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, they are similar. But in delete_object will issue the error status to client if delete object failed. As for delete_objects used by evict objects function, I don't want to impact it too much.
cpp/src/plasma/store.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment indentation
cpp/src/plasma/store.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment indentation
cpp/src/plasma/client.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we create a custom error for this so it can be caught in Python?
cc @wesm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, this looks good, thanks for the contribution! Can you please also create a unit test that tests the following two scenarios:
- make sure an error is raised if the object is in use
- test the "normal" case where the object is not in use and make sure it was actually deleted (maybe by trying to create an object with the same object id)
- test the case of an object getting deleted that doesn't exist yet.
|
Yes, I will try to create the unit test cases in next commitment.
… On 17 Dec 2017, at 11:40 AM, Philipp Moritz ***@***.***> wrote:
@pcmoritz commented on this pull request.
Hey, this looks good, thanks for the contribution! Can you please also create a unit test that tests the following two scenarios:
make sure an error is raised if the object is in use
test the "normal" case where the object is not in use and make sure it was actually deleted (maybe by trying to create an object with the same object id)
maybe test what happens if an object is deleted that doesn't exist yet.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#1427 (review)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/Afm26VC0v8ihrTN7qCYl8g60TEQ7M519ks5tBI0hgaJpZM4REiBf>.
|
robertnishihara
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! It could be done in a follow up PR, but we should expose this to the Python API and also add a test in https://github.com/apache/arrow/blob/master/python/pyarrow/tests/test_plasma.py
cpp/src/plasma/client.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error should probably just be "Object is in use."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, fixed
cpp/src/plasma/client.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about not having the plasma store reply to the client? The client could just continue on. If it really wants to know whether the object was deleted or not, it can call Contains to check.
Are there use cases where you really want to know if the object was deleted or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Providing a blocked API is the most common thinking. If delete operation failed, client also need to know why the delete operation failed and do the next step according to the reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you describe the use case?
If you view the command as more of a "hint" that doesn't necessarily do anything but simply gives the plasma store more information that it can make use of, then a non-blocking API is natural.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, just update operation. And it can be implemented as: check contains, delete, check contains again to make it happen.
Currently, only two reasons: not sealed object and object is in use will make the DELETE operation failed. We can make it as non-blocking API. But when I look at other APIs such as CREATE, they also will provide the reply to the client. I think DELETE operation should follow the same way as CREATE operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, let's try out this approach and see if it's useful, though I think we should consider switching to "hint" semantics at some point.
cpp/src/plasma/eviction_policy.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we also need to update the memory_used_ field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, fixed.
4f146d8 to
c6df5be
Compare
|
+1 This looks good. The test failure looks unrelated (the node.js tests are failing a lot) |
|
We should also expose this through Python and add Python tests. |
|
Yeah, let's do this as a followup PR |
|
@JinHai-CN could you let me know your JIRA id on the ASF JIRA so I can assign this issue to you? |
|
@wesm ID: jinhai |
Hi, I just add the delete function for Plasma and tested. JIRA ticked:
https://issues.apache.org/jira/browse/ARROW-1927