Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-2558: [Plasma] avoid walk through all the objects when a client disconnects #2015

Closed
wants to merge 3 commits into from

Conversation

@zhijunfu
Copy link
Contributor

commented May 9, 2018

Currently plasma stores list-of-clients in ObjectTableEntry, which is used to track which clients are using a given object, this serves for two purposes:

  • If an object is in use.
  • If the client trying to abort an object is the one who created it.

A problem with list-of-clients approach is that when a client disconnects, we need to walk through all the objects and remove the client pointer from the list for each object.

Instead, we could add a reference count in ObjectTableEntry, and store list-of-object-ids in client structure. This could both goals that the original approach is targeting, while when a client disconnects, it just walk through its object-ids and dereference each ObjectTableEntry, there's no need to walk through all objects.

zhijunfu added 2 commits May 9, 2018
@@ -134,8 +134,9 @@ struct ObjectTableEntry {
/// IPC GPU handle to share with clients.
std::shared_ptr<CudaIpcMemHandle> ipc_handle;
#endif
/// Set of clients currently using this object.
std::unordered_set<Client*> clients;
/// Reference count.

This comment has been minimized.

Copy link
@pcmoritz

pcmoritz May 9, 2018

Contributor

Better documentation is: Number of clients currently using this object.

This comment has been minimized.

Copy link
@zhijunfu

zhijunfu May 10, 2018

Author Contributor

Yes this is indeed better:)

/// Set of clients currently using this object.
std::unordered_set<Client*> clients;
/// Reference count.
int refcnt;

This comment has been minimized.

Copy link
@pcmoritz

pcmoritz May 9, 2018

Contributor

Let's rename this to ref_count, it's good to be not too cryptic ;)

This comment has been minimized.

Copy link
@zhijunfu

zhijunfu May 10, 2018

Author Contributor

sure.

// Tell the eviction policy that this object is being used.
std::vector<ObjectID> objects_to_evict;
eviction_policy_.begin_object_access(entry->object_id, &objects_to_evict);
delete_objects(objects_to_evict);
}
// Add the client pointer to the list of clients using this object.
entry->clients.insert(client);
// Increase refcnt.

This comment has been minimized.

Copy link
@pcmoritz

pcmoritz May 9, 2018

Contributor

// Increase reference count.

auto it = client->object_ids.find(entry->object_id);
if (it != client->object_ids.end()) {
client->object_ids.erase(it);
// Decrease refcnt.

This comment has been minimized.

Copy link
@pcmoritz

pcmoritz May 9, 2018

Contributor

// Decrease reference count.

@@ -533,19 +538,25 @@ void PlasmaStore::disconnect_client(int client_fd) {
// lists.
// TODO(swang): Avoid iteration through the object table.

This comment has been minimized.

Copy link
@pcmoritz

pcmoritz May 9, 2018

Contributor

can remove the TODO(swang) now

This comment has been minimized.

Copy link
@zhijunfu

zhijunfu May 10, 2018

Author Contributor

sure:)

@zhijunfu

This comment has been minimized.

Copy link
Contributor Author

commented May 10, 2018

Comments addressed. Thanks

@pcmoritz

This comment has been minimized.

Copy link
Contributor

commented May 15, 2018

Performance impact:

Without this PR:

[ 78.57%] ··· Running plasma.SimplePlasmaLatency.time_plasma_put                                                                                                             498.27ms
[ 85.71%] ··· Running plasma.SimplePlasmaLatency.time_plasma_putget                                                                                                          694.21ms
[ 92.86%] ··· Running plasma.SimplePlasmaThroughput.time_plasma_put_data                                                                                                           ok
[ 92.86%] ···· 
               ========== ==========
                 param1             
               ---------- ----------
                  1000     552.60μs 
                 100000    768.20μs 
                10000000    5.61ms  
               ========== ==========

With this PR:

[ 78.57%] ··· Running plasma.SimplePlasmaLatency.time_plasma_put                                                                                                             500.77ms
[ 85.71%] ··· Running plasma.SimplePlasmaLatency.time_plasma_putget                                                                                                          697.12ms
[ 92.86%] ··· Running plasma.SimplePlasmaThroughput.time_plasma_put_data                                                                                                           ok
[ 92.86%] ···· 
               ========== ==========
                 param1             
               ---------- ----------
                  1000     557.60μs 
                 100000    741.20μs 
                10000000    4.89ms  
               ========== ==========

So looks like the performance is approximately the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.