Core Arbiter Notes 2.0

There are two sides to the core arbiter: a client and a server. The server runs in an infinite loop, accepting messages from clients. There is one client per application communicating with the server.

Public APIs

Client

CoreArbiterClient(serverSocketPath): The constructor establishes connection with the server accepting connections at serverSocketPath.
void setNumCores(requestedNumCores): Asks the server for requestedNumCore exclusive cores. If cores are available, the server will wake up blocked threads and place them each on their own exclusive core.
void blockUntilCoreAvailable(): Called by a thread that wants access to an exclusive core. This call blocks and only returns when the server has placed the thread on a core.
bool shouldReleaseCore(): Returns true if the application needs to release a core. It is okay if this returns true to multiple threads, because blockUntilCoreAvailable() will check and increment the value of coreReleaseCount against the value of coreReleaseRequestCount in a critical section before actually blocking.
int getOwnedCoreCount(): Returns the number of cores the application currently owns. Note that this call is inherently racy, as described in the problems section below. We expect that the application will use hysteresis to account for the delayed effect of changing the number of cores.

Server

CoreArbiterServer(socketPath, sharedMemPath, exclusiveCores): The constructor sets up all the state necessary to put threads in dedicated cpusets and to accept client connections. If anything in the setup fails, the server exits with an error message.
startArbitration(): Handles client requests. This method does not return.

Communication Between Client and Server

In general, the client communicates with the server via unix domain sockets and the server communicates with the client via shared memory.

Initial handshake

Client: connect(serverSocket), send("new process", processId)
Server: accept(), send(sharedMemPath)

Block Thread

If this thread has never communicated with the server before, it first needs to set up its own connection with the server:

Client: connect(serverSocket), send("new thread", processId, threadId)
Server: accept()

The thread can then inform the server that it's blocking and wait to receive a message from the server.

Client: send("blocking")
(Later, when the server wants to give the client a core) Server: send("wakeup")

Request Cores

Client: send("requesting cores", numCores)
Server (to the threads that it wants to wake up): send("wakeup")

Taking Cores Back from Client

Server increments the coreReleaseRequestCount in shared memory by however many cores it wants back from the client.
The application is responsible for periodically checking the value returned by getCoreReleaseRequestCount() and for calling blockUntilCoreAvailable() on that number of threads. Internally, the client compares coreReleaseRequestCount (in shared memory) to its own internal counter coreReleaseCount, which is the total number of cores it has given back to the server in its lifetime. It returns to the client coreReleaseRequestCount - coreReleaseCount. coreReleaseCount is incremented by calls to blockUntilCoreAvailable iff coreReleaseRequestCount - coreReleaseCount > 0.
After incrementing coreReleaseRequestCount, the server sets a timer. If the client has not released all of its requested cores when the timer goes off, the server demotes the client's threads.

Problems

What if server is down?
- First thought: make every client operation a no-op. This isn't a good idea because you wouldn't be able to ramp down; blockUntilCoreAvailable wouldn't actually block any threads.
- Could have the client fork and start up a server if it can't find one. If this fails, the client can throw an exception every time you try to do something.
- John prediction: "If there's no arbiter and there's no contention on cores (that we know of), why do we care?"
The application needs some sort of getNumCores() method to know how many cores it wants to ramp up/down to. Getting this number is inherently racy, but that wouldn't matter if the ramp up detection method was reliable and efficient. Threads could just check after acquiring a lock how many cores they have and whether they still need to ramp up/down. The issue with the current approach is that the thread may decide that it still needs to ramp up and therefore ask for more cores than it actually needs.
- Possible solution: Prevent the client from calling setNumCores too often. This could be timing based (keep track of the last time the number of cores changed). Is there another heuristic we could use?

Questions

Should we try to recover if cpusets are unavailable?
Is it reasonable for the client to throw a general exception whenever something in its communication with the server goes wrong?
There needs to be some sort of contract between the client and server for how often the client checks whether it needs to release cores. What is the relationship between the server's timeout and the client's polling interval?
What should we do if any of the message from the client to the server fail?

Arbitration Policy

There are essentially two policy axes:

How to divide exclusive cores (evenly or early bird gets the worm)
Other threads running on unworthy core vs. not running at all

Must have, same priority

Early bird gets the worm; everyone else runs on unworthy core
Round robin gang scheduling
Make cores unexclusive
Even division of cores (others could either block or run on unworthy core)

Must have, different priorities

Give higher priority as many threads as possible, but lower priorities on unworthy core

Nice to have, same priority

Even division of cores; other threads don't unblock on unworthy core

Nice to have, different priorities

Starve lower priorities. If an application only asks for nice to have threads, this means that it might not run at all.
(Using the same policy as must have here would lead to contention on the unworthy core)

Related notes/questions

If a thread is running on the unworthy core we should communicate this to the application somehow. The easiest thing to do would be to have a special sentinel core ID value.
Should the arbiter communicate that there is contention?
What does the client return for the number of cores a process owns if it has threads running on the unworthy core?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly