Skip to content

Shared memory in HOMI#35

Closed
NaddiNadja wants to merge 9 commits into
mainfrom
homi-shm
Closed

Shared memory in HOMI#35
NaddiNadja wants to merge 9 commits into
mainfrom
homi-shm

Conversation

@NaddiNadja
Copy link
Copy Markdown
Contributor

This PR changes the setup from the current IPC setup for HOMI

  • From: using only sockets, both for connection/disconnection and for runtime requests and responses between the clients and the daemon. The main thread listens to requests and spawns worker threads for each new request. Threads were used to not block requests from other processes from completing, and as each client had its own socket_fd, there was no data races when completing requests concurrently.

  • To: using a socket to establish a connection from a client to the daemon. The daemon then maps out a piece of shared memory for that specific client, and starts a worker thread, which listens to and handles requests in that piece of shared memory. There is a segment of shared memory for each client, such that we can still use threads to complete requests from different processes concurrently without running into data races.

This PR does not change the client interface, and the any program written with the HOMI client before this commit (e.g. the test program from #27 ) still works.

@NaddiNadja NaddiNadja requested a review from safl March 19, 2026 13:58
Copy link
Copy Markdown
Member

@safl safl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially suggested using the uPCIe hugepage helpers on Discord, but looking at the code more closely they're not a good fit here. That said, consider POSIX shared memory (shm_open/mmap) over SysV (shmget/shmat). It's consistent with the POSIX semaphores already used and segments are visible in /dev/shm/ for debugging.
That said, this is probably mostly my bias after fiddling with mmap for hugepages in uPCIe. Are there advantages to the SysV approach here that I'm missing?

Comment thread homi/src/homi_proto.c Outdated

*hdr = shm->hdr;

if (hdr->payload_len > 0) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you just return a pointer to shm->payload instead of allocating payload buffers? I am thinking that a benefit of shared-memory, is exactly that, the memory is shared, so as long as you are only reading the memory then you do not need to "copy it out of shared memory"?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the client side, I will need to copy it out of shared memory if I want to be able to be able to use the information after sending another request, right? I can however change it, such that it's only in the client code that the data is copied out, since the daemon does not need to persist the data read.

Copy link
Copy Markdown
Contributor Author

@NaddiNadja NaddiNadja Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my current implementation, I have a small section of shared memory for each of the clients, and when they query something from the daemon, it copies the information into shared memory - e.g. if a client wanted extent info, the daemon would find it with xallib and copy it into shared memory. This was my "direct" translation of socket->shm.
However, yesterday I realised that I might be able to put the whole xal tree in shared memory, since shmat supports giving it an address, so maybe I could just give it the pointer to the xallib pool of extents? That way, xallib is responsible for creating the whole tree etc., and HOMI then just makes the memory pool shared. Do you think that would work? Because then I would not have to worry about copying to/from shared memory, I could just return the pointer to where the memory is (but I would maybe still have to copy the pointer into shared memory, so that the client knows where to read from)

Okay I don't think that approach would work. Would require changes to xallib directly

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I was maybe on the right track anyways, and I understand now why you want to switch to the POSIX-style of shared memory: for example to be compatible with xallib. The pools are already mmapped, so I could make this shared, and in that way make it accessible to the client processes. I will fiddle around with this a bit more

Comment thread homi/src/homid_ipc.c
goto failed;
}

wargs->shm->done = 0;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

potential-race-on-done -- make it atomic?

I mention this since the semaphore guarantees the semaphore value, not the done value. Thus, I suspect that there might be a race.. thus to avoid it, possibly drop in an atomic store here atomic_store(&wargs->shm->done, 0);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would of course also require defining done as an atomic.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this. If the semaphore does not guarantee that the rest of the shared memory is ready to be read, how can I ever make sure that the data I read from it is accurate? The done field is not updated across multiple threads in the daemon, it's only set to 1 by the client, when it exits - it could be called something like client_exited to make this more clear.

Comment thread homi/src/homid_ipc.c
if (!request) {
homid_log(LOG_ERR, "Error: Payload required for HELLOWORLD request");
goto exit;
if (wargs->shm->done) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

potential-race-on-done -- make it atomic?

I mention this since the semaphore guarantees the semaphore value, not the done value. Thus, I suspect that there might be a race.. thus to avoid it, possibly drop in an atomic load here

if (atomic_load(&wargs->shm->done)) {
  break;
}

Copy link
Copy Markdown
Contributor

@karlowich karlowich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @safl's comments. Otherwise it looks clean to me.

@NaddiNadja
Copy link
Copy Markdown
Contributor Author

Are there advantages to the SysV approach here that I'm missing?

The only one I can think of is that I generate a shared memory segment for each client with a unique ID and by using IPC_PRIVATE, I get a kernel-assigned unique ID, so I don't have to worry about generating it myself. It does not seem that there is an equivalent for the POSIX approach, so moving to that, I would have to do some book keeping on this. I find it difficult to estimate whether that "inconvenience" is outweighed by the benefits of switching to POSIX?

Signed-off-by: Nadja Brix Koch <n.koch@samsung.com>
@NaddiNadja
Copy link
Copy Markdown
Contributor Author

Hold off from reviewing this, I have to try some things out now that I have rebased onto Siu's commit with the xallib caching :)

@NaddiNadja NaddiNadja marked this pull request as draft March 24, 2026 09:39
Signed-off-by: Nadja Brix Koch <n.koch@samsung.com>
There are informations on the homid struct that can be necessary to know
from the threaded workers that handle the requests from clients. The
homid_ipc_accept() function's argument is changed to be the whole homid
struct instead of just the connection struct, such that this can be
passed on to the worker arguments.

Signed-off-by: Nadja Brix Koch <n.koch@samsung.com>
A reader and writer function for shared memory, as using this for IPC
is more performant than using sockets.

Signed-off-by: Nadja Brix Koch <n.koch@samsung.com>
This commit changes the current from the current IPC setup for HOMI

- From: using only sockets, both for connection/disconnection and for
  runtime requests and responses between the clients and the daemon. The
  main thread listens to requests and spawns worker threads for each new
  request. Threads were used to not block requests from other processes
  from completing, and as each client had its own socket_fd, there was
  no data races when completing requests concurrently.

- To: using a socket to establish a connection from a client to the
  daemon. The daemon then maps out a piece of shared memory for that
  specific client, and starts a worker thread, which listens to and
  handles requests in that piece of shared memory. There is a segment of
  shared memory for each client, such that we can still use threads to
  complete requests from different processes concurrently without
  running into data races.

This commit does not change the client interface, and the any program
written with the HOMI client before this commit still works.

Signed-off-by: Nadja Brix Koch <n.koch@samsung.com>
With the move to shared memory, the read and write function for the
socket is no longer used or necessary in the shared protocol between the
HOMI client and daemon.

Signed-off-by: Nadja Brix Koch <n.koch@samsung.com>
Signed-off-by: Nadja Brix Koch <n.koch@samsung.com>
Signed-off-by: Nadja Brix Koch <n.koch@samsung.com>
Signed-off-by: Nadja Brix Koch <n.koch@samsung.com>
@safl safl force-pushed the main branch 2 times, most recently from 549d5c4 to 1f12ab7 Compare April 27, 2026 23:10
@safl safl mentioned this pull request May 12, 2026
@NaddiNadja
Copy link
Copy Markdown
Contributor Author

Replaced by #65

@NaddiNadja NaddiNadja closed this May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants