-
Notifications
You must be signed in to change notification settings - Fork 5.3k
vchiq_arm.c: When registering the calling process use namespace pid #1279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Andrei Gherzan <andrei@resin.io>
|
We have a simple EGL & OMX application which runs fine on the host (rpi2). When running the same binary in a container, the application behaves strangely (lacks transitions and other animation components) while "vcdbg log msg" complains with: 025819.942: *** No KHAN handle found for pid 24 025836.615: *** No KHAN handle found for pid 24 025853.285: *** No KHAN handle found for pid 24 Running the container without a new process namespace, everything gets back to normal. Although I don't really understand the entire stack as I failed to see how this pid is (passed) used in userland and or vchiq, I pushed the patch as it is now. If someone can give me a short overview on this graphics architecture and how pid member is used, it would be great. |
|
When a process dies it is vital that any resources it used are released. VCHIQ is the comms interface between the application and the graphics APIs, and it tells the GPU the PID associated with each client to enable the multiple connections from each process to be grouped as a bundle. I'm new to PID namespaces, but I can see what the aim is. However, I'm confused by your use of task_pid_vnr which returns the "virtual" (namespaced) PID - I would have expected you to use task_pid_nr to get the global pid, unless the problem is that some other part of the GL client APIs is using the virtual PID and failing to get a match. If so, I would have thought that changing that API/driver would be better. I'm concerned that by switching to the virtual PIDs we may accidentally get a collision between processes in different namespaces; explain to me why that isn't the case. |
|
Hi @pelwell. We co-discovered this issue with @agherzan.
Using
Indeed, we also think that this is the case. I tried to find where the PID is passed from the EGL libraries to the driver but I couldn't find where this happens as it's a big and unfamiliar codebase for me. On the EGL side however, the process will only be able to see its virtual PID. You can't get IDs of a parent namespace from inside a child namespace. This means that the driver will either have to do accounting based on a tuple
You're right and it is the case that there will be collisions. This patch is mostly provided to demonstrate the problem. |
|
Thanks for that clarification. I'm going to close this PR in case somebody gets tempted to merge it, but this is an issue we should look at. |
|
Would a minimal |
|
Of course - anything that helps us to focus on the problem is helpful. |
|
Hey @pelwell. Here is an easy way of reproducing this problem. First, compile the hello_triangle.bin example. If you run it by itself you should see a cube with some textures rotating on the screen. Next use this little launcher to run the same #define _GNU_SOURCE
#include <sched.h>
#include <unistd.h>
#include <signal.h>
#include <stdio.h>
#define FILENAME "./hello_triangle.bin"
static int childFunc(void *arg) {
printf("childFunc(): PID = %ld\n", (long) getpid());
printf("childFunc(): PPID = %ld\n", (long) getppid());
execlp(FILENAME, FILENAME, (char *) NULL);
}
#define STACK_SIZE (1024 * 1024)
static char child_stack[STACK_SIZE]; /* Space for child's stack */
int main() {
pid_t child_pid;
child_pid = clone(childFunc, child_stack + STACK_SIZE, CLONE_NEWPID | SIGCHLD, NULL);
printf("PID returned by clone(): %ld\n", (long) child_pid);
waitpid(child_pid, NULL, 0);
} |
Signed-off-by: Andrei Gherzan andrei@resin.io