Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Execve #38

Closed
rennergade opened this issue Nov 19, 2019 · 12 comments
Closed

Execve #38

rennergade opened this issue Nov 19, 2019 · 12 comments
Assignees
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed
Milestone

Comments

@rennergade
Copy link
Contributor

Execve doesn't seem to work properly, as it does not execute the prompted binary.

Note: Execv here seems to also be broken, as the default environment its being sent is set as NULL, but should be an array of NULL. Should be a quick fix.

@rennergade rennergade self-assigned this Nov 19, 2019
@rennergade rennergade added the bug Something isn't working label Nov 19, 2019
@rennergade rennergade added this to the Bash/LAMP milestone Nov 19, 2019
@rennergade
Copy link
Contributor Author

rennergade commented Nov 22, 2019

What I'm noticing here is that although the fork/exec program is not faulting, and reaches succesful termination. The progress of the exec process is non-deterministic via the logs. A few examples (going to keep these to last 10 lines of logs or so)

[1554,3426412288:08:21:55.239362] Native Client module will be loaded at base address 0x000030ef00000000
[1554,3426412288:08:21:55.239368] madvising 0x30ef10050000, 0xeefb0000, MADV_DONTNEED
[1554,3426412288:08:21:55.239378] mprotecting 0x30ef10050000, 0xeefb0000, PROT_NONE
[1554,3426412288:08:21:55.239388] NaClPerfCounterInterval(NaClAppLoadFile PreAllocAddrSpace:*AllocAddrSpace): 122 microsecs
[1554,3426412288:08:21:55.239394] Loading into memory
[1554,3426412288:08:21:55.239401] loading segment 0
[1554,3426412288:08:21:55.239408] NaClElfImageLoad: checking descriptor mmap safety
[1554,3426412288:08:21:55.239413] PReading 196608 (0x30000) bytes to address 0x30ef00020000, position 65536 (0x10000)
[1554,3426412288:08:21:55.240679] Entered ParseResponse
[1554,3426412288:08:21:55.240693] ParseResponse isError=0, code=0, len=8
[1554,3426412288:08:21:55.240856] Entered ParseResponse
[1554,3426412288:08:21:55.240865] ParseResponse isError=0, code=0, len=8
Done persisting metadata.
Terminated
[1588,2041202432:08:24:32.500791] Elf Program headers
[1588,2041202432:08:24:32.500808] ==================================================
[1588,2041202432:08:24:32.500819] p_type: 1
[1588,2041202432:08:24:32.500833] p_offset: 10000
[1588,2041202432:08:24:32.500844] p_vaddr: 20000
[1588,2041202432:08:24:32.500854] p_paddr: 20000
[1588,2041202432:08:24:32.500866] p_filesz: 30000
[1588,2041202432:08:24:32.500876] p_memsz: 30000
[1588,2041202432:08:24:32.500885] p_flags: 5
[1588,2041202432:08:24:32.500897]  (PF_R  PF_X)
[1588,2041202432:08:24:32.500909] p_align: 10000
[1588,2041202432:08:24:32.500922] 
[1588,2041202432:08:24:32.500932] p_type: 1
[1588,2041202432:08:24:32.500942] p_offset: 40000
[1588,2041202432:08:24:32.500953] p_vaddr: 10020000
[1588,2041202432:08:24:32.500965] p_paddr: 10020000
[1588,2041202432:08:24:32.500976] p_filesz: a568
[1588,2041202432:08:24:32.500986] p_memsz: a568
[1588,2041202432:08:24:32.500998] p_flags: 4
[1588,2041202432:08:24:32.501008]  (PF_R  )
[1588,2041202432:08:24:32.501019] p_align: 10000
[1588,2041202432:08:24:32.501029] 
[1588,2041202432:08:24:32.501040] p_type: 1
[1588,2041202432:08:24:32.501050] p_offset: 4f9c0
[1588,2041202432:08:24:32.501061] p_vaddr: 1003f9c0
[1588,2041202432:08:24:32.501073] p_paddr: 1003f9c0
[1588,2041202432:08:24:32.501083] p_filesz: e40
[1588,2041202432:08:24:32.501095] p_memsz: 1090
Done persisting metadata.[1588,2041202432:08:24:32.501105] p_flags: 6
[1588,2041202432:08:24:32.501109]  (PF_R PF_W )
[1588,2041202432:08:24:32.501115] p_align: 10000
[1588,2041202432:08:24:32.501124] 
[1588,2041202432:08:24:32.501134] Looking at segment 0, type 0x1, p_flags 0x5
[1588,2041202432:08:24:32.501145] Matched nacl_phdr_check_data[1]
Terminated
[1656,2923636480:08:25:19.780201] NaClVmmapAdd(0x7f8090004530, 0xff000, 0x1000, 0x3, 0x2, 0x0, 0x0)
[1656,2923636480:08:25:19.780207] NaClVmmapEntryMake(0xff000,0x1000,0x3,0x2,0x0,0x0)
[1656,2923636480:08:25:19.780212] entry: 0x7f8090008fc0
[1656,2923636480:08:25:19.780216] NaClAppLoadFile done; [1656,2923636480:08:25:19.780219] NaClApp addr space layout:
[1656,2923636480:08:25:19.780226] nap->static_text_end    = 0x0000000000050000
[1656,2923636480:08:25:19.780230] nap->dynamic_text_start = 0x0000000000050000
[1656,2923636480:08:25:19.780234] nap->dynamic_text_end   = 0x0000000010020000
[1656,2923636480:08:25:19.780238] nap->rodata_start       = 0x0000000010020000
Done persisting metadata.
[1656,2923636480:08:25:19.780242] nap->data_start         = 0x000000001003f9c0
[1656,2923636480:08:25:19.780246] nap->data_end           = 0x0000000010040a50
[1656,2923636480:08:25:19.780250] nap->break_addr         = 0x0000000010040a50
[1656,2923636480:08:25:19.780254] nap->initial_entry_pt   = 0x00000000000212e0
[1656,2923636480:08:25:19.780258] nap->user_entry_pt      = 0x0000000000000000
[1656,2923636480:08:25:19.780262] nap->bundle_size        = 0x20
[1656,2923636480:08:25:19.780266] NaClPerfCounterInterval(NaClAppLoadFile __start__:EndLoadFile): 74534 microsecs
[1656,2923636480:08:25:19.780271] NaClRefCountUnref(0x7f8090006b40).
[1656,2923636480:08:25:19.780275] NaClDescIoDescDtor(0x7f8090006b40).
Terminated

In fact, the progress of the exec'ed cage seems different every time.

My hypothesis right now is that something is causing Lind to exit before the exec'd cage has a chance to execute, as we see the "Persisting..." and " Terminated" messages in each of these (Which signify Lind exiting).

@rennergade
Copy link
Contributor Author

The above problem was solved by adding a wait(NULL). The program was in fact terminating before the exec occurred. However, as discussed in today's meeting, this shouldn't cause the termination of the child process. I need to go over this a bit and talk to Yiwen about his initial implementation but I believe there is something about the initial cage setup that causes this to happen when cage 1 finishes first.

@rennergade
Copy link
Contributor Author

I realized that I characterized another error in today's meeting. I believed for a bit that exec'ing the initial cage causes Lind to boot out of itself, but that's not happening. In fact I had a working exec with a hello program here.

I'm almost positive the errors I was getting at that point are the same that I'm getting now with fork/exec'ing an ls program.

@rennergade
Copy link
Contributor Author

rennergade commented Nov 27, 2019

The fork/exec'd ls program is terminating at this line:

int c = getopt_long (argc, argv,
                           "abcdfghiklmnopqrstuvw:xABCDFGHI:LNQRST:UXZ1",
                           long_options, &oi);

It's a bit hard to tell what is causing this internally because since I don't have an easy way to debug into libc. Some of the source references I've looked at for getopt_long use malloc calls and others do not, so I'm still not certain that this is the same problem referenced in the Bash issue.

@rennergade
Copy link
Contributor Author

I tracked the line in getopt where it's faulting using the program counter (w/o the base address) and then used addr2line. It led me to a while loop dealing with optind and arguments.

My fork/exec/ls test was using ls with the argument -a and specifying the root directory / I recompiled a version where it ran ls with no args, and that worked!

This lets us know two things.

  1. This bug is separate than whatever is going on in Bash.
  2. Something is wrong with the argument loading in exec.

Hopefully this is easier to track down now.

@rennergade
Copy link
Contributor Author

rennergade commented Dec 5, 2019

Diving deeper it seems the args are in fact being copied incorrectly. There's a pretty wonky reallocation method and it seems like NaClUserToSysAddr may be being used incorrectly.

I think when copying argv and envp in we have to use either NaClUserToSysAddrRange or possibly NaClCopyOutToUser.

@rennergade
Copy link
Contributor Author

The error is definitely from address translation for argv (this also similarly effects envp). The issue is that there isn't a NaCl tool that adequately translates a double pointer (at least I haven't found one).
By the time we get to NaCl we don't explicitly know the lengths of argv and envp which makes translation hard.

The best solution I can think of is patching into glibc so we can submit the amount of args and environment variables.

@rennergade
Copy link
Contributor Author

The above is all fixed with the recent PR. All it took was some proper pointer manipulation to do the trick, with no glibc edits necessary.

I'm going to keep this open for a little. I don't think environment variables are being passed correctly, but I need to figure out more about how they exist in Lind in general. Running a simple program that prints all the environment variables has the same behavior when ran through lind solo, as well as with a fork/execve where we set an environment variables. Neither crash, but they also don't give any output.

@rennergade
Copy link
Contributor Author

This is part of the current refactor, documented here.

I still haven't examined environment variables much as I mentioned above, so that needs to be done. @jesings perhaps you can run a few tests exec'ing with ENV variables?

@rennergade rennergade added good first issue Good for newcomers help wanted Extra attention is needed labels Mar 25, 2020
@rennergade
Copy link
Contributor Author

I still haven't closed this one, since we haven't inspected ENV variables too deeply. I think this is a good first project for one of the new interns when the wiki/testing stuff is done.

@rennergade
Copy link
Contributor Author

This is an old issue that I didn't fully close because the Environment variable aspect of execve wasn't fully tested.

@mzhang15 It would be good to investigate this with a test case in the suite while you're putting that together.

@rennergade rennergade assigned kliu513 and unassigned mzhang15 Aug 4, 2020
@rennergade
Copy link
Contributor Author

I'm actually going to close this and open a new issue specifically on environment variables.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants