Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add function undump locations recovery #36

Closed
wants to merge 2 commits into from

Conversation

Pavel-Durov
Copy link
Contributor

@Pavel-Durov Pavel-Durov commented Aug 10, 2023

Solves #32

  • Added function prototype location reconstruction as part of function
    undump process.
  • Moved yk functionality to lyk.h and lyk.c.

Note:

We still get errors such as:

yk-fork/ykllvm/llvm/lib/IR/Value.cpp:1101: void llvm::ValueHandleBase::RemoveFromUseList(): Assertion `*PrevPtr == this && "List invariant broken"' failed
...
yk-fork/ykllvm/llvm/lib/IR/Value.cpp:1061: void llvm::ValueHandleBase::AddToUseList(): Assertion `Entry && "Value doesn't have any handles?"' failed
...
yk-fork/ykllvm/llvm/lib/IR/Value.cpp:285: llvm::ValueName* llvm::Value::getValueName() const: Assertion `I != Ctx.pImpl->ValueNames.end() && "No name entry found!"' failed.

when we run ../src/lua -e"_U=true" all.lua.

@Pavel-Durov Pavel-Durov marked this pull request as ready for review August 10, 2023 19:46
@@ -267,6 +270,9 @@ static void loadFunction (LoadState *S, Proto *f, TString *psource) {
loadUpvalues(S, f);
loadProtos(S, f);
loadDebug(S, f);
#ifdef USE_YK
yk_set_locations(f);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the main fix - set yk locations when loadFunction is called.
Prior to that change, when functions were loaded from a dump, yk locations were not initialised.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

@@ -9,6 +9,7 @@

#ifdef USE_YK
#define _DEFAULT_SOURCE /* for reallocarray() */
#import "lyk.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still use rallocarray? If not, we can kill the _DEFAULT_SOURCE line.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see we do, but in another file (lyk.c). Perhaps we should move that #define into lyk.h then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure why, but it looks like reallocarray call works without defining _DEFAULT_SOURCE in lyk.h

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably another header is pulling it in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to lyk.c

src/lyk.c Outdated
#ifdef USE_YK
// YKOPT: Reallocating for every instruction is inefficient.
f->yklocs = reallocarray(f->yklocs, pc, sizeof(YkLocation));
assert(("Expected yklocs to be defined!", f->yklocs != NULL));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never seen assert used like this :P

So with the comma, each expression is checked individually like && (so a string is always true)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, where is the initial allocation of f->yklocs? I don't see it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never used assert in C before 😶, I think its the same as assert(condition && "message");
But maybe we should use lua_assert here instead as that's the main assertion used in the codebase

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah. If lua has it's own assertions, probably best to use those.

Copy link
Contributor Author

@Pavel-Durov Pavel-Durov Aug 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll change to lua_assert but its just an alias:

#define lua_assert(c)           assert(c)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done 👉 b7fa8d2

src/lyk.c Outdated
* Other loops like while and repeat-until are harder to identify since they are based on OP_JMP instruction.
*/
int is_loop_start(Instruction i)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make the bracing style like the rest of the Lua code? Looks like they put the open brace on the same line as the parent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done 👉 9d00f0c

Copy link
Contributor Author

@Pavel-Durov Pavel-Durov Aug 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, I wasn't pushing to upstream.
the right commit is here 👉 bf85892

src/lyk.h Outdated
@@ -0,0 +1,19 @@
#include "lobject.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This header should probably have a guard to ensure the header can be included more than once (and the second inclusion is a no-op).

The idiom usually looks like:

#ifndef __LYK_H
#define __LYK_H

... contents of header here ...

#endif // __LYK_H

Copy link
Contributor Author

@Pavel-Durov Pavel-Durov Aug 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done 👉 bf85892

src/lyk.h Outdated

#include "lopcodes.h"

#ifdef USE_YK
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these USE_YK checks required? And if so, should we just guard to whole file in this way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a good idea I will guard to whole file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done 👉 bf85892

@vext01
Copy link
Contributor

vext01 commented Aug 11, 2023

A few minor bits, but on the whole, really good.

I think we should write a comment somewhere (maybe at the top of lyk.c?) that explains how lua allocates its bytecode and what the implications are for yk. Maybe a paragraph or two.

@Pavel-Durov
Copy link
Contributor Author

Added documentation about function prototype loading in lua 👉 7af1eb6

@Pavel-Durov
Copy link
Contributor Author

Ready for review @vext01

@Pavel-Durov
Copy link
Contributor Author

I think we can add more tests here https://github.com/ykjit/yklua/blob/main/.buildbot.sh#L52
We can't fun all.lua but we can run the test files individually.

@ltratt
Copy link
Contributor

ltratt commented Aug 12, 2023

I think we can add more tests here https://github.com/ykjit/yklua/blob/main/.buildbot.sh#L52
We can't fun all.lua but we can run the test files individually.

The more we can add in, the merrier! So if this PR does make more tests work, let's add 'em in.

@Pavel-Durov
Copy link
Contributor Author

bors try

bors bot added a commit that referenced this pull request Aug 13, 2023
@bors
Copy link
Contributor

bors bot commented Aug 13, 2023

try

Build failed:

@Pavel-Durov
Copy link
Contributor Author

bors try

bors bot added a commit that referenced this pull request Aug 13, 2023
@bors
Copy link
Contributor

bors bot commented Aug 13, 2023

try

Build failed:

@Pavel-Durov
Copy link
Contributor Author

bors try

bors bot added a commit that referenced this pull request Aug 13, 2023
@bors
Copy link
Contributor

bors bot commented Aug 13, 2023

try

Build failed:

src/lyk.c Outdated
@@ -0,0 +1,77 @@
#if USE_UK
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

USE_UK 🤦

@Pavel-Durov
Copy link
Contributor Author

bors try

bors bot added a commit that referenced this pull request Aug 13, 2023
@bors
Copy link
Contributor

bors bot commented Aug 13, 2023

try

Build failed:

@Pavel-Durov
Copy link
Contributor Author

bors try

bors bot added a commit that referenced this pull request Aug 13, 2023
@bors
Copy link
Contributor

bors bot commented Aug 13, 2023

try

Build failed:

@Pavel-Durov
Copy link
Contributor Author

bors try

@Pavel-Durov
Copy link
Contributor Author

squashed 👉 9facb37

@vext01
Copy link
Contributor

vext01 commented Aug 14, 2023

bors r+

bors bot added a commit that referenced this pull request Aug 14, 2023
36: Add function undump locations recovery r=vext01 a=Pavel-Durov

Solves #32

+ Added function prototype location reconstruction as part of function
undump process.
+ Moved yk functionality to lyk.h and lyk.c.

### Note:
We still get errors such as: 

```shell
yk-fork/ykllvm/llvm/lib/IR/Value.cpp:1101: void llvm::ValueHandleBase::RemoveFromUseList(): Assertion `*PrevPtr == this && "List invariant broken"' failed
...
yk-fork/ykllvm/llvm/lib/IR/Value.cpp:1061: void llvm::ValueHandleBase::AddToUseList(): Assertion `Entry && "Value doesn't have any handles?"' failed
...
yk-fork/ykllvm/llvm/lib/IR/Value.cpp:285: llvm::ValueName* llvm::Value::getValueName() const: Assertion `I != Ctx.pImpl->ValueNames.end() && "No name entry found!"' failed.
```
when we run `../src/lua -e"_U=true" all.lua`.


Co-authored-by: Pavel Durov <p3ld3v@pm.me>
@bors
Copy link
Contributor

bors bot commented Aug 14, 2023

Build failed:

@Pavel-Durov
Copy link
Contributor Author

I guess errors is non-deterministic after all 👀

@Pavel-Durov
Copy link
Contributor Author

bors try

bors bot added a commit that referenced this pull request Aug 14, 2023
@bors
Copy link
Contributor

bors bot commented Aug 14, 2023

try

Build succeeded!

The publicly hosted instance of bors-ng is deprecated and will go away soon.

If you want to self-host your own instance, instructions are here.
For more help, visit the forum.

If you want to switch to GitHub's built-in merge queue, visit their help page.

@Pavel-Durov
Copy link
Contributor Author

squashed again 👉 0d13753

@ltratt
Copy link
Contributor

ltratt commented Aug 14, 2023

@Pavel-Durov You'lln eed to unsquash back to 579d728 please: the new commit(s) or edits you did need to be reviewed. Put another way: each squash requires permission.

@Pavel-Durov
Copy link
Contributor Author

ok

@Pavel-Durov
Copy link
Contributor Author

ready for another review

@ltratt
Copy link
Contributor

ltratt commented Aug 14, 2023

Please squash.

@Pavel-Durov
Copy link
Contributor Author

squashed 👉 dcfccca

@ltratt
Copy link
Contributor

ltratt commented Aug 14, 2023

bors r+

36: Add function undump locations recovery r=ltratt a=Pavel-Durov

Solves #32

+ Added function prototype location reconstruction as part of function
undump process.
+ Moved yk functionality to lyk.h and lyk.c.

### Note:
We still get errors such as: 

```shell
yk-fork/ykllvm/llvm/lib/IR/Value.cpp:1101: void llvm::ValueHandleBase::RemoveFromUseList(): Assertion `*PrevPtr == this && "List invariant broken"' failed
...
yk-fork/ykllvm/llvm/lib/IR/Value.cpp:1061: void llvm::ValueHandleBase::AddToUseList(): Assertion `Entry && "Value doesn't have any handles?"' failed
...
yk-fork/ykllvm/llvm/lib/IR/Value.cpp:285: llvm::ValueName* llvm::Value::getValueName() const: Assertion `I != Ctx.pImpl->ValueNames.end() && "No name entry found!"' failed.
```
when we run `../src/lua -e"_U=true" all.lua`.


Co-authored-by: Pavel Durov <p3ld3v@pm.me>
@bors
Copy link
Contributor

bors bot commented Aug 14, 2023

Build succeeded!

The publicly hosted instance of bors-ng is deprecated and will go away soon.

If you want to self-host your own instance, instructions are here.
For more help, visit the forum.

If you want to switch to GitHub's built-in merge queue, visit their help page.

@Pavel-Durov
Copy link
Contributor Author

Not sure what happened here.
I've updated the branch from the upstream/main cause it had conflicts and then it got Closed instead of Merged.
Is it some kind of witchcraft done by bors?

@ltratt
Copy link
Contributor

ltratt commented Aug 14, 2023

It was merged OK https://github.com/ykjit/yklua/commits/main (github has a narrower view of conflicts than bors).

bors bot added a commit that referenced this pull request Aug 25, 2023
64: Track initialised yk locations r=ltratt a=Pavel-Durov

Partially fixes (only for serial compilation)  - #62

# Changes

- Remove reallocarray in favour of calloc
- Change YkLocations to be stored as pointers
- Change lyk interface to reflect its "hooks" functionality
- Move tests to a separate test script and enable serialised all.lua test suite
- Added logging in lyk module for ease of debugging
- Updated readme

`test.sh` was tested for resiliency multiple times:
```shell
$ try_repeat -v 100 sh ./test.sh
``` 
# Issues

## Uninitialised locations memory 
`reallocarray` does not initialise memory with default zero bytes locations as `calloc`. 
Moving to `calloc` and using pointers allowed to check for NULL with default values set at initialisation time.

## Tests

When lua tests are executed one by one as single files it produces different results from when it's executed via `all.lua` test suite.

Example:

```shell

$ YKD_SERIALISE_COMPILATION=1 ../src/lua -e"_U=true" ./main.lua 
...
../src/lua: ./main.lua:343: assertion failed!
stack traceback:
        [C]: in function 'assert'
        ./main.lua:343: in main chunk
        [C]: in ?
```
When run through `all.lua` it passes:
```shell
$ YKD_SERIALISE_COMPILATION=1 ../src/lua -e"_U=true" ./all.lua 
...
***** FILE 'main.lua'*****
...
***** FILE 'gc.lua'*****
...
```
Since `all.lua` is what we base the stability of yklua, I think we should include it in the tests as is, currently only serialised compilation works.

Running `all.lua` prior to these changes results in error (`main/56c5787799b876f36babbae24e9afc025806b831`):
```
$ YKD_SERIALISE_COMPILATION=1 gdb -ex 'r' -ex 'bt' --args ../src/lua -e"_U=true" ./all.lua 
...
***** FILE 'main.lua'*****

Program received signal SIGSEGV, Segmentation fault.
core::sync::atomic::AtomicUsize::fetch_sub (self=<optimised out>) at /rustc/ef85656a10657ba5e4f7fe2931a4ca6293138d51/library/core/src/sync/atomic.rs:2575
2575    /rustc/ef85656a10657ba5e4f7fe2931a4ca6293138d51/library/core/src/sync/atomic.rs: No such file or directory.
#0  core::sync::atomic::AtomicUsize::fetch_sub (self=<optimised out>) at /rustc/ef85656a10657ba5e4f7fe2931a4ca6293138d51/library/core/src/sync/atomic.rs:2575
#1  alloc::sync::{impl#33}::drop<lock_api::mutex::Mutex<parking_lot::raw_mutex::RawMutex, ykrt::location::HotLocation>, alloc::alloc::Global> (self=0x7fffffffb410)
    at /rustc/ef85656a10657ba5e4f7fe2931a4ca6293138d51/library/alloc/src/sync.rs:2370
#2  0x00007ffff7b0d4ab in core::ptr::drop_in_place<alloc::sync::Arc<lock_api::mutex::Mutex<parking_lot::raw_mutex::RawMutex, ykrt::location::HotLocation>, alloc::alloc::Global>> () at /rustc/ef85656a10657ba5e4f7fe2931a4ca6293138d51/library/core/src/ptr/mod.rs:497
#3  0x00007ffff7b0067e in core::mem::drop<alloc::sync::Arc<lock_api::mutex::Mutex<parking_lot::raw_mutex::RawMutex, ykrt::location::HotLocation>, alloc::alloc::Global>> (_x=...) at /rustc/ef85656a10657ba5e4f7fe2931a4ca6293138d51/library/core/src/mem/mod.rs:987
#4  0x00007ffff7b121f0 in ykrt::location::{impl#1}::drop (self=0x7fffffffb468) at ykrt/src/location.rs:197
#5  0x00007ffff7affa9b in core::ptr::drop_in_place<ykrt::location::Location> () at /rustc/ef85656a10657ba5e4f7fe2931a4ca6293138d51/library/core/src/ptr/mod.rs:497
#6  0x00007ffff7aff6bd in core::mem::drop<ykrt::location::Location> (_x=...) at /rustc/ef85656a10657ba5e4f7fe2931a4ca6293138d51/library/core/src/mem/mod.rs:987
#7  0x00007ffff7b004bd in ykcapi::yk_location_drop (loc=...) at ykcapi/src/lib.rs:90
#8  0x000000000088aa0e in free_loc (f=<optimised out>, i=<optimised out>, idx=<optimised out>) at lyk.c:66
#9  0x000000000088aba5 in yk_free_locactions (f=0x928cc0) at lyk.c:76
--Type <RET> for more, q to quit, c to continue without paging--
#10 0x0000000000807738 in luaF_freeproto (L=0x916e38, f=0x928cc0) at lfunc.c:276
#11 0x0000000000809fde in freeobj (L=0x916e38, o=0x928cc0) at lgc.c:767
#12 0x0000000000817145 in sweepgen (L=0x916e38, g=<optimised out>, p=<optimised out>, limit=<optimised out>, pfirstold1=<optimised out>) at lgc.c:1106
#13 0x0000000000816713 in youngcollection (L=0x916e38, g=0x916f00) at lgc.c:1239
#14 0x000000000081557d in genstep (L=0x916e38, g=0x916f00) at lgc.c:1434
#15 0x0000000000815044 in luaC_step (L=0x916e38) at lgc.c:1686
#16 0x00000000007e4447 in lua_pushstring (L=0x916e38, s=<optimised out>) at lapi.c:553
#17 0x000000000089e60c in findloader (L=0x916e38, name=0x928df8 "tracegc") at loadlib.c:641
#18 0x000000000089de0a in ll_require (L=0x916e38) at loadlib.c:666
#19 0x00000000007fda75 in precallC (L=0x916e38, func=<optimised out>, nresults=<optimised out>, f=0x89db80 <ll_require>) at ldo.c:506
#20 0x00000000007fe016 in luaD_precall (L=0x916e38, func=0x923780, nresults=1) at ldo.c:569
#21 0x0000000000883f68 in luaV_execute (L=0x916e38, ci=<optimised out>) at lvm.c:1655
#22 0x00000000007feb3b in ccall (L=0x916e38, func=<optimised out>, nResults=<optimised out>, inc=<optimised out>) at ldo.c:609
#23 0x00000000007fec61 in luaD_callnoyield (L=0x916e38, func=0x917730, nResults=-1) at ldo.c:627
#24 0x00000000007eaf03 in f_call (L=0x916e38, ud=<optimised out>) at lapi.c:1041
#25 0x00000000007f8be7 in luaD_rawrunprotected (L=0x916e38, f=0x7eae40 <f_call>, ud=0x7ffff2487308) at ldo.c:144
#26 0x0000000000801406 in luaD_pcall (L=0x916e38, func=0x7eae40 <f_call>, u=0x7ffff2487308, old_top=<optimised out>, ef=<optimised out>) at ldo.c:926
#27 0x00000000007ea9ec in lua_pcallk (L=0x916e38, nargs=<optimised out>, nresults=<optimised out>, errfunc=<optimised out>, ctx=<optimised out>, k=<optimised out>)
    at lapi.c:1067
#28 0x00000000007dc623 in docall (L=0x916e38, narg=0, nres=-1) at lua.c:160
#29 0x00000000007dbd24 in handle_script (L=0x916e38, argv=<optimised out>) at lua.c:255
#30 0x00000000007d9fe3 in pmain (L=0x916e38) at lua.c:634
#31 0x00000000007fda75 in precallC (L=0x916e38, func=<optimised out>, nresults=<optimised out>, f=0x7d97f0 <pmain>) at ldo.c:506
#32 0x00000000007fe0c8 in luaD_precall (L=0x916e38, func=0x9176f0, nresults=1) at ldo.c:572
#33 0x00000000007fea7f in ccall (L=0x916e38, func=0x9176f0, nResults=1, inc=<optimised out>) at ldo.c:607
#34 0x00000000007fec61 in luaD_callnoyield (L=0x916e38, func=0x9176f0, nResults=1) at ldo.c:627
#35 0x00000000007eaf03 in f_call (L=0x916e38, ud=<optimised out>) at lapi.c:1041
#36 0x00000000007f8be7 in luaD_rawrunprotected (L=0x916e38, f=0x7eae40 <f_call>, ud=0x7ffff2487058) at ldo.c:144
#37 0x0000000000801406 in luaD_pcall (L=0x916e38, func=0x7eae40 <f_call>, u=0x7ffff2487058, old_top=<optimised out>, ef=<optimised out>) at ldo.c:926
#38 0x00000000007ea9ec in lua_pcallk (L=0x916e38, nargs=<optimised out>, nresults=<optimised out>, errfunc=<optimised out>, ctx=<optimised out>, k=<optimised out>)
    at lapi.c:1067
#39 0x00000000007d94b0 in main (argc=<optimised out>, argv=<optimised out>) at lua.c:660
```

That's cause we're freeing uninitialised yk locations.


Co-authored-by: Pavel Durov <p3ld3v@pm.me>
@ltratt ltratt deleted the yk-location-allocation branch June 19, 2024 07:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants