Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

async/closure environment does not align local variables #22419

Closed
tersec opened this issue Aug 8, 2023 · 5 comments · Fixed by #22425
Closed

async/closure environment does not align local variables #22419

tersec opened this issue Aug 8, 2023 · 5 comments · Fixed by #22425
Assignees
Labels

Comments

@tersec
Copy link
Contributor

tersec commented Aug 8, 2023

Description

import std/asyncdispatch

type
  ValidatorPubKey* = object
    blob*: array[96, byte]

proc f() {.async.} =
  var x {.align: 16}: uint8
  var y {.align: 16}: ValidatorPubKey
  let value = cast[uint64](unsafeAddr y)
  echo "f() = ", cast[uint64](unsafeAddr y), " mod(8) = ", value mod 8

proc e() =
  var x: uint8
  var y: ValidatorPubKey
  let value = cast[uint64](unsafeAddr y)
  echo "e() = ", cast[uint64](unsafeAddr y), " mod(8) = ", value mod 8

waitFor f()
e()

Nim Version

Nim Compiler Version 1.6.15 [Linux: amd64]
Compiled at 2023-08-08
Copyright (c) 2006-2023 by Andreas Rumpf

git hash: 1cd48e4b2abd867adf1ab52a70a1d8f1c6ac77c4
active boot switches: -d:release
Nim Compiler Version 2.0.0 [Linux: amd64]
Compiled at 2023-08-08
Copyright (c) 2006-2023 by Andreas Rumpf

git hash: a488067a4130f029000be4550a0fb1b39e0e9e7c
active boot switches: -d:release
Nim Compiler Version 2.1.1 [Linux: amd64]
Compiled at 2023-08-08
Copyright (c) 2006-2023 by Andreas Rumpf

git hash: 73e661d01bdde815a5f388b960cd4d0593f7accc
active boot switches: -d:release

Current Output

$ # Test all of Nim 1.6, 2.0, and devel (therefore, both refc and ORC)
$ Nim16/bin/nim c -r --hints:off r
f() = 139823877881961 mod(8) = 1
e() = 140720357015456 mod(8) = 0
$ Nim20/bin/nim c -r --hints:off r
f() = 140174585233530 mod(8) = 2
e() = 140724735024336 mod(8) = 0
$ NimDevel/bin/nim c -r --hints:off r
f() = 140395756572794 mod(8) = 2
e() = 140735821040544 mod(8) = 0

Expected Output

Aligned local variables in closure environment (with or without align pragmas, but here even align doesn't work around the issue).

Possible Solution

No response

Additional Information

No response

@tersec tersec changed the title align pragma on variables in async/closure environment does not align async/closure environment does not align local variables Aug 8, 2023
@Varriount
Copy link
Contributor

What does the C output for this look like? Does C normally allow aligning variables on the stack?

@ringabout ringabout self-assigned this Aug 9, 2023
@tersec
Copy link
Contributor Author

tersec commented Aug 9, 2023

@Varriount first, a more minimal example,

type
  ValidatorPubKey = object
    #blob {.align: 16}: array[96, byte]
    blob: array[96, byte]

iterator count0(): int {.closure.} =
  var x {.align: 16}: uint8
  var y {.align: 16}: ValidatorPubKey
  let value = cast[uint64](unsafeAddr y)
  echo "f() = ", cast[uint64](unsafeAddr y), " mod(8) = ", value mod 8
  yield 0

for _ in count0():
  discard

Which gets compiled into

N_LIB_PRIVATE N_CLOSURE(NI, count0__a_u3)(void* ClE_0) {
	NI result;
	tyObject_Env_adotnim_count0___c3bxH9barzvxNMwrfFOIflA* colonenvP_;
NIM_BOOL* nimErr_;
{nimErr_ = nimErrorFlag();
	result = (NI)0;
	colonenvP_ = (tyObject_Env_adotnim_count0___c3bxH9barzvxNMwrfFOIflA*) ClE_0;
	while (1) {
		if (!1) goto LA1;
		{
			NimStringV2 colontmpD_;
			NimStringV2 colontmpD__2;
			tyArray__sMpvt1sOxOJ3LFGulnbeMQ T4_;
			colontmpD_.len = 0; colontmpD_.p = NIM_NIL;
			colontmpD__2.len = 0; colontmpD__2.p = NIM_NIL;
			switch ((*colonenvP_).colonstate_) {
			case -1:
			if (colontmpD__2.p && !(colontmpD__2.p->cap & NIM_STRLIT_FLAG)) {
 deallocShared(colontmpD__2.p);
}
			if (colontmpD_.p && !(colontmpD_.p->cap & NIM_STRLIT_FLAG)) {
 deallocShared(colontmpD_.p);
}
			 goto BeforeRet_;
			case 0: goto STATE0;
			case 1: goto STATE1;
			}
			STATE0: ;
			(*colonenvP_).x1 = (NU8)0;
			nimZeroMem((void*)(&(*colonenvP_).y2), sizeof(tyObject_ValidatorPubKey__bQ4vtAU3wDlHhbTUukGMVg));
			(*colonenvP_).value3 = ((NU64) (ptrdiff_t) ((&(*colonenvP_).y2)));
			T4_[0] = TM__R8RUzYq41iOx0I9bZH5Nyrw_4;
			colontmpD_ = dollar___systemZdollars_u14(((NU64) (ptrdiff_t) ((&(*colonenvP_).y2))));
			if (NIM_UNLIKELY(*nimErr_)) goto LA3_;
			T4_[1] = colontmpD_;
			T4_[2] = TM__R8RUzYq41iOx0I9bZH5Nyrw_6;
			colontmpD__2 = dollar___systemZdollars_u14((NU64)((NU64)((*colonenvP_).value3) % (NU64)(8ULL)));
			if (NIM_UNLIKELY(*nimErr_)) goto LA3_;
			T4_[3] = colontmpD__2;
			echoBinSafe(T4_, 4);
			(*colonenvP_).colonstate_ = ((NI)1);
			result = ((NI)0);
			if (colontmpD__2.p && !(colontmpD__2.p->cap & NIM_STRLIT_FLAG)) {
 deallocShared(colontmpD__2.p);
}
			if (colontmpD_.p && !(colontmpD_.p->cap & NIM_STRLIT_FLAG)) {
 deallocShared(colontmpD_.p);
}
			goto BeforeRet_;
			STATE1: ;
			(*colonenvP_).colonstate_ = ((NI)-1);
			if (colontmpD__2.p && !(colontmpD__2.p->cap & NIM_STRLIT_FLAG)) {
 deallocShared(colontmpD__2.p);
}
			if (colontmpD_.p && !(colontmpD_.p->cap & NIM_STRLIT_FLAG)) {
 deallocShared(colontmpD_.p);
}
			goto LA2;
			{
				LA3_:;
			}
			{
				if (colontmpD__2.p && !(colontmpD__2.p->cap & NIM_STRLIT_FLAG)) {
 deallocShared(colontmpD__2.p);
}
				if (colontmpD_.p && !(colontmpD_.p->cap & NIM_STRLIT_FLAG)) {
 deallocShared(colontmpD_.p);
}
			}
			if (NIM_UNLIKELY(*nimErr_)) goto BeforeRet_;
		} LA2: ;
	} LA1: ;
	}BeforeRet_: ;
	popFrame();
	return result;
}

The salient difference is this

colonenvP_ = (tyObject_Env_adotnim_count0___c3bxH9barzvxNMwrfFOIflA*) ClE_0;

through which x and y are accessed.

For example:

colonenvP_ = (tyObject_Env_adotnim_count0___c3bxH9barzvxNMwrfFOIflA*) ClE_0;

Where

struct tyObject_Env_adotnim_count0___c3bxH9barzvxNMwrfFOIflA {
  RootObj Sup;
        NI colonstate_;
        NU8 x1;
        tyObject_ValidatorPubKey__bQ4vtAU3wDlHhbTUukGMVg y2;
        NU64 value3;
};

is the closure iterator's environment.

That is, it's using C's struct alignment (in the heap, in this case) by default, not the stack alignment in the non-closure iterator case.

It is, however, capable of doing the alignment, because if one instead writes:

type
  ValidatorPubKey = object
    blob {.align: 16}: array[96, byte]

iterator count0(): int {.closure.} =
  var x: uint8
  var y: ValidatorPubKey
  let value = cast[uint64](unsafeAddr y)
  echo "f() = ", cast[uint64](unsafeAddr y), " mod(8) = ", value mod 8
  yield 0

for _ in count0():
  discard

and removes the align pragma from count0 and adds them to ValidatorPubKey.blob's definition, then it aligns as expected, even as part of the closure iterator.

The only difference adding the align: 16 to the ValidatorPubKey definition is

@@ -55,7 +55,7 @@
 };
 typedef NU8 tyArray__26aQaWOzPONEXg2JYXgXsA[96];
 struct tyObject_ValidatorPubKey__bQ4vtAU3wDlHhbTUukGMVg {
-	tyArray__26aQaWOzPONEXg2JYXgXsA blob;
+NIM_ALIGN(16) 	tyArray__26aQaWOzPONEXg2JYXgXsA blob;
 };
 struct tyObject_Env_adotnim_count0___c3bxH9barzvxNMwrfFOIflA {
   RootObj Sup;

that is, it uses the NIM_ALIGN (C proprocessor) macro as part of the tyObject_ValidatorPubKey__bQ4vtAU3wDlHhbTUukGMVg struct definition, whereas it doesn't use NIM_ALIGN anywhere in the initial example.

@ringabout
Copy link
Member

It is supposed to be divided by 8 and leaves a remainder of 0, right? just like

f() = 139669055053968 mod(8) = 0
e() = 140730649107728 mod(8) = 0

@tersec
Copy link
Contributor Author

tersec commented Aug 9, 2023

It is supposed to be divided by 8 and leaves a remainder of 0, right? just like

f() = 139669055053968 mod(8) = 0
e() = 140730649107728 mod(8) = 0

That's the intent, yes.

Ideally even without needing the {.align: ...}, but at least processing that to allow a workaround if it's needed. The issue here is twofold:

  • (a) there's an inconsistency where the difference in underlying implementation is leaking through so that closure iterators and non-(closure iterators) align supposedly "local variables" differently because one is, in fact, placing them into a struct which is separately allocated; and
  • (b) it's not possible to work around this by manually annotating this with the desired alignment, which is clearly possible (per the second example in my comment, the one where the align pragma is on the type rather than the variable declaration).

I'd consider (a) to be a probably-bug in itself, but even if one disagrees, (b) definitely is.

@arnetheduck
Copy link
Contributor

Ideally even without needing the {.align: ...}

without explicit align, the alignment of ValidatorPubKey is 1, no matter if it's a local variable or part of a structure, so I don't see how it could end up on an 8 border.

also, with align: 16, we evenly divide by 16 - ie that's slightly overaligned for the purpose .

Araq pushed a commit that referenced this issue Aug 9, 2023
#22425)

* fixes #22419; async/closure environment does not align local variables

* Apply suggestions from code review

* Update tests/align/talign.nim

Co-authored-by: Jacek Sieka <arnetheduck@gmail.com>

* apply code review

* update tests

---------

Co-authored-by: Jacek Sieka <arnetheduck@gmail.com>
narimiran pushed a commit that referenced this issue Aug 11, 2023
#22425)

* fixes #22419; async/closure environment does not align local variables

* Apply suggestions from code review

* Update tests/align/talign.nim

Co-authored-by: Jacek Sieka <arnetheduck@gmail.com>

* apply code review

* update tests

---------

Co-authored-by: Jacek Sieka <arnetheduck@gmail.com>
(cherry picked from commit 5334dc9)
narimiran pushed a commit that referenced this issue Aug 11, 2023
#22425)

* fixes #22419; async/closure environment does not align local variables

* Apply suggestions from code review

* Update tests/align/talign.nim

Co-authored-by: Jacek Sieka <arnetheduck@gmail.com>

* apply code review

* update tests

---------

Co-authored-by: Jacek Sieka <arnetheduck@gmail.com>
(cherry picked from commit 5334dc9)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants