Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

experiment: precise heap tags #4544

Open
wants to merge 17 commits into
base: master
Choose a base branch
from
Open

Conversation

crusso
Copy link
Contributor

@crusso crusso commented May 17, 2024

Implement precise tagging of heap allocated objects.

Splits

  • TAG_ARRAY into consecutive TAG_ARRAY_I, TAG_ARRAY_M and TAG_ARRAY_T (immutable array, array, tuple).
  • TAG_BLOB into consecutive TAG_BLOB_B, TAB_BLOB_T and TAG_BLOB_P (blob, text, principal).
  • TAG_BITS_32 into consecutive TAG_BITS32_U, TAG_BITS32_S and TAG_BITS32_F ((boxed) Nat32/Int32, Float32 (reserved for future use).
  • TAG_BITS_64 into consecutive TAG_BITS64_U, TAG_BITS64_S and TAG_BITS64_F ((boxed) Nat64/Int32, Float64).

Modifies array slicing logic to preserve the base tag of the underlying array, by stealing the top 2 bits of the slice start field, relying on the fact that array indices are < 2^30. For clarity, the tag in decompressed on each iteration. For efficiency, it could actually be maintained during slicing compressed form as a sort ( see 448962f).

TODO:
[ ] print refined tags in debug output?

TBD:
[ ] should we also reserve TAG_BLOB_M for (future) mutable blobs and maybe TAB_BLOB_X for internal blobs?
[ ] should we also reserve TAG_ARRAY_X for things like roots?
[ ] I currently just added a tag argument to rts.alloc_blob and rts.alloc_array. Would it be preferable to have six allocation functions?

@crusso crusso marked this pull request as draft May 18, 2024 22:14
Copy link

github-actions bot commented May 31, 2024

Comparing from c899462 to 8dbe885:
In terms of gas, 4 tests regressed, 1 tests improved and the mean change is -3.7%.
In terms of size, 5 tests regressed and the mean change is +0.2%.

@crusso crusso marked this pull request as ready for review May 31, 2024 09:32
pub const TAG_NULL: Tag = 29;
pub const TAG_ONE_WORD_FILLER: Tag = 31;
pub const TAG_FREE_SPACE: Tag = 33;
pub const TAG_ARRAY_I: Tag = 5;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Only a very little stylistic thought: Maybe we could name the tags with a unabbreviated suffix like TAG_ARRAY_IMMUTABLE etc. or just have a short comment here at the declaration.)

pub fn slice_tag(array_tag: Tag, slice_start: u32) -> Tag {
debug_assert!(array_tag == TAG_ARRAY_I || array_tag == TAG_ARRAY_M || array_tag == TAG_ARRAY_T);
debug_assert!(slice_start >= TAG_ARRAY_SLICE_MIN && slice_start < (1 << 30));
(((array_tag - TAG_ARRAY_I) / 2) << 30) | slice_start
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, maybe, a debug assertion could be helpful, to ensure that array_tag - TAG_ARRAY is even and has a distance of maximum 3 (to be encoded in two bits).
The 30 bits is also related to MAX_ARRAY_SIZE in constant.rs that it must cover the maximum size of the array, i.e. MAX_ARRAY_SIZE < (1 << 30). Maybe, subsummarizing it as a local constant could be helpful (just that I catch all occurrences for the 64-bit porting, where maximum array size is larger).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought: Maybe the odd numbering is no longer needed in future if we focus on incremental GC. In this case the division and multiplication by 2 would no longer be necessary...

|| tag >= TAG_ARRAY_SLICE_MIN
);
if tag >= TAG_ARRAY_SLICE_MIN {
(TAG_ARRAY_I + (tag >> 30) * 2, tag << 2 >> 2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, the the second and third occurrence of 2 could be expressed as usize::BITS - 30 (where 30 is the constant of the shift to encode the tag in the slicing info)

|| tag >= TAG_ARRAY_SLICE_MIN
);
if tag >= TAG_ARRAY_SLICE_MIN {
TAG_ARRAY_I + (tag >> 30) * 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a possible sanity check: The result would be tag == TAG_ARRAY_I || tag == TAG_ARRAY_M || tag == TAG_ARRAY_T (based on the assumption that the three array tags are consecutively numbered and odd).

@@ -506,7 +558,9 @@ impl Obj {
}

pub unsafe fn as_blob(self: *mut Self) -> *mut Blob {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether it could make sense to use dedicated as_blob functions, depending on the context, e.g. for texts etc.

}

pub unsafe fn restore_tag(self: *mut Self, array_tag: Tag) -> () {
(*self).header.tag = array_tag;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the restored tag would fulfil array_tag == TAG_ARRAY_I || arraty_tag == TAG_ARRAY_M || array_tag == TAG_ARRAY_T (no slicing any longer)? Not sure, if a debug assertion would make sense.

@@ -389,7 +399,7 @@ impl Value {
/// In debug mode panics if the value is not a pointer or the
/// pointed object is not a `Blob`.
pub unsafe fn as_stream(self) -> *mut Stream {
debug_assert_eq!(self.tag(), TAG_BLOB);
debug_assert!(self.is_blob());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the stream would always be a TAG_BLOB_B? Maybe, a stricter assertion could be applied.

@@ -373,14 +383,14 @@ impl Value {
/// Get the pointer as `Blob` using forwarding. In debug mode panics if the value is not a pointer or the
/// pointed object is not a `Blob`.
pub unsafe fn as_blob(self) -> *const Blob {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just an idea: We could expose distinct access function, on for pure blob and one for text (maybe also for principal) with corresponding specific tag checks, and then call the corresponding specific function depending of the context. This would only serve for making sanity checks stricter.


use motoko_rts_macros::ic_mem_fn;

// CRC32 for blobs. Loosely based on https://rosettacode.org/wiki/CRC-32#Implementation_2

#[no_mangle]
pub unsafe extern "C" fn compute_crc32(blob: Value) -> u32 {
if blob.tag() != TAG_BLOB {
if !blob.is_blob() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be a case where the blob is a principal? The check could then be constrained to TAG_BLOB_P.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am probably wrong. Below it seems to be a TAG_BLOB_B that is passed to this function. It is hard to follow...

let r = alloc_blob(mem, Bytes((n.as_u32() + 4 + 4) / 5 * 8)); // contains padding
let r = alloc_blob(
mem,
TAG_BLOB_T, /*?*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good question. The base32 would be a textual encoding?

@@ -225,7 +229,7 @@ unsafe fn base32_to_principal<M: Memory>(mem: &mut M, b: Value) -> Value {
let mut data = blob.payload_const();

// Every group of 5 characters will yield 6 bytes (due to the hypen)
let r = alloc_blob(mem, Bytes(((n.as_u32() + 4) / 5) * 6));
let r = alloc_blob(mem, TAG_BLOB_T, Bytes(((n.as_u32() + 4) / 5) * 6));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure: Would the result maybe be a TAG_BLOB_P? (I am just trying to infer it by the function name.)

@@ -172,7 +174,7 @@ unsafe extern "C" fn text_to_buf(mut s: Value, mut buf: *mut u8) {
unsafe extern "C" fn stream_write_text(stream: *mut Stream, mut s: Value) {
loop {
let s_ptr = s.as_obj();
if s_ptr.tag() == TAG_BLOB {
if s_ptr.tag() == TAG_BLOB_B || s_ptr.tag() == TAG_BLOB_T {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious: The TAG_BLOB_B would occur for the Candid streaming. And the TAG_BLOB_T because of the principal to text conversions?

@@ -570,7 +570,7 @@ fn create_static_heap(
// root.
let array_addr = u32::try_from(heap.as_ptr() as usize).unwrap();
let mut offset = 0;
write_word(heap, offset, TAG_ARRAY);
write_word(heap, offset, TAG_ARRAY_M); // TODO: TBR
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could also use TAG_ARRAY_I for the static roots. But it is quite uncritical as it is not scanned by graph stabilization.


let text = text_of_str(&mut heap, "bfozs-kwa73-7nadi");
assert_eq!(
text_compare(
blob_compare(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am not mistaken, this compares a TAG_BLOB_B with TAG_BLOB_T. Perhaps, the text_of_str() is here more like a blob encoder in this specific test...

@@ -1996,18 +1999,22 @@ module Tagged = struct
so update all!
*)

type bits_sort = U | S | F
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Just cosmetics: Maybe a brief comment could be helpful saying what the abbreviations stand for.)

@@ -2197,7 +2212,7 @@ module Tagged = struct
let (set_o, get_o) = new_local env "o" in
let prep (t, code) = (t, get_o ^^ code)
in set_o ^^ get_o ^^ branch_default env retty def (List.map prep cases)

(*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good that you noticed. Maybe we can delete this dead code...

@@ -2571,7 +2587,11 @@ module BoxedWord64 = struct

let payload_field = Tagged.header_size

let heap_tag env pty = Tagged.Bits64 (* TODO *)
let heap_tag env pty =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice: The TODO is now addressed.

G.if1 I32Type (Opt.inject_simple env get_blob) (Opt.null_lit env)
G.if1 I32Type
(get_blob ^^ Blob.as_ptr_len env ^^
of_ptr_size env ^^ (* creates text blob *)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good that you covered this case: We indeed need to create a copy of the blob, as it has a different purpose (and tag)...

let alloc env array_sort len =
compile_unboxed_const Tagged.(int_of_tag (Array array_sort)) ^^
len ^^
E.call_import env "rts" "alloc_array" (* TODO *)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious about the TODO: Do you have a specific open aspect in mind?

@@ -7519,11 +7583,11 @@ module MakeSerialization (Strm : Stream) = struct
*)
get_thing ^^ set_result ^^
get_memo ^^ get_result ^^ store_unskewed_ptr ^^
get_memo ^^ compile_add_const 4l ^^ Blob.lit env (typ_hash t) ^^ store_unskewed_ptr
get_memo ^^ compile_add_const 4l ^^ Blob.lit env Tagged.T (typ_hash t) ^^ store_unskewed_ptr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could also consider the hash fingerprint (to double check the aliasing deserialization mechanism) as pure blob?

@@ -8639,7 +8703,7 @@ module GCRoots = struct
})

let store_static_roots env =
Arr.vanilla_lit env (E.get_static_roots env)
Arr.vanilla_lit env Tagged.M (E.get_static_roots env)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether the static roots could be tagged as immutable, as they reside in static space and should not change at runtime (references to static MutBoxes that are not moved by the GC). This would be different with EOP.

@@ -8747,7 +8812,10 @@ module StackRep = struct
| Const.Unit -> Tuple.unit_vanilla_lit env
| Const.Array cs ->
let ptrs = List.map (materialize_const_t env) cs in
Arr.vanilla_lit env ptrs
Arr.vanilla_lit env Tagged.I ptrs
| Const.Tuple cs ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, as constant tuples are now generated (I noticed it below).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants