Compute and cache the value representation of a type when it becomes complete. #3271

zygoloid · 2023-10-06T00:28:59Z

Using the computed value representation, fix lowering of struct and tuple values to use the value representation rather than the object representation. Fixes an issue found in the review of #3257.

This currently causes us to compute value representations of all types as they are created, which generates substantially more SemIR to represent types. We can get some of that back by deferring computation of the value representation until the type is required to be complete, but some of the additional cost here will persist with this approach.

I also considered making the computation of the value representation type be something that lives entirely within the lowering phase, but I think that's not the right approach in the longer term, because the value representation will be semantically visible and relevant once we start allowing it to be customized.

We should consider moving the nodes that exist to compute canonical non-local types, including value representations, out into a separate global block. That will clean up the SemIR representation substantially, and make the SemIR produced for a function not depend on which types we happen to have encountered beforehand. But that's not being done in this PR.

the object representation.

toolchain/lower/testdata/let/tuple.carbon

Use the value representation type where appropriate in lowering.

chandlerc

Really nice. I think there may still be some things left to do here, but I think this is a good increment and it seems better to fix forward at this stage of things.

See comment inline for some of the things still surprising me here. But to be clear, none of these are blockers IMO.

toolchain/check/context.cpp

chandlerc · 2023-10-13T03:30:13Z

toolchain/check/testdata/array/fail_type_mismatch.carbon

 // CHECK:STDOUT:   %.loc10_39.1: type = tuple_type (i32, String, String)
-// CHECK:STDOUT:   %.loc10_39.2: (i32, String, String) = tuple_literal (%.loc10_20, %.loc10_23, %.loc10_32)
-// CHECK:STDOUT:   %.loc10_39.3: i32 = int_literal 0
-// CHECK:STDOUT:   %.loc10_39.4: ref i32 = array_index %a, %.loc10_39.3
-// CHECK:STDOUT:   %.loc10_39.5: init i32 = initialize_from %.loc10_20 to %.loc10_39.4
+// CHECK:STDOUT:   %.loc10_39.2: type = tuple_type (i32, String*, String*)
+// CHECK:STDOUT:   %.loc10_39.3: type = ptr_type (i32, String*, String*)


Not sure it makes sense to address in this PR, but I think this shows an issue with the current value representation approach for tuples.

If we take the value representation of each member of the tuple, and one of them uses a pointer value rep, and then the tuple itself uses a pointer value rep, we don't want to end up with a pointer-to-a-pointer.

When we're selecting a value rep as a pointer, the pointer is to the object, and so we shouldn't have the field value reps here, as we'll go all the way back to the tuple object with object representations for all its fields. And that's what we want so we don't end up with double-indirection.

We should discuss this when you get back from vacation. There's an interesting design tradeoff here that gets into how we want our value semantics to work. As an example to mull over, consider:

fn F(strings: (String, String)); fn G(s: String) { F((s, s)); }

Here, assume that the object representation of String owns a heap buffer, and the value representation is just a pointer and size. And that the tuple value representation is large enough that we want to pass it indirectly. (Eg, add tuple elements until we reach that point.)

Either:

The value representation of (String, String) is a pointer to storage containing two String value representations, meaning that we have a case where the value rep is a pointer, but it's not a pointer to an object rep of the same type. Or...

The value representation of (String, String) is a pointer to storage containing two String object representations, meaning that we need to perform heap allocations and copies in order to call F from G, where we fundamentally had no need to create any new String objects. Or...

Something else!

To me, option (1) seems like a better choice than option (2). There's then another choice:

Either:

1a) The value representation of a tuple contains the value representations of its elements, always, leading to the pointers-to-pointers behavior you're observing here. Or...
1b) If an element's value representation is a pointer, then the tuple's value representation stores the pointee directly. That'd mean that we avoid the double indirection, at the cost of making additional copies. For very large, heavily nested tuples, this can result in the creation of a lot more stack objects and copying between them, but for cases like this one, may be a substantial improvement. Or...
1c) We heuristically pick between 1a and 1b based on what we think is the right tradeoff between extra indirections and extra copies. Or...
1d) Something else!

My current thinking is that either (1b) or (1c) is probably a good direction. But I think there's a lot to explore here.

I have opened issue #3297 to track this question.

chandlerc · 2023-10-13T03:38:43Z

toolchain/lower/testdata/basics/type_values.carbon

+// CHECK:STDOUT: %type = type {}
+// CHECK:STDOUT:
+// CHECK:STDOUT: define %type @I32() {
+// CHECK:STDOUT:   ret %type zeroinitializer


This seems a bit surprising, if likely harmless. Mostly flagging in case you weren't expecting it or want to leave a TODO.

It's not ideal, but it is expected: we're switching from the value representation of type being empty to it being type-by-copy. I think that's right at the SemIR level: for compile-time constant evaluation, the result really is not "none". But the toolchain doesn't have any separate layout optimizations for struct beyond the ones that are done in SemIR, and currently does a very naive conversion from SemIR types to LLVM types, so changing the SemIR side of things loses the lowering optimization. (This also comes up with things like (i32, (), i32), which is lowered to {i32, {}, i32} -- both before and after this PR -- but I think should really be just {i32, i32}.)

(And it turns out to be ... really annoying ... to preserve the current SemIR value representation due to the very special treatment of TypeType as a builtin that has no entry in our types table. I looked into removing that special treatment, but it's going to take more changes that I was comfortable putting into this PR.)

Added a TODO to improve the lowering.

Co-authored-by: Chandler Carruth <chandlerc@gmail.com>

Fix lowering of tuple values to use the value representation rather than

27df53f

the object representation.

zygoloid requested a review from josh11b October 6, 2023 00:28

github-actions bot requested a review from chandlerc October 6, 2023 00:29

github-actions bot added the toolchain label Oct 6, 2023

chandlerc reviewed Oct 7, 2023

View reviewed changes

toolchain/lower/testdata/let/tuple.carbon Outdated Show resolved Hide resolved

zygoloid added 7 commits October 11, 2023 14:14

Park

4e9162b

Merge branch 'trunk' into toolchain-tuple-value-type

d26138b

Compute a value representation type and store it on the SemIR::File.

49de992

Use the value representation type where appropriate in lowering.

Merge branch 'trunk' into toolchain-tuple-value-type

0d13567

Make treatment of builtins consistent. Autoupdate.

e4a9bdf

Bug fixes.

89088fc

Fix YAML test.

b54ee10

zygoloid changed the title ~~Fix lowering of tuple values to use the value representation rather than the object representation.~~ Compute and cache the value representation of a type when it becomes complete. Oct 12, 2023

zygoloid requested a review from chandlerc October 12, 2023 21:11

zygoloid added 2 commits October 12, 2023 15:28

Merge branch 'trunk' into toolchain-tuple-value-type

9a4089a

Autoupdate

13f34a1

chandlerc approved these changes Oct 13, 2023

View reviewed changes

Add TODO comments.

b9e100c

Co-authored-by: Chandler Carruth <chandlerc@gmail.com>

zygoloid mentioned this pull request Oct 13, 2023

value representation for structs and tuples #3297

Open

Add a test TODO.

efb54b4

zygoloid enabled auto-merge October 13, 2023 21:05

zygoloid added this pull request to the merge queue Oct 13, 2023

github-merge-queue bot removed this pull request from the merge queue due to a conflict with the base branch Oct 13, 2023

zygoloid added 2 commits October 13, 2023 14:44

Merge branch 'trunk' into toolchain-tuple-value-type

58e9196

Post-merge fixes.

f05a01b

zygoloid enabled auto-merge October 13, 2023 21:46

zygoloid added this pull request to the merge queue Oct 13, 2023

github-merge-queue bot removed this pull request from the merge queue due to a conflict with the base branch Oct 13, 2023

Merge branch 'trunk' into toolchain-tuple-value-type

67916c6

zygoloid enabled auto-merge October 13, 2023 22:05

Merge branch 'trunk' into toolchain-tuple-value-type

a2fcf03

zygoloid added this pull request to the merge queue Oct 13, 2023

Merged via the queue into carbon-language:trunk with commit e4caf7d Oct 13, 2023
6 checks passed

zygoloid deleted the toolchain-tuple-value-type branch October 13, 2023 22:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compute and cache the value representation of a type when it becomes complete. #3271

Compute and cache the value representation of a type when it becomes complete. #3271

zygoloid commented Oct 6, 2023 •

edited

chandlerc left a comment

chandlerc Oct 13, 2023

zygoloid Oct 13, 2023 •

edited

chandlerc Oct 13, 2023

zygoloid Oct 13, 2023 •

edited

Compute and cache the value representation of a type when it becomes complete. #3271

Compute and cache the value representation of a type when it becomes complete. #3271

Conversation

zygoloid commented Oct 6, 2023 • edited

chandlerc left a comment

Choose a reason for hiding this comment

chandlerc Oct 13, 2023

Choose a reason for hiding this comment

zygoloid Oct 13, 2023 • edited

Choose a reason for hiding this comment

chandlerc Oct 13, 2023

Choose a reason for hiding this comment

zygoloid Oct 13, 2023 • edited

Choose a reason for hiding this comment

zygoloid commented Oct 6, 2023 •

edited

zygoloid Oct 13, 2023 •

edited

zygoloid Oct 13, 2023 •

edited