-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
Summary of issue:
For C++ interoperability, Carbon must map C++ built-in integer types like long and long long.
The C++ type system treats these as nominally distinct types for purposes like function overloading and template specialization, even on platforms where their underlying representations (size, signedness) are identical.
A simple alias to canonical Carbon types (e.g., mapping both long and long long to i64 on an LP64 system) would lose this semantic distinction and break C++ interop. This requires us to introduce distinct Carbon types (e.g., Cpp.long, Cpp.long_long), which raises two critical design questions:
- How do we provide arithmetic and other operations for these new types without duplicating all the
implsthat already exist fori64? - How do we define conversions and the result type of mixed-type operations (e.g.,
i64 + Cpp.long_long)?
Details:
The Core Problem: Nominal Distinction
The C++ standard defines its fundamental integer types as nominally distinct. This is crucial for overload resolution and template specialization.
- On 64-bit Windows (LLP64),
intandlongare both 32-bit, but they are distinct types. - On 64-bit Linux (LP64),
longandlong longare both 64-bit, but they are distinct types. - C++ explicitly defines
char,signed char, andunsigned charas three distinct types, even thoughcharmust have an identical representation to one of the other two.
A naive mapping (e.g., Cpp.long as an alias for i32 on Windows) would fail to compile C++ code that legally overloads fn Foo(i: int) and fn Foo(l: long).
Therefore, Carbon must introduce distinct types, such as Cpp.long and Cpp.long_long, that are imported via the Cpp library. A draft PR has begun this work (see Draft PR #6250), but these new types currently have no operations or conversions. This leads to the next challenges.
Challenge 1: How to Provide Operations (impls)?
We must provide arithmetic, bitwise, and comparison impls for these types. Manually copy-pasting the impls from i32 and i64 is unscalable and a maintenance burden. Recent discussions have identified a few potential paths:
Option A: Parameterized Core Integers (Core.GenericInt(n, a))
- Proposal: Refactor Carbon's core integer types into a generic form.
i64might becomeCore.GenericInt(64, i64_tag)andCpp.long_longwould beCore.GenericInt(64, Cpp.long_long_tag). The arithmeticimplswould be written once for the generic type. - Trade-off: This is architecturally invasive to Carbon's core types and adds complexity for all Carbon users, not just those using C++ interop.
Option B: Adapters with impl Inheritance
- Proposal: Define
Cpp.long_longas anadapterthat wrapsi64. We would then need a language mechanism for adapters to automatically "inherit"implsfrom the type they wrap. - Trade-off: This "impl inheritance" feature does not currently exist. It would be a major new language feature. It also raises semantic questions: if
implinheritance works by unwrapping, wouldCpp.long_long + Cpp.long_long(which unwraps toi64 + i64) return ani64? This would violate C++'s type preservation rules.
Option C: impl Synthesis from C++ Builtins
- Proposal: Don't provide any Carbon-defined
implsfor these types. Instead, leverage the general C++ operator interoperability mechanism (proposed in Issue C++ Interop: How should C++ operators interact with Carbon operator interfaces? #6166). When Carbon seesa + b(whereaisCpp.long_long), it would ask Clang to perform C++ overload resolution forlong long + long long. Clang would find the C++ builtin operator, and Carbon would then synthesize a temporaryimplfor the call. - Trade-off: This reuses the same scalable mechanism needed for all C++ operator interop (e.g., for
Cpp.Widget). However, it relies on that (complex) feature being fully implemented.
Challenge 2: Mixed-Type Arithmetic and Conversions
We must define the behavior of expressions mixing canonical Carbon types and these new C++ types, such as my_i64 + my_cpp_long_long.
Result Type: What type should this expression produce?
- Produce
Cpp.long_long: This would align with C++'s "usual arithmetic conversions," where operands are promoted to the type with the highest "rank". This seems best for an "unsurprising mapping". - Produce
i64: This would produce a more common, canonical Carbon type, avoiding pulling C++'s complex integer promotion rules into Carbon. However, this is a "surprising" mapping for C++ interop.
Conversion Rules: The choice of result type dictates the conversion rules.
- To make
i64 + Cpp.long_longresult inCpp.long_long, the compiler must use animpl Cpp.long_long as AddWith(Cpp.long_long). This requires a "safe" implicit conversion fromi64toCpp.long_long. - This aligns with Carbon's proposal for safe implicit conversions, which allows value-preserving conversions (like
inttolong). - To prevent ambiguity, the reverse conversion (
Cpp.long_longtoi64) would then need to be explicit (e.g.,as i64), as this conversion "loses" the nominal C++ type identity.
Challenge 3: Location of Definitions
If we choose any path other than impl synthesis (Option 1C), where do these impls and conversions live?
- The standard
preludecannot have a dependency on theCpplibrary. - Are these definitions compiler built-ins that are "activated" when a user
import Cpp? - Should we introduce a
cpp_preludethat is automatically imported withimport Cpp?
Questions for Leads:
-
implProvisioning: What is our strategy for providing operations for distinct C++ integer types?- A)
Core.GenericInt: Do we pursue the invasive refactor ofCore.Int? - B)
adapter: Do we commit to designing and building "adapterimplinheritance"? - C)
implSynthesis: Should we rely entirely on the C++ operator interop mechanism (Issue C++ Interop: How should C++ operators interact with Carbon operator interfaces? #6166) to use C++'s builtin operators?
- A)
-
longAmbiguity: How should we modelCpp.long? Is it a single distinct type that is conditionally mapped toi32ori64behavior based on the target C++ data model (LLP64 vs. LP64)? -
Mixed-Type Arithmetic:
- 3a. Result Type: What type should
i64 + Cpp.long_longproduce? Should we prioritize C++ compatibility (result:Cpp.long_long) or a Carbon-canonical result (result:i64)? - 3b. Conversions: Based on the answer to (3a), do we approve an asymmetric conversion model: implicit
i64 -> Cpp.long_longbut explicitCpp.long_long -> i64?
- 3a. Result Type: What type should
-
Definition Location: If we require pre-defined conversions (per Q3b) or an
implstrategy other than (1C), where are these definitions located? Are they compiler built-ins activated byimport Cpp?