Skip to content

C++ Interop: How to model distinct C++ integer types (long, long long) and their operations? #6275

@bricknerb

Description

@bricknerb

Summary of issue:

For C++ interoperability, Carbon must map C++ built-in integer types like long and long long.
The C++ type system treats these as nominally distinct types for purposes like function overloading and template specialization, even on platforms where their underlying representations (size, signedness) are identical.
A simple alias to canonical Carbon types (e.g., mapping both long and long long to i64 on an LP64 system) would lose this semantic distinction and break C++ interop. This requires us to introduce distinct Carbon types (e.g., Cpp.long, Cpp.long_long), which raises two critical design questions:

  1. How do we provide arithmetic and other operations for these new types without duplicating all the impls that already exist for i64?
  2. How do we define conversions and the result type of mixed-type operations (e.g., i64 + Cpp.long_long)?

Details:

The Core Problem: Nominal Distinction

The C++ standard defines its fundamental integer types as nominally distinct. This is crucial for overload resolution and template specialization.

A naive mapping (e.g., Cpp.long as an alias for i32 on Windows) would fail to compile C++ code that legally overloads fn Foo(i: int) and fn Foo(l: long).

Therefore, Carbon must introduce distinct types, such as Cpp.long and Cpp.long_long, that are imported via the Cpp library. A draft PR has begun this work (see Draft PR #6250), but these new types currently have no operations or conversions. This leads to the next challenges.


Challenge 1: How to Provide Operations (impls)?

We must provide arithmetic, bitwise, and comparison impls for these types. Manually copy-pasting the impls from i32 and i64 is unscalable and a maintenance burden. Recent discussions have identified a few potential paths:

Option A: Parameterized Core Integers (Core.GenericInt(n, a))

  • Proposal: Refactor Carbon's core integer types into a generic form. i64 might become Core.GenericInt(64, i64_tag) and Cpp.long_long would be Core.GenericInt(64, Cpp.long_long_tag). The arithmetic impls would be written once for the generic type.
  • Trade-off: This is architecturally invasive to Carbon's core types and adds complexity for all Carbon users, not just those using C++ interop.

Option B: Adapters with impl Inheritance

  • Proposal: Define Cpp.long_long as an adapter that wraps i64. We would then need a language mechanism for adapters to automatically "inherit" impls from the type they wrap.
  • Trade-off: This "impl inheritance" feature does not currently exist. It would be a major new language feature. It also raises semantic questions: if impl inheritance works by unwrapping, would Cpp.long_long + Cpp.long_long (which unwraps to i64 + i64) return an i64? This would violate C++'s type preservation rules.

Option C: impl Synthesis from C++ Builtins

  • Proposal: Don't provide any Carbon-defined impls for these types. Instead, leverage the general C++ operator interoperability mechanism (proposed in Issue C++ Interop: How should C++ operators interact with Carbon operator interfaces? #6166). When Carbon sees a + b (where a is Cpp.long_long), it would ask Clang to perform C++ overload resolution for long long + long long. Clang would find the C++ builtin operator, and Carbon would then synthesize a temporary impl for the call.
  • Trade-off: This reuses the same scalable mechanism needed for all C++ operator interop (e.g., for Cpp.Widget). However, it relies on that (complex) feature being fully implemented.

Challenge 2: Mixed-Type Arithmetic and Conversions

We must define the behavior of expressions mixing canonical Carbon types and these new C++ types, such as my_i64 + my_cpp_long_long.

Result Type: What type should this expression produce?

  • Produce Cpp.long_long: This would align with C++'s "usual arithmetic conversions," where operands are promoted to the type with the highest "rank". This seems best for an "unsurprising mapping".
  • Produce i64: This would produce a more common, canonical Carbon type, avoiding pulling C++'s complex integer promotion rules into Carbon. However, this is a "surprising" mapping for C++ interop.

Conversion Rules: The choice of result type dictates the conversion rules.

  • To make i64 + Cpp.long_long result in Cpp.long_long, the compiler must use an impl Cpp.long_long as AddWith(Cpp.long_long). This requires a "safe" implicit conversion from i64 to Cpp.long_long.
  • This aligns with Carbon's proposal for safe implicit conversions, which allows value-preserving conversions (like int to long).
  • To prevent ambiguity, the reverse conversion (Cpp.long_long to i64) would then need to be explicit (e.g., as i64), as this conversion "loses" the nominal C++ type identity.

Challenge 3: Location of Definitions

If we choose any path other than impl synthesis (Option 1C), where do these impls and conversions live?

  • The standard prelude cannot have a dependency on the Cpp library.
  • Are these definitions compiler built-ins that are "activated" when a user import Cpp?
  • Should we introduce a cpp_prelude that is automatically imported with import Cpp?

Questions for Leads:

  1. impl Provisioning: What is our strategy for providing operations for distinct C++ integer types?

  2. long Ambiguity: How should we model Cpp.long? Is it a single distinct type that is conditionally mapped to i32 or i64 behavior based on the target C++ data model (LLP64 vs. LP64)?

  3. Mixed-Type Arithmetic:

    • 3a. Result Type: What type should i64 + Cpp.long_long produce? Should we prioritize C++ compatibility (result: Cpp.long_long) or a Carbon-canonical result (result: i64)?
    • 3b. Conversions: Based on the answer to (3a), do we approve an asymmetric conversion model: implicit i64 -> Cpp.long_long but explicit Cpp.long_long -> i64?
  4. Definition Location: If we require pre-defined conversions (per Q3b) or an impl strategy other than (1C), where are these definitions located? Are they compiler built-ins activated by import Cpp?

Any other information that you want to share?

Carbon <-> C++ Interop: Primitive Types proposal

Metadata

Metadata

Assignees

No one assigned

    Labels

    leads questionA question for the leads team

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions