Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How should interfaces for core functionality be named? #1058

Closed
zygoloid opened this issue Jan 31, 2022 · 15 comments
Closed

How should interfaces for core functionality be named? #1058

zygoloid opened this issue Jan 31, 2022 · 15 comments
Labels
leads question A question for the leads team

Comments

@zygoloid
Copy link
Contributor

We have various choices for how to name interfaces that correspond to core functionality, such as overloaded operators, implicit conversions, etc. Some possible considerations:

  • We might want these names to be as terse as is feasible. For example, Rust uses Mul, Deref, Index, and Haskell uses Eq, Ord, Num.
  • We might want some kind of systematic decoration for these names. For example, Python uses __double_underscore__ names such as __mul__.
  • We might want these names to be fully descriptive and unabbreviated, with nothing in particular setting them apart from user-defined interfaces, but still as short as possible subject to those constraints. For example, previous discussion in Carbon has used as placeholders names like AddableWith, and we have accepted proposals using CommonTypeWith, As, and ImplicitAs.
  • We might want these names to be generated directly from the corresponding operators. For example, C++ uses operator*, operator[]. This would deviate from Carbon's normal naming rules, but this is perhaps a place where we could afford such deviation.

Relative to Carbon's goals, the most important aspects are likely:

  • Language evolution: changes to which interfaces are core versus which are merely built on top of core interfaces should ideally avoid significant ripple effects in how code is written, suggesting that we don't use a special naming convention for these operations.
  • Code that is easy to read, understand, and write, and in particular:
    • Syntax should be easily parsed and scanned by any human in any development environment, not just a machine or a human aided by semantic hints from an IDE.
    • Explicitness must be balanced against conciseness, as verbosity and ceremony add cognitive overhead for the reader, while explicitness reduces the amount of outside context the reader must have or assume.

Probably the two most plausible options are more-abbreviated Rust-like naming and more-verbose naming following our current proposals.

@zygoloid
Copy link
Contributor Author

One observation about the Rust-like names is that they tend to read as actions ("add" or "multiply" or "index"), whereas names in current Carbon read as properties ("addable" or "multipliable"). This isn't necessarily fundamental to the approach, however: the Haskell names read as properties too ("numeric" or "ordered"), and some Carbon names that have been discussed aren't properties ("PartialOrder").

This naming question has some influence on other parts of the language design. In particular, the is operator makes most sense for interfaces that are named as properties, but it's probably not a large cognitive cost for people to get used to utterances like T is Add.

We could perhaps accept that impl T as I can mean any of "T is I" (eg, Hashable), or "T is an I" (eg, Widget), or "T can I" (eg, Add), or even "T has an I" (eg, Order), and that people will get used to that, but we'll be setting an important example with how we name the core language interfaces.

Another concern is that many of these interfaces will have a single method, and we need to pick a name for that too. In Rust, Add.add works out well. With Carbon naming and name shadowing rules, Add.Add isn't going to work well, but Addable.Add avoids the problem.

@josh11b
Copy link
Contributor

josh11b commented Jan 31, 2022

Another option is to give these interfaces short names, but put them some namespace that clarifies these are ops, like Op or Carbon.Op.

@josh11b josh11b added this to Questions in Issues for leads via automation Feb 2, 2022
@zygoloid
Copy link
Contributor Author

zygoloid commented Feb 3, 2022

Recent discussions of this issue:

@jonmeow
Copy link
Contributor

jonmeow commented Feb 7, 2022

Are these intended to be pulled in by the prelude, or would they require an explicit import? I think it could function either way, but it affects name collisions. i.e., Carbon.Index might be fine if it needs to be imported explicitly, but if it were in the prelude, the resulting Index name would more likely run into conflicts with user code. (a namespace per josh11b's suggestion would disambiguate).

@zygoloid
Copy link
Contributor Author

Some operators work particularly badly under the FooableWith approach, mostly because they try to read well as plain English, but the plain English rules are not uniform or simple:

  • SubtractableWith sounds awkward. SubtractableFrom might be more natural in English, but would reverse the argument order (RHS is SubtractableFrom(LHS)) and introduce nonuniformity.
  • Is it MultiplyableWith.Multiply (uniform spelling but incorrect in English) or MultipliableWith.Multiply (correct English spelling but non-uniform)?
  • DividableWith is also awkward, both because Divideable would be the result of uniformly adding -able and because you divide by, not with, the RHS.
  • It's not clear what name to give the interface for the % operator.
  • << and >> would presumably be ShiftableLeft and ShiftableRight, which is a little non-uniform.

These problems don't occur for Rust- or C++-like naming.

@chandlerc
Copy link
Contributor

FWIW, I agree about FooableWith or Fooable generally being difficult. I will note that I think << and >> aren't that bad: LeftShiftable and RightShiftable seem fine, and generally using noun-shaped "left shift" and "right shift" are likely fine terms for here or documentation where needed.

But the other points are all pretty strong IMO.

Two things I'd like to understand:

  • Are we really confident that we will want the interfaces in this space to follow the single-function pattern?
  • Do we expect anything other than a return type associated type to be part of the interface?

@chandlerc
Copy link
Contributor

Are these intended to be pulled in by the prelude, or would they require an explicit import?

I would expect us to want these in the prelude given that they directly interact with the core language.

I think it could function either way, but it affects name collisions. i.e., Carbon.Index might be fine if it needs to be imported explicitly, but if it were in the prelude, the resulting Index name would more likely run into conflicts with user code. (a namespace per josh11b's suggestion would disambiguate).

I agree that collisions are something to think about. Personally, I still think that the prelude should only alias a (very) small subset of names as being available without saying Carbon. but that insisting on a dedicated import for everything that isn't aliased IMO won't make a lot of sense exactly for constructs like this.

@zygoloid
Copy link
Contributor Author

Are we really confident that we will want the interfaces in this space to follow the single-function pattern?

I'm not entirely confident of that. I could imagine finding we want variants of the operator as additional methods:

interface AddableWith(T:! Type) {
  let Result:! Type;
  fn Add[me: Self](other: T) -> Result;
  default fn TryAdd[me: Self](other: T) -> Optional(Result) {
    return me.Add(other);
  }
}

@chandlerc
Copy link
Contributor

We've discussed this several times, and I wanted to write up a rough summary of where I think we're starting to converge at least initially. There are a few patterns that end up guiding what naming scheme to use.

(1) Interfaces that implement a specific language expression syntax

One category of interfaces here are the ones with a specific 1:1 correspondence with a expression syntax, mostly operators. Where have this really strong correspondence I think there is also a growing interest in the following pattern:

  • Name the interface with a very short name strongly associated with the syntax itself:
    • Binary operators: Add, Sub, Mul, Div, ...
    • Unary operators: Increment, Decrement, Deref, Negate
    • ...
  • Name a single function in all of these interfaces with a consistent, generic, short name. The leading candidate is Op.
  • Anchor on the canonical syntax for calling the Op function is with the builtin language syntax (a + b), and so these interfaces should be consistently implemented externally. This, in turn, makes it fine for them to all have the same function name.
    • We should consider adding the ability for an interface to make itself (or one of its functions) always external. But that can be a separate step.
  • If this pattern works well but the extra function name Op becomes a frustrating source of boilerplate, we can revisit adding dedicated syntax for an interface that has a single primary function.

One big question I don't have a good answer for is how to choose the short names for the interfaces. When should they be abbreviated? Above, I used the names that seem most natural to me, but they are a complete mixture. Options:

  • (1a) Always abbreviate as much as we can (min of 3 letters likely). From the above: Add, Sub, Mul, Div, Inc, Dec, Deref, Neg.
    • For me, Dec and Neg are not trivially recognizable when abbreviated this way. But I could get used to it.
  • (1b) Never abbreviate, and work to find reasonably short and easily spelled names likely by using simpler verb forms instead of words like Multiplication. From the above: Add, Subtract, Multiply, Divide, Increment, Decrement, Dereference, Negate.
    • I'm OK with this, but worried some may be annoyed by the verbosity, for example with names like Dereference.
    • I find I mildly prefer the abbreviations for some of these because I more strongly associate those names with programming while the words I associate with math. Super subjective, but there it is.
  • (1c) Use abbreviations when "sufficiently" widely used and recognizable in languages and programming contexts. Will be a judgement call (maybe for the painter?) on each name.
    • This matches the above names for me personally. The reason for Increment is because I'd like it to match Decrement and Dec isn't as easily recognizable. My personal bar for this would be quite high, but the 5 abbreviations above would be fine with me.
    • Purely subjectively, I find this most aesthetically pleasing. But its not very principled.

(2) Interfaces that contribute usefully to creating consistent APIs on types

For some interfaces, it is useful for the function name to be reasonable when embedded into a types API with an internal implementation to create consistent and idiomatic type APIs. While we may want to introduce "spaceship" style syntax (<=>) for calling a three-way comparison function, let's assume we don't for the purpose of an example. It might then be nice to structure the interface with a function name Compare so that types can internally implement it and get an idiomatic method name as part of their API. And yet these might still be used as part of core functionality in the language.

We still need to name the interface, and there were still some concerns when the name might overlap with the names in the useful method names. We could repeat the names (Compare.Compare), but there were some concerns there including some amount of confusion.

One candidate guide for naming the interface that seemed interesting was: if the desired name of the interface might collide with one of its member names, either exactly or as two parts of speech like Comparable and Compare, rather than relying on sometimes confusing, inconsistent, or complex English rules for separating two parts of speech, use the simpler or more obvious word (even if a verb or noun) and add a suffix like Interface or Constraint interfaces or named constraints respectively: CompareInterface or CompareConstraint.

  • This is a bit verbose, but if we find it a problem in practice we can shorten. Some hesitation about the Iface abbreviation, and sadly FooI is a pattern with a different meaning in some existing C++ codebases.
  • Suggestion was to start with the unambiguous but verbose option and see how bad it is before trying to optimize.

However, one unanswered question is what to do when the desired name is already an adjective and completely distinct from any likely member name. Some candidates might be Abstract or Concrete as type-of-types (not interfaces). Other interesting cases here are As and ImplicitAs where we reference the syntax construct's name and have an unsurprising but distinct verb (Convert) to use as the member. There seem to be two choices:

  • (2a) No need for a suffix here.
    • We would only add the suffix overhead when there is some "verb"-y or "noun"-y name that would work well for the interface name to avoid doing a complex English transformation into an adjective.
    • Less verbose, but means the suffix isn't really a consistent pattern, just used to clarify otherwise complex/confusing cases.
  • (2b) Consistently use suffixes everywhere.
    • This would make many more interfaces more verbose.
    • Unclear we really want to strive for consistency here... For example (1) above wouldn't be consistent anyways.

My current leaning

I really happy with any of the (1) variations, slight preference for (1c), then (1b), and (1a) least.

I less confident but still pretty happy with (2a). I don't think I'd like (2b) -- I'd like to see if we end up having any real difficulty deciding when to use a suffix before trying to add more rigid rules around it.

@josh11b
Copy link
Contributor

josh11b commented Apr 2, 2022

I would like there to be a policy around constraints whose primary role is to express a property, rather than expose methods. Examples: concrete, sized, trivially destructible, has unformed state, is a data class, and so on.

@josh11b
Copy link
Contributor

josh11b commented Apr 4, 2022

To clarify, since I don't think I expressed that well: I think there is a third category of constraints that are not really in category (1) since they are not about a specific operation and are more like an adjective than a verb, and not really in category (2) since they don't have members you would want added to your type by implementing them internally. You talk about them as part of category (2), but I don't think they fit there.

Also, what category would you put things like CommonType in?

@chandlerc chandlerc mentioned this issue Apr 6, 2022
@chandlerc
Copy link
Contributor

Just wanted to update this to call out that in open discussion there seemed to be:

  • Pretty good consensus around (1) generally.
  • Not much support for (1b) due to verbosity.
  • Some concerns around (1a) due to confusing abbreviations like Dec.
  • Not too much concern about the ad-hoc nature of (1c) -- the expectation being that this is vocabulary that will be learned, including which spelling.
  • Lots of concerns with (2b), including the exact concerns raised by @josh11b in the previous two comments.
  • Decent amount of happiness with (2a) -- basically minimally using suffixes of Interface or Constraint when the alternative would be a hard-to-remember or confusing transformation of a noun or verb into an adjective (like "divisible"). But emphasis on minimal, and instead using direct names when available.
    • This means CommonType, if desired, would fall under (2a) but not need any suffix.

Not sure this is fully consensus yet, but wanted to track the discussion.

@chandlerc
Copy link
Contributor

I think the leads are good w/ (1c) and (2a) from my earlier comment: #1058 (comment)

This includes the clarification of (2a) to mean -- minimally use suffixes like Interface or Constraint when there would otherwise be a confusing transformation of a noun or verb into an adjective.

And we'll start with the initial list of abbreviations in (1c) alone: Sub, Mul, Div, Deref. If other good candidates come up, they'll be considered as they arise.

Issues for leads automation moved this from Questions to Resolved Apr 9, 2022
@chandlerc chandlerc moved this from Resolved to Needs proposal in Issues for leads Apr 9, 2022
zygoloid added a commit that referenced this issue May 3, 2022
Add concrete design for interfaces for comparison.

Rename interfaces for arithmetic following current thinking in #1058.

Update rules for mixed-type comparisons for data classes following #710.

Co-authored-by: Chandler Carruth <chandlerc@gmail.com>
chandlerc added a commit that referenced this issue Jun 28, 2022
Add concrete design for interfaces for comparison.

Rename interfaces for arithmetic following current thinking in #1058.

Update rules for mixed-type comparisons for data classes following #710.

Co-authored-by: Chandler Carruth <chandlerc@gmail.com>
@jonmeow jonmeow added the leads question A question for the leads team label Aug 10, 2022
@jonmeow
Copy link
Contributor

jonmeow commented Aug 11, 2022

Does this still need a proposal given #1178 and related work?

@chandlerc
Copy link
Contributor

I think this is done now.

geoffromer added a commit to geoffromer/carbon-lang that referenced this issue Sep 15, 2022
Notable changes from p0157:
- Syntax and semantics are updated to reflect subsequent design work, especially on variable declarations, generics, and classes.
- `Matchable.Match` is now `Match.Op`, following the resolution of carbon-language#1058.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
leads question A question for the leads team
Projects
No open projects
Issues for leads
Needs proposal
Development

No branches or pull requests

4 participants