Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: always_specify_ambiguous_types #4853

Open
Hixie opened this issue Jan 24, 2024 · 19 comments
Open

proposal: always_specify_ambiguous_types #4853

Hixie opened this issue Jan 24, 2024 · 19 comments
Labels
lint-proposal P3 A lower priority bug or feature request status-pending type-enhancement A request for a change that isn't a bug

Comments

@Hixie
Copy link

Hixie commented Jan 24, 2024

always_specify_ambiguous_types

Description

When an unobvious type is omitted: "Specify type annotations when the type is not unambiguously obvious."

When a redundant type is given: "Omit type annotations when the type is unambiguously obvious."

Details

This lint would be like always_specify_types, except that in the following cases it would require that the type be omitted:

  • Types of variables when the fully qualified name of the inferred type of the variable is identical to the variable name itself, other than for the case of the first character of the name.

  • Types of collection literals when they are being assigned to a field or variable of the same type as the collection type that would be inferred.

Kind

Style advice.

Bad Examples

// BAD
final Foo foo = activate();
final List<Widget> children = <Widget>[Placeholder()];
final count = 2;
measure((baz) { baz.energize(); });

Good Examples

// GOOD
final foo = activate();
final List<Widget> children = [Placeholder()];
final int count = 2;
measure((Baz baz) { baz.energize(); });

Discussion

This lint is intended as a potential way to reduce verbosity without losing the benefits of clarity that come from explicit types.
It's related to always_specify_types, avoid_types_on_closure_parameters, and omit_local_variable_types. It came from discussion of the Flutter style guide.

@johnmccutchan
Copy link

I would also add the following to Good Examples:

  1. Constructor calls.
// rvalue is constructor call which fully specifies the type.
final x = Foo();
  1. For loop iterators
// map already specifies the types.
for (final x in map) 
// list already specifies the type.
for (final x in list) 

@Hixie
Copy link
Author

Hixie commented Jan 24, 2024

Constructors don't guarantee the type because of factory constructors, but that one seems probably ok, yeah.

For the maps and lists one I can't tell what the type of x is so I'm not sure that's in the spirit of this lint.

@srawlins srawlins added the P3 A lower priority bug or feature request label Jan 24, 2024
@jacob314
Copy link
Member

The one case that surprises me is that
final int count = 2;
is considered good. This seems like the constructor call case where the type is unambiguous.

@johnmccutchan
Copy link

Constructors don't guarantee the type because of factory constructors, but that one seems probably ok, yeah.

For the maps and lists one I can't tell what the type of x is so I'm not sure that's in the spirit of this lint.

My suggestions are from the Plaque project's restrictions on use of the C++ "auto" keyword:

  • Constructor calls because usually the type is right there on the line
  • Iterators because of the ridiculous amount of typing necessary and you can easily figure out the type of the container.

I'm not surprised you find the iterator case controversial :)

@goderbauer
Copy link
Contributor

There have been past discussions of similar lints to omit "obvious" types, see #3480 and dart-lang/lints#24. I believe, there are more issues on file around this, but I couldn't find them just now.

@pq
Copy link
Member

pq commented Jan 24, 2024

Chatting a bit with @bwilkerson, my sense is that from an implementation perspective, @Hixie's initial proposal isn't tricky. That said, I do think we want to spend some time up front trying to seek some some kind of consensus on what exactly we want to implement with buy in from the Dart/Flutter community. Ideally whatever we implement would be something we'd go so far as recommending in the core or recommended (or at least flutter) lint sets.

📟 @goderbauer @lrhn @munificent @jakemac53 @natebosch

@Hixie
Copy link
Author

Hixie commented Jan 24, 2024

Iterators because of the ridiculous amount of typing necessary and you can easily figure out the type of the container.

What is the type of x in the examples you gave? I don't know and have no way to guess. :-) The point of this lint proposal is to try to be as conservative as possible in terms of what types we remove. It's trying to optimize for clarity in the code while giving slightly more friendliness to developers in those cases where we can do so with zero loss of clarity.

The one case that surprises me is that final int count = 2; is considered good. This seems like the constructor call case where the type is unambiguous.

I can see why it would be controversial. My concern with 2 in particular is that I wouldn't want code to end up looking like this:

final double x = 0;
final double y = 0;
final count = 0;

I think an interesting question is whether String and bool should be omitted for variables set to string and bool literals. I would lean towards including the types even though it's unambiguous because otherwise the type doesn't appear anywhere and it becomes a bit less clear what the intent was (plus it would be more consistent with double/int), but I can certainly see the opposite argument is strong.

I do wonder about constructors in general. It's weird to me that static (de facto factory) methods would be different than factory constructors:

   final Foo x = Foo.createFoo(); // variable doesn't match type name, callee is a static method => need a type
   final x = Foo.thingy(); // callee is a constructor => no type
   final foo = Foo.createFoo(); // callee matches type name => no type

Ugh, this just makes me think the whole idea is bad. :-)

@lrhn
Copy link
Member

lrhn commented Jan 24, 2024

Erik's lists in the linked issues are what I'd go for (expressions with trivial types are non-connection literals, collection literals with type arguments, collection literals containing only expressions with trivial types and the same type, constructor invocations, with type arguments if generic). The one thing I might add is expression as T.

Some maybes are:

  • arithmetic expressions on nums, at least if containing at least one literal, and at least one double literal or division of the result is a double.
  • conditional expressions where both branches have the same type, at least one of them being trivial.

I wouldn't consider a variable having the same name as a type as sufficient, because that requires me to know that the type exists.

And I wouldn't want this in recommended lints. Too controversial and divisive, with no real benefit. Effectively forcing people to remove information that they want to have there, whether one agrees with that or not, is just aggravating.
People who really want the lint can choose it, but forcing it, without any convincing reason, isn't helpful.

@natebosch
Copy link
Member

Ideally whatever we implement would be something we'd go so far as recommending in the core or recommended (or at least flutter) lint sets.

This is not a lint I'd want in the core or recommended set. It's also not a lint I would want enabled for Dart team projects either.

I'd like to see omit_local_variable_types adopted across Dart team projects. I think the ecosystem has also adopted it widely enough that we could consider it for the recommended set, but that isn't something I'd spend effort pushing on right now.

It might be OK to loosen omit_local_variable_types or to add a new version of it with more heuristics that allow authors to re-specify inferred types when they might be ambiguous and a human author or review thinks the specific redundant annotation is useful. I am opposed to the inclusion of a lint that forces specifying redundant types in general.

@jacob314
Copy link
Member

omit_local_variable_types has not been very widely adopted by the ecosystem and there is actually a fairly even split between users using it and always_specify_types .
Only a bit over 10% of users have bothered to use either lint and for users that have bothered, 60% use omit_local_variable_types and 40% use always_specify_types.

By comparison, the most popular lint not in a recommended set is cancel_subscriptions used by 20.8% of users. I think this is evidence that something a bit more pragmatic that only warned about providing type annotations for cases where the types are blindly obvious could get more adoption than prohibiting all types or requiring types everywhere.

@srawlins
Copy link
Member

omit_local_variable_types is enforced across the board in Google-internal code though.

@bwilkerson
Copy link
Member

FWIW, DCM implements prefer-type-over-var, which appears to be a little less complete than always_specify_types, but not the opposite. I don't have any data about how many users enable that lint.

@munificent
Copy link
Member

Only a bit over 10% of users have bothered to use either lint

Yeah, it's hard to get good data from that. I scraped a big corpus of pub packages and open source widgets (17,941,439 lines in 90,019 files). First, looking at local variable declarations individually:

-- All locals (410848 total) --
 230982 ( 56.221%): Untyped        ==========================
 161098 ( 39.211%): Typed          ==================
  18768 (  4.568%): Uninitialized  ===

Note that I call out uninitialized variables separately since those aren't candidates for inference, so we should expect them to be typed even by users who like inference.

Looking at individual variables doesn't give us good data about stylistic preference. If it turns out that people who like types also create more local variables (or vice versa), this will be skewed. (Also, sometimes huge generated files can skew data like this.)

I think a decent approximation of style preference is by looking at each file in aggregate (package would be better, but is a little harder for me to do):

-- Locals in file (41329 total) --
  19390 ( 46.916%): All inferred   ======================
  13231 ( 32.014%): All typed      ===============
   5844 ( 14.140%): Mixed          =======
   1964 (  4.752%): >80% inferred  ===
    900 (  2.178%): >80% typed     =

Here, each data point is a single file and we count the number of annotated and unnannotated local variables (ignoring uninitialized ones completely). If every variable in the file is typed, then it's a point for "All typed". Likewise, if no variable is typed, it's a point for "All inferred".

Otherwise, it's some mixture of some typed and some untyped. Those could be users who prefer typing "non-obvious" variables but inferring others. Or places where a type is needed to upcast or affect downwards inference but otherwise the user likes inference.

Here's a histogram of the actual percent of locals in a file that are annotated:

-- Percent inferred in file (41329 total) --
  13556 ( 32.800%): 0    ===================
    575 (  1.391%): 10   =
    789 (  1.909%): 20   ==
    807 (  1.953%): 30   ==
    422 (  1.021%): 40   =
   1594 (  3.857%): 50   ===
   1084 (  2.623%): 60   ==
    867 (  2.098%): 70   ==
   1383 (  3.346%): 80   ==
    862 (  2.086%): 90   ==
  19390 ( 46.916%): 100  ==========================
Sum 2395270, average 57.956, median 80

It looks like overall, users prefer inference, but not by a huge amount.

We could get a little more data by also looking at parameter types for function expressions, but I didn't do that here.

@munificent
Copy link
Member

Many files contain only a couple of local variables (often just one), so that can make the data appear a little more "all or nothing" since the denominator is so small. Here's the results filtered to only look at files that contain at least 10 local variables:

-- >= 10 locals in file (9023 total) --
   2978 ( 33.005%): All inferred   ===============
   2158 ( 23.917%): All typed      ===========
   1819 ( 20.160%): Mixed          ==========
   1420 ( 15.738%): >80% inferred  ========
    648 (  7.182%): >80% typed     ====

-- Percent inferred in file (>= 10 locals) (9023 total) --
   2483 ( 27.519%): 0    ================
    323 (  3.580%): 10   ==
    278 (  3.081%): 20   ==
    255 (  2.826%): 30   ==
    215 (  2.383%): 40   ==
    317 (  3.513%): 50   ==
    309 (  3.425%): 60   ==
    360 (  3.990%): 70   ===
    643 (  7.126%): 80   ====
    862 (  9.553%): 90   ======
   2978 ( 33.005%): 100  ===================
Sum 511450, average 56.683, median 70

As expected, there are more mixed results. But otherwise, it doesn't seem to skew much.

Aggregating by package might be a better way to glean the style preference of each team or user. I hacked together some code to do that:

-- Locals in package (1950 total) --
    793 ( 40.667%): Mixed          ===================
    496 ( 25.436%): >80% inferred  ============
    292 ( 14.974%): All inferred   =======
    228 ( 11.692%): >80% typed     ======
    141 (  7.231%): All typed      ====

-- Percent inferred in package (2553 total) --
    399 ( 15.629%): 0    =========
    134 (  5.249%): 10   ===
    177 (  6.933%): 20   ====
    182 (  7.129%): 30   ====
    131 (  5.131%): 40   ===
    173 (  6.776%): 50   ====
    152 (  5.954%): 60   ====
    147 (  5.758%): 70   ====
    196 (  7.677%): 80   =====
    337 ( 13.200%): 90   ========
    525 ( 20.564%): 100  ============
Sum 142150, average 55.680, median 60

There are enough mixed results this way that it's a bit hard to read, so just splitting down the middle:

-- Locals in package (1950 total) --
   1165 ( 59.744%): >= 50% inferred  ==========================
    785 ( 40.256%): < 50% inferred   ==================

When doing this, I noticed that my itsallwidgets.com corpus includes Flutter itself and the Flutter samples. If we're trying to figure out what the external ecosystem does, then it makes sense to skip those. Removing them from the dataset gives:

-- Percent inferred in file (39805 total) --
  12222 ( 30.705%): 0    =================
    574 (  1.442%): 10   =
    789 (  1.982%): 20   ==
    805 (  2.022%): 30   ==
    420 (  1.055%): 40   =
   1591 (  3.997%): 50   ===
   1082 (  2.718%): 60   ==
    866 (  2.176%): 70   ==
   1355 (  3.404%): 80   ==
    861 (  2.163%): 90   ==
  19240 ( 48.336%): 100  ===========================
Sum 2377450, average 59.727, median 90

-- Percent inferred in file (>= 10 locals) (8444 total) --
   1925 ( 22.797%): 0    =============
    323 (  3.825%): 10   ===
    278 (  3.292%): 20   ==
    255 (  3.020%): 30   ==
    213 (  2.523%): 40   ==
    316 (  3.742%): 50   ===
    308 (  3.648%): 60   ===
    360 (  4.263%): 70   ===
    642 (  7.603%): 80   =====
    861 ( 10.197%): 90   ======
   2963 ( 35.090%): 100  ====================
Sum 509590, average 60.349, median 80

-- All locals (384538 total) --
 228188 ( 59.341%): Untyped        ===========================
 139180 ( 36.194%): Typed          =================
  17170 (  4.465%): Uninitialized  ===

-- Any locals in file (39805 total) --
  19240 ( 48.336%): All inferred   ======================
  11899 ( 29.893%): All typed      ==============
   5833 ( 14.654%): Mixed          =======
   1936 (  4.864%): >80% inferred  ===
    897 (  2.253%): >80% typed     ==

-- >= 10 locals in file (8444 total) --
   2963 ( 35.090%): All inferred   ================
   1815 ( 21.495%): Mixed          ==========
   1602 ( 18.972%): All typed      =========
   1418 ( 16.793%): >80% inferred  ========
    646 (  7.650%): >80% typed     ====

@jacob314
Copy link
Member

Thanks for putting together all of this data. I find the numbers for ">=10 locals in a file the most interesting as those best approximate user behavior on a package level. Seems like may have a bit of a lint promotion problem as 35.090% of users have All inferred but a much smaller percentage of users have realized they should enable omit_local_variable_types to enforce that preference. Filed dart-lang/sdk#54763 to track.
The >=10 locals in a file data also gives me hope that there could be a useful middle ground always_specify_ambiguous_types lint that could be embraced by the 65% of users that include at least some types for locals in their code.

@FMorschel
Copy link
Contributor

omit_local_variable_types has not been very widely adopted by the ecosystem and there is actually a fairly even split between users using it and always_specify_types.
Only a bit over 10% of users have bothered to use either lint and for users that have bothered, 60% use omit_local_variable_types and 40% use always_specify_types.

Just a little insight from my team and I. We have prefer_final_in_for_each, prefer_final_locals, prefer_const_constructors and avoid_types_on_closure_parameters.

Between omit_local_variable_types and always_specify_types we also prefer the former. We simply don't use it because we usually have final/const variables or when we do have non-final local variables, we usually have their types explicit because of this part of the lint description:

Sometimes the inferred type is not the type you want the variable to have. For example, you may intend to assign values of other types later. In that case, annotate the variable with the type you want.

GOOD:

Widget build(BuildContext context) {
  Widget result = Text('You won!');
  if (applyPadding) {
    result = Padding(padding: EdgeInsets.all(8.0), child: result);
  }
  return result;
}

So basically, in our case, we would like a lint that asks to omit final/const types and always_specify_types for actual local "variables" (that change value).

@natebosch
Copy link
Member

, we usually have their types explicit because of this part of the lint description:

The omit_local_variable_types lint should not stop you from writing Widget result = Text('');. When the local variable type annotation is a supertype of the inferred type the lint will not emit a diagnostic. If you find any case where adding a local variable type impacts the semantics/behavior of the code like this and the lint still emits a diagnostic you can file an issue for us to investigate.

@FMorschel
Copy link
Contributor

FMorschel commented Jan 30, 2024

Sorry if I was not very clear.

We are aware of that, we simply do not use omit_local_variable_types because we do choose to always_specify_types for local non-final variables.

Edit:

We wanted to use both lints but for different cases, that's why. Maybe some other teams/individuals out there have some similar cases. That would explain why they are mostly avoiding activating them.

@srawlins srawlins added the type-enhancement A request for a change that isn't a bug label Mar 27, 2024
@Hixie
Copy link
Author

Hixie commented Jun 7, 2024

flutter/engine now uses a similar style (without any lint guardrails).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lint-proposal P3 A lower priority bug or feature request status-pending type-enhancement A request for a change that isn't a bug
Projects
None yet
Development

No branches or pull requests