There are various places where Sorbet will take a bare generic class (one that hasn't been applied to any type arguments) and apply it to "sensible" type arguments.
Consider code like this:
# typed: true
extend T::Sig
class Box
extend T::Generic
Elem = type_member
sig {returns(T.nilable(Elem))}
attr_accessor :val
end
sig {params(box: Box).void}
def example(box)
val = box.val
T.reveal_type(val) # ... ?
end
The val
method mentions the Elem
type member on Box
. When we call this
method on a type like Box[Integer]
, Sorbet effectively
substitutes1 Integer
for Elem
in the signature for Box#val
.
If we allowed calling box.val
when box
had type ClassType { klass = ::Box }
, how would that substitution work? It wouldn't, because there are no type
arguments to apply.
You would think it nonsensical to try to treat this program as well-formed:
def requires_one_arg(x)
puts(x)
end
requires_one_arg() # 💥
And essentially, generic classes are type-level functions that take required, type-level arguments.
As such, Sorbet tries as hard as possible to require type arguments for generic classes. But the real world is messy.
Having defaults for bare generic classes is mostly a hack.
In almost all cases, we would probably prefer the user to give us arguments. We
don't want to see Array
, we want T::Array[SomeType]
. But there are some
complications we have to accommodate.
-
Pre-Sorbet type annotations.
The runtime type systems in use at Stripe that predated Sorbet allowed writing
Array
,Hash
, etc. as valid type annotations. We didn't want to codemod all these when we rolled out Sorbet to minimize disruption.This persists to this day: generic classes defined in the standard library are allowed to be bare in
# typed: true
. Only in# typed: strict
does writingArray
in type syntax produce an error. -
Types for class literals (or more generally, any class with type templates).
All class singleton classes are generic classes (generic in the
<AttachedClass>
type template that Sorbet implicitly declares). Without defaulting rules, the type of an expression likeMyClass
would beT.class_of(MyClass)
. This is a bare generic, because type templates are just type members scoped to a singleton class.No matter how hard we try, it would be impossible to ask users to replace something like
MyClass.foo
with something likeT.class_of(MyClass)[MyClass].foo
(even assuming this syntax were valid). Indeed, most of our choices around type member defaulting rules were invented to accommodate<AttachedClass>
. But the rules were chosen from first principles, not implementing something ad hoc for<AttachedClass>
specifically. -
Instantiating a generic class and forgetting to provide type arguments.
This is somewhat a repeat/special case of the previous, but worth mentioning anyways. Consider code like this:
Array.new Box.new
Both of these are problematic. We'd prefer that users write something like
Array[Integer].new
orBox[String].new
. But because of how Sorbet builds the CFG, both these look like<tmp>$1 = Array <tmp>$1.new() <tmp>$2 = Box <tmp>$2.new()
Sorbet has to make a choice for the type of
<tmp>$1
and<tmp>$2
when they're assigned, which makes it hard for Sorbet to reject the call to.new
saying something like "you have to apply this class to a type when instantiating it."Unlike the previous case,
Box[Integer].new
is at least syntax we support, because it's applying the arguments to the instance class. And even then, we only support this syntax for the.new
method, not for other singleton class methods:Box[Integer].make
isn't valid Sorbet syntax, because Sorbet special-cases constructor methods.2 -
Types for the
.class
method.Consider code like this
class MyModel extend T::Generic MutatorType = type_template { {upper: MyMutator} } def foo klass = self.class end end
What should the type of
klass
be in the snippet above? The type can't simply beT.class_of(MyModel)
, because that would be a bare generic. This is a very similar problem to the class literal problem, where it would be prohibitively annoying to ask users to rewrite code likeself.class
, and we don't even have a suitable replacement if we wanted to enforce something.Even more, there's a similar problem to the previous point:
my_box.class.new('')
If
my_box
isBox[Integer]
, there's no reason whymy_box.class.new
has to create anotherBox[Integer]
--making aBox[String]
is fine!But in Sorbet's CFG, that snippet looks like this:
<tmp>$1 = my_box.class <tmp>$2 = <tmp>$1.new('')
and Sorbet has no way of knowing whether
<tmp>$1
was a constant literal or an expression. (Wildly, Sorbet allowsmy_box.class[String].new('')
to make a well-typedBox[String]
instance).
TL;DR: It's unavoidable for Sorbet to have to allow bare generics in places.
However, completely bare generics are bad, because then we have no types to substitute for generic types when used in method signatures, etc.
So Sorbet has default rules to silently convert bare generic classes to applied generic classes in places where it's unavoidable.
So we've established that bare generic classes are bad, and that bare generic classes are also unavoidable. Defaulting rules are how we make the best of a bad situation.
The defaulting rules are implemented by ClassOrModule::externalType()
, which
is computed once per Symbol during resolver and cached, so the actual
computation happens in unsafeComputeExternalType
. (It's not actually unsafe
anymore because of a change jvilk made in this commit. It used to do
lurky things with atomics and racy mutable access).
The rules here might be out of date w.r.t. the implementation, and if they are please edit this doc. But at a high level:
-
If a type_member is a "legacy" generic, which is basically
Array
and a few other classes (but importantly, not every stdlib generic class), it unsafely defaults toT.untyped
no matter what. (This was just a hack to get things to type check when rolling out Sorbet.) -
If a type member is
fixed
, default it to whatever the type member is fixed to, regardless of variance. -
Otherwise, we're going to have to use the variance:
- Invariant? ->
T.untyped
- Covariant? -> the
upper
bound (the default upper bound isT.anything
) - Contravariant? -> the
lower
bound (the default lower bound isT.noreturn
)
- Invariant? ->
The most common question is then: why are invariant type members defaulted to
T.untyped
? It's because we can't get away with using upper
nor lower
.
Consider this setup:
class Parent; end
class Child < Parent; end
In this setup, T.class_of(Child) <: T.class_of(Parent)
, which is what users
expect. Now consider if we made Parent
and Child
generic:
class Parent
extend T::Generic
Elem = type_template { {upper: Numeric} }
end
class Child < Parent
extend T::Generic
Elem = type_template { {upper: Integer} }
end
Suddenly, Parent
and Child
need to have defaults (because of all the
situations we had before, like if they appear as class literals in a method
body, or as the result type of a call to self.class
). We might want to pick
defaults such that we maintain the relationship that held previously, namely that
T.class_of(Child) <: T.class_of(Parent)
. (Those types are meaningless now that
those are bare generic classes, but the intuition of wanting it to still hold
remains). The problem is that both upper
and lower
are hard to stomach as
defaults.
Consider if we pick upper
. Then we'd have that:
Parent
has typeT.class_of(Parent)[Numeric]
andChild
has typeT.class_of(Child)[Integer]
T.class_of(Child)[Integer]
is not a subtype of T.class_of(Parent)[Numeric]
because Elem
is invariant! As unintuitive as this sounds, this subtyping
outcome is actually the desirable, sound outcome: if we both pick the upper
bound as the default and treat Child
as a valid value of type
T.class_of(Parent)
, it's possible to draw a contradiction:
# typed: strong
class Module; include T::Sig; end
class Parent
extend T::Generic
Elem = type_template { {upper: Numeric} }
sig {overridable.params(x: Elem).void}
def self.example(x); end
end
class Child < Parent
extend T::Generic
Elem = type_template { {upper: Integer} }
sig {params(x: Integer).void}
def self.takes_integer(x); end
sig {override.params(x: Elem).void}
def self.example(x)
takes_integer(x)
# ^ our program passes a float to takes_integer without using T.unsafe,
# and with no errors
end
end
sig {params(parent_class: T.class_of(Parent)).void}
# ^ defaults to T.class_of(Parent)[Numeric]
def main(parent_class)
parent_class.example(0.0)
# ^^^ a valid numeric, but not an Integer!
end
main(Child)
So maybe upper
was the wrong thing to pick? But if we pick lower
, two things
happen:
T.class_of(Parent)
defaults toT.class_of(Parent)[T.noreturn]
, which means we can't even callparent_class.example
- We could make a similar contradiction by rewriting the example to use
Elem
inreturns
instead ofparams
.
So maybe we pick upper
when defaulting in params
, and lower
when
defaulting in returns
? Implementation complexity aside (we'd have to thread
input/output position (polarity) through all calls to externalType
, which would
mean we couldn't cache it as easily as now), we'd still have the problem that
most users expect Child
to be a valid T.class_of(Parent)
.
So Sorbet gives up and accepts the unsoundness by simply defaulting to
T.untyped
.
The takeaway? Type members in well-typed code should always either:
- be fixed, and accept the limitations that come with that (basically: makes the class final in practice3), or
- be either co- or contravariant
- be carefully reviewed to ensure that
T.untyped
isn't sneaking in (highlighting untyped code is a good idea, maybe even using# typed: strong
if possible).
Code making heavy use of invariant, non-fixed generics will be the most prone to
having T.untyped
sneak in, due to a combination of unavoidable and intentional
choices in the type system. Mitigating T.untyped
means looking at the list of
cases where Sorbet defaults type arguments, and doing the opposite.
-
Use
# typed: strict
, so Sorbet will treat bare stdlib generics as errors.For generic singleton classes, note that
T.class_of(...)[...]
is valid syntax for applying type arguments to a generic singleton class. (Sorbet does not yet require this syntax in# typed: strict
files, because the syntax is so new.) -
In a class with invariant type templates, avoid calling singleton class methods on class literals. Instead, only call them inside variables on a generic class:
class AbstractClass MyTemplate = type_template sig {abstract.returns(MyTemplate)} def self.thing; end def self.example self.thing # OK end end sig do type_parameters(:U) .params(abstract_class: T.class_of(AbstractClass)[AbstractClass, T.type_parameter(:U)]) .returns(T.type_parameter(:U)) end def example(abstract_class) abstract_class.thing # OK end AbstractClass.thing # 💥 bad, untyped
-
Always provide types when instantiating a generic class.
Box.new(0) # 💥 bad, untyped Box[Integer].new(0) # OK
Ideally Sorbet could catch this in the future.
-
Understand that using the
.class
method will introduce untyped the same way as mentioning a class literal.Ideally Sorbet could catch this in the future.
Footnotes
-
Sorbet doesn't actually use the term substitute here, it uses
alignBaseTypeArgs
, because OOP and inheritance are messy. But if your intuition is substitution, that's close enough for now. ↩ -
This is another weird case. Most other languages would treat the constructor as a generic method, where the type is something like [T](x: T) -> Box[T], and would then have syntax for applying a type argument to generic methods, not the class, like
Box.new[Integer](0)
orBox.make[Integer](0)
. ↩ -
I say "final in practice" here because while you'll still be able to subclass this class, those subclasses will be forced to repeat the exact same
fixed
annotation as the parent. This means that if, for example, you're using a type member to track an associatedResult
type of a class that implements an AbstractRPCCommand (like the example in our generics docs), you'll find that child classes must have the sameResult
type as the parent. In rare cases that's fine, but usually when you're making a subclass, you want to use set the type member to something else entirely, which is no longer possible after a bound isfixed
. ↩