Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nominal `unique type` brands #33038

Open
wants to merge 1 commit into
base: master
from

Conversation

@weswigham
Copy link
Member

commented Aug 23, 2019

Fixes #202
Fixes #4895

We've talked about this on and off for the last three years, and it was a major reason we chose to use unique symbol for the individual-symbol-type, since we wanted to reuse the operator for a nominal tag later. What this PR allows:

type NormalizedPath = unique string;
type AbsolutePath = unique string;
type NormalizedAbsolutePath = NormalizedPath & AbsolutePath;

declare function isNormalizedPath(x: string): x is NormalizedPath;
declare function isAbsolutePath(x: string): x is AbsolutePath;
declare function consumeNormalizedAbsolutePath(x: NormalizedAbsolutePath): void;


const p = "/a/b/c";
if (isNormalizedPath(p)) {
    if (isAbsolutePath(p)) {
        consumeNormalizedAbsolutePath(p);
    }
}

unique T (where T is any type) is allowed in any position a type is allowed, and nominally tags T with a marker that makes only T's that have come from that location be assignable to the resulting type.

This is done by adding a new NominalBrand type flag, which is a type with no structure which is unique to each symbol it is manufactured from. This is then mixed into the argument type to unique type via intersection, which is what produces all useful relationships. (The brand can have an alias if it is directly constructed via type MyBrand = unique unknown)

This does so much with so little - this reduces the jankiness written into types to enable nominalness with unique symbols or enums, while adding zero new assignability rules.

So, why bring this up now? I was thinking about how "brands" work today, with something like type MyBrand<T> = T & {[myuniquesym]: void} where T could then become a literal type like "a". We've wanted, for awhile, to be able to more eagerly reduce an intersection of an object literal and a primitive to never (to make subtype reduction and intersection reduction produce less jank and recognize more types as mutually exclusive), but these "brand" patterns keep stopping us. (Heck, we use em internally.) Well, if we ever want to change object types to actually mean object, then we're going to need to provide an alternative for the brand pattern, and ideally that alternative needs to be available for awhile. So looking on the horizon to breaks we could take into 4.0 in 9 months, this simplification of branding would be up there, provided we've had the migration path available for awhile. So I'm trying to get the conversation started on this before we're too close to that deadline to plan something like that. Plus #202 is up there on our list of all-time most requested issues, and while we've always been open to it, we've never put forward a proposal of our own - well, here one is.

On unique symbol

unique symbol's current behavior takes priority for syntactically exactly unique symbol, however if a nominal subclass of symbols is actually desired, one can write unique (symbol) to get a nominally branded symbol type instead (or symbol & unique unknown - it's exactly the same thing). The way a unique symbol behaves is like a symbol that is locked to a specific declaration, and has special abilities when it comes to narrowing and control flow because of that. Nominally branded types are more flexible in their usage but do not have the strong control-flow linkage with a host statement that unique symbols do (in fact, they don't necessarily assume a value exists at all), so there is very much reason for them to coexist. They're similar enough, that I'm pretty comfortable sharing syntax between the two.

Alternative considerations

While I've used the unique type operator here, like we've oft spoken of, on implementation, it's become plain to me that I don't need to specify an argument to unique. We could just expose unique as a unique type factory on it's own, and dispense with the indirection. The "uniqueness" we apply internally isn't actually tied to the input type argument through anything more than an intersection type, so simply shortening unique unknown to unique and reserving the argument form for just unique symbols may be preferable. All the same patterns would be possible, one would just need to write string & unique instead of unique string, thus dispensing with the sugar. It depends on the perceived complexity, I think. However, despite being exactly the same, string & unique is somehow uglier and harder to look at than unique string, which is why I've kept it around for now. It's probably worth discussing, though.

What this draft would still need to be completed:

  • New error messages for any errors involving brands (One or more unique brands is missing from type A with related information pointing at the brand location, rather than the current error involving unique unknown)
  • More tests exercising indexed access and exploring how indexed accesses on branded types are constructed (specifically, what should be done if you (attempt to) index nothing but a union of unique unknown brands)
  • Tests for unique (symbol) declaration emit (to ensure it's not rewritten as unique symbol)
  • Discussion on if keyof <unique brand type> should be never (as it is now), since the brand is top-ish (and contains no structure information itself), or if it should be preserved as an abstract keyof <unique brand> type, so that brand can apply keys-of-branded-types constraints in constraint positions
  • 馃毑 馃彔
@fatcerberus

This comment has been minimized.

Copy link

commented Aug 23, 2019

While I've used the unique type operator here, like we've oft spoken of, on implementation, it's become plain to me that I don't need to specify an argument to unique.

I don't understand this part. If these are meant to replace branded types, then:

type UString = unique string;
type BString = string & { __brand: true };

declare let ustr: UString;
declare let bstr: BString;
let str1: string = bstr;  // ok because branded string is a subtype of string (this is generally desirable).
let str2: string = ustr;  // could be ok because unique string is also a string.
let num: number = ustr;  // never ok, should be error because unique string is NOT a number!

But if we just had unique without the argument, there would be no way to make this distinction. unique would just end up being a nominal unknown (effectively an opaque Skolem constant) which doesn't sound that useful to me?

@weswigham

This comment has been minimized.

Copy link
Member Author

commented Aug 23, 2019

But if we just had unique without the argument, there would be no way to make this distinction.

It's useful because you can then intersect it with something. Which is all unique string is doing under the hood right now - making that intersection for you.

@fatcerberus

This comment has been minimized.

Copy link

commented Aug 23, 2019

Yeah, I need to read more closely. I just noticed the string & unique bit before you posted. But I don't like this because unique by itself isn't a type. It would be a special marker that doesn't work by itself as a type (or at least, doesn't make sense as one - what would it mean to take an argument of form arg: unique, e.g.), which I find very weird to then use in a position that nominally (no pun intended!) accepts only type operands. I guess there's precedent with ThisType<T>, but... I don't know, it rubs me the wrong way.

It might be represented as an intersection under the hood, but that strikes me as unnecessarily exposing implementation details.

@weswigham

This comment has been minimized.

Copy link
Member Author

commented Aug 23, 2019

unique is essentially ust shorthand for unique unknown under the current model, and it does function fully as a standalone type.

@cevek

This comment has been minimized.

Copy link

commented Aug 23, 2019

Is this proposal allows to assign literals to variable/param which has branded type?

type UserId = unique string
function foo(param: UserId) {}

foo("foo") // ok
var x: UserId = "foo"; // ok

let s = "str";
var y: UserId = s; // not ok
foo(s) // not ok
@goodmind

This comment has been minimized.

Copy link

commented Aug 23, 2019

How would you make nominal classes with this?

@weswigham

This comment has been minimized.

Copy link
Member Author

commented Aug 23, 2019

Not easily. You'd need to mix a distinct nominal brand into every class declaration, like via a property you don't use of type unique unknown or similar, and then ensure that your inheritance trees always override such a field (so a parent and child don't appear nominally identical).

I'd avoid it, if possible, tbh. Nominal classes sound like a pain :P

@weswigham

This comment has been minimized.

Copy link
Member Author

commented Aug 23, 2019

Is this proposal allows to assign literals to variable/param which has branded type?

Branded types can only be "made" either via cast or typeguard, as I said in the OP, so no. This is because the typesystem doesn't know what invariants a given brand is meant to maintain, and can't implicitly know if some literal satisfies them.

@jack-williams

This comment has been minimized.

Copy link
Collaborator

commented Aug 23, 2019

Would something like the following ever be meaningful? Probably not right..

interface Parent extends unique unknown { }
interface ChildA extends (Parent & unique unknown) { }
interface ChildB extends (Parent & unique unknown) { }
@AnyhowStep

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2019

This is probably obvious to everyone but unique T shouldn't replace branding.
There are scenarios where branding is the only viable solution (at the moment).

For example,

//Can be replaced with `unique number`
type LessThan256 = number & { __rangeLt : 256 }
//Cannot be replaced with `unique number`
type LessThan<N extends number> = number & { __rangeLt : N }

More complicated example here,
#15480 (comment)


There are a few reasons why I'm generally against the idea of unique T.
(And nominal typing, and instanceof, and using symbols)

Cross-library interop

Library A may have type Radian = unique number.
Library B may have type Radian = unique number.

Both types will be considered different, even though they have the same name and declaration.

If libraries start using this unique T all over the place, you'll start needing unsafe casts (as libA.Radian) more often. This means you can accidentally write myDegree as libA.Radian and cast incorrectly. Whoops!

So, one starts thinking that a no-op casting function would be safer,

function libARadianToLibBRadian (rad : libA.Radian) : libB.Radian {
  return rad as libB.Radian;
}

This is safer because you won't accidentally convert myDegree:libA.Degree to libB.Radian

But if you have N libraries with their own Radian type, you may end up needing up to N^2 functions to convert between each of the Radian types from each library.


Cross-version interop

It's happened to me a bunch where I've had the same package, but at different versions, within a single project.

So,

v1.0.0's type Radian = unique number would be considered different from,
v1.1.0's type Radian = unique number

Now you need a casting function... Even though it's the same package.


With brands, if two libraries use the same brands, even if they're different types, they'll still be assignable to each other. (As long as they don't use unique symbol)

Library A may have type Rad = number & { __isRadian : void }.
Library B may have type Radian = number & { __isRadian : void }.

Even though they're different types, they're assignable to each other. No casting needed.
v1.0.0 and v1.1.0 of type Rad and type Radian will work with each other fine.


As an aside, I vaguely remember something from many, many years ago. I can't find it through Google anymore, though.

There was discussion about adding syntax to C++ to make typedef create a new type (rather than just functioning as a type alias),

typedef double Radian;

And this was rejected outright because of the issues I listed above.

Two libraries with their own typedef double Radian; type would be incompatible, the N^2 problem, etc.

@fatcerberus

This comment has been minimized.

Copy link

commented Aug 23, 2019

Re: nominal classes - classes are already nominal if they contain any private members. Just throwing that out there. 馃槂

@be5invis

This comment has been minimized.

Copy link

commented Aug 24, 2019

So we can finally have things like this? @weswigham

// Low-end refinement type :)
type NonEmptyArray<A> = unique ReadonlyArray<A>
function isNonEmpty(a: ReadonlyArray<A>): a is NonEmptyArray<A> {
    return a.length > 0
}

// INTEGERS (sort of)
type integer = unique number
function isInteger(a: number) { return a === a | 0 }
@AnyhowStep

This comment has been minimized.

Copy link
Contributor

commented Aug 24, 2019

@fatcerberus It's also why I avoid classes entirely and avoid private members if I do have them =P

@AnyhowStep

This comment has been minimized.

Copy link
Contributor

commented Aug 25, 2019

Hmm...

const normalizedPathBrand = Symbol();
const absolutePathBrand = Symbol();

type NormalizedPath = string & typeof normalizedPathBrand;
type AbsolutePath = string & typeof absolutePathBrand;
type NormalizedAbsolutePath = NormalizedPath & AbsolutePath;

declare function isNormalizedPath(x: string): x is NormalizedPath;
declare function isAbsolutePath(x: string): x is AbsolutePath;
declare function consumeNormalizedAbsolutePath(x: NormalizedAbsolutePath): void;


const p = "/a/b/c";
consumeNormalizedAbsolutePath(p); //Error
if (isNormalizedPath(p)) {
    consumeNormalizedAbsolutePath(p); //Error
    if (isAbsolutePath(p)) {
        consumeNormalizedAbsolutePath(p); //OK
    }
}

Playground


I guess the downside to this is that it's not actually a symbol.

But Symbol doesn't have very methods one would accidentally use.
image

@sindresorhus sindresorhus referenced this pull request Aug 26, 2019
@jack-williams

This comment has been minimized.

Copy link
Collaborator

commented Aug 26, 2019

If cross version compatibility really is an issue then an alternate solution would be to make naming explicit:

Let the type unknown "name" denote the set of all values with label name. This would make the intersection type approach mandatory so a nominal path must now be written:

type NormalizedPathOne = string & unique "NormalizedPath";

where an unlabelled unique denotes a generative type as defined in the OP.

type NormalizedPathTwo = string & unique // some label that we don't care about that is auto-generated.

so while two declarations of NormalizedPathTwo produce distinct types, two declarations of NormalizedPathOne produce identical types.

FWIW I have no real preference---for me the big win of this feature is being able reduce empty intersections more aggressively.

Discussion on if keyof unique brand type ....

IMO, for all brand oblivious operations a unique type should be equivalent to unknown (which is what is proposed AFAIK).

@resynth1943

This comment has been minimized.

Copy link

commented Aug 27, 2019

I've implemented Opaque types like so:

type Opaque<V> = V & { readonly __opq__: unique symbol };

type AccountNumber = Opaque<number>;
type AccountBalance = Opaque<number>;

function createAccountNumber (): AccountNumber {
    return 2 as AccountNumber;
}

function getMoneyForAccount (accountNumber: AccountNumber): AccountBalance {
    return 4 as AccountBalance;
}

getMoneyForAccount(100); // -> error
@AnyhowStep

This comment has been minimized.

Copy link
Contributor

commented Aug 27, 2019

@resynth1943

Your version breaks given the following,

type Opaque<V> = V & { readonly __opq__: unique symbol };

type NormalizedPath = Opaque<string>;
type AbsolutePath = Opaque<string>;
type NormalizedAbsolutePath = NormalizedPath & AbsolutePath;

declare function isNormalizedPath(x: string): x is NormalizedPath;
declare function isAbsolutePath(x: string): x is AbsolutePath;
declare function consumeNormalizedAbsolutePath(x: NormalizedAbsolutePath): void;


const p = "/a/b/c";
consumeNormalizedAbsolutePath(p); //Error
if (isNormalizedPath(p)) {
    consumeNormalizedAbsolutePath(p); //Expected Error, Actual OK
    if (isAbsolutePath(p)) {
        consumeNormalizedAbsolutePath(p); //OK
    }
}

Playground

Contrast with,
#33038 (comment)

@resynth1943

This comment has been minimized.

Copy link

commented Aug 31, 2019

@AnyhowStep I know, but that's how I'm currently creating opaque types. I hope this Pull Request will incorporate this into the language, and make it even better than my implementation.

@mohsen1

This comment has been minimized.

Copy link

commented Sep 3, 2019

Nominal types are pretty useful. The example I often use is the APIs that take latitude/longitude and bugs that are result of mixing up latitude with longitude which are both numbers. By making those unique types we can avoid that class of bugs.

However, unique types can cause so much pain when you have to keep importing those types to simply use an API. So I'm hoping that at least primitive types are assignable to unique primitives where I can still call my functions like this:

// lib.ts
export type Lat = unique number;
export type Lng = unique number;
export function distance(lat: Lat, lng: Lng): number;

// usage.ts
import { distance } from 'lib.ts';
distance(1234, 5678); // no need to asset types

As @AnyhowStep mentioned cross-lib and cross-version conflicting unique types can also be a source of pain. Can we limit uniqueness scope somehow? Would that be a viable solution?

@weswigham weswigham added the Experiment label Sep 6, 2019

@weswigham weswigham marked this pull request as ready for review Sep 6, 2019

@weswigham

This comment has been minimized.

Copy link
Member Author

commented Sep 6, 2019

#33290 is now open as well, so we can have a real conversation on what the non-nominal explicit tag would look like, and if we'd prefer it.

@weswigham

This comment has been minimized.

Copy link
Member Author

commented Sep 7, 2019

BTW, this is now a dueling features type situation (though I've authored both) - we won't accept both a #33290 style brand and this PR's style brand, only one of the two (and we're leaning towards #33290 on initial discussion). We'll get to debating it within the team hopefully during our next design meeting (next friday), but y'all should express ideas, preferences, and reasoning therefore within both PRs.

@weswigham

This comment has been minimized.

Copy link
Member Author

commented Sep 9, 2019

Only repo contributors have permission to request that @typescript-bot pack this

@typescript-bot

This comment has been minimized.

Copy link
Collaborator

commented Sep 9, 2019

Heya @weswigham, I've started to run the tarball bundle task on this PR at 13968b0. You can monitor the build here. It should now contribute to this PR's status checks.

@typescript-bot

This comment has been minimized.

Copy link
Collaborator

commented Sep 9, 2019

Hey @weswigham, I've packed this into an installable tgz. You can install it for testing by referencing it in your package.json like so:

{
    "devDependencies": {
        "typescript": "https://typescript.visualstudio.com/cf7ac146-d525-443c-b23c-0d58337efebc/_apis/build/builds/43239/artifacts?artifactName=tgz&fileId=CC1A42393FB48F1DFF21222516C1DB28571C1E5134B16E3B234461EABAECAE1D02&fileName=/typescript-3.7.0-insiders.20190909.tgz"
    }
}

and then running npm install.

@xiaoxiangmoe

This comment has been minimized.

Copy link

commented Sep 9, 2019

type integer = unique number
function isInteger(a: number): a is integer { return a === (a | 0) }

function Interger(a:number) {
    if(!isInteger(a)) throw new Error("not an integer");
    return a;
}
@raveclassic

This comment has been minimized.

Copy link

commented Sep 9, 2019

@xiaoxiangmoe I think you should add a type guard:

function isInteger(a: number): a is integer { return return a === (a | 0) }
@ProdigySim

This comment has been minimized.

Copy link

commented Sep 9, 2019

Borrowing from an example from the other PR...

Let's say I want to have a unique UserId type that is also tagged with details about its composition. What would be the correct way to combine these types into a final unique version?

type NonEmptyString = unique string;
type Uuid = unique string;

// Oops, these two are probably assignable to each other!
type GroupId = NonEmptyString & Uuid;
type UserId = NonEmptyString & Uuid;

I downloaded the tgz build and tried these out, which both appear to "work" on a surface level...

type GroupId = unique (NonEmptyString & Uuid);
type UserId = unique (NonEmptyString & Uuid);


type GroupId = unique string & NonEmptyString & Uuid;
type UserId = unique string & NonEmptyString & Uuid;
@weswigham

This comment has been minimized.

Copy link
Member Author

commented Sep 9, 2019

@ProdigySim yep, the way you'd accomplish that is with extra unique tags for each unique thing you want to track.

@joshburgess

This comment has been minimized.

Copy link

commented Sep 9, 2019

How would one go about using phantom type params to get unique nominal types with this PR? Is there a way?

For example:

type User = ...
type Organization = ...

type Key<A> = unique string

type UserId = Key<User>
type OrganizationId = Key<Organization>

The goal of the above would be to make sure UserId and OrganizationId aren't interchangeable, but the generic type param does not influence the structure of the underlying type or runtime value at all. Would the above just work? Or would you need to do something like:

type Key<A> = unique string & A

or something else?

I think I would probably want/expect syntax like this:

unique type Key<A> = string

or maybe:

unique type Key<A> = unique string

In Haskell & PureScript, the syntax to do this looks like this:

newtype Key a = Key String

Thoughts?

@weswigham

This comment has been minimized.

Copy link
Member Author

commented Sep 10, 2019

How would one go about using phantom type params to get unique nominal types with this PR? Is there a way?

As is, nope. A unique tag is unique per declaration site, and every instantiation gets the same unique brand, irrespective of type parameters. You'd need to do something like type Key<A> = unique string & A, except you probably want to stash A somewhere it's value can't be easily observed (like in a property), which is pretty much where we are today. The phantom type use-case of the structural brands we have today isn't handled well by this proposal.

@joshburgess

This comment has been minimized.

Copy link

commented Sep 10, 2019

@weswigham I just saw your other PR using tag, which does support phantom types. That looks interesting.

Just out of curiosity, would it be possible to add a keyword solely for this purpose to be used in tandem with this implementation?

Maybe, something like:

type Key<A> = unique string & phantom A

or

type Key<A> = unique string & tagged A

or something like that, where phantom, or tagged, essentially behaves similarly to unique, uniquely modifying the nominal type, but without any false/ghost properties added to the structural type.

Alternatively, maybe using unique as a prefix to a type alias would implicitly do something just like the above? like:

// the leading `unique` automatically causes all unique combinations of generic type
// param inputs to uniquely create a nominal type under the hood
unique type Key<A> = string

@weswigham weswigham requested review from ahejlsberg and RyanCavanaugh Sep 10, 2019

@resynth1943

This comment has been minimized.

Copy link

commented Sep 10, 2019

I'm not familiar with all this rhetoric, bit isn't having phantom types the goal, as they're actually unique types? They're without any __mysterious_props__ (and weird compiler errors).

@hardfist hardfist referenced this pull request Sep 10, 2019
@mheiber

This comment has been minimized.

Copy link
Contributor

commented Sep 11, 2019

Can we make unique work for class declarations, so we can write unique class A {}?

I checked out the branch for this PR, and the following produces an invalid character syntax error:

export unique class A {}

I think this matters because it could help us make good declaration file emit for private-identified fields: #30829

With currentish TS, the best I can think of for declaration file emit for:

// example.ts
export class A {
     #foo: number;
}

is:

// example.d.ts
export declare class A {
    #foo;
}

But this violates the spirit of hash private fields: users shouldn't have to scroll past implementation details in order to see the fields that are relevant to them.

If we can emit unique, then that would enable us to hide the implementation better whilst preserving nominality:

// example.d.ts
export declare unique class A {}
mheiber added a commit to bloomberg/TypeScript that referenced this pull request Sep 11, 2019
no type emit for private-identified field in .d.ts
Do not emit types for private-identified fields
when generating declaration files.

// example.ts
export class A {
    #foo: number;
}

// example.d.ts

export declare class A {
    #foo;
}

**This is not ideal!**

The following would be better:

// example.d.ts

export declare unique class A {
    #foo;
}

See discussion:

microsoft#33038 (comment)
mheiber added a commit to bloomberg/TypeScript that referenced this pull request Sep 11, 2019
dts: don't emit type for private-identified field
Do not emit types for private-identified fields
when generating declaration files.

// example.ts
export class A {
    #foo: number;
}

// example.d.ts

export declare class A {
    #foo;
}

**This is not ideal!**

The following would be better:

// example.d.ts

export declare unique class A {
    #foo;
}

See discussion:

microsoft#33038 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can鈥檛 perform that action at this time.