Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Constraint to have a custom name and/or a custom static type #78

Merged
merged 1 commit into from
May 31, 2019

Conversation

justingrant
Copy link
Contributor

@justingrant justingrant commented Mar 1, 2019

EDIT: this PR changed a lot, and got more general, so this initial post is not where the solution is heading. See discussion below.

This PR adds an optional parameter customInstanceOf?: (o: V) => boolean to the InstanceOf constructor. This parameter lets clients optionally use a custom function instead of the default instanceof keyword to decide if an object is an instance of the specified class.

Justification

This is helpful in cases where multiple library versions might be running side-by-side in the same process. This is unfortunately common in a node.js environment, e.g. you depend on library A and library B which each depend on different versions of library C.

Usually libraries which want to run side-by-side will have a static isXXX method as a version-safe alternative to instanceof. Examples:

These isXXX functions are usually implemented via checking a special property (e.g. _bsontype for some classes in the MongoDB client library) that's used for the purpose of identifying instances of the same class across versions. So even if a library doesn't offer an isXXX function, the client can easily build one and pass it to the customInstanceOf parameter in this PR.

Tests

I added tests for this PR that:

  1. Create two different versions of the same class and verifying that a custom isXXX function will cause check() to return true when comparing instances of different versions. Existing tests with the default instanceof behavior still pass.
  2. Add InstanceOf tests for Date and RegExp objects. This is unrelated to this PR but those two built-in classes are (in my experience) frequent sources of class-related validation bugs so it's good to have them tested!

BTW, this library seems great. I'm in the middle of converting my app over to use it, and so far the only serious gap I've found is here in this PR. If you want me to change anything about this PR, I'm happy to do it-- just let me know. Thanks!

@coveralls
Copy link
Collaborator

coveralls commented Mar 1, 2019

Coverage Status

Coverage increased (+0.03%) to 99.149% when pulling 2e4ec41 on justingrant:version-safe-instanceof-2 into 18d8a1d on pelotom:master.

@pelotom
Copy link
Collaborator

pelotom commented Mar 2, 2019

Hi, thanks for the PR. I think rather than muddy the semantics of InstanceOf it would be better to make a new runtype which captures arbitrary type guards. Something like this:

import { Runtype, create } from '../runtype';
import { ValidationError } from '../errors';

export interface Guard<V> extends Runtype<V> {
  tag: 'guard';
  name: string;
}

export function Guard<V>(name: string, guard: (x: any) => x is V) {
  return create<Guard<V>>(
    x => {
      if (!guard(x)) throw new ValidationError(`Failed to pass type guard for ${name}`);
      return x;
    },
    { tag: 'guard', name },
  );
}

then you could use it like e.g.

const RtBuffer = Guard('Buffer', Buffer.isBuffer);

Note that this works because Buffer.isBuffer is already defined as a type guard:

isBuffer(obj: any): obj is Buffer;

For your own types you could make up arbitrary type guards, like

const RtSomeClass = Guard(
  'SomeClass',
  (o): o is SomeClass => o._someClassTag === SOMECLASS_TAG
);

This approach is more flexible than tacking something on to InstanceOf because it can be used for any kind of type, not just classes. What do you think?

@justingrant
Copy link
Contributor Author

Hi @pelotom - I really like your idea of making a more generic solution as opposed to overloading only InstanceOf.

It'd also address the case where types and constraints are intertwined, e.g. number | string where if a number it must be a positive integer and if a string must be all lower case. Guard<T> would probably make it clearer to implement that case because it'd be possible to combine type checking and value checking in the same guard function instead of having to have a separate withConstraint() which necessarily must include the same type-checking logic that check() normally includes.

Your idea might also also help with cases of non-public/undocumented properties or behavior, where the author deliberately wants to allow the runtime type to diverge from the TS type, e.g. to support legacy code using undocumented or deprecated features where the author doesn't want to expose them in a .d.ts but also doesn't want to fail vaildation if those undocumented things are used at runtime. I ran into this case recently with a MongoDB library's test code.

That said, I've got a few concerns/questions. My main concern is that Guard<T> and withConstraint() may be similar enough to be confusing but dissimilar enough that you can't use one interchangeably with the other. For example, if my custom type-check doesn't require a custom error message, then I can use Guard<T> alone. But as soon as I need a custom error message, then I need to either split logic into a guard function and a constraint function, or turn the guard function into a wrapper around the constraint function.

Seems like it'd be clearer to allow a single function to be used for both type-checking and custom-message-allowed constraint checking. Calling only one function also might yield a (minor) performance advantage for constraints called in repeatedly in loops or when type-checking huge arrays.

Instead of (or in addition to?) adding a new top-level runtype, what do you think about a withGuardConstraint() that would work identically to withConstraint() except the implementation wouldn't call check() before calling the constraint function?

@pelotom
Copy link
Collaborator

pelotom commented Mar 3, 2019

That said, I've got a few concerns/questions. My main concern is that Guard<T> and withConstraint() may be similar enough to be confusing but dissimilar enough that you can't use one interchangeably with the other. For example, if my custom type-check doesn't require a custom error message, then I can use Guard<T> alone. But as soon as I need a custom error message, then I need to either split logic into a guard function and a constraint function, or turn the guard function into a wrapper around the constraint function.

Seems like it'd be clearer to allow a single function to be used for both type-checking and custom-message-allowed constraint checking. Calling only one function also might yield a (minor) performance advantage for constraints called in repeatedly in loops or when type-checking huge arrays.

Instead of (or in addition to?) adding a new top-level runtype, what do you think about a withGuardConstraint() that would work identically to withConstraint() except the implementation wouldn't call check() before calling the constraint function?

withConstraint is just a convenience method for obtaining a Constraint runtype, so a withGuardConstraint (or, I would call it, withGuard) would just be a convenience method for obtaining a Guard runtype. While superficially similar, they seem like fairly different use cases; Constraint lets you add a runtime check to an existing runtype without modifying the static type; Number.withConstraint(x => x > 0) is still just a Runtype<number>. But Guard is about asserting some new static type that isn't necessarily represented as a runtype at all. That said, I'm open to unifying the two if it's possible. I agree that in any case custom error messages would be nice to have for guards as well.

@justingrant
Copy link
Contributor Author

I agree that in any case custom error messages would also be nice to have for guards as well.

Any idea how that could work? TS type guard functions can't return strings and shouldn't throw if the type doesn't match. I guess check() could supply a second parameter, e.g.
guard: (x: any, errorReporter: (message: string) => void) => x is V

What do you think? Is there a better way to do it?

@justingrant
Copy link
Contributor Author

Hi @pelotom - Here's what I've got so far for adding custom error messaging to your Guard<T> idea. It works for traditional single-parameter type guards as well as "custom message" type guards that accept a second errorReporter callback parameter.

If you like this direction I can PR it with tests, docs, reflection, etc. Let me know what you think.

export function Guard<V>(
  name: string,
  guard: (x: any, errorReporter?: (message: string) => void) => x is V,
) {
  return create<Guard<V>>(
    x => {
      let errMsg: string | undefined;
      const errorReporter = (message: string) => {
        if (String.guard(errMsg))
          throw new Error('Cannot report more than one error from a type guard');
        errMsg = message;
      };
      if (guard(x, errorReporter)) {
        if (String.guard(errMsg))
          throw new Error('Type guard must return false after reporting an error');
        return x;
      } else {
        if (String.guard(errMsg)) throw new ValidationError(errMsg);
        else throw new ValidationError(`Failed to pass type guard for ${name}`);
      }
    },
    { tag: 'guard', name },
  );
}

Here's some draft documentation with example usage:

Custom type guards

If an existing Runtype doesn't fit the type-checking behavior you want, you can build your own using a type guard function and the Guard<T> runtype.

For example, many JavaScript classes use a static method for type checking because instanceof (and therefore the InstanceOf runtype) won't match different versions of the same class running side-by-side in a node.js environment. If the class already provides a type guard, like Buffer.isBuffer, then it's easy to wrap it with a Runtype:

// const badDontUse = InstanceOf(Buffer); // breaks if different Buffer versions are used
const RtBuffer = Guard('Buffer', Buffer.isBuffer);
RtBuffer.check('not a buffer'); // Throws error: Failed to pass type guard for Buffer

It's also easy to use a custom type guard function:

const myClassGuard = (o: any): o is MyClass => o._myClassId === 'M Y C L A S S';
const RtMyClass = Guard('MyClass', myClassGuard);
RtMyClass.check('not a MyClass'); // Throws error: Failed to pass type guard for MyClass

Validation messages can optionally be customized using type guards:

const myClassGuard = (o: any, errorReporter?: (message: string) => void): o is MyClass => {
  const id = o._myClassId;
  if (!id) {
    errorReporter && errorReporter(`MyClass id not found`);
    return false;
  } else if (id !== 'M Y C L A S S') {
    errorReporter && errorReporter(`Invalid MyClass id: ${id}`);
    return false;
  }
  return true;
}
const RtMyClass = Guard('MyClass', myClassGuard);
RtMyClass.check( {_myClassId: 'fake'} ); // Throws error: Invalid MyClass id: fake

You can think of the Guard<T> runtype as the opposite of the Runtype<T>.guard method. Guard<T> transforms a type guard function for type T into a Runtype for type T. Runtype<T>.guard does the reverse: it takes a Runtype for type T and produces a type guard function for T.

@pelotom
Copy link
Collaborator

pelotom commented Mar 9, 2019

After thinking about it some more, what do you think of this approach:

export interface Guard<A extends Runtype, B extends Static<A>> extends Runtype<B> {
  tag: 'guard';
  underlying: A;
  guard: (x: Static<A>) => x is B;
}

export function Guard<A extends Runtype, B extends Static<A>>(
  underlying: A,
  guard: (x: Static<A>) => x is B
) {
  return create<Guard<A, B>>(
    x => {
      const typed = underlying.check(x);
      if (!guard(typed)) throw new ValidationError(`Failed to pass type guard on ${show(underlying as any)}`);
      return x as any;
    },
    { tag: 'guard', underlying, guard },
  );
}

Some observations:

  • Like Constraint, Guard is a refinement relative to starting type A. This is useful because sometimes you want to e.g. refine from a union to some subunion of its variants, and it's nice to have some type safety in your guard function. To this end, maybe we should call it Refinement instead of Guard, and the corresponding method on Runtype would be refine? 🤔
  • Custom error messages can be achieved by simply throwing a ValidationError with a custom message inside the guard function (instead of merely returning false). Your suggested errorReporter API is reasonable but it allows some odd semantic edge cases that I'd rather make impossible: mainly that after calling errorReporter the user still needs to return something, and they could choose to return true, which is just strange.
  • Guard is then strictly more powerful than Constraint because it has the ability to refine not only the runtime type but the static type. But the API of Constraint is cleaner if you don't need to refine the static type. So Constraint would stay, but it could be reimplemented using Guard. (I made a pass at trying to unify them with overloads so you could continue using the same API for Constraint, but if you made the constraint function's return type a type predicate it would become a guard... but no dice, it seems you can't use overloads to make a function possibly a type guard, possibly not.)

@justingrant
Copy link
Contributor Author

Like Constraint, Guard is a refinement relative to starting type A.

I really like this idea. Especially because it should make easier delegation of things like reflect/show. Speaking of show(), what were you thinking should be shown for a refined Runtype? Should the refinement implementation have the ability to customize what's emitted by show()? Should it have the ability to customize reflection?

One use case I was thinking about is where the implementer is OK with type-checking behavior of a runtype but wants more control over validation error messages, e.g. for localization or to provide clearer messages. How do you see this case working? Should the implementer select a lowest-common-denominator underlying type like unknown, and then call check() of the underlying type inside the predicate function?

Custom error messages can be achieved by simply throwing a ValidationError with a custom message inside the guard function (instead of merely returning false). Your suggested errorReporter API is reasonable but it allows some odd semantic edge cases...

Yeah, I didn't like that callback API either. ;-) I just couldn't figure out any other way to make the guard functions used for Guard<T> also safely useable as regular TS type guards. Having is XXX functions around that behave differently from regular type guards (e.g. throw exceptions instead of returning false) might be a problem. Seems tempting for another developer to use those functions as type guards without knowing that they're not real type guards.

I think a better approach might be to, instead of requiring a type predicate, just accept any function that returns boolean. Because the return type won't be a type guard, then devs won't expect them to act like type guards, so it's OK to throw exceptions for error messages. It'd be trivial for the Runtype code to turn any boolean-returning function into a real type guard via guard().

Conveniently, a function that returns (o: any) o is XXX can be used in place of a (o: any)=>boolean. So if we accept a boolean-returning function, then real type predicate functions can also be used without having to wrap them in a cast to boolean return type.

Another option I thought of was to borrow the => boolean | string signature of Constraint functions. This might make it riskier to pass in a type predicate function if the predicate doesn't actually return boolean but instead returns truthy/falsy, e.g. o => o.someString where if the string property is present it'd be a truthy type guard function return value, but an error if returned from a constraint function. So might be safer to only accept boolean.

Anyway, how about something like this?

export function Guard<A extends Runtype, B extends Static<A>>(
  underlying: A,
  validator: (x: Static<A>) => boolean
) {
  const guard = ((x: Static<A>): x is B => validator(x))
  return create<Guard<A, B>>(
    x => {
      const typed = underlying.check(x);
      if (!guard(typed)) throw new ValidationError(`Failed to pass type guard on ${show(underlying as any)}`);
      return x as any;
    },
    { tag: 'guard', underlying, guard },
  );
}

To this end, maybe we should call it Refinement instead of Guard, and the corresponding method on Runtype would be refine? 🤔

Yeah, naming is tough-- if we're not actually giving a real guard function, then calling it Guard may not be helpful. Refinement is OK. Other options could be Custom or Transform or ChangeType.
I don't feel strongly about this.

@pelotom
Copy link
Collaborator

pelotom commented Mar 12, 2019

Speaking of show(), what were you thinking should be shown for a refined Runtype? Should the refinement implementation have the ability to customize what's emitted by show()? Should it have the ability to customize reflection?

I would probably just do something similar to what's done for Constraint currently, e.g. Refine<${show(underlying)}>, unless you have a better idea.

One use case I was thinking about is where the implementer is OK with type-checking behavior of a runtype but wants more control over validation error messages, e.g. for localization or to provide clearer messages. How do you see this case working? Should the implementer select a lowest-common-denominator underlying type like unknown, and then call check() of the underlying type inside the predicate function?

Custom error messages can be achieved by simply throwing a ValidationError with a custom message inside the guard function (instead of merely returning false). Your suggested errorReporter API is reasonable but it allows some odd semantic edge cases...

I think a better approach might be to, instead of requiring a type predicate, just accept any function that returns boolean. Because the return type won't be a type guard, then devs won't expect them to act like type guards, so it's OK to throw exceptions for error messages. It'd be trivial for the Runtype code to turn any boolean-returning function into a real type guard via guard().

Conveniently, a function that returns (o: any) o is XXX can be used in place of a (o: any)=>boolean. So if we accept a boolean-returning function, then real type predicate functions can also be used without having to wrap them in a cast to boolean return type.

Another option I thought of was to borrow the => boolean | string signature of Constraint functions. This might make it riskier to pass in a type predicate function if the predicate doesn't actually return boolean but instead returns truthy/falsy, e.g. o => o.someString where if the string property is present it'd be a truthy type guard function return value, but an error if returned from a constraint function. So might be safer to only accept boolean.

Anyway, how about something like this?

export function Guard<A extends Runtype, B extends Static<A>>(
  underlying: A,
  validator: (x: Static<A>) => boolean
) {
  const guard = ((x: Static<A>): x is B => validator(x))
  return create<Guard<A, B>>(
    x => {
      const typed = underlying.check(x);
      if (!guard(typed)) throw new ValidationError(`Failed to pass type guard on ${show(underlying as any)}`);
      return x as any;
    },
    { tag: 'guard', underlying, guard },
  );
}

So if I understand what you're proposing correctly, this would make Guard act in every way just like Constraint, except with an additional type parameter to define what it should cast to. If so, and I think that's reasonable, why not just add the extra optional type parameter to Constraint itself?

Another possibility would be to decouple customizing error messages from these particular runtypes and instead allow customizing any runtype's error message:

Number.Or(String).withCustomErrorMessage((x, defaultErrorMessage) => `${x} is neither a number or a string`);

Then a Guard runtype wouldn't have to worry about that concern and could focus on what it's mainly about, using a type guard to refine the type:

const BufferRuntype = Unknown
  .withGuard(Buffer.isBuffer)
  .withCustomErrorMessage(x => `${x} ain't a buffer yo!`);

Something like withCustomErrorMessage seems like a more direct way to tackle this use case you were talking about:

One use case I was thinking about is where the implementer is OK with type-checking behavior of a runtype but wants more control over validation error messages, e.g. for localization or to provide clearer messages. How do you see this case working? Should the implementer select a lowest-common-denominator underlying type like unknown, and then call check() of the underlying type inside the predicate function?

@justingrant
Copy link
Contributor Author

Hi Tom - sorry for slow reply, I got buried last week and just digging out now.

So if I understand what you're proposing correctly, this would make Guard act in every way just like Constraint, except with an additional type parameter to define what it should cast to. If so, and I think that's reasonable, why not just add the extra optional type parameter to Constraint itself?

I think that would work too. I'll try to prototype it and work with it for a few days and let you know how it goes.

One question: if Constraint could cast to a new type, then what would be the difference between Guard and Constraint? Would Guard be a subset of functionality as Constraint, but with an easier dev experience if you already have an existing guard function?

I would probably just do something similar to what's done for Constraint currently, e.g. Refine<${show(underlying)}>, unless you have a better idea.

If we go with that extra type parameter to Constraint, then should we show the underlying type, the desired type, or both? I'm inclined to think that the desired type is the one to show to users, because the whole point of casting is to show a new face to the world.

BTW, I have an embarrassing admission: I have no idea how the show/reflect parts of the library work. Is there an easy way to explain the parts about the recursive use of Reflect?

Something like withCustomErrorMessage seems like a more direct way

I definitely like the idea of being able to customize messages of existing runtypes, and your withCustomErrorMessage idea seems like a fine approach. It won't work for more advanced cases where there are multiple possible messages (e.g. 'illegal character in email address' vs. 'email address is too long'), but Constraint should cover those cases, right?

@pelotom
Copy link
Collaborator

pelotom commented Mar 18, 2019

Hi Tom - sorry for slow reply, I got buried last week and just digging out now.

No worries, it often takes me a while to respond to PRs for the same reason.

One question: if Constraint could cast to a new type, then what would be the difference between Guard and Constraint? Would Guard be a subset of functionality as Constraint, but with an easier dev experience if you already have an existing guard function?

Yes, I think if go ahead with augmenting Constraint in this way we could just make withGuard a convenience method for constructing a constraint (we wouldn't even need a new Runtype for it.)

I would probably just do something similar to what's done for Constraint currently, e.g. Refine<${show(underlying)}>, unless you have a better idea.

If we go with that extra type parameter to Constraint, then should we show the underlying type, the desired type, or both? I'm inclined to think that the desired type is the one to show to users, because the whole point of casting is to show a new face to the world.

This is an interesting point; if you're changing the type then you probably want to show that somehow, so maybe we need a name parameter. And if you're using Constraint in the traditional way without changing the type you still might want to name it somehow, e.g.

const PositiveNumber = Number.withConstraint(x => x > 0 | `${x} is not positive`, 'PositiveNumber');

BTW, I have an embarrassing admission: I have no idea how the show/reflect parts of the library work. Is there an easy way to explain the parts about the recursive use of Reflect?

The idea behind Reflect is that it allows writing external functions/libraries that can inspect the internal structure of a runtype. show (which is basically like toString()) could've been defined as a method on Runtype but I made it use Reflect as kind of a proof of concept to show how third parties could integrate. In fact Runtype.toString() just delegates to the show function. To write your own "method" which operates on a runtype, you just need to write a function which switches on the tag of the Reflect union and handles all the possible cases, recursing wherever a runtype contains sub-runtypes. The primitive types are the base cases. Hope that helps, but let me know if not.

I definitely like the idea of being able to customize messages of existing runtypes, and your withCustomErrorMessage idea seems like a fine approach. It won't work for more advanced cases where there are multiple possible messages (e.g. 'illegal character in email address' vs. 'email address is too long'), but Constraint should cover those cases, right?

I would think so.

@justingrant
Copy link
Contributor Author

Hi @pelotom - finally getting back to this after kids' spring break and a weeklong adventure learning about how programmatically sending email that avoids Gmail spam folders is harder than it looks!

I'm now trying to figure out what should be the signature for a withConstraint() overload that allows callers to change the resulting type.

I see that withConstraint() takes an optional args parameter. At first I assumed this would be passed to the constraint function but it doesn't seem to be referenced anywhere. What is args used for? I'm asking because it's squatting on some real estate that otherwise could be a good place for an extra "change type to" parameter. ;-)

@pelotom
Copy link
Collaborator

pelotom commented Apr 8, 2019

I'm now trying to figure out what should be the signature for a withConstraint() overload that allows callers to change the resulting type.

I see that withConstraint() takes an optional args parameter. At first I assumed this would be passed to the constraint function but it doesn't seem to be referenced anywhere. What is args used for? I'm asking because it's squatting on some real estate that otherwise could be a good place for an extra "change type to" parameter. ;-)

The args parameter was added in #16 / #17 to support a third party integration. TBH I've never totally understood the need for it, maybe @typeetfunc can elaborate.

That extra argument / type argument definitely complicates the types... for simplicity's sake let's imagine what it would look like if we didn't have to worry about args. I'm thinking something like this:

export interface Constraint<A extends Runtype, T extends Static<A> = Static<A>> extends Runtype<T> {
  tag: 'constraint';
  underlying: A;
  constraint(x: Static<A>): boolean | string;
}

export function Constraint<A extends Runtype, T extends Static<A>>(
  underlying: A,
  constraint: (x: Static<A>) => boolean | string,
): Constraint<A, T>;

// in Runtype:

withConstraint<T extends Static<this>>(
  constraint: (x: Static<this>) => boolean | string,
): Constraint<this, T>;

@justingrant justingrant changed the title InstanceOf constructor optional param for static isXXX methods Allow constraint functions to optionally check a different static type Apr 8, 2019
@justingrant
Copy link
Contributor Author

justingrant commented Apr 8, 2019

Given the existing 3rd-party dependency, would it be better to add a new withXXX function instead of trying to overload withConstraint? We'd already discussed adding a withGuard(), so we could just add one more, e.g. withRefinement or withTransform or withNarrowing or something like that? A revised Constraint runtype could service all these cases (including the args case that I don't yet understand). For example:

export interface Constraint<A extends Runtype, K, T extends Static<A> = Static<A>>
  extends Runtype<Static<A>> {
  tag: 'constraint';
  underlying: A;
  // See: https://github.com/Microsoft/TypeScript/issues/19746 for why this isn't just
  // `constraint: ConstraintCheck<A>`
  constraint(x: Static<A>): boolean | string;
  args?: K;
  desired?: T;
  name?: string;
}

export function Constraint<A extends Runtype, K, T extends Static<A> = Static<A>>(
  underlying: A,
  constraint: ConstraintCheck<A>,
  args?: K,
  name?: string,
): Constraint<A, K, T> {
  return create<Constraint<A, K, T>>(
    x => {
      const typed = underlying.check(x);
      const result = constraint(typed);
      if (String.guard(result)) throw new ValidationError(result);
      else if (!result) throw new ValidationError(`Failed ${name || 'constraint'} check`);
      return (typed as unknown) as T;
    },
    { tag: 'constraint', underlying, args, constraint, name },
  );
}

// in Runtype:

function withConstraint<K>(constraint: ConstraintCheck<A>, args?: K): Constraint<A, K> {
  return Constraint(A, constraint, args);
}

function withNamedConstraint<K>(
  constraint: ConstraintCheck<A>,
  name: string,
  args?: K,
): Constraint<A, K> {
  return Constraint(A, constraint, args, name);
}

function withRefinement<T extends Static<A>, K>(
  constraint: (x: Static<A>) => boolean | string,
  name?: string,
  args?: K,
): Constraint<A, K, T> {
  return Constraint(A, constraint, args, name);
}

function withGuard<T extends Static<A>, K>(
  guard: (x: Static<A>) => x is T,
  name?: string,
  args?: K,
): Constraint<A, K, T> {
  return Constraint(A, guard, args, name);
}

What do you think of this kind of approach?

If I'm on the right track, here's a bunch of open issues. Let me know if you have advice on these questions:

  • Should Static<> be typed as A or T? If the latter, what's the cleanest way to do it?
  • Is it good to have a desired: T property on this runtype, to facilitate analysis/reflection later? If yes, then:
    • Is it OK if it's never initialized, like _falseWitness? Or is desired unnecessary because solving the Static<> issue above requires setting the type of _falseWitness to T?
    • What should be its default if T is not specified?
  • What should show() return for a refined runtype?
  • If T is not specified (default behavior), should name be undefined, or should it get some default value? I assume the former, but if the latter: why?
  • With the POC code above, T disappears at runtime. There's no way at runtime to know that T !== A. Is this OK? If not, how should refined vs. not be exposed at runtime?
  • Would it be better to use an options dictionary instead of positional arguments and different-named functions, to allow for easier extensibility later?

@pelotom
Copy link
Collaborator

pelotom commented Apr 8, 2019

What do you think of this kind of approach?

Yep, that makes sense to me.

If I'm on the right track, here's a bunch of open issues. Let me know if you have advice on these questions:

  • Should Static<> be typed as A or T? If the latter, what's the cleanest way to do it?

I'm not sure I follow. Are you asking, should Static<Constraint<A, K,T>> be equal to Static<A> or T? The answer to that is T, because the whole purpose of this is to modify the static type associated with the runtype, and T represents what the static type should be.

  • Is it good to have a desired: T property on this runtype, to facilitate analysis/reflection later? If yes, then:

    • Is it OK if it's never initialized, like _falseWitness? Or is desired unnecessary because solving the Static<> issue above requires setting the type of _falseWitness to T?
    • What should be its default if T is not specified?

I don't see a need for that, since we can recover T by using Static<>. Actually the whole _falseWitness thing is a relic of a time before we had infer, so that shouldn't even be necessary any more...

  • What should show() return for a refined runtype?

If we add the name parameter it can use that, otherwise I'd leave it as what show(constraint) is doing.

  • If T is not specified (default behavior), should name be undefined, or should it get some default value? I assume the former, but if the latter: why?

It seems like T and name are independent; you could want a named constraint without changing its static type, e.g. Number.withConstraint(x => x > 0, '${x} is not positive', { name: PositiveNumber }) (or whatever). Or you could want a different static type without specifying a name. Or both, or neither.

  • With the POC code above, T disappears at runtime. There's no way at runtime to know that T !== A. Is this OK? If not, how should refined vs. not be exposed at runtime?

I think that's fine. The nature of this feature is about refining purely static types, so when static types get erased so does the distinction between a constraint and its base runtype.

  • Would it be better to use an options dictionary instead of positional arguments and different-named functions, to allow for easier extensibility later?

I think so... I wonder if maybe the args thing should get chucked in there too, although it would require revving a major version.

@justingrant
Copy link
Contributor Author

justingrant commented Apr 11, 2019

OK, here's what I've got so far with an options object, optional static type, optional name, and withGuard helper function. Let me know what you think.

In particular:

  • I think the right way to adjust the static type is to change extends Runtype<Static<A>> to extends Runtype<T> in the Constraint<A, K, T> interface definition, but wasn't 100% sure. Is it correct?
  • If there's a custom constraint function, should it also check the underlying type first? Or should it delegate all checking to the constraint function? Or should this be an option which is false by default for constraints but true by default for guards, because guards by definition should be fully validating the desired type?
  • Did I get the show() implementation right?
  • I tried to fill in JSDoc comments for the template and function params for the withGuard/withConstraint functions, but not sure I got the JSDoc syntax correct, especially for type parameters.
  • If we're doing breaking changes anyways, should it be Constraint<A, K, T> or Constraint<A, T, K>?

Runtype Declarations

  /**
   * Use an arbitrary constraint function to validate a runtype, and optionally
   * to change its name and/or its static type.
   *
   * @template T - Optionally override the static type of the resulting runtype
   * @param {(x: Static<this>) => boolean | string} constraint - Custom function
   * that returns `true` if the constraint is satisfied, `false` or a custom
   * error message if not.
   * @param {ConstraintOptions} [options]
   * @param {string} [options.name] - allows setting the name of this
   * constrained runtype, which is helpful in reflection or diagnostic
   * use-cases.
   */
  withConstraint<T extends Static<this>, K>(
    constraint: ConstraintCheck<this>,
    options?: ConstraintOptions<K>,
  ): Constraint<this, K, T>;

  /**
   * Helper function to convert an underlying Runtype into another static type
   * via a type guard function.  The static type of the runtype is inferred from
   * the type of the guard function.
   *
   * @template T - Typically inferred from the return type of the type guard
   * function, so usually not needed to specify manually.
   * @param {(x: Static<this>) => x is T} guard - Type guard function (see
   * https://www.typescriptlang.org/docs/handbook/advanced-types.html#user-defined-type-guards)
   *
   * @param {ConstraintOptions} [options]
   * @param {string} [options.name] - allows setting the name of this
   * constrained runtype, which is helpful in reflection or diagnostic
   * use-cases.
   */
  withGuard<T extends Static<this>, K>(
    guard: (x: Static<this>) => x is T,
    options?: ConstraintOptions<K>,
  ): Constraint<this, K, T>;

Runtype Implementation

  function withConstraint<T extends Static<A>, K>(
    constraint: ConstraintCheck<A>,
    options: ConstraintOptions<K>,
  ): Constraint<A, K> {
    return Constraint<A, K, T>(A, constraint, options);
  }

  function withGuard<T extends Static<A>, K>(
    guard: (x: Static<A>) => x is T,
    options: ConstraintOptions<K>,
  ): Constraint<A, K> {
    return Constraint<A, K, T>(A, guard, options);
  }

Constraint Implementation

export interface ConstraintOptions<K> {
  name?: string;
  args?: K;
}

export interface Constraint<A extends Runtype, K, T extends Static<A> = Static<A>>
  extends Runtype<T> {
  tag: 'constraint';
  underlying: A;
  // See: https://github.com/Microsoft/TypeScript/issues/19746 for why this isn't just
  // `constraint: ConstraintCheck<A>`
  constraint(x: Static<A>): boolean | string;
  args?: K;
  name?: string;
}

export function Constraint<A extends Runtype, K, T extends Static<A> = Static<A>>(
  underlying: A,
  constraint: ConstraintCheck<A>,
  options: ConstraintOptions<K>,
): Constraint<A, K, T> {
  return create<Constraint<A, K, T>>(
    x => {
      const typed = underlying.check(x);
      const result = constraint(typed);
      if (String.guard(result)) throw new ValidationError(result);
      else if (!result) throw new ValidationError(`Failed ${name || 'constraint'} check`);
      return (typed as unknown) as T;
    },
    { tag: 'constraint', underlying, args: options.args, constraint, name: options.name },
  );
}

Reflection

  | {
      tag: 'constraint';
      underlying: Reflect;
      constraint: ConstraintCheck<Runtype<never>>;
      args?: any;
      name?: string;
    } & Runtype

Show

    case 'constraint':
      return refl.name || show(needsParens)(refl.underlying);

@pelotom
Copy link
Collaborator

pelotom commented Apr 13, 2019

  • I think the right way to adjust the static type is to change extends Runtype<Static<A>> to extends Runtype<T> in the Constraint<A, K, T> interface definition, but wasn't 100% sure. Is it correct?

Yep, that's what I had in my proposed types a few posts back.

  • If there's a custom constraint function, should it also check the underlying type first? Or should it delegate all checking to the constraint function? Or should this be an option which is false by default for constraints but true by default for guards, because guards by definition should be fully validating the desired type?

I'm not sure I follow... why would anything change with regards to the runtime checks being done? If you want no runtime checks apart from those provided by the type guard, you would just use Unknown as the underlying type.

  • Did I get the show() implementation right?

Looks right to me.

  • I tried to fill in JSDoc comments for the template and function params for the withGuard/withConstraint functions, but not sure I got the JSDoc syntax correct, especially for type parameters.

I'm really the wrong person to ask about jsdoc syntax 😆

  • If we're doing breaking changes anyways, should it be Constraint<A, K, T> or Constraint<A, T, K>?

Yeah, I'd say so.

@justingrant
Copy link
Contributor Author

Hmm, didn't mean to close this... was resetting to master before pushing latest code. Hopefully will get re-opened once I push!

@justingrant
Copy link
Contributor Author

Nope, it didn't re-open after I pushed new code. ;-( Perhaps it's a blessing in disguise given how far this PR has morphed since the beginning. I'll open a new PR and will link back to the discussion in this one.

@pelotom pelotom reopened this Apr 30, 2019
@pelotom
Copy link
Collaborator

pelotom commented Apr 30, 2019

Reopened.

@justingrant
Copy link
Contributor Author

OK, after a long and winding path to this PR, here's the latest. I think that all open issues from our last go-round have been resolved. Let me know what you think. I can also commit documentation changes too-- but didn't want to do this until we were sure that the API and behavior was going to stick. ;-)

This PR adds two features to the Constraint runtype:

  • The ability to specify a custom name (e.g. 'PositiveNumber') that will be used whenever the underlying name or the word "constraint" was previously used in user-visible output, like error messages or show()
  • The ability to change the static type associated with a Constraint. This is a helpful escape hatch for cases where there's no existing Runtype which will do what you want. For example, if you're depending on libraries that use different versions of Node's Buffer class, you can't use the InstanceOf runtype to validate across versions, but now you can use a custom Constraint instead that will have a static type of Buffer.

To set the name, use the new, optional options parameter:

const C = Number.withConstraint(n => n > 0, {name: 'PositiveNumber'});

To change the type, there are two ways to do it: passing a type guard function to a new Runtype.withGuard() method, or using the familiar Runtype.withConstraint() method. (Both methods also accept the new options parameter noted above to optionally set the name.)

Using a type guard function is the easiest option to change the static type, because TS will infer the desired type from the return type of the guard function.

// use Buffer.isBuffer, which is typed as: isBuffer(obj: any): obj is Buffer;
const B = Unknown.withGuard(Buffer.isBuffer);
type T = Static<typeof B>; // T is Buffer

However, if you want to return a custom error message from your constraint function, you can't do this with a type guard because these functions can only return boolean values. Instead, you can roll your own constraint function and use the withConstraint<T>() method. Remember to specify the type parameter for the Constraint because it can't be inferred from your check function!

const check = (o: any) => Buffer.isBuffer(o) || 'Dude, not a Buffer!';
const B = Unknown.withConstraint<Buffer>(check);
type T = Static<typeof B>; // T is Buffer

One important choice when changing Constraint static types is choosing the correct underlying type. The implementation of Constraint will validate the underlying type before running your constraint function. So it's important to use a lowest-common-denominator type that will pass validation for all expected inputs of your constraint function or type guard. If there's no obvious lowest-common-denominator type, you can always use Unknown as the underlying type, as shown in the Buffer examples above.

This new Constraint implementation is backwards-compatible to existing users, except for the small minority of users who were using the undocumented args parameter as part of their automatic test-case generators. For these users, this PR will be a breaking change:

  • args? moves from an optional second parameter on withConstraint() to an optional property on the options object
  • The type of the args parameter is now the third type parameter on the Constraint generic type (K in Constraint<A, T, K>), where T (the static type of the constraint) is new. Before this PR, the args type (K) was the second type in Constraint<A, K>.

Tip: don't reset your forked branch to master and force-push. GitHub will close your PR! Oops.
Instead, I think the GitHub-friendly workflow is to reset locally and commit new code locally before pushing new code.

@justingrant justingrant changed the title Allow constraint functions to optionally check a different static type Enable Constraint to have a custom name and/or a custom static type Apr 30, 2019
@pelotom
Copy link
Collaborator

pelotom commented May 4, 2019

Sorry for the delay in getting back to this. It looks really great! If you could add a section to the README about this feature that'd be awesome. Otherwise I think it's ready to go 🎉

@pelotom pelotom mentioned this pull request May 4, 2019
@pelotom
Copy link
Collaborator

pelotom commented May 4, 2019

One other thing that occurred to me: it seems likely that Unknown.withGuard(isX) will be a very common pattern, so maybe we should export a convenience runtype from constraint.ts:

export const Guard = <T, K = unknown>(
  guard: (x: unknown) => x is T,
  options?: { name?: string; args?: K },
) => Unknown.withGuard(guard, options);

which would allow you to just write Guard(isX).

@justingrant
Copy link
Contributor Author

Great, sounds like we're in the home stretch. Yep, was planning to slightly adapt the text in my last post into README content once we finalized the API, which it sounds like we just did!

Your Guard type idea is a good one. I'll add it to the next commit. Is there a need to expose it as a separate type in reflection/show? Are there any other gotchas to think about when introducing a new runtype?

BTW one interesting issue I ran into when using the new constraint type was that you can't use a function like a => a.foo when the underlying type is unknown because the compiler complains that a is unknown . You need to do this: (a: any) => a.foo. I think this is OK, but it's admittedly a little odd developer experience compared to what I expected. Holler if you a) think this is bad and b) have an idea to fix it. Otherwise I think it's probably OK as-is.

@pelotom
Copy link
Collaborator

pelotom commented May 5, 2019

Your Guard type idea is a good one. I'll add it to the next commit. Is there a need to expose it as a separate type in reflection/show? Are there any other gotchas to think about when introducing a new runtype?

Nope, it’s just an alias for a particular configuration of Constraint, not a new atomic run type. Nothing fancy needed.

BTW one interesting issue I ran into when using the new constraint type was that you can't use a function like a => a.foo when the underlying type is unknown because the compiler complains that a is unknown . You need to do this: (a: any) => a.foo. I think this is OK, but it's admittedly a little odd developer experience compared to what I expected. Holler if you a) think this is bad and b) have an idea to fix it. Otherwise I think it's probably OK as-is.

This is an interesting point. unknown is technically the correct type, because the type guard knows nothing about its input and needs to use some sort of reflection on it with no prior assumptions. However most preexisting type guards provided by libraries tend to use any because it’s easier to write out the definition, for reasons of convenience, even though in spirit they are accepting an unknown as input. I think this comes down to a deficiency of TypeScript, because ideally you should be able to do all necessary refinement starting from unknown, e.g.

function (x: unknown): x is { foo: boolean } {
  return typeof x === 'object'
	&& x !== null
	&& 'foo' in x
	&& typeof x.foo === 'boolean';
}

This is all perfectly logical, but unfortunately the type system gets stuck on the last step, saying that Property 'foo' does not exist on type 'object', even though it should by that point be convinced that x: { foo: unknown }.

Anyway, I agree that it's more convenient to define type guards taking any as input, but we can't use a Runtype<any> as the base of the constraint, or else it ruins the resulting static type (any & T is any; we rely on the fact that unknown & T is T for all T). But we could make it a further convenience afforded by the Guard alias:

export const Guard = <T, K = unknown>(
  guard: (x: any) => x is T,
  options?: { name?: string; args?: K },
) => Unknown.withGuard(guard, options);

This seems like it's probably worth doing, even though it's a little bit less type-safe. What do you think?

@pelotom
Copy link
Collaborator

pelotom commented May 5, 2019

On the other hand, this would mean if you had a type guard which did make some assumptions about its input, e.g. (x: {}) => x is { foo: boolean }, you could pass it to Guard with no type error, which would be unsafe. So maybe it’s better to just advise people to annotate their type guard input with any explicitly if they want to take responsibility for it.

@justingrant
Copy link
Contributor Author

This was exactly the thought process I went through. My best idea was to make the T template parameter a conditional type like T extends unknown ? any : T but that didn't work as I expected. Sadly my TypeScript skills weren't good enough to figure out how to build the right syntax. (If indeed it's actually possible to create a conditional type that changes unknown to any but otherwise leaves the type alone. It might not be possible.)

I did manage, while trying out various syntax options, to trigger some amazingly typescript perf issues where intellisense would hang for minutes at a time with my MacBook fan running at full blast, so I guess I did achieve something at least. ;-)

@pelotom
Copy link
Collaborator

pelotom commented May 5, 2019

I think I’m in favor of keeping it simple and leaving it as unknown.

@justingrant
Copy link
Contributor Author

Yep agreed

@justingrant
Copy link
Contributor Author

OK Tom, I think it's (finally) ready! Here's what different from last time:

  • Added Guard convenience type
  • Added tests for Guard
  • Added docs in the readme

Sorry it's taken so long to finish this, and thanks for your feedback... I learned a lot!

Holler if you see any problems.

Copy link
Collaborator

@pelotom pelotom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thanks for all your hard work!

@pelotom pelotom merged commit d5bf991 into runtypes:master May 31, 2019
@justingrant
Copy link
Contributor Author

Cool. Unfortunately I can't use it on TS 3.5 because of #89. Any idea how to fix that issue?

@pelotom
Copy link
Collaborator

pelotom commented May 31, 2019

Nope, haven't had time to look at it yet.

@justingrant
Copy link
Contributor Author

BTW I figured out a fix for the TypeScript 3.5 problems and submitted a PR to fix them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants