Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default validation transformers are too loose #54

Closed
joshuajabbour opened this issue Jun 9, 2016 · 15 comments
Closed

Default validation transformers are too loose #54

joshuajabbour opened this issue Jun 9, 2016 · 15 comments

Comments

@joshuajabbour
Copy link

joshuajabbour commented Jun 9, 2016

Currently yup's validation and cast behavior for booleans seems way too greedy: https://tonicdev.com/joshuajabbour/yup-greedy-cast Even with the strict flag, the results are counter-intuitive. Ultimately, I think most people will need to overwrite the default transformer.

Personally, I think the boolean transformer should be more restrictive, and only accept true, false, 'true', 'false', 1, 0, '1', '0'. At the very least, the docs need to detail that 'whatever1' === true, etc. (All the types need transformer documentation actually.)

@joshuajabbour
Copy link
Author

Or more closely follow the logic of Joi: https://github.com/hapijs/joi/blob/master/lib/boolean.js

@jquense
Copy link
Owner

jquense commented Jun 9, 2016

hi, not quite sure what you mean by greedy. the logic coerced any value to a boolean. and with strict true, it will not do any coercion

that a trailing 1 is coerced to true is a bug, regex should be ^(true|1)$

@joshuajabbour
Copy link
Author

joshuajabbour commented Jun 9, 2016

Well, the trailing 1 is exactly one of the things I meant by greedy. But the other problem is a value is always coerced by .cast(). false is an explicit value (not necessarily the absence of a value), and anything except true, 'true', 1 will be coerced into false (once the bug is fixed). That's problematic in my opinion. What I'm saying is that coercion and casting should be way more particular, and only coerce actual boolean-equivalent values (like Joi does... they count true/false/yes/no/on/off/1/0 only as boolean equivalent.

One might argue the way to solve this dilemma is by making that field .nullable() and passing null when you don't want a value. Which works when done intentionally. But it still doesn't catch errors (e.g. passing truee silently sets the value to false instead of an error).

@joshuajabbour
Copy link
Author

joshuajabbour commented Jun 9, 2016

Here's another explanation, using number, which also has this problem (which is even more clear).

  • In strict mode, the valid values are 1 and Number(1) (but also Number('whatever')?).
  • In non-strict mode, '1', Date() and true/false also become valid "numbers".
  • But there's no way to get '1' as valid, without also getting those other, less obvious values to be considered valid.

I guess the problem is that what is considered "valid" is all over the place (and not documented at all). I'm trying to use yup to validate REST API input, so everything comes in as strings. I'd love yup to smartly coerce my values, but it seems like I'll need to reimplement every transformer to get that to happen.

@jquense
Copy link
Owner

jquense commented Jun 10, 2016

I hear you, and sympathize. The problem is there there is just no "right" way to coerce values that works for everyone and every use case, or even most use cases.

The casting mechanisms of yup aren't intending to be "smart" about coercion; its merely type coercion. The main use case is parsing json responses from a server where all the primitive values are already the correct types (generally). There are a few cases where it does some smarts but it's intentionally minimal, and in keeping with the language. The goal of cast is to take a value and output the right type out the other end. For values that have JS "invalid" representations (NaN, invalid date, null for objects/arrays) we use them otherwise we coerce the value to the type.

I'm not see the weirdness you are seeing with strict mode, if you run it with strict: false the coercions don't run at all, so if your validations are giving unexpected responses it's usually because Javascript is weird on the edges (or a bug with the type checking), and despite precautions the language will still just coerce stuff even if you don't want it to ('2' >= 1 for instance) to which there isn't much we can do about that.

all that to say: there is no such thing as the "right way" or even a consistent way to coerce values between types. Why coearce "yes" but not "si" or "no" and not "nein", "1" but not " 1 "? we try to have the least amount of opinions about it, and make it really easy to customize to your specific use-case, instead of picking one opinion and forcing you to have it.

@jquense
Copy link
Owner

jquense commented Jun 10, 2016

btw I do appreciate these issues, choices have been made with certain tradeoffs, use-cases, and legacy in mind, but it's always helpful to occasionally revisit those choices to see if they still make sense, and if the original reasons or issues are still applicable.

@joshuajabbour
Copy link
Author

Thanks, and I appreciate you taking the time for this conversation. I will make a final argument that I believe Yup is as opinionated as Joi; the opinions are just non-obvious. Anyway I had a much longer response written, but it sounds like you've made the decisions intentionally (which is totally fine; it's your tool), so I'll leave it at that. Ultimately I need a validation tool, not a coercion tool, so I'll keep looking. Thanks again for your work here; it is interesting (and I hope Joi picks up a few of your ideas).

@jquense
Copy link
Owner

jquense commented Jun 10, 2016

I would just add before you head off, yup is a validation tool however we keep the validation and coercion bits independent, which is very different from Joi. They are often used used together (such as parsing and validating a response), but also separately. It seems like you generally just need to validation bits, why not use it with strict false, that seems more mostly meet your needs (admittedly you did find two bugs there).

I am open to reworking the base coercion bits somewhat; In earlier versions of yup "failing" during a cast was not acceptable or helpful, which is partly why its so aggressive about giving you the type back. It's still unclear what the right way to fail is for both calling cast independent of validate (do you just throw a type error?) and used in the context of validate (still throw? Turn it into a validation error?)

@joshuajabbour
Copy link
Author

joshuajabbour commented Jun 10, 2016

Ultimately it comes down to the fact that I wouldn't call the current behavior "minimal" or "consistent". Nor would I say Yup "output[s] the right type out the other end". I would actually argue the opposite; the current behavior is confusing, undocumented, very opinionated, and (IMO) incorrect.

Because coercion isn't really independent; by default Yup coerces all input (and as I've demonstrated, in a lot of non-obvious ways) due to strict: false. Meaning a user installing this library and following the docs would not really have a validator (at least not in the sense most would expect, I'd argue). (Plus .cast() behaves differently than internal coercion; it always greedily coerces even in strict mode.)

While Yup does separate the coercion and validation internally, that's an implementation detail which is not clear without digging. I'm not sure most users would realize that strict dramatically changes how Yup works (for the better). But then once you enable strict mode, you're unable to transform yourself, correct? So there's no clean way for me to add '1' === 1 to the strict validation logic.

In my opinion, only strict mode should exist and it shouldn't even be possible to disable it. It should just be how Yup works, but should still run all added transforms. That way if a user wants to alter how the default (strict) validation works, they'd add their own transforms (if they want new Date() === number or new Number('abc') === number or 'false' === true, all of which pass validation now).

I get that you're mostly replicating Javascript behavior here, which is why some of these weird edge cases exist. But I think that's incorrect; if I wanted that I don't need a library, I'll just use vanilla Javascript. I want my validation library to standardize and "fix" the issues with JS and give me sane defaults.

@joshuajabbour
Copy link
Author

joshuajabbour commented Jun 10, 2016

And to be clear, I need both validation and coercion. But I want very limited set of coercions applied and then validated; I only want values like '1' => 1 and 'true' => true, but not new Date() => 129849... or 'false' => true. And as far as I can tell, there's no easy way to get that behavior. (I realize I can probably use non-strict mode and override every internal transform, but that's the whole point of this issue.)

@jquense
Copy link
Owner

jquense commented Jun 10, 2016

Ok while I appreciate feedback, I don't need you to make sweeping assumptions about what "all" or "most" users of the library expect or want, its not helpful and you literally have no information to make those assumptions whereas I have used this library in production in many different places, with very different use-cases and have maintained, answered and supported plenty of users over the years so please don't talk down to me about what is obviously wrong, or unintuitive.

Instead why not talk about what you'd like to see. I already posed two questions about how the library should handle cases where it cannot "reasonably" coerce to a type, throw? fail quietly? let validation catch it? what about if they aren't validating?

by default Yup coerces all input (and as I've demonstrated, in a lot of non-obvious ways)

those 2 in strict mode, I granted are bugs (one of which one with boxed primitives which exactly 3 ppl use :P), and appreciate that you pointed them out.

While Yup does separate the coercion and validation internally, that's an implementation detail which is not clear without digging.

It is not an implementation detail cast is one of the 3 public methods, and the distinction is explicitly noted in the documentation as an differentiation from Joi

Plus .cast() behaves differently than internal coercion; it always greedily coerces even in strict mode.

Not sure what you mean with this one. they are one and the same logic. There is zero coercion in strict mode. literarily none, yup doesn't run any coercion logic. What you arguing for is that the language throw when you try and donull > 2 instead of returning false (which yup protects you from if its not nullable).

In my opinion, only strict mode should exist

This is another way of saying "Only encode my opinions"; you aren't asking for no strict mode, you are saying "always coerce values according to a set of rules and don't allow them to be configurable", which isn't the same thing as disabling coercion. There are specific use-cases that are served by both modes, of which i didn't invent as Joi does the same thing.

but not new Date() => 129849... or 'false' => true

I would disagree on the Date one, dates to numbers is a common coercion (but again this is all just opinions in the same way "Which is more like an apple? A banana, or tangerine?" is). Not sure what you mean by the second, 'false' is coerced to false

@joshuajabbour
Copy link
Author

joshuajabbour commented Jun 10, 2016

so please don't talk down to me about what is obviously wrong, or unintuitive.

I'm sorry, I did not mean to talk down to you. That was not what I was trying to do. Rather I'm just saying that I believe most users would not expect this default behavior from a library that does validation per the examples provided.

Instead why not talk about what you'd like to see.

I have, repeatedly in this thread, as well as in that most recent comment. Here's the simplest example, which I will repeat.

// also: https://tonicdev.com/joshuajabbour/yup-date-accepted-as-number
var yup = require('yup')
var schema = yup.object().shape({
  age: yup.number().required().positive().integer()
})
var strictSchema = yup.object().shape({
  age: yup.number().strict().required().positive().integer()
})

schema.isValid({ age: 24 }) // true
schema.isValid({ age: '2016-06-09T20:46:30.562Z' }) // true?
strictSchema.isValid({ age: '2016-06-09T20:46:30.562Z' }) // false

// Arguable, but a fine _opinion_, however this is different from the logic used during validation:
schema.cast({ age: '2016-06-09T20:46:30.562Z' }) // `2016`
strictSchema.cast({ age: '2016-06-09T20:46:30.562Z' }) // still `2016` what?

This is such a huge vector for bugs and attacks, and it's completely hidden from the user. Maybe "most" people don't expect their validator to conform to their defined rules, but then I'm not really sure I understand what they are using a validator for.

Plus .cast() behaves differently than internal coercion; it always greedily coerces even in strict mode.

See my example. .cast() does not use the transforms, or honor strict mode. It does its own thing, often casting to different values than what was used in the validation.

you are saying "always coerce values according to a set of rules and don't allow them to be configurable"

No I am not. You completely ignored the next sentence, which refutes your point: "but should still run all added transforms". Strict mode isn't an opinion, it's following the desired rules to the letter. If I ask to validate something as a positive integer, then only positive integers would be accepted. If I wanted to also accept dates, or string numbers, or booleans, then I could add a transform. Unless I'm missing something, this is not possible now.

I would disagree on the Date one, dates to numbers is a common coercion.

Sure, it is. But not being opt-in? And assuming I want something I call a "positive integer" to accept a date? That is a giant opinionated choice. Again, see the example I included above. If I want to accept dates for a number field, I'll add a date validator to it. Same with accepting booleans for a number.

Not sure what you mean by the second, 'false' is coerced to false.

You are correct. Sorry, I misread my notes. I meant to say 'asdf' => false.

Ultimately what I want is to be able to define a bunch of validation rules and have confidence they are being honored. I can't do that in non-strict mode, and in strict mode I lose the flexibility of Yup (transforms/cast). And again, that's fine. We obviously have different ideas about what a validation library should do, and how opinionated it should be (I'd prefer much less, but if I have to accept opinion, then Joi's opinions are clear to me).

@joshuajabbour joshuajabbour changed the title Boolean transformer is greedy Default validation transformers are too loose Jun 10, 2016
@joshuajabbour
Copy link
Author

I already posed two questions about how the library should handle cases where it cannot "reasonably" coerce to a type, throw? fail quietly? let validation catch it?

Yes, validation should catch it and fail. This is my entire point.

what about if they aren't validating?

Then don't coerce that value.

Again, the end user has little awareness of, or control over, what logic is used for coercion, validation, and casting.

@jquense
Copy link
Owner

jquense commented Jun 10, 2016

ok your concerns seem to reflect a misunderstanding of how this library works and many of the examples you are giving look weird because you are using it wrong.

There are two pipelines. transformation, and validation.

If you run validate with strict false only the the validation pipeline is run, and you validate exactly what you put in. when it's true it runs the transform pipeline, then passes the output to the validation pipeline

Strict doesn't affect cast because its not an option of cast, a "strict" cast would be a complete noop, it would just be value => value. It makes no sense to "strictly" cast something, because strict means do not cast, so yeah if you cast a value against a schema and pass "strict" to it will we be ignored. The type coercion logic is never going to be decoupled from the rest of the transforms, thats intentional and has drawbacks but it also enables a lot of things that other libraries like Joi can't do.

See my example. .cast() does not use the transforms,

This is incorrect, that is the only thing cast does. you can open up the code and see for your self.

  • .cast(value) run all transforms against value

  • .isValid(value, { strict: true }) Run all validations against value

  • .isValid(value, { strict: false }) Run all transforms and validations against value

    And per the documentation for .strict():

Sets the strict option to true. Strict schemas skip coercion and transformation attempts, validating the value "as is".

You keep conflating the two concepts, which are distinct and intentional separate. I appreciate that it doesn't make sense for those things to be distinct for you, and thats fine, lots of ppl do, and much of the useful features of yup fall out of that design choice.

It seems like the only real issue you have is that you disagree with the default coearcions for boolean and number, thats fine, plenty to disagree with, The likelihood that we will ever agree is far fetched, not just because we differ but because its arbitrary and use case-specific. SO what I gather from this not meeting your case is that it is hard to change the default type transforms.As I care about being flexible and configurable it seems like that is worth fixing.

second is the question is whether "don't fail a cast" is still a reasonable approach, your position on that is clear. There were specific reasons why that choice was made, and I need to look back and see if those things are still relevant.

@joshuajabbour
Copy link
Author

joshuajabbour commented Jun 10, 2016

If you run validate with strict false only the the validation pipeline is run, and you validate exactly what you put in. when it's true it runs the transform pipeline, then passes the output to the validation pipeline

Pretty sure you got that backwards.

Strict doesn't affect cast because its not an option of cast, a "strict" cast would be a complete noop, it would just be value => value. It makes no sense to "strictly" cast something.

Sure not for a strict value, but not everything has to be strict. Look at this example, where I have the age field set to strict. Otherwise, what is the point of the .strict() method? If I had the following, I'd expect a .cast() to ignore age, but coerce date, just like .validate() does the right thing here:

yup.object().shape({
  age: yup.number().strict().required().positive().integer(),
  date: yup.date()
})

Now maybe this is a problem with the documentation, where this isn't explained. So maybe I'm making some assumptions here. But what I describe is sure what I'd like to happen.

This is incorrect, that is the only thing cast does.

Ok, this is due to me misreading/misunderstanding the default transforms (again). The date string gets converted to 2016 because it's a string, a Date object gets converted to a timestamp.

I appreciate that it doesn't make sense for those things to be distinct for you.

It makes total sense they are separate, I have no problem with that. But per above, it just shouldn't be all or nothing. The way it works now is both too opinionated (because the default transforms), and not flexible enough (due to all or nothing casts).

The likelihood that we will ever agree is far fetched, not just because we differ but because its arbitrary and use case-specific.

Exactly what I've tried to say many times. Too much opinion, too hard to override.

it is hard to change the default type transforms. As I care about being flexible and configurable it seems like that is worth fixing.

Yes this is the key (plus the fact that it's not documented).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants