Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

custom transformers / validators for input data #496

Closed
2 of 4 tasks
simllll opened this issue Oct 3, 2019 · 23 comments
Closed
2 of 4 tasks

custom transformers / validators for input data #496

simllll opened this issue Oct 3, 2019 · 23 comments

Comments

@simllll
Copy link
Contributor

simllll commented Oct 3, 2019

Sorting

  • I'm submitting a ...

    • bug report
    • feature request
    • support request
  • I confirm that I

    • used the search to make sure that a similar issue hasn't already been submit

Issues

Right now it's not possible to add a custom validator / transform for the input that comes to a controller method.
E.g. if you have a custom type (like we were discussing in #483) it's not possible to validate this input data. Or even if you have just a string that you want to validate against something. E.g. if it is a valid ID or something else (more complex) that can not be covered by typescript definitions.

Possible Solution

Allow adding custom validators/transforms for input data. E.g. like strings are converted to numbers right now, I would like the ability to add a custom tranformation for a specifiy key, but this transformation can also fail and will be handled like the current validation erros

I'm thinking about a second parameter to the @ Body or @ Query annotation where you can pass a custom function. This function will override any other default validator/transformator for this parameter, but the function get's a reference to the "default" validator so it can also be used in the custom transform/validator function.

This is just a draft of the idea. All comments welcomed.

Breaking change?

no breaking change

@simllll simllll changed the title custom transforms / validators custom transformers / validators Oct 3, 2019
@simllll simllll changed the title custom transformers / validators custom transformers / validators for input data Oct 3, 2019
@WoH
Copy link
Collaborator

WoH commented Oct 3, 2019

I feel like this is something that should be done in userland.

I don't (yet) see the value of not casting in the controller, but adding a new mechanism and related complexity to the framework?

@simllll
Copy link
Contributor Author

simllll commented Oct 3, 2019

For my understanding I though the validators are acutally doing exactly this, but right now only for "pre defined" types.

E.g. If you look into the validatorservice

const numberValue = validator.toInt(String(value), 10);

here the input (which is by nature a string in the case of a query param) is transformed to an integer [transformer] (if it can be transformed to an integer [validation]).
After the validationservice has passed all attributes are validated and returned / transformed to their corresponding type.

The custom validator would just extend this functionallity to do more flexible validations / transformations.

Another example:
We have one endpoint which gets multipart/form-data as input, to parse this kind of data we use the multer.none() method. Unfortnautnely this method returns a different object than a plain js object. Therefore all validation fails afterwards, a simple transformation like this: req.body = { ...req.body }; solves the problem already. But right now we have insterted a custom route entry that is run before registerRoute is called which 1. calls multer (this will be solved by the middleware annotiation) 2. does the custom tranformation.

This is pretty ugly in many ways, as it takes place on a completely different file, and is hard to understand.

e.g.

app.post('/some-multipart-endpoint', multer().none(), (req, res, next) => {
	req.body = { ...req.body };
	next();
});

RegisterRoutes(app);

@dgreene1
Copy link
Collaborator

dgreene1 commented Oct 3, 2019

@simllll how would a custom transformer avoid multer? Like if someone uses bodyparser as a middleware in koa, then every incoming request will be parsed as JSON.

Is my question clear? (I'm struggling to phrase this)

I guess to put it another way, aren't most middleware functions applied across all routes?

So even if we provided some hook that allowed consumers to serialize the raw http string into a new object, wouldn't we still get hit by the middleware that the person includes first?

@simllll
Copy link
Contributor Author

simllll commented Oct 3, 2019

Yeah sorry that I mixed that in my example, I thought it was necessary to see the whole problem and what can be solved with this idea.

Multer can not be avoided with this approach at all, there is probably always some kind of middleware. Multer has the only difference to any average middleware function that it is only applied to certain routes with custom parameter (e.g. what's the name of the field for the file upload). Multer also throws some errors if the input is not as expected and therefore can not be applied as regular middleware function for all routes.
Anyway, this will be not solved by this, absolutely right, but it would be solved with #62 .

The problem that is still left, is that some transformations need to be done to get the output of the middleware to be compatible with tsoa. You are right in this case you could also just add a custom middleware function that processes multer and does the transformation after it immediately. But this is just one case, there are more cases where something is in the body that needs to be transformed and validated before tsoa or the controller afterwards can work with it. This is already happening e.g. for integer parsing, custom objects etc. But not possible for e.g. custom types or "complex" type aliases because the validation is only happening on build time within typescript (and therefore not even possible for tsoa to do an "automatic" input validation on runtime).

Do you see what I try to explain here? ;)

@WoH
Copy link
Collaborator

WoH commented Oct 3, 2019

For my understanding I though the validators are actually doing exactly this, right now only for "pre-defined" types.

These are specified by JSON Schema and can be expressed in Swagger/OpenAPI documents.
Right now I see the job of the tsoa validation layer to check if the input matches the spec. If it does, pass it to the controller. Basically, a user of your API should know why it did error based on the swagger spec.

The tuple pr is a great example here: We can't express the order of the types in the API spec, therefore, we could throw a validation error to satisfy the Controller's (Typescript) requirements, but you would not know if should error from only looking at the API Spec.

I'm not saying this is a bad idea, but I feel like supporting that could be a lot of work.

@github-actions
Copy link

github-actions bot commented Nov 3, 2019

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days

@github-actions
Copy link

github-actions bot commented Dec 4, 2019

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days

@github-actions github-actions bot added the Stale label Dec 4, 2019
@WoH WoH removed the Stale label Dec 4, 2019
@Eoksni
Copy link

Eoksni commented Dec 6, 2019

I don't relate to the multer example, but my thoughts on the topic is that doing such transformations and validations in userland has a drawback: it can't affect generated spec. I'd like to specify some custom format, for example parenthesized strings:

interface RequestBodyDto {
  /**
   * @format parenthesized
   */
  someKey: string;
}

So I would accept { someKey: "(hello)" } and reject { someKey: "hello" }. Then lets say I hate my customers and I dont actually need those parentheses and I want to remove them as transformation step while still forcing consumers to send those parentheses. For whatever reason. So my controller will expect { someKey: "(hello)" } to be transformed to { someKey: "hello" }.

A very nice way to achieve all that would be somehow telling tsoa about implementation of such custom format by providing something like this:

// parenthesized-formatter.ts

@Formatter('parenthesized') // not-yet-existing decorator
export class ParenthesizedFormatter {
  validate(input: unknown): unknown {
    // check if valid
    if (isString(input) && input.startsWith('(') && input.endsWith(')')) {
      // and apply some transformation
      return input.substring(1, input.length - 2);
    } else {
      throw ValidateError(...);
    }
  }
}

And pointing to such files in tsoa.json:

{
  // ...
  "customFormatterGlobs": ["*-formatter.ts"]
}

Then tsoa would apply such validation/transformation and our controllers would get clean data, AND also output a format property in the spec (it allows for custom formats, at least openapi 3).

This would conform to the tsoa philosophy of "specify in one place, reuse it everywhere" (at least thats how I understand tsoa philosophy).

Doing all that in userland means we still have to specify jsdoc @format ... in order for it to get passed to spec file, but we also need to remember that we specified that format and apply our validation. And everything that requires "remembering" can always get out of sync, which is not great.

I believe it shouldn't be very hard to implement, I can do it myself if you are willing to accept such PR.

P. S.
Even better if such custom formats would accept parameters, like * @format parenthesized 2 and it would enforce 2 parentheses like "((hello))". Such parameters could be passed to the constructor of the custom formatter class.

@dgreene1
Copy link
Collaborator

dgreene1 commented Dec 6, 2019

@Eoksni sounds interesting but it doesn’t seem like a real scenario. If anyone else can provide a realistic scenario that would be helpful.

For instance, I would like my routes to expect a JsJoda LocalDate. In order to do that I would need either need:

  • Option A: a custom formatter (like what you have above)
  • Option B: simply covert the string query param to a LocalDate object inside the route method (I.e. “in user land” as @WoH says)

For now I find Option B to be easy enough and also more transparent. And thanks to tsoa’s existing isDay formatter I can still specify the type of the string in Swagger.

In other words, I haven’t needed Option A yet.

@Eoksni
Copy link

Eoksni commented Dec 7, 2019

@dgreene1 I partly agree, conversion isn't that big of a deal. On other hand, custom validation is probably a big deal - tsoa provides a validation layer (only for built-in types/formats), but have you ever seen a validation library without support for custom validation?

Most ORMs have support for custom validation and it would be great if we can "lift" those validations from model-level to tsoa/DTO-level so it can fail as early as possible if the request is not in a correct format, while also providing additional and actual documentation information about that format to the consumer.

As for more concrete realistic scenarios

  • We have formats for date, for date-time, but what about time for time portion or for timespans? For example we need to specify just "05:12:30" instead of full date, or something like "2 days 3 hours" for a timespan.

  • At the very least, custom formatter can be used to name a complex regexp pattern to improve readability. Have you seen how crazy regexps for email validation can be?

  • As for more examples, here quote from the openapi spec https://swagger.io/docs/specification/data-models/data-types/#string

However, format is an open value, so you can use any formats, even not those defined by the OpenAPI Specification, such as:
email
uuid
uri
hostname
ipv4
ipv6
and others

Dunno, maybe a better long-term solution is to delegate validation capabilities to some external popular validation library? Just as a suggestion.

@dgreene1
Copy link
Collaborator

dgreene1 commented Dec 7, 2019

I hear you but the challenge is how would the formatter/validation describe the inputs versus outputs. And I’m particular concerned with this for query parameters.

I have to think on this. Let’s let it sit to see if others would also like this. It bloats the library. I’m not against maintaining this addition but I’d like to see what the community desires since there is a very clear workaround that meets your needs of being able to communicate the format of the string via OpenAPI.

Again, I’m not saying no, I’m just saying let’s wait a bit. For anyone reading this please feel free to comment below with your interest or concerns.

@WoH
Copy link
Collaborator

WoH commented Dec 7, 2019

validations in userland has a drawback: it can't affect generated spec.

That's precisely the point why it would be bad to add it.
I'll give a brief example. Given a controller method

  @Get('/blog')
  public async getBlog(
    @Header() h1: string,
    @Header() h2: string,

we need to be able to look at this and expect to get h1 and h2 from the request (and therefore inject and document h1 and h2).
If we had an @Middleware(myMiddleware), that was applied, it may be that middleware that sets those headers and therefore should not be part of the request spec. I don't see any way we could predict that.

So what you can do, basically the escape mechanism, is to grab the request object, which usually is a bad option (when you can do the validation inside the controller method) after applying the middleware and accessing the properties the middleware sets on the request via request.propertyFromMiddleware.

I feel like the primary thing tsoa should do is enforcing OpenAPI on top of TypeScript. See below:

So I would accept { someKey: "(hello)" } and reject { someKey: "hello" }. Then lets say I hate my customers and I dont actually need those parentheses and I want to remove them as transformation step while still forcing consumers to send those parentheses. For whatever reason. So my controller will expect { someKey: "(hello)" } to be transformed to { someKey: "hello" }.

Perfect case of a simple reusable function that takes one parameter and returns a boolean.
You could even wrap that in your own decorator for function decoration, I guess my issue really is that I don't see the benefit of moving this into the framework?
In case you can't use @format to document a custom format, that should be a PR.

but have you ever seen a validation library without support for custom validation?

Dunno, maybe a better long-term solution is to delegate validation capabilities to some external popular validation library? Just as a suggestion.

Tsoa uses a validation lib (https://github.com/validatorjs/validator.js) for most of the basic building blocks. It provides simple functions for validating (string) inputs. There is no "extension" mechanism since it's not required and probably would not make sense.

However, I certainly see the point of moving validation into a json-schema validation lib, I have entertained the idea of rewriting the validation layer as a wrapper around using AJV at various points in time (esp. since our swagger probably 95% compatible with json-schema and therefore AJV). I think it makes sense at that point to discuss providing hooks into the custom keyword functionality ajv offers.

@dgreene1
Copy link
Collaborator

dgreene1 commented Dec 7, 2019

We already decided against using ajv or joi internally when we discussed issue #416.

However, nothing precludes a tsoa User from using a validation library inside of the route method.

Also, I feel like tsoa is flexible enough to allow a developer to document any and all of the string formats. I guess we just haven’t documented that well.

@WoH
Copy link
Collaborator

WoH commented Dec 7, 2019

We already decided against using ajv or joi internally when we discussed issue #416.

I initially closed this because luke was talking about an unreleased version of this which I wanted to take a look at first.

However, nothing precludes a tsoa User from using a validation library inside of the route method.

Or replacing the generated output/validation through different templates and sharing these via npm.

@Eoksni
Copy link

Eoksni commented Dec 7, 2019

Yeah, I believe using a custom template is the simplest solution for me at this point.

@WoH Still, I think there is a bit of misunderstanding.

Perfect case of a simple reusable function that takes one parameter and returns a boolean.
You could even wrap that in your own decorator for function decoration, I guess my issue really is > that I don't see the benefit of moving this into the framework?

So I tried to explain it earlier, but I guess I wasn't successful at it. So I take another try :)
Once again, my suggestion is that specifying custom @format should enable custom validation at runtime. I don't talk about whole middleware that can inject additional properties into headers or whatever. No, my point is about this statement about tsoa philosophy from the readme:

Runtime validation of tsoa should behave as closely as possible to the specifications that the generated Swagger/OpenAPI schema describes.

Currently, if I specify that some property in the request body is string, tsoa ensures that it is string. If I specify that some property is number, tsoa ensures that it is number. And so on. That is AWESOME! But why not bring some customization into it? Openapi spec allows us to specify custom format, so why can't we specify custom validation according to such custom format?

Adding "userland" validation is pretty much the same as not using tsoa at all. I might as well write spec myself from scratch, specify there my custom format and then add my custom validation decorator to my controller. But it is errorprone, I can forget to update my docs very easily when I change my runtime validation. Awesomeness of tsoa is that it ensures the generated spec to be up-to-date because of the single source of truth - I specify that my model expects someKey property of type string and BOOM! Runtime validation is in place. That is great, and doing same thing for custom formats/types would be even greater thing.

@WoH
Copy link
Collaborator

WoH commented Dec 7, 2019

So I tried to explain it earlier, but I guess I wasn't successful at it. So I take another try :)
Once again, my suggestion is that specifying custom @format should enable custom validation at runtime. I don't talk about whole middleware that can inject additional properties into headers or whatever.

I somewhat agree here and agree that it wasn't smart to refer to middleware and input validation in the same paragraph in my comment, which happened due to the context of the conversation with how broad the initial issue discussed in here was. I might have generalized a bit too much. However, I referred to your custom format explicitly:

In case you can't use @format to document a custom format, that should be a PR.

So let's first of all agree @format myformat should work.
The only thing beyond that I could see is something you mentioned:

Openapi spec allows us to specify custom format, so why can't we specify custom validation according to such custom format?

I would agree that a hook for format validation (as a specific case of custom validation that's supported by the spec) might be useful.

@Eoksni
Copy link

Eoksni commented Dec 7, 2019

I believe I can use @format myformat to have it show up in the generated spec. But specifying it in the spec doesn't mean it introduces any actual validation. So if I could specify a custom validation for my custom format, I'd be the happiest person on Earth :)

I'll start by implementing my custom template to fiddle with the validation process, and if I see something good coming out of it, I can share and maybe we can come up with something great eventually. Thanks for the discussion and the responses, its been a pleasure.

@dgreene1
Copy link
Collaborator

dgreene1 commented Dec 7, 2019

@Eoksni instead of a custom template can you create a PR that enhances format to accept a regex?

I’ve made a separate issue (#543) where we can discuss this new feature as opposed to this current thread which should continue to discuss the potential need for a middleware decorator.

@WoH
Copy link
Collaborator

WoH commented Dec 8, 2019

Pattern and format are different things. Format applies to every data type, pattern is restricted to strings.

@github-actions
Copy link

github-actions bot commented Jan 8, 2020

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days

@github-actions github-actions bot added the Stale label Jan 8, 2020
@github-actions
Copy link

github-actions bot commented Feb 8, 2020

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days

@github-actions github-actions bot added the Stale label Feb 8, 2020
@WoH WoH removed the Stale label Mar 15, 2020
@Jofemago
Copy link

Jofemago commented Mar 29, 2023

i have a question here, if i want to validate a paramater of the body, i can define that this parameter is string, but i want validate the format of the sting to be an uuid v4 or to be a string whitout spaces, now i can't do this whit the @Format, Tsoa can't do these types of validations ?

@WoH
Copy link
Collaborator

WoH commented Mar 29, 2023

Yes, using a @pattern + the regex

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants