Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow regex specified custom string validation #56

Closed
strazto opened this issue Apr 11, 2022 · 5 comments
Closed

Allow regex specified custom string validation #56

strazto opened this issue Apr 11, 2022 · 5 comments

Comments

@strazto
Copy link
Contributor

strazto commented Apr 11, 2022

From #42 (comment)

Is your feature request related to a problem? Please describe.

We want to avoid allowing the very_dangerous_raw_string input type, and impose restrictions on allowed characters and patterns, however the input constraints change from application to application, for example, URLs have different requirements to other strings.

Though having some built-in defaults to cover common use-cases makes a lot of sense (int, uint, ascii, ascii_sentence, url, etc) there may be more complex patterns that require custom logic, eg, dttm: YY-MM-DD hh:mm , etc.

A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like

Allowing users to specify regex validation patterns in their configuration, (and maybe also a message format hint) might be a good way to accomodate this, for example:

#42 (comment)

It would be great to have an option to set a regex for string validation, like:

        arguments:
          - name: message
            type: regex
            regex: [a-z0-9:/.-]+

Describe alternatives you've considered

There might be other mechanisms to configure safe, customizable string escaping + validation, but I do think this is probably the most flexible + intuitive way to allow this.

Additional context

I also think there should be an optional config option for input format hints, which are displayed on the front-end, both for custom types and for the built-in types, eg:

Taken from regextester

        arguments:
          - name: Datetime
            type: regex
            regex: '^\d\d\d\d-(0?[1-9]|1[0-2])-(0?[1-9]|[12][0-9]|3[01]) (00|[0-9]|1[0-9]|2[0-3]):([0-9]|[0-5][0-9]):([0-9]|[0-5][0-9])$'
            hint: 'YYYY-MM-DD HH:MM:SS'

The equivalent for a builtin, say, int:

        arguments:
          - name: Some int
            type: regex
            regex: '^[0-9]+$'
            hint: 'A positive integer'

This might render as:

<p>
    <label for="some-int">Some int:</label>
    <input name="some-int" aria-describedby="some-int-desc">
    <span id="some-int-desc">A positive integer</span>
</p>

https://www.w3.org/WAI/tutorials/forms/instructions/

https://medium.com/nextux/form-design-best-practices-9525c321d759

Or just put the hint in the placeholder

@strazto
Copy link
Contributor Author

strazto commented Apr 11, 2022

var (
typecheckRegex = map[string]string{
"very_dangerous_raw_string": "",
"int": "^[\\d]+$",
"ascii": "^[a-zA-Z0-9]+$",
"ascii_identifier": "^[a-zA-Z0-9\\-\\.\\_]+$",
"ascii_sentence": "^[a-zA-Z0-9 \\,\\.]+$",
}
)

looks like these are the regex patterns for existing validations

// ActionArgument objects appear on Actions.
type ActionArgument struct {
Name string
Title string
Type string
Default string
Choices []ActionArgumentChoice
}

The config def for arguments for an action.

// TypeSafetyCheck checks argument values match a specific type. The types are
// defined in typecheckRegex, and, you guessed it, uses regex to check for allowed
// characters.
func TypeSafetyCheck(name string, value string, typ string) error {
pattern, found := typecheckRegex[typ]
if !found {
return errors.New("argument type not implemented " + typ)
}
matches, _ := regexp.MatchString(pattern, value)
if !matches {
log.WithFields(log.Fields{
"name": name,
"type": typ,
"value": value,
}).Warn("Arg type check safety failure")
return errors.New("invalid argument, doesn't match " + typ)
}
return nil
}

The implementation of actual regex type checking.

The most logical thing would be to refactor TypeSafetyCheck into a general regex check, which is called by the builtin typed versions, as well as by the regex type'd version.

@jamesread
Copy link
Collaborator

jamesread commented Apr 13, 2022

Hey @matthewstrasiotto , thanks for taking a look at this one. I agree that having the option to add custom regex will help a lot of users where they want to be more specific about input type.

Implementation guidance;

  • I think adding InputRegex on ActionArgument is the cleanest, most compatible way to handle this (not called Regex).
  • TypeSafetyCheck is called from typecheckActionArgument (here). I think a simple if statement there will do - if InputRegex's length is > 0, then use the regex, otherwise use the Type field.
  • Also, when the argument values change in the web form, the javascript calls /ValidateArumentType/, this is so that the user has a chance to fix the argument values if they're invalid, before they are really validated again before starting execution. However, /ValidateArgumentType/ is dumb, and relies on the browser to tell it which type to validate against. I don't think this is a security issue (if the browser only validates against a string, for example, instead of an int), because the executor validates against Action.Type (ie, the browser does not tell the executor the argument type). However, this creates a problem for the custom regex (we don't want to send the regex to the browser, and then back to the /ValidateArgumentType), so this feature is an opportunity to clean up the /ValidateArgumentType/ API call. It is implemented here;
    func (api *oliveTinAPI) ValidateArgumentType(ctx ctx.Context, req *pb.ValidateArgumentTypeRequest) (*pb.ValidateArgumentTypeResponse, error) {
    . The API call is defined in gRPC protobuf, though (here), so you'll see we need to update ValidateArgumentTypeRequest. It's new definition should probably be something like this;
message ValidateArgumentTypeRequest {
	string value = 1;
	string type = 2 [deprecated = true];
	string actionId = 3;
	string argumentName = 4;
}
  • Note that you'll need to run buf gen (see the Makefile) once the .proto file is updated.

Functional requirements;

  • If a InputRegex has been specified, then Type should be ignored.
  • If the user's custom regex does not compile, we should always fail the typecheck for safety, and emit an error log (but not quit the app).
  • Documentation covering this InputRegex usage on docs.olivetin.app.

Thanks again for taking a look at this, and I really appreciate your time in considering to make a contribution to OliveTin! I've got an open mind if you think I may have missed something in the guidance/requirements above, or you'd like to discuss a different approach.

@jamesread
Copy link
Collaborator

Hey @matthewstrasiotto , how are you coming with this? Is there any additional help I can offer?

@jamesread
Copy link
Collaborator

@strazto ping :-) ^^

@jamesread
Copy link
Collaborator

I decided just to implement this myself in the end.

I changed my mind, and just used a type: regex:^...$ syntax as it's should be fine, and saves yet another configuration fileld.

https://docs.olivetin.app/args-custom-regex.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants