Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nicer errors when user entered an invalid regex #4680

Open
macobo opened this issue Jun 10, 2021 · 27 comments
Open

Nicer errors when user entered an invalid regex #4680

macobo opened this issue Jun 10, 2021 · 27 comments
Labels
bug Something isn't working right enhancement New feature or request team/product-analytics
Projects

Comments

@macobo
Copy link
Contributor

macobo commented Jun 10, 2021

Is your feature request related to a problem?

Writing a correct regex is hard. Currently when user manages to write an invalid regex, they get an 5xx error back from our API.

Describe the solution you'd like

Let's instead show them a warning on the frontend?

Describe alternatives you've considered

Additional context

Thank you for your feature request – we love each and every one!

@macobo macobo added bug Something isn't working right enhancement New feature or request analytics-filters labels Jun 10, 2021
@sentry-io
Copy link

sentry-io bot commented Jun 10, 2021

Sentry issue: POSTHOG-2X8

@sentry-io
Copy link

sentry-io bot commented Jun 10, 2021

Sentry issue: POSTHOG-2X7

@sentry-io
Copy link

sentry-io bot commented Jun 10, 2021

Sentry issue: POSTHOG-2V7

@Twixes
Copy link
Collaborator

Twixes commented Jun 10, 2021

ClickHouse and the plugin server use Google's RE2 for regex matching. It does have a Node bindings package, but it doesn't work in the browser, as in the end it's a C++ binary. We could use JS standard RegExp client-side instead which would largely be OK, though it's actually somewhat more lenient than RE2, so some 500s would still slip in.
Alternatively this can be checked server-side when setting a regex field in Python, using google-re2, in which case validation would require a request, but would be fully reliable.
Postgres is a slightly different story – its regex syntax (POSIX) differs in a few places, though not greatly. In its case only a Postgres query would really reliably verify a regex. However I'd argue we should stick to RE2 validation in this case too.

@macobo
Copy link
Contributor Author

macobo commented Jun 11, 2021

I think (at least) starting with some js-RegExp validation is a valid start, always can add complexity later if a lot of errors still.

@clarkus
Copy link
Contributor

clarkus commented Jul 27, 2021

Related #4326

@paolodamico
Copy link
Contributor

paolodamico commented Jul 29, 2021

As this seems to have caused confusion a bit of times and affecting funnel usage, adding to the priorities list for this sprint. We should consider the extra context from #4326 (i.e. let's add detailed instructions about the kind of operators that are supported).

@clarkus
Copy link
Contributor

clarkus commented Jul 29, 2021

Let's focus this issue on addressing clearer instructions around regex. Secondary to that, we could use a simple error pattern for input fields to highlight errors and communicate how to recover from them. I filed #5380 to track this work.

@clarkus
Copy link
Contributor

clarkus commented Aug 4, 2021

I am working on this today and tomorrow. Do you have any guidance on what makes for a great regex authoring experience? So far I am tracking simple validation based on syntax. Would we also need some way to see how the expression evaluates to ensure that it's accurate AND well-formed? Secondary to this, I was planning on adding a simple syntax key or cheatsheat to show what properties we support and possibly even provide shortcuts for applying them into the input field that contains the regex. Any thoughts on what else we can address here? Thanks!

@paolodamico
Copy link
Contributor

I sometimes use https://regex101.com/ because it visually shows the match groups and makes it easy to build and parse a regex string.

Here's the syntax we use in CH & plugin server. I would keep the cheatsheet to the most basic expressions and then link their docs.

Whenever we evaluate an expression I'm not sure how detailed errors we'd be able to provide on malformed expressions, but I would just show an error string.

Tagging @Twixes for any additional input

@clarkus
Copy link
Contributor

clarkus commented Aug 4, 2021

Here are some ideas that are mostly intended to facilitate discussion, not so much prescribe a solution. Here's my thinking so far (please correct me if I'm wrong):

  • Writing a valid regex isn't easy. A lot of people need help or at least need a more approachable error state to help them recover
  • Depending on the string being matched, the regex can become very verbose and contain lots of characters
  • There are secondary parameters like delimiters and modifiers we could support
  • There are varying flavors of regex and our specific syntax support matters, so we should link to the corresponding documentation

My biggest question is how likely are users to author complex regex patterns in the app? It seems like there are a lot of popular single serving sites that are really good at this now. We could try to approach that level of experience with our regex editor, or we could just make it easy to copy and paste text into those fields, and alert users of malformed values. Secondary to that, are there any common regex patterns we think users will want to use? Are there any kind of utility patterns that we could make available in the UI? The idea here is to make writing a regex easier by avoiding writing the regex entirely.

Some ideas for inline options. Validation would occur on blur / de-focus, or when the user clicks "reevaluate". It was about this time I realized a textarea might be a better experience.

Screen Shot 2021-08-04 at 3 49 27 PM

Our ability to summarize a complex regex is also a bit of a challenge. A summary view and a modular editing view might be worth considering as well. In this case, once a user selects the matches regex operator, when the value field is focused, we could trigger a modal containing a large-form editor. This would allow for both simple and complex patterns.

Screen Shot 2021-08-04 at 3 46 16 PM

@Twixes
Copy link
Collaborator

Twixes commented Aug 4, 2021

Syntax highlighting would be great, and we even already use a library in the product that can do this (react-syntax-highlighter which uses Prism.js under the hood). BUT unfortunately it doesn't work with inputs or textareas. They don't allow such styling so we'd have to engineer our own component that would work exactly like input – only by displaying text using raw HTML (a problem I also recently faced in desktop/desktop#12465 🙈 ). I'd skip that for now as that would be a lot of effort for one component.

As for error messages, I definitely think they could be more useful by being specific. In fact I just made them such in #5456, which I think looks pretty fine.

I think validation should be automatic (and non-obtrusive, deblur/debounce should be good), and your mocks look really nice @clarkus. Much more coherent than the current error state message thrown in quickly.
I would actually argue the option to edit a filter value in a textarea modal could be generalized to all operators in a useful way (though a bit concerned about this UI for operators where a filter can have multiple values).

Personally for the really complex expressions I use https://regexr.com/ and they have a nice cheatsheet. We definitely can't match the power of that app… but we could put a similar cheatsheet in our one.

Regexxxx

@clarkus
Copy link
Contributor

clarkus commented Aug 4, 2021

Awesome feedback. Thank you, @Twixes.

Could we do some input fakery with a content-editable attribute on a div or some other element? That would probably get us closer to allowing structure inside that container element and hopefully then some syntax highlighting. As long as we could pull the string out of the element, it should work?

@Twixes
Copy link
Collaborator

Twixes commented Aug 5, 2021

We could do some fakery but then the input experience would be impacted (i.e. cursor moving, selection, etc.). That's why I'd rather go for the less tricky improvements first.

@clarkus
Copy link
Contributor

clarkus commented Aug 5, 2021

I did some quick searching and found what might be a viable option that uses multiple elements to emulate an editor with syntax highlighting. https://css-tricks.com/creating-an-editable-textarea-that-supports-syntax-highlighted-code/

I'm not going to pursue that right now, but wanted to share it before I forgot. Maybe we can aim for something like this in the future. I'm going to look into incorporating modular editors for other operators as well. Thanks again for the input!

@clarkus
Copy link
Contributor

clarkus commented Aug 5, 2021

I thought of another question How critical is it to have a sample string or representation of the property value being affected by the regular expression? Seems like having some sample might aid in writing a regex. I'm just now sure how representative the sample could be in this context.

@clarkus
Copy link
Contributor

clarkus commented Aug 5, 2021

Couple more iterations that focus on the modular editor. I've moved validation to a standard position and made the reevaluate action consistent. The syntax reference is also placed opposite the label so that it doesn't conflict with a more verbose validation message. I've also used some subtle code formatting within the validation string to highlight the portion that represents the invalid characters.

Available in figma at https://www.figma.com/file/9yWtngNb1AIuf6KmXaEPJA/App-doodles?node-id=923%3A134375

Screen Shot 2021-08-05 at 1 17 06 PM

@paolodamico
Copy link
Contributor

That last design looks amazing! @Twixes correct me if I'm wrong but I think we can evaluate on the go? Also by syntax highlighting you mean highlighting each match group so you can better understand how the Regexp is constructed?

@Twixes
Copy link
Collaborator

Twixes commented Aug 5, 2021

We can definitely evaluate on the go, though without some debouncing it would be annoying.
Yup, syntax highlighting would help see match groups, but also even more importantly I think, special characters.

@clarkus
Copy link
Contributor

clarkus commented Aug 5, 2021

I don't think we should validate on every key press. That could be really distracting when you're just trying to type in the expression you know you want.

@macobo's idea from #5380 (comment) could be worth considering?

User has not typed for the past N ms

@Twixes
Copy link
Collaborator

Twixes commented Aug 5, 2021

Yup, that's debouncing, 500 ms is my go-to value in such cases (but can always be tweaked).

@clarkus
Copy link
Contributor

clarkus commented Aug 5, 2021

Thanks, @Twixes. You taught me something new.

@clarkus clarkus added the UI/UX label Aug 17, 2021
@clarkus clarkus moved this from Backlog to In Progress in Design 🎨 Aug 17, 2021
@clarkus
Copy link
Contributor

clarkus commented Aug 17, 2021

Let's move ahead with the modular regex editor. We should trigger the modal when the user selects one of the regex operators from the list, or if the modal is closed and a regex operator is already selected, when the value field is selected. Once the regex is validated, on submit we will insert that regex value into the value field in the filter row. Said another way, the modal is replacing any kind of dropdown or other interstitial selection component.

Screen Shot 2021-08-17 at 12 27 27 PM

Screen Shot 2021-08-17 at 12 34 54 PM

Screen Shot 2021-08-17 at 12 35 31 PM

@clarkus clarkus moved this from In Progress to Awaiting Implementation in Design 🎨 Aug 17, 2021
@paolodamico
Copy link
Contributor

  1. Made a similar comment on Improve UX for formulas in trends #3375. Think it's worth having the option of users typing their regex expression directly in the main input (particularly thinking when you're using it in funnels) for when you want something simple. But then the user can have the option to prop open this modal to have a clearer workspace to create something complex.
  2. Should we get rid of the reevaluate button then?
  3. For the supported properties I think we could have an in-product popover with the list of expressions and then maybe an alternate option to go to the full docs. Wdyt?

@clarkus
Copy link
Contributor

clarkus commented Aug 18, 2021

I think conditional editing modes is an interesting idea. I can work on some iterations for that. The reevaluate button was really meant to let a user test their regex without having to go through the whole debounce / blur to trigger validation. Some manual means of triggering that check could help users trust that they're entering a meaningful value here. An in-product popover could be helpful. Do we have that content or can we produce it to test this idea? It's really a question of how complex that cheat sheet could be.

The core change here is validation messaging. If we can ship that, then we could watch how users react and adjust our next iterations based on that. I'm not sure the other pieces are necessary to really improve the experience... I mean they would, but if users are authoring regex on external tools and copy/pasting into the control, then the cheat sheet and manual validation controls aren't as necessary.

@clarkus
Copy link
Contributor

clarkus commented Aug 18, 2021

Here's an inline version of the regex editor. This removes the validation button. I am still working through how we might trigger a modular version of the editor.

Regex - inline

https://www.figma.com/file/9yWtngNb1AIuf6KmXaEPJA/App-doodles?node-id=1076%3A135967

@Twixes
Copy link
Collaborator

Twixes commented Jun 29, 2023

+1 based on ZEN-3909 (in particular we only return a server error when you try to use lookahead or lookbehind, which isn't supported by re2)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working right enhancement New feature or request team/product-analytics
Projects
No open projects
Design 🎨
Awaiting Implementation
Development

No branches or pull requests

5 participants