Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regex exception when route template placeholder contains invalid character ')' #2116

Open
wenneberg6 opened this issue Jul 5, 2024 · 10 comments
Assignees
Labels
bug Identified as a potential bug medium effort Likely a few days of development effort Routing Ocelot feature: Routing small effort Likely less than a day of development effort.

Comments

@wenneberg6
Copy link

Expected Behavior

Given a route parameter including a ) character, then a 4** status code is expected.

Earlier versions before 19.0.4 returned status code 404.

Actual Behavior

Exception which causes a response with status code 500.

StackTrace:
Exception caught in global error handler, exception message: Invalid pattern '\b*path=debug)\b' at offset 14. Too many )'s., exception stack: at System.Text.RegularExpressions.RegexParser.ScanRegex() at System.Text.RegularExpressions.RegexParser.Parse(String pattern, RegexOptions options, CultureInfo culture) at System.Text.RegularExpressions.Regex..ctor(String pattern, CultureInfo culture) at Ocelot.DownstreamUrlCreator.Middleware.DownstreamUrlCreatorMiddleware.RemoveQueryStringParametersThatHaveBeenUsedInTemplate(DownstreamRequest downstreamRequest, List1 templatePlaceholderNameAndValues)
at Ocelot.DownstreamUrlCreator.Middleware.DownstreamUrlCreatorMiddleware.Invoke(HttpContext httpContext)
at Ocelot.LoadBalancer.Middleware.LoadBalancingMiddleware.Invoke(HttpContext httpContext)
at Ocelot.DownstreamPathManipulation.Middleware.ClaimsToDownstreamPathMiddleware.Invoke(HttpContext httpContext)
at Ocelot.QueryStrings.Middleware.ClaimsToQueryStringMiddleware.Invoke(HttpContext httpContext)
at Ocelot.Headers.Middleware.ClaimsToHeadersMiddleware.Invoke(HttpContext httpContext)
at Ocelot.Authorization.Middleware.AuthorizationMiddleware.Invoke(HttpContext httpContext)
at Ocelot.Claims.Middleware.ClaimsToClaimsMiddleware.Invoke(HttpContext httpContext)
at Ocelot.Authentication.Middleware.AuthenticationMiddleware.Invoke(HttpContext httpContext)
at Ocelot.RequestId.Middleware.RequestIdMiddleware.Invoke(HttpContext httpContext)
at Ocelot.RateLimit.Middleware.ClientRateLimitMiddleware.Invoke(HttpContext httpContext)
at Ocelot.Request.Middleware.DownstreamRequestInitialiserMiddleware.Invoke(HttpContext httpContext)
at Ocelot.Headers.Middleware.HttpHeadersTransformationMiddleware.Invoke(HttpContext httpContext)
at Ocelot.Security.Middleware.SecurityMiddleware.Invoke(HttpContext httpContext)
at Ocelot.Multiplexer.MultiplexingMiddleware.Fire(HttpContext httpContext, RequestDelegate next)
at Ocelot.Multiplexer.MultiplexingMiddleware.Invoke(HttpContext httpContext)
at Ocelot.DownstreamRouteFinder.Middleware.DownstreamRouteFinderMiddleware.Invoke(HttpContext httpContext)
at Ocelot.Responder.Middleware.ResponderMiddleware.Invoke(HttpContext httpContext)
at Ocelot.Errors.Middleware.ExceptionHandlerMiddleware.Invoke(HttpContext httpContext) RequestId: 0HN4SP4CA0UMM:00000001, exception: System.Text.RegularExpressions.RegexParseException: Invalid pattern '\b*path=debug)\b' at offset 14. Too many )'s.
at System.Text.RegularExpressions.RegexParser.ScanRegex()
at System.Text.RegularExpressions.RegexParser.Parse(String pattern, RegexOptions options, CultureInfo culture)
at System.Text.RegularExpressions.Regex..ctor(String pattern, CultureInfo culture)
at Ocelot.DownstreamUrlCreator.Middleware.DownstreamUrlCreatorMiddleware.RemoveQueryStringParametersThatHaveBeenUsedInTemplate(DownstreamRequest downstreamRequest, List1 templatePlaceholderNameAndValues) at Ocelot.DownstreamUrlCreator.Middleware.DownstreamUrlCreatorMiddleware.Invoke(HttpContext httpContext) at Ocelot.LoadBalancer.Middleware.LoadBalancingMiddleware.Invoke(HttpContext httpContext) at Ocelot.DownstreamPathManipulation.Middleware.ClaimsToDownstreamPathMiddleware.Invoke(HttpContext httpContext) at Ocelot.QueryStrings.Middleware.ClaimsToQueryStringMiddleware.Invoke(HttpContext httpContext) at Ocelot.Headers.Middleware.ClaimsToHeadersMiddleware.Invoke(HttpContext httpContext) at Ocelot.Authorization.Middleware.AuthorizationMiddleware.Invoke(HttpContext httpContext) at Ocelot.Claims.Middleware.ClaimsToClaimsMiddleware.Invoke(HttpContext httpContext) at Ocelot.Authentication.Middleware.AuthenticationMiddleware.Invoke(HttpContext httpContext) at Ocelot.RequestId.Middleware.RequestIdMiddleware.Invoke(HttpContext httpContext) at Ocelot.RateLimit.Middleware.ClientRateLimitMiddleware.Invoke(HttpContext httpContext) at Ocelot.Request.Middleware.DownstreamRequestInitialiserMiddleware.Invoke(HttpContext httpContext) at Ocelot.Headers.Middleware.HttpHeadersTransformationMiddleware.Invoke(HttpContext httpContext) at Ocelot.Security.Middleware.SecurityMiddleware.Invoke(HttpContext httpContext) at Ocelot.Multiplexer.MultiplexingMiddleware.Fire(HttpContext httpContext, RequestDelegate next) at Ocelot.Multiplexer.MultiplexingMiddleware.Invoke(HttpContext httpContext) at Ocelot.DownstreamRouteFinder.Middleware.DownstreamRouteFinderMiddleware.Invoke(HttpContext httpContext) at Ocelot.Responder.Middleware.ResponderMiddleware.Invoke(HttpContext httpContext) at Ocelot.Errors.Middleware.ExceptionHandlerMiddleware.Invoke(HttpContext httpContext) System.Text.RegularExpressions.RegexParseException: Invalid pattern '\b*path=debug)\b' at offset 14. Too many )'s. at System.Text.RegularExpressions.RegexParser.ScanRegex() at System.Text.RegularExpressions.RegexParser.Parse(String pattern, RegexOptions options, CultureInfo culture) at System.Text.RegularExpressions.Regex..ctor(String pattern, CultureInfo culture) at Ocelot.DownstreamUrlCreator.Middleware.DownstreamUrlCreatorMiddleware.RemoveQueryStringParametersThatHaveBeenUsedInTemplate(DownstreamRequest downstreamRequest, List1 templatePlaceholderNameAndValues)
at Ocelot.DownstreamUrlCreator.Middleware.DownstreamUrlCreatorMiddleware.Invoke(HttpContext httpContext)
at Ocelot.LoadBalancer.Middleware.LoadBalancingMiddleware.Invoke(HttpContext httpContext)
at Ocelot.DownstreamPathManipulation.Middleware.ClaimsToDownstreamPathMiddleware.Invoke(HttpContext httpContext)
at Ocelot.QueryStrings.Middleware.ClaimsToQueryStringMiddleware.Invoke(HttpContext httpContext)
at Ocelot.Headers.Middleware.ClaimsToHeadersMiddleware.Invoke(HttpContext httpContext)
at Ocelot.Authorization.Middleware.AuthorizationMiddleware.Invoke(HttpContext httpContext)
at Ocelot.Claims.Middleware.ClaimsToClaimsMiddleware.Invoke(HttpContext httpContext)
at Ocelot.Authentication.Middleware.AuthenticationMiddleware.Invoke(HttpContext httpContext)
at Ocelot.RequestId.Middleware.RequestIdMiddleware.Invoke(HttpContext httpContext)
at Ocelot.RateLimit.Middleware.ClientRateLimitMiddleware.Invoke(HttpContext httpContext)
at Ocelot.Request.Middleware.DownstreamRequestInitialiserMiddleware.Invoke(HttpContext httpContext)
at Ocelot.Headers.Middleware.HttpHeadersTransformationMiddleware.Invoke(HttpContext httpContext)
at Ocelot.Security.Middleware.SecurityMiddleware.Invoke(HttpContext httpContext)
at Ocelot.Multiplexer.MultiplexingMiddleware.Fire(HttpContext httpContext, RequestDelegate next)
at Ocelot.Multiplexer.MultiplexingMiddleware.Invoke(HttpContext httpContext)
at Ocelot.DownstreamRouteFinder.Middleware.DownstreamRouteFinderMiddleware.Invoke(HttpContext httpContext)
at Ocelot.Responder.Middleware.ResponderMiddleware.Invoke(HttpContext httpContext)
at Ocelot.Errors.Middleware.ExceptionHandlerMiddleware.Invoke(HttpContext httpContext)
`

Steps to Reproduce the Problem

  1. Simple Route configuration
    {
    "DownstreamPathTemplate": "/api/{*path}",
    "DownstreamScheme": "http",
    "DownstreamHostAndPorts": [{...}],
    "UpstreamPathTemplate": "/api/{*path}",
    "UpstreamHttpMethod": [ "Get" ]
    }
  2. GET /api/debug)
  3. Or with urlencoded parameter: GET /api/debug%29

Specifications

  • Version: 23.2.2
  • Platform: Dotnet 8
@raman-m
Copy link
Member

raman-m commented Jul 8, 2024

Hello Jimmi!

"UpstreamPathTemplate": "/api/{*path}",
DownstreamPathTemplate": "/api/{*path}",

The templates are invalid due to the inclusion of the * character in the placeholder name. What is the reason for adding an asterisk before the placeholder name?

GET /api/debug)

Indeed, using an invalid character in the URL string is not advisable. It's hard to see its purpose. Perhaps you're a QA engineer who enjoys testing software? 😉

Or with url encoded parameter: GET /api/debug%29

This URL is valid; however, due to special regex characters, the System.Text.RegularExpressions.RegexParser.ScanRegex() identifies it during the route matching phase, and the internally constructed regex string likely has unbalanced parentheses. It appears that any special RegEx characters in a URL could potentially cause this exception.

exception message: Invalid pattern '\b*path=debug)\b' at offset 14. Too many )'s.

Firstly, please remove the * from the placeholder name and test the URL again. Secondly, the rationale behind the upstream /api/{path} to downstream /api/{path} conversion is unclear. Could you explain why this is necessary?

I'm requesting a test of this route definition with a single Catch All route:

{
  "DownstreamPathTemplate": "/{path}",
  "UpstreamPathTemplate": "/{path}"
}

This route template should handle all paths, whether they include special characters encoded or not.
Please provide feedback.

@raman-m raman-m added needs validation Issue has not been replicated or verified yet needs feedback Issue is waiting on feedback before acceptance waiting Waiting for answer to question or feedback from issue raiser labels Jul 8, 2024
@raman-m
Copy link
Member

raman-m commented Jul 8, 2024

Expected Behavior

Given a route parameter including a ) character, then a 4** status code is expected.

Earlier versions before 19.0.4 returned status code 404.

Actual Behavior

Exception which causes a response with status code 500.

I'm curious... How does this affect your client? Both statuses indicate failure.
What is the reaction of your downstream service to this URL /api/debug%29, and what status does it return?

@wenneberg6
Copy link
Author

Thanks for replying.
Nope, not QA :) But since our latest update from v.18.0.0 to v.23.2.2 we noticed these Regex exceptions in our logs.
Ocelot has changed behavior, it used to accept routes with unknown characters. I tested the latest versions and found that in v.19.0.4 Ocelot started to throw an exception.

I have removed the * and tested again, no difference, an exception is still thrown.

The Downstream service is responsible for returning content and providing the correct status code when nothing is found(404). In our case, an invalid route should return a 404, which is not considered a failed response.

Ocelot prevents this from happening, as the Regex exception causes a 500 server error to be returned upstream.
Adding a parenthesis to the url is not intended by our users, or bots, but it is a public endpoint, stuff happens.

This problem can be "handled" using other techniques. But it is a change in behavior, that i was not aware of reading the change logs, and therefor must consider a bug.

@raman-m
Copy link
Member

raman-m commented Jul 9, 2024

@wenneberg6 commented on July 9

But since our latest update from v.18.0.0 to v.23.2.2 we noticed these Regex exceptions in our logs.
Ocelot has changed behavior, it used to accept routes with unknown characters. I tested the latest versions and found that in v.19.0.4 Ocelot started to throw an exception.

Upon reviewing the code, I found that the Regex causing the exception was introduced in version 8.0.1, as seen at line 82 in the commit 8f4ae03. Migrating from version 18.0.0 to 23.2.2 carries significant risks due to the extensive changes. Although we strive for backward compatibility, behavior may vary between major versions.

I tested the latest versions and found that in v.19.0.4 Ocelot started to throw an exception.

How did you determine there was an issue with version 19.0.4? As mentioned, the RegEx was introduced in version 8.0.1.
How were you able to test version 19.0.4 if you had migrated from 18.x to 23.x?

The Downstream service is responsible for returning content and providing the correct status code when nothing is found(404). In our case, an invalid route should return a 404, which is not considered a failed response.
Ocelot prevents this from happening, as the Regex exception causes a 500 server error to be returned upstream.

The downstream service returned a 404 status, indicating that Ocelot successfully forwarded the request without encountering internal 500 errors. We will address the issue.

This problem can be "handled" using other techniques. But it is a change in behavior, that i was not aware of reading the change logs, and therefor must consider a bug.

We are taken aback, yet we accept it. This is the bug.

@raman-m raman-m changed the title Regex exception when Route Parameter contains invalid character ')' Regex exception when route template placeholder contains invalid character ')' Jul 9, 2024
@raman-m raman-m added bug Identified as a potential bug medium effort Likely a few days of development effort small effort Likely less than a day of development effort. Routing Ocelot feature: Routing and removed needs validation Issue has not been replicated or verified yet needs feedback Issue is waiting on feedback before acceptance waiting Waiting for answer to question or feedback from issue raiser labels Jul 9, 2024
@wenneberg6
Copy link
Author

wenneberg6 commented Jul 9, 2024

Thank you.
Yes updating was a risk, but reading through the release notes, we found no issues for our configuration.

How were you able to test version 19.0.4 if you had migrated from 18.x to 23.x?

I rolled back from 23.3.2 to 1.8.0.
I confirmed that in v.18 a status code 404 was returned for my use case.
I updated the Ocelot Nuget package one version at a time, upon reaching v.19.0.4 the status code changed to 500.

Perhaps my testing is flawed, but i found it to fail from v.19.0.4.

@raman-m
Copy link
Member

raman-m commented Jul 9, 2024

Jimmi,
I apologize for the confusion, but there is no 19.0.4 release. Our releases include 19.0.3 and the subsequent 20.0.0, which you can view on page 2.
However, there is a 19.0.4 package available. It was issued as a technical release. Perhaps you are referring to version 20.0.0, correct?
And release 20.0.0 was pretty big.
Changes: 19.0.3...20.0.0
Possible root cause: ec85b13ec85b13#diff-f5a2c3a941a7f201a4cead9eb4f489229f0740cf4e1166d0b8ff1670e2ce30a5R100-R117
More details in bug fixing PR #1335

@raman-m raman-m added the Sep'24 September 2024 release label Jul 10, 2024
@raman-m raman-m added this to the v23.3.x Hotfixes milestone Jul 10, 2024
@raman-m
Copy link
Member

raman-m commented Jul 18, 2024

Hello @wenneberg6,
Do you have any suggestions for resolving this issue? Are you a C# or Java developer?
I propose eliminating the creation of the Regex object entirely and reverting to a string manipulation design.
Redesigning this method would...

  • slightly enhance the performance of the RemoveQueryStringParametersThatHaveBeenUsedInTemplate method
  • address the original bug caused by an invalid pattern containing reserved characters of regular expressions

What are your thoughts on this? I would appreciate your input.

@wenneberg6
Copy link
Author

wenneberg6 commented Aug 5, 2024

Hello @wenneberg6, Do you have any suggestions for resolving this issue? Are you a C# or Java developer? I propose eliminating the creation of the Regex object entirely and reverting to a string manipulation design. Redesigning this method would...

  • slightly enhance the performance of the RemoveQueryStringParametersThatHaveBeenUsedInTemplate method
  • address the original bug caused by an invalid pattern containing reserved characters of regular expressions

What are your thoughts on this? I would appreciate your input.

Well, without more context about the change to a regular expression in the RemoveQueryStringParametersThatHaveBeenUsedInTemplate method, and without a clear understanding of the method's overall purpose, it's difficult to suggest the best approach.

The primary issue is the unhandled exception, both options can potentially throw.
Maybe simply escaping the regex input string is enough here → Regex.Escape(String) Method

@ggnaegi ggnaegi self-assigned this Aug 12, 2024
@ggnaegi
Copy link
Member

ggnaegi commented Aug 12, 2024

@raman-m @wenneberg6 I will check this

@ggnaegi ggnaegi assigned raman-m and ggnaegi and unassigned ggnaegi and raman-m Aug 12, 2024
@raman-m
Copy link
Member

raman-m commented Aug 12, 2024

🙃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Identified as a potential bug medium effort Likely a few days of development effort Routing Ocelot feature: Routing small effort Likely less than a day of development effort.
Projects
None yet
Development

No branches or pull requests

3 participants