Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support colons in path for server-side #5676

Merged
merged 21 commits into from
Jun 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@

import java.util.List;
import java.util.Set;
import java.util.regex.Pattern;

import com.google.common.collect.ImmutableList;
import com.google.common.collect.ImmutableSet;
Expand All @@ -31,6 +32,7 @@

final class ExactPathMapping extends AbstractPathMapping {

private static final Pattern ESCAPE_COLON = Pattern.compile("/\\\\:");
private final String prefix;
private final String exactPath;
private final List<String> paths;
Expand All @@ -45,7 +47,9 @@ private ExactPathMapping(String prefix, String exactPath) {
checkArgument(exactPath.indexOf(';') < 0, "exactPath: %s (expected not to have a ';')", exactPath);
}
this.prefix = prefix;
this.exactPath = ensureAbsolutePath(exactPath, "exactPath");
ensureAbsolutePath(exactPath, "exactPath");
exactPath = ESCAPE_COLON.matcher(exactPath).replaceAll("/:");
this.exactPath = exactPath;
paths = ImmutableList.of(exactPath, exactPath);
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,14 @@
*/
final class ParameterizedPathMapping extends AbstractPathMapping {

private static final Pattern VALID_PATTERN = Pattern.compile("(/[^/{}:]+|/:[^/{}]+|/\\{[^/{}]+})+/?");
private static final Pattern VALID_PATTERN = Pattern.compile(
'(' +
// If the segment doesn't start with ':' or '{', the behavior should be the same as ExactPathMapping
"/[^:{][^/]*|" +
"/:[^/{}]+|" +
Copy link
Contributor

@ikhoon ikhoon May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question)

  1. Asumming that a path pattern is /a/:b:c/:d, b:c is used as a path variable, e.g. ctx.pathParam("b:c"). Is that an expected behavior?
  2. /brace/{b}:def is an invalid pattern. /colon/:a:verb is an valid pattern and a path segment is captured with a:verb at which I was a bit suprised.

If we don't support /:param:verb style in the ParameterizedPathMapping, I prefer to raise an exception when a path variable includes :. What do you think?

Copy link
Contributor Author

@jrhee17 jrhee17 May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also found that the handling of {} and : is different which confused me at first.
Having said this, I decided to just follow the previous regex pattern since I didn't want to introduce inadvertent breaking changes.

Asumming that a path pattern is /a/:b:c/:d, b:c is used as a path variable, e.g. ctx.pathParam("b:c"). Is that an expected behavior?
/brace/{b}:def is an invalid pattern. /colon/:a:verb is an valid pattern and a path segment is captured with a:verb at which I was a bit suprised.

Double checked that this behavior is consistent with the main branch.

If we don't support /:param:verb style in the ParameterizedPathMapping, I prefer to raise an exception when a path variable includes :. What do you think?

I wanted to avoid introducing more breaking changes since this PR is already introducing certain breaking changes. I also agree that the assymmetry is confusing and it is easier to reason about if : and {} have the same treatment.

I guess this comes down to "does this issue annoy maintainers enough to make a breaking change (I'm not sure if users are actually using colon in names though)". I think we can discuss this further at the next stand up meeting.

Copy link
Contributor

@ikhoon ikhoon May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double checked that this behavior is consistent with the main branch.

I was confused. /brace/{b}:def only works on the main branch but didn't work with the previous release version, 1.28.4. As it is consistent with the previous version, it is okay.

This PR is good as is. However, ultimately, it would be good to support a path pattern that allows the path variable can be located in a path segment. Some URL patterns seem useful:

  • /{project}/{repoName}.git
  • /download/{filename}.pdf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, ultimately, it would be good to support a path pattern that allows the path variable can be located in a path segment

I agree this could be useful. I think it's probably doable if we start using {} instead of \0 to internally representing the parameterized variables. Additionally, we probably need to modify the route parsing logic at Routers as well.

I don't have immediate plans to handle this since RegexPathMapping can kind of handle this case - I'd be happy to review if you would like to work on it though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To support complex path patterns to build finer grain routes, I am considering dropping : from the path variable patterns in Armeria 2.0. We don't debate about that though.

I don't see it was a good choice that we used : as a path variable. : seems simple but it can be also used as a literal. If a path variable is mixed with a literal :, it is ambiguous and difficult to know the exact name of the variable.

I don't have immediate plans to handle this since RegexPathMapping can kind of handle this case - I'd be happy to review if you would like to work on it though.

I'll try it next time. 😃

"/\\{[^/{}]+}" +
")+/?"
);

private static final Pattern CAPTURE_REST_PATTERN = Pattern.compile("/\\{\\*([^/{}]*)}|/:\\*([^/{}]*)");

Expand Down Expand Up @@ -76,7 +83,7 @@ final class ParameterizedPathMapping extends AbstractPathMapping {
* Skeletal form of given path, which is used for duplicated routing rule detection.
* For example, "/{a}/{b}" and "/{c}/{d}" has same skeletal form and regarded as duplicated.
*
* <p>e.g. "/{x}/{y}/{z}" -> "/:/:/:"</p>
* <p>e.g. "/{x}/{y}/{z}" -> "/\0/\0/\0"</p>
* <p>Set a skeletal form with the patterns described in {@link Route#paths()}.</p>
*/
private final String skeleton;
Expand Down Expand Up @@ -136,21 +143,23 @@ private ParameterizedPathMapping(String prefix, String pathPattern) {
for (String token : PATH_SPLITTER.split(pathPattern)) {
final String paramName = paramName(token);
if (paramName == null) {
// If the given token is a constant, do not manipulate it.
// If the token escapes the first colon, then clean it. We don't need to handle '{'
// since it's not an allowed path character per rfc3986.
if (token.startsWith("\\:")) {
token = token.substring(1);
}

patternJoiner.add(token);
normalizedPatternJoiner.add(token);
skeletonJoiner.add(token);
continue;
}
// The paramName should not include colons.
// TODO: Implement param options expressed like `{bar:type=int,range=[0,10]}`.
final String colonRemovedParamName = removeColonAndFollowing(paramName);
final boolean captureRestPathMatching = isCaptureRestPathMatching(token);
final int paramNameIdx = paramNames.indexOf(colonRemovedParamName);
final int paramNameIdx = paramNames.indexOf(paramName);
if (paramNameIdx < 0) {
// If the given token appeared first time, add it to the set and
// replace it with a capturing group expression in regex.
paramNames.add(colonRemovedParamName);
paramNames.add(paramName);
if (captureRestPathMatching) {
patternJoiner.add("(.*)");
} else {
Expand All @@ -162,8 +171,8 @@ private ParameterizedPathMapping(String prefix, String pathPattern) {
patternJoiner.add("\\" + (paramNameIdx + 1));
}

normalizedPatternJoiner.add((captureRestPathMatching ? ":*" : ':') + colonRemovedParamName);
skeletonJoiner.add(captureRestPathMatching ? "*" : ":");
normalizedPatternJoiner.add((captureRestPathMatching ? ":*" : ':') + paramName);
skeletonJoiner.add(captureRestPathMatching ? "*" : "\0");
jrhee17 marked this conversation as resolved.
Show resolved Hide resolved
}

this.pathPattern = pathPattern;
Expand Down Expand Up @@ -201,17 +210,6 @@ private static String paramName(String token) {
return null;
}

/**
* Removes the portion of the specified string following a colon (:).
*/
private static String removeColonAndFollowing(String paramName) {
final int index = paramName.indexOf(':');
if (index == -1) {
return paramName;
}
return paramName.substring(0, index);
}

/**
* Return true if path parameter contains capture the rest path pattern
* ({@code "{*foo}"}" or {@code ":*foo"}).
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,7 @@ default RoutingResult apply(RoutingContext routingCtx) {
* <ul>
* <li>EXACT: {@code [ "/foo", "/foo" ]} (The trie path is the same.)</li>
* <li>PREFIX: {@code [ "/foo/", "/foo/*" ]}</li>
* <li>PARAMETERIZED: {@code [ "/foo/:", "/foo/:" ]} (The trie path is the same.)</li>
* <li>PARAMETERIZED: {@code [ "/foo/\0", "/foo/\0" ]} (The trie path is the same.)</li>
* </ul>
*
* <p>{@link RoutePathType#REGEX} may have one or two paths. If the {@link Route} was created from a glob
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,6 @@
import java.util.List;
import java.util.Objects;
import java.util.Set;
import java.util.function.Function;
import java.util.function.Predicate;
import java.util.regex.Pattern;

Expand Down Expand Up @@ -96,7 +95,7 @@ public final class RouteBuilder {
* @throws IllegalArgumentException if the specified path pattern is invalid
*/
public RouteBuilder path(String pathPattern) {
return pathMapping(getPathMapping(pathPattern, RouteBuilder::getPathMapping));
return pathMapping(getPathMapping(pathPattern));
}

/**
Expand Down Expand Up @@ -218,8 +217,7 @@ public RouteBuilder glob(String glob) {

private RouteBuilder glob(String glob, int numGroupsToSkip) {
requireNonNull(glob, "glob");
return pathMapping(
getPathMapping(glob, pathPattern -> globPathMapping(pathPattern, numGroupsToSkip)));
return pathMapping(globPathMapping(glob, numGroupsToSkip));
}

/**
Expand Down Expand Up @@ -543,32 +541,12 @@ private static PathMapping getPathMapping(String pathPattern) {
"pathPattern: " + pathPattern +
" (not an absolute path starting with '/' or a unknown pattern type)");
}
if (!pathPattern.contains("{") && !pathPattern.contains(":")) {
if (!pathPattern.contains("/{") && !pathPattern.contains("/:")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to this change, /path/infix{foo} becomes a valid pattern. Previously, an exception was raised.

{ is not an allowed character so it is percent-encoded by Armeria client or server. It means a request whose path is /path/infix{foo} fails to match the service for /path/infix{foo} and returns 404.

We may reject the pattern as before or treat this as a known problem if the percent-encoding issue is not relevant to this PR. I also don't see this mismatch as a critical problem.

Copy link
Contributor Author

@jrhee17 jrhee17 May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{ is not an allowed character so it is percent-encoded by Armeria client or server. It means a request whose path is /path/infix{foo} fails to match the service for /path/infix{foo} and returns 404.

I agree with this sentiment. However, I also thought that not only {} but also other characters are valid path patterns but have the same issue. e.g. /$@ will have a similar issue.

Because of this, I thought 1) this was easy enough for maintainers to reason about 2) users won't be using this pattern anyways 3) If we want to limit these type of characters, it shouldn't be limited to only {} but also other characters that should be precent-encoded.

I do think such strict validation will be nice to have for those who aren't familiar with percent-encoding rules though (I also have to refer to the rfc every time if I'm being honest). I think it might be nice to refer to other frameworks to see how strictly they are handling validation rules as well.

Copy link
Contributor

@ikhoon ikhoon Jun 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/path/infix{foo} is not a common pattern for REST API. Let's resolve the problem when we supporting the use of a path variable with literal characters like https://github.com/spring-projects/spring-framework/blob/main/spring-core/src/test/java/org/springframework/util/AntPathMatcherTests.java#L659-L661

return new ExactPathMapping(pathPattern);
}
return new ParameterizedPathMapping(pathPattern);
}

private static PathMapping getPathMapping(String pathPattern,
Function<String, PathMapping> basePathMappingMapper) {
requireNonNull(pathPattern, "pathPattern");
if (pathPattern.startsWith(EXACT) ||
pathPattern.startsWith(PREFIX) ||
pathPattern.startsWith(REGEX) ||
pathPattern.startsWith(GLOB)) {
return basePathMappingMapper.apply(pathPattern);
}

// Parameterized, glob or no prefix.
final String verb = VerbSuffixPathMapping.findVerb(pathPattern);
if (verb == null) {
return basePathMappingMapper.apply(pathPattern);
}
final String basePathPattern = pathPattern.substring(0, pathPattern.length() - verb.length() - 1);
final PathMapping basePathMapping = basePathMappingMapper.apply(basePathPattern);
return new VerbSuffixPathMapping(basePathMapping, verb);
}

static PathMapping prefixPathMapping(String prefix, boolean stripPrefix) {
if ("/".equals(prefix)) {
// Every path starts with '/'.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -178,18 +178,11 @@ private NodeBuilder<V> insertChild(@Nullable NodeBuilder<V> node,
StringBuilder exactPath = null;

while (pathReader.hasNext()) {
// All escaped characters will be treated as an exact path.
if (pathReader.isEscaped()) {
if (exactPath == null) {
exactPath = new StringBuilder();
}
exactPath.append(pathReader.read());
continue;
}

final char c = pathReader.read();
// Find the prefix until the first wildcard (':' or '*')
if (c != '*' && c != ':') {
// Find the prefix until the first wildcard ('\0' or '*')
jrhee17 marked this conversation as resolved.
Show resolved Hide resolved
// '\0' in this context signifies that the current path segment is a parameter.
// See ParameterizedPathMapping for details on how '\0' is determined.
if (c != '*' && c != '\0') {
if (exactPath == null) {
exactPath = new StringBuilder();
}
Expand All @@ -209,7 +202,7 @@ private NodeBuilder<V> insertChild(@Nullable NodeBuilder<V> node,
if (c == '*') {
node = asChild(new NodeBuilder<>(NodeType.CATCH_ALL, node, "*"));
} else {
node = asChild(new NodeBuilder<>(NodeType.PARAMETER, node, ":"));
node = asChild(new NodeBuilder<>(NodeType.PARAMETER, node, "\0"));
}
}

Expand Down Expand Up @@ -440,11 +433,8 @@ private static class PathReader {

char peekAsKey() {
final char c = path.charAt(index);
if (isEscapeChar(c)) {
return path.charAt(index + 1);
}
switch (c) {
case ':':
case '\0':
return KEY_PARAMETER;
case '*':
return KEY_CATCH_ALL;
Expand All @@ -454,12 +444,7 @@ char peekAsKey() {
}

char read() {
final char c = path.charAt(index++);
if (isEscapeChar(c)) {
return path.charAt(index++);
} else {
return c;
}
return path.charAt(index++);
}

boolean isSameChar(char c) {
Expand All @@ -474,16 +459,8 @@ boolean hasNext() {
return index < length;
}

boolean isEscaped() {
return isEscapeChar(path.charAt(index));
}

String remaining() {
return path.substring(index);
}

private static boolean isEscapeChar(char ch) {
return ch == '\\';
}
}
}

This file was deleted.

Loading
Loading