Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add minimal regex matcher #37

Merged
merged 9 commits into from
Jan 12, 2024
Merged

add minimal regex matcher #37

merged 9 commits into from
Jan 12, 2024

Conversation

DmitriyMusatkin
Copy link
Contributor

Issue #, if available:
add support for regex matching in endpoint resolution

Description of changes:

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@codecov-commenter
Copy link

codecov-commenter commented Jan 8, 2024

Codecov Report

Attention: 23 lines in your changes are missing coverage. Please review.

Comparison is base (fd8c0ba) 67.94% compared to head (ad32847) 69.30%.

Files Patch % Lines
source/endpoints_regex.c 91.19% 17 Missing ⚠️
source/partitions.c 62.50% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #37      +/-   ##
==========================================
+ Coverage   67.94%   69.30%   +1.35%     
==========================================
  Files           9       10       +1     
  Lines        2611     2795     +184     
==========================================
+ Hits         1774     1937     +163     
- Misses        837      858      +21     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@DmitriyMusatkin DmitriyMusatkin marked this pull request as ready for review January 8, 2024 22:16
source/endpoints_regex.c Show resolved Hide resolved
source/endpoints_regex.c Outdated Show resolved Hide resolved
source/endpoints_regex.c Show resolved Hide resolved
struct aws_byte_cursor alternation = aws_byte_cursor_from_string(symbol->info.alternation);
size_t chars_in_match = 0;
while (aws_byte_cursor_next_split(&alternation, '|', &variant)) {
if (aws_byte_cursor_starts_with(&text, &variant)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention in the rules that it just matches the first alternation and will fail for something like regex = (ab|abc)d and text = abcd

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioning this in the rules in a .c file seems insufficient. If we're not going to handle certain patterns, we should detect and error out in regex_new()

But we can easily handle this. Simply scan over every variant, and use whichever one was longest.

include/aws/sdkutils/private/endpoints_regex.h Outdated Show resolved Hide resolved
source/endpoints_regex.c Outdated Show resolved Hide resolved
include/aws/sdkutils/private/endpoints_regex.h Outdated Show resolved Hide resolved
include/aws/sdkutils/private/endpoints_regex.h Outdated Show resolved Hide resolved
include/aws/sdkutils/sdkutils.h Show resolved Hide resolved
include/aws/sdkutils/sdkutils.h Show resolved Hide resolved
source/endpoints_regex.c Show resolved Hide resolved
source/endpoints_regex.c Outdated Show resolved Hide resolved
source/endpoints_regex.c Outdated Show resolved Hide resolved
source/endpoints_regex.c Outdated Show resolved Hide resolved
struct aws_byte_cursor alternation = aws_byte_cursor_from_string(symbol->info.alternation);
size_t chars_in_match = 0;
while (aws_byte_cursor_next_split(&alternation, '|', &variant)) {
if (aws_byte_cursor_starts_with(&text, &variant)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioning this in the rules in a .c file seems insufficient. If we're not going to handle certain patterns, we should detect and error out in regex_new()

But we can easily handle this. Simply scan over every variant, and use whichever one was longest.

tests/endpoint_regex_tests.c Outdated Show resolved Hide resolved
tests/endpoint_regex_tests.c Outdated Show resolved Hide resolved
source/endpoints_regex.c Outdated Show resolved Hide resolved
source/endpoints_regex.c Show resolved Hide resolved
@DmitriyMusatkin DmitriyMusatkin merged commit 6c7764e into main Jan 12, 2024
31 checks passed
@DmitriyMusatkin DmitriyMusatkin deleted the regex branch January 12, 2024 23:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants