-
Notifications
You must be signed in to change notification settings - Fork 1
Add Regexp Targeting #115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add Regexp Targeting #115
Conversation
b607ce0
to
402ccfc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great passion project!
Check out the power of let
in my test suggestions. It's the most amazing part of rspec
. It's so frustrating to switch over to the web
repo and not have access to let
in minitest!
README.md
Outdated
This will be applied in any process that has (`service_name == "frontend"` OR `service_name == "auth"`) AND `datacenter == "AWS-US-EAST-1"`. | ||
|
||
### Matching Values | ||
By adding a backslash `/` at the front and end of a value, context values will be applied if the string between the backslashes is a substring of the value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⛏️ That's actually a "slash". \
is a backslash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I was glad to see in the tests and implementation that slashes are required at both the front and back of the string, so a path prefix like /tmp
won't trigger it. At first I missed it in your documentation here, because it says "a backslash" (singular). How about:
By adding `/` at the front and end of a value, the pattern given between the slashes will be treated as a regular expression to match with the context.
^ Note it's more than just a substring--it's a regexp! Can you change the example to use some regexp special values, like maybe /frontend-\d{3}/
for 3 digits after the hyphen?
README.md
Outdated
target: | ||
service_name: /frontend/ | ||
``` | ||
This will be applied in any process that has `service_name =~ "frontend"`. As an example this will match `"frontend-1"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd use slashes here in the example since that's how Ruby would do it after =~
:
that has `service_name =~ /frontend/
lib/process_settings/target.rb
Outdated
when true, false | ||
target_value | ||
when String | ||
if target_value =~ /\/.+\// # Any string that starts and ends with backslashes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, you need to anchor that regexp! As you have it written, a string like "network/advertiser/affiliate" would match! Can you add a test that fails on that?
Also, sadly, =~
is slightly non-preferred because it's slow. .match?
is preferred when you don't need the MatchData back.
if target_value.match?(/\A\/.+\/\z/)
An alternative would be:
if target_value.start_with?('/') && target_value.end_with?('/') && target_value.size > 2
But...there's also this cool Regexp extraction notation (parens go around the group(s) you want to extract and then , 1
means "get me the first match group"). Note that I used /x
in the Regexp so that whitespace can be added for readability.
if (pattern = target_value[/\A \/ (.+) \/ \z/x, 1])
context_hash.match?(pattern)
process_target = described_class.new(target_hash) | ||
expect(process_target.target_key_matches?(context_hash)).to be_falsey | ||
expect(process_target.target_key_matches?({})).to be_falsey | ||
expect(process_target.target_key_matches?({ 'service' => '/telecom' })).to be_truthy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
subject
and is_expected.to
can make this sort of test read great. It's a bit more verbose, so feel free to skip it. But I really like how it reads. (I learned this from @jebentier BTW. It's now pretty common to see in Hydra tests.)
context 'when target value has a leading slash only' do
let(:target_hash) { { 'service' => '/telecom' } }
let(:process_target) { described_class.new(target_hash) }
subject { process_target.target_key_matches?(context_hash) }
describe 'when context_hash has the string but not the slash' do
let(:context_hash) {
{
'service' => 'telecom',
'region' => 'east',
'cdr' => { 'caller' => '+18056807000' }
}
}
it { is_expected.to be_falsey }
end
describe 'when context_hash is empty' do
let(:context_hash) { {} }
it { is_expected.to be_falsey }
end
describe 'when context_hash has an exact string match' do
let(:context_hash) { { 'service' => '/telecom' } }
it { is_expected.to be_truthy }
end
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And the failing test I mentioned above:
context 'when target value has embedded slashes (not at the front and back)' do
let(:service) { 'network/advertiser|affiliate/path' } # | won't match if this is treated as a Regexp
let(:target_hash) { { 'service' => service }
it { is_expected.to be_truthy }
end
end | ||
|
||
context "for substring matching" do | ||
it "should match on values when using backslash delimiters in target hash" do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can DRY up the tests below by putting some common let
s up here:
let(:context_hash) do
{
'service' => service,
'region' => 'east',
'cdr' => { 'caller' => '+18056807000' }
}
end
The individual contexts just need to do let(:service) { ... }
.
README.md
Outdated
|
||
### Matching Values | ||
By adding a backslash `/` at the front and end of a value, context values will be applied if the string between the backslashes is a substring of the value. | ||
By adding a slash `/` at the front and end of a value, context values will be applied if the string between the slashes is a substring of the value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The singular is still a bit confusing. And the substring part isn't right--it'll be a regexp match!
How about this wording:
To provide a regular expression for matching, use a leading and trailing slash `/` around your expression. For example:
And then I'd make sure that the example is more than just a substring. That's why I suggested
service_name: /frontend-\d/
and then maybe you can give an example that matches because of a digit and also one that fails because it's a non-digit there.
CHANGELOG.md
Outdated
|
||
## [0.20.0] - 2022-03-04 | ||
### Added | ||
- Added substring matching for targeting. See [README.md](README.md#Matching-Values) for usage examples. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the PR title to "Add Regexp Targeting". Can you change it here in the CHANGELOG and anywhere else in the PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed c2e5de2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a general question about the implementation. I found out today through a little research that Ruby's YAML parser has built in support for declaring a Regexp
. How would we feel about allowing the use of this built in so that we appropraitely load a Regex into the target_value and don't need to do these string comparisons and extractions? The YAML syntax isn't the prettiest, but it's declarative
key: !ruby/regexp /frontend-\d/
This would reduce the necessary changes in the gem as well as speed up the comparison by not having to check every string value of the target every time.
CHANGELOG.md
Outdated
|
||
Note: this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). | ||
|
||
## [0.20.0] - 2022-03-04 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we put Unreleased
for the release date until it's released?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice twist there to use actual regexps.
``` | ||
target: | ||
service_name: /frontend-\d/ | ||
service_name: !ruby/regexp /frontend-\d/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoa! This is really cool to not preclude slashes around strings. In the past we've been nervous about using this YAML ruby class support because it has a much wider surface area for security vulnerabilities (since this Ruby class will be invoked). That doesn't seem like a big deal here, but we might want to get an opinion from someone with a security background.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the source for how Pysch
parses it this value:
https://github.com/ruby/psych/blob/master/lib/psych/visitors/to_ruby.rb#L96-L110
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ColinDKelley @jebentier It looks like we technically are already open to this potential vulnerability! I can switch how we are parsing yaml to use Pysch.safe_load_file
and specify allowing only Regexp
as a permitted class besides the default permitted classes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See
json_doc = Psych.load_file(file_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point--we are already vulnerable! And because the process_settings are set by developers, I think we'll be fine? Worth mentioning that to in the request for a security sign-off. (It's the cases where raw YAML is parsed from the outside world that have caused serious security problems. And that's why safe_load_file
was invented.)
And if it doesn't add much scope, I really like your suggestion of switching over to Pysch.safe_load_file
here with only Regexp
permitted. But if that takes more than a couple minutes, feel free to create a ticket for Octothorpe instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added: c6d10fe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes in general look good to me, but Colin brings up a good point about pulling in someone from Security to make sure we're not exposing ourselves to a regex vulnerability.
4e56522
to
c6d10fe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work!
@ttstarck @jbuehring was hoping to use this feature. Anything blocking it from getting merged? |
Add Regexp matching to Targeting
The goal of these changes is to allow being able to dynamically target similar services that are only unique by a single context value. For example, if I had the following frontend services with names:
If I wanted to only target services with names that only match
frontend-primary
, that would not be possible with current targeting features. Of course, one might argue there should be some other context likeregion
ordeployGroup
to designate these but there are cases where this might not be possible, extremely difficult to fully setup, or not worth the time for a setting that may only be temporary.Summary of Changes
Target.target_key_matches?
will now check if thetarget_value
is aRegexp
object and run.matches
on the context hash.Example targeting:
This target should only match
frontend-primary-1
andfrontend-primary-2
and not matchfrontend-secondary-1
andfrontend-secondary-2
from the above example.