Add Regexp Targeting #115

ttstarck · 2022-03-04T19:32:17Z

Add Regexp matching to Targeting

The goal of these changes is to allow being able to dynamically target similar services that are only unique by a single context value. For example, if I had the following frontend services with names:

frontend-primary-1
frontend-primary-2
frontend-secondary-1
frontend-secondary-2

If I wanted to only target services with names that only match frontend-primary, that would not be possible with current targeting features. Of course, one might argue there should be some other context like region or deployGroup to designate these but there are cases where this might not be possible, extremely difficult to fully setup, or not worth the time for a setting that may only be temporary.

Summary of Changes

Target.target_key_matches? will now check if the target_value is a Regexp object and run .matches on the context hash.

Example targeting:

target:
  service_name: !ruby/regexp /frontend-primary/

This target should only match frontend-primary-1 and frontend-primary-2 and not match frontend-secondary-1 and frontend-secondary-2 from the above example.

ColinDKelley

Great passion project!

Check out the power of let in my test suggestions. It's the most amazing part of rspec. It's so frustrating to switch over to the web repo and not have access to let in minitest!

ColinDKelley · 2022-03-04T22:23:42Z

README.md

 This will be applied in any process that has (`service_name == "frontend"` OR `service_name == "auth"`) AND `datacenter == "AWS-US-EAST-1"`.

+### Matching Values
+By adding a backslash `/` at the front and end of a value, context values will be applied if the string between the backslashes is a substring of the value.


⛏️ That's actually a "slash". \ is a backslash.

And I was glad to see in the tests and implementation that slashes are required at both the front and back of the string, so a path prefix like /tmp won't trigger it. At first I missed it in your documentation here, because it says "a backslash" (singular). How about:

By adding `/` at the front and end of a value, the pattern given between the slashes will be treated as a regular expression to match with the context.

^ Note it's more than just a substring--it's a regexp! Can you change the example to use some regexp special values, like maybe /frontend-\d{3}/ for 3 digits after the hyphen?

ColinDKelley · 2022-03-04T22:25:05Z

README.md

+target:
+  service_name: /frontend/
+```
+This will be applied in any process that has `service_name =~ "frontend"`. As an example this will match `"frontend-1"`


I'd use slashes here in the example since that's how Ruby would do it after =~:

that has `service_name =~ /frontend/

ColinDKelley · 2022-03-04T22:40:59Z

lib/process_settings/target.rb

        when true, false
          target_value
+        when String
+          if target_value =~ /\/.+\// # Any string that starts and ends with backslashes.


Oh, you need to anchor that regexp! As you have it written, a string like "network/advertiser/affiliate" would match! Can you add a test that fails on that?

Also, sadly, =~ is slightly non-preferred because it's slow. .match? is preferred when you don't need the MatchData back.

if target_value.match?(/\A\/.+\/\z/)

An alternative would be:

if target_value.start_with?('/') && target_value.end_with?('/') && target_value.size > 2

But...there's also this cool Regexp extraction notation (parens go around the group(s) you want to extract and then , 1 means "get me the first match group"). Note that I used /x in the Regexp so that whitespace can be added for readability.

if (pattern = target_value[/\A \/ (.+) \/ \z/x, 1]) context_hash.match?(pattern)

ColinDKelley · 2022-03-04T22:56:37Z

spec/lib/process_settings/target_spec.rb

+        process_target = described_class.new(target_hash)
+        expect(process_target.target_key_matches?(context_hash)).to be_falsey
+        expect(process_target.target_key_matches?({})).to be_falsey
+        expect(process_target.target_key_matches?({ 'service' => '/telecom' })).to be_truthy


subject and is_expected.to can make this sort of test read great. It's a bit more verbose, so feel free to skip it. But I really like how it reads. (I learned this from @jebentier BTW. It's now pretty common to see in Hydra tests.)

context 'when target value has a leading slash only' do let(:target_hash) { { 'service' => '/telecom' } } let(:process_target) { described_class.new(target_hash) } subject { process_target.target_key_matches?(context_hash) } describe 'when context_hash has the string but not the slash' do let(:context_hash) { { 'service' => 'telecom', 'region' => 'east', 'cdr' => { 'caller' => '+18056807000' } } } it { is_expected.to be_falsey } end describe 'when context_hash is empty' do let(:context_hash) { {} } it { is_expected.to be_falsey } end describe 'when context_hash has an exact string match' do let(:context_hash) { { 'service' => '/telecom' } } it { is_expected.to be_truthy } end end

And the failing test I mentioned above:

context 'when target value has embedded slashes (not at the front and back)' do let(:service) { 'network/advertiser|affiliate/path' } # | won't match if this is treated as a Regexp let(:target_hash) { { 'service' => service } it { is_expected.to be_truthy } end

ColinDKelley · 2022-03-04T23:00:04Z

spec/lib/process_settings/target_spec.rb

    end
+
+    context "for substring matching" do
+      it "should match on values when using backslash delimiters in target hash" do


You can DRY up the tests below by putting some common lets up here:

let(:context_hash) do { 'service' => service, 'region' => 'east', 'cdr' => { 'caller' => '+18056807000' } } end

The individual contexts just need to do let(:service) { ... }.

ColinDKelley · 2022-03-05T00:04:10Z

README.md


 ### Matching Values
-By adding a backslash `/` at the front and end of a value, context values will be applied if the string between the backslashes is a substring of the value.
+By adding a slash `/` at the front and end of a value, context values will be applied if the string between the slashes is a substring of the value.


The singular is still a bit confusing. And the substring part isn't right--it'll be a regexp match!
How about this wording:

To provide a regular expression for matching, use a leading and trailing slash `/` around your expression. For example:

And then I'd make sure that the example is more than just a substring. That's why I suggested

service_name: /frontend-\d/

and then maybe you can give an example that matches because of a digit and also one that fails because it's a non-digit there.

ColinDKelley · 2022-03-07T20:18:59Z

CHANGELOG.md


+## [0.20.0] - 2022-03-04
+### Added
+- Added substring matching for targeting. See [README.md](README.md#Matching-Values) for usage examples.


I changed the PR title to "Add Regexp Targeting". Can you change it here in the CHANGELOG and anywhere else in the PR?

Fixed c2e5de2

jebentier

I have a general question about the implementation. I found out today through a little research that Ruby's YAML parser has built in support for declaring a Regexp. How would we feel about allowing the use of this built in so that we appropraitely load a Regex into the target_value and don't need to do these string comparisons and extractions? The YAML syntax isn't the prettiest, but it's declarative

key: !ruby/regexp /frontend-\d/

This would reduce the necessary changes in the gem as well as speed up the comparison by not having to check every string value of the target every time.

jebentier · 2022-03-08T14:43:25Z

CHANGELOG.md


 Note: this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

+## [0.20.0] - 2022-03-04


Can we put Unreleased for the release date until it's released?

ColinDKelley

Nice twist there to use actual regexps.

ColinDKelley · 2022-03-25T18:50:07Z

README.md

 ```
 target:
-  service_name: /frontend-\d/
+  service_name: !ruby/regexp /frontend-\d/


Whoa! This is really cool to not preclude slashes around strings. In the past we've been nervous about using this YAML ruby class support because it has a much wider surface area for security vulnerabilities (since this Ruby class will be invoked). That doesn't seem like a big deal here, but we might want to get an opinion from someone with a security background.

Here is the source for how Pysch parses it this value:
https://github.com/ruby/psych/blob/master/lib/psych/visitors/to_ruby.rb#L96-L110

@ColinDKelley @jebentier It looks like we technically are already open to this potential vulnerability! I can switch how we are parsing yaml to use Pysch.safe_load_file and specify allowing only Regexp as a permitted class besides the default permitted classes

See

process_settings/lib/process_settings/targeted_settings.rb

Line 76 in 6ff07b4

json_doc = Psych.load_file(file_path)

Good point--we are already vulnerable! And because the process_settings are set by developers, I think we'll be fine? Worth mentioning that to in the request for a security sign-off. (It's the cases where raw YAML is parsed from the outside world that have caused serious security problems. And that's why safe_load_file was invented.)

And if it doesn't add much scope, I really like your suggestion of switching over to Pysch.safe_load_file here with only Regexp permitted. But if that takes more than a couple minutes, feel free to create a ticket for Octothorpe instead.

Added: c6d10fe

jebentier

Changes in general look good to me, but Colin brings up a good point about pulling in someone from Security to make sure we're not exposing ourselves to a regex vulnerability.

ColinDKelley

Awesome work!

ColinDKelley · 2023-06-06T01:45:43Z

@ttstarck @jbuehring was hoping to use this feature. Anything blocking it from getting merged?

Add substring targeting

402ccfc

ttstarck force-pushed the PASSION_add_regex_matching_for_targeting branch from b607ce0 to 402ccfc Compare March 4, 2022 19:56

ttstarck requested review from ColinDKelley and jebentier March 4, 2022 20:06

ColinDKelley requested changes Mar 4, 2022

View reviewed changes

fix pattern matching to not match on internal embedded slashes

3af2fba

ColinDKelley reviewed Mar 5, 2022

View reviewed changes

use regex matching example in readme

7a82f9b

ttstarck requested a review from ColinDKelley March 7, 2022 19:21

ColinDKelley changed the title ~~Add Substring Targeting~~ Add Regexp Targeting Mar 7, 2022

ColinDKelley reviewed Mar 7, 2022

View reviewed changes

swap wording to regex instead of substring

c2e5de2

jebentier reviewed Mar 8, 2022

View reviewed changes

utilize ruby yaml parser's regex keyword

8730e8e

ttstarck requested review from ColinDKelley and jebentier March 25, 2022 18:28

ColinDKelley reviewed Mar 25, 2022

View reviewed changes

jebentier approved these changes Mar 25, 2022

View reviewed changes

use safe_load_file with only Regexp as permitted class

c6d10fe

ttstarck force-pushed the PASSION_add_regex_matching_for_targeting branch from 4e56522 to c6d10fe Compare March 25, 2022 20:44

ColinDKelley approved these changes Mar 25, 2022

View reviewed changes

update static_context_hash in target to use regex as well

9809f61


		Note: this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

		## [0.20.0] - 2022-03-04

Add Regexp Targeting #115

Are you sure you want to change the base?

Add Regexp Targeting #115

Uh oh!

Conversation

ttstarck commented Mar 4, 2022 • edited by ColinDKelley Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Add Regexp matching to Targeting

Summary of Changes

Example targeting:

Uh oh!

ColinDKelley left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ColinDKelley Mar 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ColinDKelley Mar 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ColinDKelley Mar 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ttstarck Mar 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jebentier left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ColinDKelley left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ttstarck Mar 25, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ttstarck Mar 25, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ColinDKelley Mar 25, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jebentier left a comment

Choose a reason for hiding this comment

Uh oh!

ColinDKelley left a comment

Choose a reason for hiding this comment

Uh oh!

ColinDKelley commented Jun 6, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

ttstarck commented Mar 4, 2022 •

edited by ColinDKelley

Loading

ColinDKelley Mar 4, 2022 •

edited

Loading

ColinDKelley Mar 4, 2022 •

edited

Loading

ColinDKelley Mar 5, 2022 •

edited

Loading

ttstarck Mar 8, 2022 •

edited

Loading

ttstarck Mar 25, 2022 •

edited

Loading

ttstarck Mar 25, 2022 •

edited

Loading

ColinDKelley Mar 25, 2022 •

edited

Loading