Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Domains in matching context can only trigger with domains but they must not trigger with those characters contained in another domain #433

Closed
MaartenLMEM opened this issue Aug 4, 2021 · 6 comments
Assignees

Comments

@MaartenLMEM
Copy link
Contributor

MaartenLMEM commented Aug 4, 2021

For example
If domain is jeuxvideo.com
and regex .*forum
It must not match with
https://www.lemonde.fr/pixels/article/2017/11/16/jeuxvideo.com-les-moderateurs-racontent-les-coulisses-du-forum-18-25_5215777_4408996.html

Solution discussed :

  • Add an business rule that forbids the bubble domain to match with a string of characters that is after slash "/"
@MaartenLMEM
Copy link
Contributor Author

related to #429

@lutangar
Copy link
Member

lutangar commented Sep 6, 2021

I don't get it or maybe it's just a bad example:

If domain is jeuxvideo.com

It won't match lemonde.fr

@JalilArfaoui
Copy link
Member

@lutangar I’m not sure about that :

For example, notice 373 has matching context 1635 with regex (adobe\.com).*photoshop-lightroom

It seems to match with https://lemonde.fr/adobe.com/is-photoshop-lightroom-good

image

We do not have a specific domain check that I know of, since the backend only serve a big regex … We talked about having a list of domains for each MC served in https://notices.bulles.fr/api/v3/matching-contexts, I don’t know if you remember ? 
Or for the short term, we could also check the beginning of the string with ^ as suggested by @zhinu

Be currently we do not :

image

@lutangar lutangar self-assigned this Sep 6, 2021
@lutangar
Copy link
Member

lutangar commented Sep 7, 2021

@JalilArfaoui I know which issue we're trying resolve here, @MaartenLMEM example was wrong but he edited since.

About the resolution...

  1. I was thinking about something like the following:
    https://regex101.com/r/EWAQxE/1/

as suggested by @zhinu in #429 (comment)

This would preserve the kind of "subdomain wildcard" behavior we had so far.
I'm pretty sure it could be tricked but this is much stronger...

We do not have a specific domain check that I know of

  1. OR we could leave this to the clients and add this proper check... it's easy and natural for any of our clients to know on which domain we are, and the backend would have to expose the domain part in a dedicated field.

@JalilArfaoui
Copy link
Member

I vote for solution 2, as it solves other issues (#405 as you mentionned, but also allow things like dis-moi/extension#277) … and it will also work well with the «XPath replacement» we talked about this morning (no issue created yet ?)

@lutangar
Copy link
Member

Alright closing this in favor of #446 then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants