Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I2I: amp-link-rewriter #20653

Closed
parsingeye opened this issue Feb 4, 2019 · 23 comments
Closed

I2I: amp-link-rewriter #20653

parsingeye opened this issue Feb 4, 2019 · 23 comments
Assignees
Labels
INTENT TO IMPLEMENT Proposes implementation of a significant new feature. https://bit.ly/amp-contribute-code

Comments

@parsingeye
Copy link
Contributor

parsingeye commented Feb 4, 2019

Hi AMP community,

I'm Dragan Stefanov, CTO at digidip GmbH.

Digidip is the premium content monetization platform. It provides instant access to over 40,000 merchant affiliate programs without the hassle of network sign-ups, approvals or creating affiliate links.

Design document

https://docs.google.com/document/d/1slfb3znYHg4wdLwsvBqTFgB2K4esiltLAY1xq7g1ZpI

Motivation

Our main goal is to provide publishers with ways to increase their revenues without relying on ads and without damaging reader's experience.

Since AMP gained huge popularity among publishers, constant request is to provide digidip AMP extension.

With this extension they would easily accelerate their monetization efforts.

  • For some time we have followed the development of the extension "amp-rewriter-links" but dismissed it as it does not fit our technical requirements.
  • The extension "amp-skimlinks" was developed for a vendor with a similar business to digidip’s but it is highly coupled to the way skimlinks system works. For instance, tracking URL is hard-coded and their extension sends the links to an API to be changed to a tracking link.

Requirements

Core functionality to be built as an AMP extension is pretty simple:

  • The core functionality is monetizing retailer links on the publisher website by converting them to the "digidip links" on "click" event.
  • We have a few options to set up include / exclusion rules to enable our publishers to exclude / include certain sections / domains from monetization

Potential functionality for future AMP development:

  • Our "worddip" product is identifying product references in articles and converts them to links to the relevant product page on the retailer's side
  • Tracking page views and referrals
  • Adding health check of our servers to disable link-shifting in case of a problem

Apart from the above requirements, ease of implementation is critical for our publishers, so we propose a single tag with only one mandatory attribute (publisherID).

    <amp-digidip
        layout="nodisplay"
        publisher-id="<<publisher id>>"
        hosts-ignore="<<host domains list>>"
        element-clickhandler-attribute="<<html attribute>>"
        element-clickhandler="<<html attribute value>>"
        element-ignore-attribute="<<html element to be ignored when it has a specific value>>"
        element-ignore-pattern="<<html element value to be ignored >>"
    >
    </amp-digidip>

/cc @dvoytenko @aghassemi @lannka

@parsingeye parsingeye added the INTENT TO IMPLEMENT Proposes implementation of a significant new feature. https://bit.ly/amp-contribute-code label Feb 4, 2019
@torch2424
Copy link
Contributor

Hello, so I was discussing with @lannka , and we noticed that we are getting a lot more interesting in affiliate linking within AMP. And in fact, we had this proposal in the past, but got de-prioritized: #9276

Could you look through that, and see if it would work for your use case? We were thinking it may be better to make a more general solution on our end, that platforms can integrate with 😄

Thank you!

@parsingeye
Copy link
Contributor Author

We have looked at existing proposals and implementations, but they just didn't fit our use case.

In the case of amp-link-rewriter, sending CORS request won't work for us as we don't have the endpoint or any infrastructure for it. We cannot change anything on the server side in the foreseeable future due to resource limits.

Also, one of the essential things is the ability to include/exclude certain portions of the page. We also allow our publishers to use custom attributes, they define, placed on anchors to exclude them from rewriting (e.g., rel="pass"). It is crucial for publishers to be able to control this easily through extension.

We could make our extension more general and call it amp-local-link-rewriter or amp-static-link-rewriter if others have an interest in using it similarly.

@torch2424
Copy link
Contributor

@parsingeye Ah, makes sense. Thank you for the consideration! Was mostly asking for @lannka 's reference. Thank you! 😄

@parsingeye
Copy link
Contributor Author

How should we proceed?

@torch2424
Copy link
Contributor

cc @lannka for the next steps.

But I think next step (if I understand correctly), is to present at a design review, that way the team in general can give feedback on the design and things. You can sign up for a design review, by commenting on one of the upcoming design review issues. And then present the design doc at that meeting 😄

@lannka @zhouyx Please correct me if I am wrong 👍

@lannka
Copy link
Contributor

lannka commented Feb 12, 2019

@parsingeye thanks for your patience. I took a look at your design and I do feel there is good chance this can be made generic so your contribution can benefit many others.

I left a couple of comments in the doc, we can try to get some questions answered there.
Looking forward to the design review.

@ampprojectbot ampprojectbot added this to the Pending Triage milestone Feb 13, 2019
@dvoytenko
Copy link
Contributor

/cc @jpettitt

@lannka
Copy link
Contributor

lannka commented Feb 26, 2019

@parsingeye any progress on the spec design?

@parsingeye
Copy link
Contributor Author

@lannka

JSON config spec proposal added to the end of the document https://docs.google.com/document/d/1slfb3znYHg4wdLwsvBqTFgB2K4esiltLAY1xq7g1ZpI

An example (in a more readable format)

<amp-digidip>
<script type="application/json">
{
  "rewritePattern": "https://visit.digidip.net?pid=110&url={{href}}",
  "ignoreHosts": ["youtube.com", "mobile.vodafone.de"],
  "include": [
    {
      "attribute": "id",
      "value": "comments"
    },
    {
      "attribute": "class",
      "value": "sidebar"
    }
  ],
  "exclude": {
    "attribute": "rel",
    "value": "skip",
    "scanParents": false
  }
}
</script>
</amp-digidip>

@lannka
Copy link
Contributor

lannka commented Feb 28, 2019

@parsingeye I'm thinking of a more concise and flexible spec:

{
  "output": "https://visit.digidip.net?pid=110&url=${href}&cid=${customerId}",
  "section": [
    "#product-listing-1",
    "#product-listing-2",
  ],
  "attribute": {
     "href": "((?!(https:\/\/youtube\.com)|(https:\/\/mobile\.vodafone\.de)).)*",
     "id": "comments",
     "class": "sidebar",
     "rel": "(?!(skip))*",
  },
  "vars": {
    "customerId": "12345"
  }
}

output is the new URL. (naming TBD)
section specifies CSS selectors to select the sub DOM tree for URL decoration.
attribute is regex based value filters. note the "ignoreHost" is implemented as a regex expression here. and we do and between the rules.
vars is optional client side data to pass through

Regex is less readable, but handles more diverse future requirements. It also makes the extension logic much simpler (smaller binary size). As you mentioned in the design review, the config is to be machine generated right?

@parsingeye
Copy link
Contributor Author

Although it is less readable, we like this approach and agree that will make the ext. logic somewhat simpler.

If no one else has any objections to this we would like to start implementing proposed extension config.

@parsingeye
Copy link
Contributor Author

One additional thing we would like to add that is not related to configuration, but not mentioned before is how to make anchor attributes available for "output".

If you have anchor like this:

<a href="https://ma.rs" rev="pub5" data-val="merchant5">Launch!</a>

extension would make data available in for of the anchorAttributes object

{
  "rev": "pub5",
  "data-val": "merchant5"
}

... and it can be used in configuration like this

{
  "output": "https://visit.digidip.net?url=${href}&referral=${anchorAttributes.rev}"
} 

@lannka
Copy link
Contributor

lannka commented Feb 28, 2019

@parsingeye glad to know it works for your use cases.
Regarding to the attribute data, I'd suggest we leverage the browser's dataset API (like what amp-analytics is already doing). That would make all the data- attributes available to your extension.

I would like to have whitelist for other non 'data-' attributes for privacy reasons, so that the pub has control over what to share with vendors. The initial whitelist would have id, href, rel, 'rev'. Please suggest if you need more.


@PhilWinchester do you see the above proposed spec could meet needs from Smartlinks?

@parsingeye
Copy link
Contributor Author

@lannka are you suggesting that "non 'data-' attributes" whitelist should be configurable or just initially hard-coded to "id, href, rel, rev"?


Regarding "data-" attributes, we noticed that amp-analytics is using data-vars-.

Example:

<span id="test1" class="box" data-vars-event-id="22"> Click here to generate an event </span>

And in the output url the token would be of the format ${eventId} (follows camelcase).

Would you suggest using the same pattern?

@lannka
Copy link
Contributor

lannka commented Mar 1, 2019

"non 'data-' attributes" whitelist should be configurable or just initially hard-coded to "id, href, rel, rev"

I'd suggest we start with a fixed set.

Regarding "data-" attributes, we noticed that amp-analytics is using data-vars-.

Yep, I was thinking about the data-vars-. So the same data attributes are accessible from multiple extensions.

@PhilWinchester
Copy link
Contributor

@lannka Those rules make sense for us. I'm assuming we'd still have access to the AMP element itself and could access certain elements (ie title, location, user agent, etc). The only other attributes that would be nice is class. Also, would our extension be able to write to these values or only read?

The use case I'm thinking of is passing back certain information relevant to the new link that the publisher would then use to display on their site.

@lannka
Copy link
Contributor

lannka commented Mar 1, 2019

we'd still have access to the AMP element itself and could access certain elements (ie title, location, user agent, etc)

@PhilWinchester we can make some of the AMP URL MACROs still available.

The only other attributes that would be nice is class

What do you use class for? AMP runtimes inserts many noise to the class attribute.

would our extension be able to write to these values or only read?

I would avoid any DOM mutation at this point for security reasons.

@parsingeye
Copy link
Contributor Author

Based on our prev. discussion, this is the list of variables available for usage inside output pattern.

value source example in output output example
location page https://publisher.com location original_url=${location}
referrer page https://google.com referrer referrer_url=${referrer}
id anchor <a href="..." id="link" /> id someid=${id}
rel anchor <a href="..." rel="pass" /> rel relation=${rel}
href anchor <a href="https://vendor.com" /> href url=${href}
rev anchor <a href="..." rev="author" /> rev someid=${rev}
data-vars-* anchor <a href="..." data-vars-merchant-id="123" /> data.merchantId merchant=${data.merchantId}
vars.* config { "vars": { "publisherId": "123" } } data.publisherId publisher=${data.publisherId}

Example of this in action would be:

Given the anchor tag:

<a href="https:/ma.rs" rev="RE334" data-vars-merchant-id="432">Lift off!</a>

... and config

"vars": {
  "customerId": "12345"
}

output attribute can look like this

"output": "https://vendor.com?url=${href}&user=${rev}&merchant=${data.merchantId}&customer=${data.customerId}"

... and the result of the rewrite for that perticular anchor with that configuration would be:

https://vendor.com?url=https://ma.rs&user=RE334&merchant=432&customer=12345


Let me know what you think so we can start finalizing our implementation.

@parsingeye
Copy link
Contributor Author

We have PR that closes this issue #21285

@lannka @PhilWinchester

@lannka lannka changed the title I2I: amp-digidip I2I: amp-link-rewriter Mar 12, 2019
@lannka
Copy link
Contributor

lannka commented Apr 30, 2019

@parsingeye thanks for contributing! since the feature has been there for a while, want to follow up to see if it is launched at your side and how things are going.

@parsingeye
Copy link
Contributor Author

@lannka Yes, amp-link-rewriter is used in production. Everything works well. Only complaint on a publisher side is slightly more complicated configuration (regex). Apart from that, all good so far!

Thanks for all your help and support!

@lannka
Copy link
Contributor

lannka commented May 16, 2019

@parsingeye good to hear that!
I thought the plan is to auto generate the config on your side?

@parsingeye
Copy link
Contributor Author

@lannka Yes, that is the plan. That will make config much easier for publishers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
INTENT TO IMPLEMENT Proposes implementation of a significant new feature. https://bit.ly/amp-contribute-code
Projects
None yet
Development

No branches or pull requests

6 participants