ads.txt #201

torgo · 2017-09-26T11:42:08Z

Hello TAG!

I'm requesting a TAG review of:

Name: ads.txt
Specification URL: https://iabtechlab.com/wp-content/uploads/2016/07/IABOpenRTBAds.txtSpecification_Version1_Final.pdf
Explainer, Requirements Doc, or Example code: https://digiday.com/marketing/wtf-ads-txt/

Further details (optional):

This issue has been opened up by the TAG

torgo · 2017-09-26T11:43:29Z

Some of the issues we have identified in our discussion at our Nice f2f:

This spec defines a well known, hard coded URL. There is now a standard for placing these paths within a .well-known prefix, see https://tools.ietf.org/html/rfc5785
The spec does not define the format using a formal syntax grammar, eg. ABNF, making it very hard to understand what would be valid examples of this format. For example, there is no specification for which whitespace characters are acceptable as separators. For examples of good grammar specifications, see https://www.w3.org/TR/tabular-data-model/
The spec requires that the ads.txt file is published on a 'root domain'. There is no technical definition of 'root domain' in web architecture, and sites with authority and control over an origin may reasonably not have control over the parent origin.
It appears possible that this document is allowing for parseable content to follow on from a comment on the same line as the comment text. This would be so unusual that we suspect that this is not actually the intent of the authors.
The document specifies that ads.txt should be available on HTTP and HTTPS. This is enormously concerning, especially since some sites are moving away from listening for HTTP traffic at all, and requiring the use of HTTP for any web specification should be considered contrary to the very principles of good web architecture and detrimental to the future development of the web. See the TAG finding on securing the web
The document contains a normative reference to w3schools regarding URL encoding, which is a site generally regarded as a poor source of information about the web, and certainly not a primary source on any subject. On this point, https://tools.ietf.org/html/rfc3986 would be the correct normative reference.
Google has a system called App links and we are wondering why a mechanism like that is not appropriate for this use case.

We are happy to engage with the authors, and we appreciate the importance of the problem that this is trying to solve. Making this more compatible with web architecture would be appreciated and will help the authors get better buy in from the web community.

(most of the words in this comment by @triblondon)

slightlyoff · 2017-09-26T13:32:42Z

The author of the document responded to a private ping, noting there's an updated version of the document here.

The 1.0.1 update indicates that crawlers should follow redirects within the same CNAME entry (although the language is wolly regarding "root domain"); e.g. it allows redirects between https://example.com and http://example.com, enabling downgrade of connection security.

There appear to be additions for "SUBDOMAIN" which is a redirect type. It does not appear to be well-specified and it's unclear why redirects with an eTLD+1 policy aren't being used instead.

tantek · 2017-10-17T14:32:54Z

@torgo re: "On this point, https://tools.ietf.org/html/rfc3986 would be the correct normative reference." why not https://url.spec.whatwg.org/ instead which I believe more and more W3C RECs are citing. E.g. https://www.w3.org/TR/webmention/#normative-references and https://www.w3.org/TR/websub/#normative-references (the latter a PR hopefully soon to be REC)

slightlyoff · 2018-02-01T14:33:21Z

We've re-visited this at the London F2F meeting. Most of the issues remain. I'm pinging the authors via private mail.

triblondon · 2018-02-01T14:37:59Z

Up to date list of concerns, referencing the 1.0.1 version of the doc:

This spec defines a well known, hard coded URL. There is now a standard for placing these paths within a .well-known prefix, see https://tools.ietf.org/html/rfc5785
The spec does not define the format using a formal syntax grammar, eg. ABNF, making it very hard to understand what would be valid examples of this format. For example, there is no specification for which whitespace characters are acceptable as separators. For examples of good grammar specifications, see https://www.w3.org/TR/tabular-data-model/
The spec requires that the ads.txt file is published on a 'root domain'. There is no technical definition of 'root domain' in web architecture, and sites with authority and control over an origin may reasonably not have control over the parent origin.
The document specifies that ads.txt should be available on "HTTP and/or HTTPS". This is enormously concerning, especially since some sites are moving away from listening for HTTP traffic at all, and suggesting the use of HTTP for any web specification should be considered contrary to the very principles of good web architecture and detrimental to the future development of the web. See the TAG finding on securing the web
The document contains a normative reference to w3schools regarding URL encoding. W3Schools is a site which has been widely regarded as a poor source of information about the web, and certainly not a primary source on any subject. On this point, https://tools.ietf.org/html/rfc3986 or https://url.spec.whatwg.org/ would be the correct normative reference.
The doc indicates that crawlers should follow redirects within the same CNAME entry (although the language is woolly regarding "root domain"); e.g. it allows redirects between https://example.com and http://example.com, enabling downgrade of connection security.
There appear to be additions for "SUBDOMAIN" which is a redirect type. It does not appear to be well-specified and it's unclear why redirects with an eTLD+1 policy aren't being used instead.
Google has a system called App links and we are wondering why a mechanism like that is not appropriate for this use case.

triblondon · 2018-02-01T14:45:47Z

Alex and I have pinged IAB people and we'll follow up on a telcon

slightlyoff · 2018-10-30T12:12:07Z

Met with George several times in February, debriefed in Tokyo. Just pinged again to understand if they plan to publish a new version which will address our concerns.

wseltzer · 2019-02-05T22:04:05Z

A venue for further discussion could be the Improving Web Advertising BG which has active participation from IAB TechLab.

torgo · 2019-03-01T10:14:51Z

@wseltzer just following up on this. Does the Web Advertising BG hold regular calls? can we potentially tee up this discussion point and maybe members of the TAG could join for that session?

wseltzer · 2019-03-06T17:37:47Z

@torgo yes, the group meets every 2 weeks, with upcoming calls planned for March 14 and March 28.

ylafon · 2019-05-21T14:36:14Z

Note that the current version is https://iabtechlab.com/wp-content/uploads/2019/03/IAB-OpenRTB-Ads.txt-Public-Spec-1.0.2.pdf

ylafon · 2019-05-21T15:45:55Z

As of version 1.0.2, we notice that most comments were not addressed yet, apart from a clarification in the redirect section. In this section, codes others than 302 are allowed, but 308 is missing from the updated list. The section 5.3 would greatly benefit from a clarification of the parsing model, whitespace definition, etc...

We are still concerned about the possible "downgrade redirect" issue, as the current specification still allows redirect from https to http. In general the specification should mandate the use of https only (and MAY default to http if not available, with the trust issues associated with its use).

Also, as the document defines a document format, it would be better for it to have a proper media type definition rather than using text/plain, at worst, using the generic text/csv would be better.
Note that the RFC defining the text/csv media type also define its grammar (see comment on section 5.3) https://tools.ietf.org/html/rfc4180

torgo · 2019-05-21T16:07:16Z

@wseltzer we are thinking since we haven't made enough progress on this issue that it should be migrated over to the advertising BG. Would the BG be a good forum for discussing ads.txt and feeding back on its design? Let us know and maybe we can migrate the issue over this week.

cmlight · 2020-01-29T16:47:43Z

Hi, ads.txt working group member here. Yes, it would be great to get these concerns addressed in the next ads.txt (and related specs) version update. Items I had previously written down that I'm hoping to make more technically precise include:

Character encoding: we see files published in various character encodings which may not be properly interpreted by all platforms. We should specify a character encoding such as UTF-8 for the file content so that validators can consistently flag issues
Byte-order mark headers: we see files that have non-visible byte order marks (https://en.wikipedia.org/wiki/Byte_order_mark) which can trip up parsing if not interpreted properly. We should include specifics in the spec about whether these are allowed or not
Line endings: the spec does not specify which byte sequences are considered line endings. We've encountered files encoded using atypical (or containing a mix of) line ending types which could trip up parsers. We should update the spec to include specifics of what byte sequences (0a; 0d0a; 0d; etc) are considered valid, parseable line endings.
Public suffix list specificity: the publicsuffix.org list contains two sections: an ICANN section and a private section. The ads.txt spec doesn't specify whether the private section is valid for use.
SUBDOMAIN= directive specificity and limitations: I'd like to make the spec provide more detail and examples about how SUBDOMAIN= directives behave and interact with each other, along with potentially defining a limit to the number of levels.
Security: I'd like to see if we can be more precise in the standard about how to treat HTTPS URLs, when it is permissible to fall back to HTTP, what validations the crawler should perform (e.g. SSL certificate validation), and the valid transport security protocols accepted. We should consider security risks that should be mitigated with precise rules.

I will work with @slightlyoff on this.

Stepping back from the specific recommendations in this thread, I was wondering if you have any pointers to documents that explain how to write a good spec, if such a thing exists? Also, I would like to somehow put together a compatibility testing suite that participants can use to confirm that their crawlers and parsers were implemented correctly. If you have any tips on this or examples of well-written solutions that do this, that would be great to learn from.

torgo · 2020-03-04T21:45:16Z

Hi @cmlight -

First of all, thanks for the visibility on some of the issues you are tackling. It seems like there is active work happening on a new spec. I think what needs to happen is that when a new spec is ready for review, someone files a new design review issue here with us.

Regarding how to write a good spec, we can provide feedback but we are not really equipped to help write the spec itself and some of the answer to that question is venue specific. One approach might be to bring this work to a venue where you might have greater opportunity to bring in expertise in spec development and expertise in related web technologies. For example, a w3c community group could be a good low-friction venue. In general, successful web specifications tend to be developed in an open environment and according to a transparent process.

torgo assigned slightlyoff Sep 26, 2017

torgo added the Progress: pending external feedback The TAG is waiting on response to comments/questions asked by the TAG during the review label Sep 26, 2017

plinss added this to the tag-telcon-2018-02-20 milestone Feb 1, 2018

torgo modified the milestones: 2018-02-21-telcon, 2018-03-20-telcon Feb 21, 2018

plinss modified the milestones: 2018-03-20-telcon, 2018-04-05-f2f-tokyo Mar 27, 2018

torgo modified the milestones: 2018-04-05-f2f-tokyo, 2019-02-05-f2f Oct 30, 2018

torgo added the extra time label Feb 5, 2019

torgo assigned torgo and ylafon Feb 5, 2019

cynthia removed the extra time label Feb 7, 2019

torgo modified the milestones: 2019-02-05-f2f, 2019-03-05-telcon Mar 1, 2019

plinss modified the milestones: 2019-03-05-telcon, 2019-03-26-telcon Mar 13, 2019

plinss modified the milestones: 2019-03-26-telcon, 2019-04-02-telcon Mar 26, 2019

plinss removed this from the 2019-04-03-telcon milestone Apr 8, 2019

plinss added this to the 2019-04-17-telcon milestone Apr 8, 2019

plinss modified the milestones: 2019-04-17-telcon, 2019-05-08-telcon Apr 22, 2019

torgo assigned alice and unassigned slightlyoff May 20, 2019

plinss modified the milestones: 2019-05-08-telcon, 2019-06-12-telecon Jun 11, 2019

plinss modified the milestones: 2019-06-12-telecon, 2020-01-13-week, 2020-01-20-week Jan 13, 2020

torgo added Progress: propose closing we think it should be closed but are waiting on some feedback or consensus Progress: stalled and removed Progress: pending external feedback The TAG is waiting on response to comments/questions asked by the TAG during the review labels Jan 21, 2020

plinss removed this from the 2020-01-20-week milestone Jan 27, 2020

torgo closed this as completed Mar 4, 2020

torgo removed Progress: propose closing we think it should be closed but are waiting on some feedback or consensus Progress: stalled labels Mar 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ads.txt #201

ads.txt #201

torgo commented Sep 26, 2017

torgo commented Sep 26, 2017 •

edited by triblondon

slightlyoff commented Sep 26, 2017

tantek commented Oct 17, 2017 •

edited

slightlyoff commented Feb 1, 2018

triblondon commented Feb 1, 2018 •

edited

triblondon commented Feb 1, 2018

slightlyoff commented Oct 30, 2018

wseltzer commented Feb 5, 2019

torgo commented Mar 1, 2019

wseltzer commented Mar 6, 2019

ylafon commented May 21, 2019

ylafon commented May 21, 2019

torgo commented May 21, 2019

cmlight commented Jan 29, 2020

torgo commented Mar 4, 2020

ads.txt #201

ads.txt #201

Comments

torgo commented Sep 26, 2017

torgo commented Sep 26, 2017 • edited by triblondon

slightlyoff commented Sep 26, 2017

tantek commented Oct 17, 2017 • edited

slightlyoff commented Feb 1, 2018

triblondon commented Feb 1, 2018 • edited

triblondon commented Feb 1, 2018

slightlyoff commented Oct 30, 2018

wseltzer commented Feb 5, 2019

torgo commented Mar 1, 2019

wseltzer commented Mar 6, 2019

ylafon commented May 21, 2019

ylafon commented May 21, 2019

torgo commented May 21, 2019

cmlight commented Jan 29, 2020

torgo commented Mar 4, 2020

torgo commented Sep 26, 2017 •

edited by triblondon

tantek commented Oct 17, 2017 •

edited

triblondon commented Feb 1, 2018 •

edited