Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ReportingObserver #47

Merged
merged 16 commits into from
Apr 9, 2018
Merged

Add ReportingObserver #47

merged 16 commits into from
Apr 9, 2018

Conversation

RByers
Copy link
Collaborator

@RByers RByers commented Sep 1, 2017

Adds initial definitions and hooks for ReportingObserver for #29.


Preview | Diff

Fixes a number of definitions so that bikeshed can separate "Report" the
IDL interface from "report" the abstract definition.
This resolves all remaining Bikeshed warnings
@RByers
Copy link
Collaborator Author

RByers commented Sep 1, 2017

@mikewest @patrickkettner @paulmeyer90 what do you think of this? I hook into the queuing logic here and invoke this observer-specific machinery. All changes elsewhere in the spec should be unobservable I believe - they're just to resolve definition issues and make bikeshed happy.

I'm starting to work on some web platform tests - there are a surprising number of observable edge conditions to validate. Next major step is to define deprecation reports, but I wanted to see if the basic infrastructure looked OK to you first.

index.src.html Outdated
A {{ReportingObserver}} object can be used to observe some types of
<a>reports</a> from JavaScript when they are created.

Issue: How do we specify which reports are observable? Network-layer reports
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory, that should be specified when the report is queued, as a default "off" parameter?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think network-only reports should be the exception. So how about a default-off "network-only" parameter?

index.src.html Outdated
1. If present, remove the <a>context object</a> from the list of
<a>registered reporting observers</a> for the <a>environment settings
object</a> associated with the <a>context object</a>.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on experience with PerformanceObserver (see w3c/performance-timeline#74), we should also support takeRecords().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing to consider: what about reports queued during "boot time" of the page, before observer can be registered? E.g. it's best practice to remove analytics and other non-critical work from the critical path, but that means that I may miss a whole lot of reports.

For PerfObserver, we're adding support for buffering events up to a certain point: w3c/performance-timeline#81

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
Discussed takeRecords as part of https://github.com/WICG/reporting/issues/58.
Filed https://github.com/WICG/reporting/issues/59 to discuss boot-time.

index.src.html Outdated
[=report/url=], and {{Report/body}} initialized to |report|'s
[=report/body=].

Issue: how to polymorphically initialize body?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't body a JSON payload?

Copy link
Collaborator Author

@RByers RByers Nov 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's just an object (i.e. you can do "report.body.message"). For the HTTP path, this object gets serialized to JSON. That said @patrickkettner suggests that the text I have here is fine. We'll have, eg. a DeprecationReport interface which is an instance of ReportBody. It's fine to have text elsewhere that creates a DeprecationReport and then manipulate it here as ReportBody. WDYT?

index.src.html Outdated

3. Add |r| to the end of |observer|'s <a>report list</a>.

2. <a>Queue a task</a> to [[#invoke-observers]] with a copy of the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be queued as a low priority task? For PerfObserver we define own task source and indicate that it should be a low priority queue: https://w3c.github.io/performance-timeline/#queue-a-performanceentry

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, filed https://github.com/WICG/reporting/issues/58 to discuss further.

@paulmeyer90
Copy link
Contributor

@igrigorik I've made some changes to ReportingObserver, specifically around:

  1. How observers are registered.
  2. How reports are buffered before being sent to observers, and then queued in the observers before being reported.
  3. The interaction between the |futureOnly| options and the report buffer.
  4. Moving the setting of options from inside the observe() function to directly in the construction of the observer. Note that this prevents a number of edge cases that could arise from changing options after reports have already been observed or reported. Also, I don't think setting |futureOnly| actually makes sense after an observer is already observing, and likewise I don't see the need to change the set of report types to be observed after an observer has already been observing other types.
  5. Nailed down a number of the algorithms to ensure that everything is behaving consistently, especially with respect to the lifetime of reports, and how they are buffered/queued, depending on the options.

Any thoughts?

Copy link
Member

@igrigorik igrigorik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a lot of formatting updates here. Could we split those out into a separate PR? Also, we have https://github.com/WICG/reporting/pull/67 in flight which will cause merge conflicts. That aside...

If I'm reading this correctly, the proposal is to have a global buffer of up to 100 reports, which can be delivered to new observers if they pass in an optional flag in the constructor?

index.src.html Outdated
{ "<a>url</a>": "https://backup.com/reports",
"<a>group</a>": "endpoint-1",
"<a>max-age</a>": 10886400 }
<a>Report-To</a>: { "<a for="ReportTo">url</a>": "https://example.com/reports",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a heads up, @dcreager is working on some updates that will break this.

index.src.html Outdated
JavaScript, and is represented in JavaScript by the {{ReportingObserver}}
object.

Each <a>environment settings object</a> has a <dfn>registered reporting
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For PO we use "Each ECMAScript global environment has": https://w3c.github.io/performance-timeline/#performance-timeline

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll use "ECMAScript global environment" instead.

index.src.html Outdated
Each <a>environment settings object</a> has a <dfn>report buffer</dfn>, which
is an ordered list of <a>reports</a> that have been generated in
that <a>environment settings object</a>. This list is initially empty, and the
reports are stored in the same order in which they are generated.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For PO we queue the pending entries / reports into each observer, instead of keeping a global buffer:
https://w3c.github.io/performance-timeline/#queue-a-performanceentry

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This buffer is just used for the purpose of passing past reports into new observers. Pending reports are still queued on observers (in their "report list").

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this to handle a race condition, where an observer might be registered after some of the events that it needs to monitor? If so I suggest some explanatory text somewhere; the name buffered implied to me something about queueing up reports between when Reporting receives from the downstream spec and before they're handed over to the observer.

index.src.html Outdated
<h3 id=interface-reporting-observer>Interface {{ReportingObserver}}</h3>

<pre class="idl">
[NoInterfaceObject]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The body is a JS object, right? Should this really be NoInterfaceObject?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'll remove those tags.

[Constructor(ReportingObserverCallback callback, optional ReportingObserverOptions options)]
interface ReportingObserver {
void observe();
void disconnect();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also define takeRecords for consistency — you'll need it. See w3c/performance-timeline@aecfdf4.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, added.

index.src.html Outdated

dictionary ReportingObserverOptions {
sequence&lt;DOMString> types;
boolean futureOnly = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@paulmeyer90 paulmeyer90 Feb 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll change it to |buffered|, but what do you think about the default value. Should reports really be buffered by default in this case? I'm leaning toward changing this to default not buffered both for consistency and because I feel that would likely be the assumption made when omitting the option entirely.

index.src.html Outdated
};
</pre>

Issue: How do we specify which reports are observable? Network-layer reports
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that each report type should be explicitly stating if it's meant to be JS-observable. We can make this a flag on the ~queue the report step, with default value set to "false".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've added a step to check for this. Each specific report type will need to explicitly state that it is observable from JavaScript in order to be observed by a ReportingObserver.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter would need to be per-report, not per-type. Only some Network Error Logging reports should be visible to observers, for instance. In particular, a NEL report should only be visible to the same origin that the report is about. So for instance, if a page at foo.com contains a link to bar.com, bar.com can request NEL reports, but we won't send those reports about bar.com to any foo.com Reporting collectors. And so any Reporting observers registered for foo.com shouldn't be able to see those bar.com NEL reports, either.

I think we can still get that with the parameter that the downstream spec has to set, indicating whether individual reports are visible to observers or not. NEL should have all the info needed to fill that in — if the environment settings object's origin is different than the origin of the request that the report describes, set it to no; otherwise set it to yes. I'd appreciate someone checking my work on that, though. 😉

So in addition to the parameter being present, I'd suggest some clarifying text. ("Downstream specs should make sure to set this parameter so that observers are only able to see reports that would be sent to collectors for that origin" or something like that)

index.src.html Outdated

4. Set <a>context object</a>'s {{futureOnly}} <a>option</a> to true.

5. For each |report| in the <a>report buffer</a> associated with |settings|,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implies that we must queue all reports indefinitely for the lifetime of the page. Is that the behavior we want, and/or reason for concern? It's not crazy to imagine hundreds of reports sitting around indefinitely..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so that observers can be made to observe reports that were generated before they could be registered, and this is the behavior we want. We don't expect that the number of reports will be that large, but there is also the buffer limit of 100 in place, so there can't be hundreds sitting around.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a section further down about garbage collection, which gives the user agent the ability to throw away reports before they're delivered if they've been sitting around too long. I suggest updating that text to say that if any reports are ever GCed from the reporting cache, they must also be removed from any observer's report list that they appear in.


4. Return |observer|.

The <dfn method for=ReportingObserver><code>observe()</code></dfn>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definition doesn't cover the case where I call observe() on an existing observer with different options.

See https://w3c.github.io/performance-timeline/#the-performanceobserver-interface for an example.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That issue is not present here because this spec no longer allows options to be passed into observe() for ReportingObserver. The options are passed in on construction of the observer, and so cannot be changed with multiple calls to observe(), for example.

index.src.html Outdated

2. Add |report| to the <a>report buffer</a> associated with |settings|.

3. If the <a>report buffer</a> now contains more than 100 reports, remove the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implies there is a global 100 reports limit?

What is the "beginning"? We should spell it out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there is a 100 report buffer limit.

The "beginning" of the list is used here: https://infra.spec.whatwg.org/#lists, so I think it is unambiguous. If it does need specifying though, then it should probably be specified in infra rather than in this spec.

@igrigorik
Copy link
Member

Paging @dcreager for review :)

@paulmeyer90
Copy link
Contributor

The formatting changes were actually necessary in order to allow bikeshed to properly link everything with the new changes, and so cannot be separated out from this CL.

Copy link
Member

@dcreager dcreager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry it took so long to get to this! Broadly speaking I like this; just have several suggestions that all boil down to "needs more intro / explanatory text"

index.src.html Outdated

4. Set <a>context object</a>'s {{futureOnly}} <a>option</a> to true.

5. For each |report| in the <a>report buffer</a> associated with |settings|,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a section further down about garbage collection, which gives the user agent the ability to throw away reports before they're delivered if they've been sitting around too long. I suggest updating that text to say that if any reports are ever GCed from the reporting cache, they must also be removed from any observer's report list that they appear in.

index.src.html Outdated
Each <a>environment settings object</a> has a <dfn>report buffer</dfn>, which
is an ordered list of <a>reports</a> that have been generated in
that <a>environment settings object</a>. This list is initially empty, and the
reports are stored in the same order in which they are generated.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this to handle a race condition, where an observer might be registered after some of the events that it needs to monitor? If so I suggest some explanatory text somewhere; the name buffered implied to me something about queueing up reports between when Reporting receives from the downstream spec and before they're handed over to the observer.

index.src.html Outdated
};
</pre>

Issue: How do we specify which reports are observable? Network-layer reports
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter would need to be per-report, not per-type. Only some Network Error Logging reports should be visible to observers, for instance. In particular, a NEL report should only be visible to the same origin that the report is about. So for instance, if a page at foo.com contains a link to bar.com, bar.com can request NEL reports, but we won't send those reports about bar.com to any foo.com Reporting collectors. And so any Reporting observers registered for foo.com shouldn't be able to see those bar.com NEL reports, either.

I think we can still get that with the parameter that the downstream spec has to set, indicating whether individual reports are visible to observers or not. NEL should have all the info needed to fill that in — if the environment settings object's origin is different than the origin of the request that the report describes, set it to no; otherwise set it to yes. I'd appreciate someone checking my work on that, though. 😉

So in addition to the parameter being present, I'd suggest some clarifying text. ("Downstream specs should make sure to set this parameter so that observers are only able to see reports that would be sent to collectors for that origin" or something like that)

@paulmeyer90
Copy link
Contributor

@dcreager: Not sure why I can't reply to your comments directly. Anyways:

  1. I have added a note to the garbage collection section as you described.

  2. I have added a note to describe the purpose of the |report buffer| and |buffered| option.

  3. Currently, the spec is set up to not allow any reports to be observable cross-origin; they are only sent to observers in the same environment settings object. I think that this property, in addition to specifying which report types are observable, should be sufficient. Are there any other problem cases you can think of that would require observability to be specified per-report rather than per-type? I have also added a note to clarify this in the spec.

Copy link
Member

@dcreager dcreager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New text looks great.

Is it worth calling out that the observer part is only relevant for a user agent that has a JavaScript engine? I'm implementing a Reporting client over in nel-reporter-java, for instance, that will be used in native apps that are just using HTTP as a transport mechanism, and don't have any of the other Web Platform bits available. @igrigorik, is there already boilerplate text about that that we can steal from other specs? Or is it implicit and not worth mentioning?


dictionary ReportingObserverOptions {
sequence&lt;DOMString> types;
boolean buffered = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change this to processAlreadyBufferedReports (or something in that vein)? Unless there's a general preference in these specs for shorter names, I like the more descriptive style, especially for boolean options.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer current name because it allows us to keep consistency between *Observer interfaces — e.g. PerformanceObserver, ReportingObserver, etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

@igrigorik
Copy link
Member

@igrigorik, is there already boilerplate text about that that we can steal from other specs? Or is it implicit and not worth mentioning?

I'm not aware of boilerplate, but you may want to add a note (non-normative, probably) that outlines how non UA clients should behave.

Copy link
Member

@dcreager dcreager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending some quick text about non-UA clients


dictionary ReportingObserverOptions {
sequence&lt;DOMString> types;
boolean buffered = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants