Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define identity of a web app. #272

Closed
marcoscaceres opened this issue Nov 10, 2014 · 78 comments
Closed

Define identity of a web app. #272

marcoscaceres opened this issue Nov 10, 2014 · 78 comments

Comments

@marcoscaceres
Copy link
Member

  • what identifies an app? An origin?
  • how does one update the app?
  • what happens if the scope changes?
@marcoscaceres
Copy link
Member Author

//cc @sicking

@mounirlamouri do you have any thoughts on the above?

@benfrancis
Copy link
Member

The obvious answer is the manifest URL. Are there any other suggestions?

@sicking
Copy link

sicking commented Dec 17, 2014

Given the lack of other ideas. I think we should simply go with the manifest URL yes.

@marcoscaceres
Copy link
Member Author

I wouldn't say there is a "lack of ideas" - we just haven't gotten around to this yet.

@sicking
Copy link

sicking commented Dec 18, 2014

I think this needs to be a very high priority as it's likely to affect a lot of other features. For example the ServiceWorker registration API might affect what we do here.

@marcoscaceres
Copy link
Member Author

Prioritized. Will work on this next.

@marcoscaceres marcoscaceres changed the title What is the identity of an app? Define identity of a web app. Dec 20, 2014
@opoto
Copy link

opoto commented Dec 30, 2014

Shouldn't the spec state that the identifier is the manifest's canonicalized URL, so that:
HTTP://my.domain.com:80/app/manifest.json
and
http://my.domain.com/app/images/../manifest.json
are considered as same app identifier?

@sicking
Copy link

sicking commented Dec 30, 2014

Yeah. That sounds right. Is there a difference between canonical URL and resolved URL? I.e. do you need to do anything extra to get the canonical URL after you resolve a URL against its base-URL?

@marcoscaceres
Copy link
Member Author

To clarify, URLs are not their serialized string representations - they are objects. Once a string is parsed to a URL, paths are normalized. Hence, there is no such thing as canonical URLs or resolved URLs. There are just URLs.

The spec always treats URLs as being in their object form (being parsed from string input) - so adding clarifications like the above wouldn't really help too much.

@opoto
Copy link

opoto commented Dec 31, 2014

This makes sense.
Thanks for the clarification. Maybe it should be in the spec? Or maybe this is obvious to anyone else but me... BTW, I checked the "URL" dfn link, but it leads to https://url.spec.whatwg.org/#url-parsing, whereas I guess it should be https://url.spec.whatwg.org/#concept-url.

marcoscaceres added a commit that referenced this issue Jan 1, 2015
@marcoscaceres
Copy link
Member Author

@opoto thanks! fixed the busted link. There is some ongoing work to fix cross-document references in the tool I'm using to generate the spec (Respec). That should allow for more seamless jumping between concepts and their definitions. I'm a bit hesitant to add (non-normative) clarifications for concepts defined in other specs, as every time I've done that it's not ended well: either the other document changes and there is a slight mismatch in definition (leading to confusion) - or the Editor of the other spec gets upset because I'm redefining their stuff.

marcoscaceres added a commit that referenced this issue Jan 9, 2015
* 'identity' of github.com:w3c/manifest:
  Define identity of a web app (closes #272)

Conflicts:
	index.html
marcoscaceres added a commit that referenced this issue Jan 9, 2015
# The first commit's message is:
Define identity of a web app (closes #272)

# The 2nd commit message will be skipped:

#	Fixup

# The 3rd commit message will be skipped:

#	Fixup
marcoscaceres added a commit that referenced this issue Jan 9, 2015
* 'identity' of github.com:w3c/manifest:
  # This is a combination of 3 commits. # The first commit's message is: Define identity of a web app (closes #272)
marcoscaceres pushed a commit that referenced this issue Jan 11, 2015
Define identity of a web app (closes #272)
@marcoscaceres marcoscaceres reopened this Jan 13, 2015
@marcoscaceres
Copy link
Member Author

Elsewhere, I proposed that identity be handled by the OS.

screenshot 2015-01-13 16 35 27

@jmajnert
Copy link
Contributor

@marcosc - In a2e8c31#commitcomment-9254539 you wrote:

With this current proposal, the identity of the app will change if the start URL changes (e.g. from index.php to index.html). If an already installed app had its start URL changed in the manifest, it would stop getting updates and be orphaned on the device.

It's identity would simply be updated to reflect the new start URL, and the manifest URL would remain the same. The application is updated from the manifest and hence, so long as the manifest can be accessed, an update can take place.

I'm afraid that I don't fully understand your notion of identity. Why do we even need to have application identity in the manifest spec? At one point the manifest was regarded as just another resource with additional info about the app.
Application identity is hard to do right even in closed ecosystems like Android or iOS (author certs, setting up dev accounts, hosting apps in app-stores etc). Aren't we overreaching a bit when trying to define app identity for the whole web?

@jmajnert
Copy link
Contributor

I agree with @marcosc - "authoring requirements" and "best practices" have no meaning for implementations and thus will be ignored in real world.

@benfrancis
Copy link
Member

OK. How about if the "name" field was compulsory, and the start_url was resolved against the manifest URL (can use absolute URL if needs to be cross-origin)? These are things I'd quite like to see anyway and would make sharing a manifest between apps quite impractical.

@jmajnert
Copy link
Contributor

If we're starting to think of manifests as standalone resources, we need to make start_url obligatory (and maybe not "purely advisory" as well).

@benfrancis
Copy link
Member

Yes, or make start_url have a default of "/" or the directory of the manifest.

@jmajnert
Copy link
Contributor

Yes. Exactly. It has to point somewhere.

@marcoscaceres
Copy link
Member Author

For the record....

Different understandings of the role of the web manifest

This document discusses pros and cons that can arise with the "additive" approach, defined in detail below, being taken in the current standardization of the web manifest. This document proposes an alternative approach that treats the manifest as "authoritative metadata" about a web application. What we mean by authoritative is also described in detail below.

Our alternative "authoritative" approach is not without it's own set of pros and cons, but Mozilla would like to present it to other implementers for consideration - particularly as we believe it allows for a different life-cycle management than the current additive approach.

Additionally, as the folks standardizing the web manifest have not yet finalized the design of the specification (and no one fully implements it), the cost of switching models might not be too high if other implementers agree.

As such, we would appreciate your thoughts on which model would be best to pursue (i.e., continue down current "additive" path or take the "authoritative" approach... or maybe some kind of hybrid approach).

Manifest as additive, and its implications

To date, the W3C specification has been written with an assumption that the manifest provides additive metadata about a web application (i.e., a collection of web pages). It is additive in the sense that it overrides, amends, or works in concert with metadata found in a web page.

For instance, it is valid per spec to have a manifest that contains only the following information:

{
   "orientation":  "landscape",
   "display": "standalone",
   "scope":  "/clockapp/",
   "short_name": "Clock"
}

And have that associated with a page, "/clockapp/index.html" in the following manner:

<!doctype html>
<title>The World Clock — Worldwide</title>
<link rel=manifest href="//:cdn.bar.com/manifest.json">
<meta name='application-name' value='World Clock'>
<link rel='icon' href='clock.ico' sizes='16x16 32x32 48x48 64x64'>

As per the current processing rules of the manifest spec, this allows the UA to merge what is declared in the manifest and whatever metadata can be gathered from the DOM of the page from which the web application is being "bookmarked" or "added to home screen".

Combining the raw JSON manifest and the metadata from the web page, would yield a "processed manifest" that would look like:

{
   "orientation":  "landscape",
   "display": "standalone",
   "scope":  "/clockapp/",
   "short_name": "Clock",
   "name": "World Clock",
   "icons": [{
       "src": "clock.ico",
       "sizes": "16x16 32x32 48x48 64x64"
    }]
}

At install time, the above processed manifest is used to compose a UI dialog that allows the user to install the application.

Rationale of additive model

The rationale for the current additive design and processing model is to leverage legacy metadata declarations found in existing web content. For instance, research conducted by Mozilla in October 2013 showed that application-name: was used in 1,571 sites out of Alexa's top 78,000 site (2%). Also, link@rel=icon (and favicon.ico) has been quite successful on the Web over the last decade, so the idea was to leverage those resources where possible.

In addition, as is the nature of all Web standards, it was assumed that cross-vendor implementation would be gradual - hence this additive model would allow developers to incrementally transition web page metadata from web pages to manifests over approximately 2-6 years (average time for cross-browser parity is +5-7 years). We are currently on ~year 2.

Pros

  • this approach is that it fits a "traditional" web development model. A manifest works similarly to, for instance, CSS (in a very loose sense): where values from the manifest are "applied" to a page when the application is opened from a user's home screen.
  • Works with CDNs.
  • No need for a MIME type.
  • Allows manifest to work in concert with existing metadata on a page (or group of pages).

Cons

  • Updating installation details is difficult: if some of the data is derived from the manifest, and the rest was derived from the web page, it can be complicated to update the icons/name/etc. of an installed web application.
  • Metadata is not authoritative: one app can use another application's metadata (possibly even across domains, CORS allowing).
  • There is no 1:1 mapping between manifest metadata and HTML5's metadata - so new link/meta types/relationships might need to be specified for fallback to work properly in HTML with new features (e.g., "scope"). This makes the manifest an alternative way of providing HTML meta tags about a page (this begs the question if it's worth the trouble to standardize a whole new format just for this metadata, when this data could just be included in a web page?).

Manifest as authoritative

The manifest as authoritative means that the manifest serves as the absolute "source of truth" about a web application - making it distinct from metadata found in individual documents of a web application. As such, when processing the manifest, no fallback metadata is gleaned from the Document from which the manifest was derived.

Rationale of authoritative approach

The rationale for the authoritative approach is to make the manifest a useful/standalone resource in its own right: with metadata describing a web app as a whole (all URLs within a defined "scope"), which is separate from metadata describing any single web page from which the manifest might be linked.

This allows a manifest to be used independently of any document that makes up the web application itself (e.g., from a marketplace). This is achieved by restricting the manifest to a particular origin:

  • having the manifest be same origin provides a light-weight trust mechanism to assert information about an application it hosts.

Pros

  • Manifest URL serves as a "stable" identifier for a web application.
  • Single "source of truth": making it easier to reason about updates/changes to the manifest.: The metadata about the web application itself won't depend on any web page of the web application. This makes it simpler to perform updates, as the complete set of metadata can be gleaned from the manifest instead of the manifest + a HTML document (as is the case in the additive model).
  • Marketplace-friendly: a developer can simply submit the link to the manifest to an online store (or even a regular website), and metadata about an application can be derived just from the manifest.

Cons

  • May require a MIME type. Historically, this has been problematic for developers who don't control their own server setups. For example, HTML had to drop its requirement of a MIME type on appcache manifests because of the number of developers that encountered issues trying to enable a particular type on a server (independently of the other problems inherent with appcache).
  • Breaks the ability to use manifests on a CDN. This could be a problem for many sites that rely on CDNs for static content that are held at other origins.
  • Might restrict customization and localization of the manifest - for instance, serving the right manifest to a user after he or she logs into a site.

@marcoscaceres
Copy link
Member Author

Ok, so, I think the only sensible compromise position is:

  • Make manifest metadata authoritative (a user agent ignores a page's meta tags): this gives us the ability to perform updates, etc. reliably without relying on the document from which the page was installed.
  • Make only CORS-enabled fetches of the manifest the default, as per Obtaining a Manifest should follow usual CORS rules with credentials. #353. This allows cross origin fetches, but provides content authors the ability to prevent others sites using their manifests without permission.

Also, protection against XSS attack is provided by manifest-src. So, evil.com won't be able to inject itself into good.com.

@marcoscaceres
Copy link
Member Author

@benfrancis
Copy link
Member

Make manifest metadata authoritative (a user agent ignores a page's meta tags): this gives us the ability to perform updates, etc. reliably without relying on the document from which the page was installed.

Yes.

Make only CORS-enabled fetches of the manifest the default, as per #353. This allows cross origin fetches, but provides content authors the ability to prevent others sites using their manifests without permission.

I think you keep misunderstanding the problem people are talking about here. The problem is not other people using your manifest for their own content (what use would that be to them?). It's other people re-packaging your content as an app by creating their own manifest for your content and showing ads in splashscreens, changing the start_url for phishing purposes or selling it in an app store etc.

This is why I think the solution needs to be on the app content end, not the manifest end, and is why I suggested the idea of using the CSP header to determine whether to render a page.

@benfrancis
Copy link
Member

(it's similar to the phishing problem that X-Frame-Options solves, which is why I was exploring a similar solution)

@marcoscaceres
Copy link
Member Author

It's other people re-packaging your content as an app by creating their own manifest for your content and showing ads in splashscreens, changing the start_url for phishing purposes or selling it in an app store etc.

I don't understand how is that even possible with the current spec? Can you show how you would do that, concretely, with say IRC cloud?

@marcoscaceres
Copy link
Member Author

(you lose 10 points if you say "marketplace")

@benfrancis
Copy link
Member

I don't understand how is that even possible with the current spec? Can you show how you would do that, concretely, with say IRC cloud?

The truth is that it mostly isn't a problem if you assume that web apps are only ever installed from a page of the app, which is the assumption the spec makes. A side effect of this is that the manifest is not a trustable resource in its own right, it can only be used in conjunction with a page of the app. This is why I'm pushing for an answer on whether installing from an app store is considered a valid use case of a web manifest. For example:

  • An evil developer creates a manifest at http://evil.com/manifest.json which has a start_url of http://irccloud.com/index.html
  • They submit the URL http://evil.com/manifest.json to the Firefox Marketplace or Windows Store to be featured as an app, costing $1.
  • A user installs the app from the app store, without reference to any page of the app
  • The evil developer changes the start_url of the manifest http://evil.com/login.html
  • The user updates the app, launches it and logs into what they think is IRCCloud
  • The evil developer puts an ad in the splash screen of the app suggesting the user try out the new and improved product at evil2.com
  • The evil developer has $1, the user's username and password, and has them using their new evil2 product

As I understand it this was basically the rationale for the same-origin restriction on Firefox Apps. Whether or not this is important for web manifest depends largely on whether installing web apps from an app store, or using the manifest as a useful resource independently of a web page it might be referenced from, are considered valid use cases.

@marcoscaceres
Copy link
Member Author

The truth is that it mostly isn't a problem if you assume that web apps are only ever installed from a page of the app, which is the assumption the spec makes.

Yes, which is exactly why I've never understood what the hell you people were talking about :)

A side effect of this is that the manifest is not a trustable resource in its own right, it can only be used in conjunction with a page of the app. This is why I'm pushing for an answer on whether installing from an app store is considered a valid use case of a web manifest.

Not for this spec. No.

For example:

It can't do that. This is already banned.

-10 points (you were warned! :)).

As I understand it this was basically the rationale for the same-origin restriction on Firefox Apps. Whether or not this is important for web manifest depends largely on whether installing web apps from an app store,

It's not. The assumption is that you install at the application site, not from an app store.

or using the manifest as a useful resource independently of a web page it might be referenced from, are considered valid use cases.

This one is, but only in relation to performing updates of icons, etc.

@benfrancis
Copy link
Member

Not for this spec. No.

OK, I'm fine with that. But are the Firefox Marketplace team, Microsoft and Crosswalk OK with that?

@marcoscaceres
Copy link
Member Author

OK, I'm fine with that. But are the Firefox Marketplace team, Microsoft and Crosswalk OK with that?

Hence the ping to everyone. Note that we ripped the manifest out of the Sysapps Working Group to make it work with "The Web" (:tm:) - and not with marketplaces on purpose. Marketplaces have their own set of requirements which are incompatible with this specification.

If that's now changing again, this should bounce back to SysApps (at which point I would hand over the editorial reins to people who better understand the requirements around marketplaces, etc.).

@benfrancis
Copy link
Member

OK, let's wait for feedback from others on whether the app store use case is essential to them.

In the mean time...

How does the current spec deal with this scenario?:

Doesn't this bypass the mechanism which is supposed to ensure that the start URL is same-origin with the page the app was installed from?

@marcoscaceres
Copy link
Member Author

Doesn't this bypass the mechanism which is supposed to ensure that the start URL is same-origin with the page the app was installed from?

No. The start URL is resolved and forced same origin to the page the app was installed from. If that fails, you get the Document url. So, to update, you need to keep a record of the page where you installed from.

@jmajnert
Copy link
Contributor

jmajnert commented May 1, 2015

Make manifest metadata authoritative (a user agent ignores a page's meta tags): this gives us the ability to perform updates, etc. reliably without relying on the document from which the page was installed.

+1. This is IMO the most sensible approach

Make only CORS-enabled fetches of the manifest the default, as per #353. This allows cross origin fetches, but provides content authors the ability to prevent others sites using their manifests without permission.

+1. As @benfrancis noted, this doesn't solve the rogue-app-store scenario in which the manifest is the only source about information about the app. IMHO, a sensible app store would validate such an app submission by visiting the app's site and checking for example if the app links to the same manifest.

@jmajnert
Copy link
Contributor

jmajnert commented May 1, 2015

There was once a discussion on the workflow of installing an app from the app store. From what I remember:

  • app store digests manifest (submitted or found by crawling the web). nothing stops the app stoe from validating that the manifest is not malicious (ex visiting the start url and checking the original manifest, if exists)
  • when user chooses to install an app from such a store (they click "Install" button), they are taken to the start_url and a normal installation flow from the UA is performed

For "special" app stores, like FxOS marketplace or Xwalk store, it's up to the store to validate the manifests and provide a special installation API if they wish to have their own installation UX

@alxlu
Copy link

alxlu commented May 1, 2015

Make manifest metadata authoritative (a user agent ignores a page's meta tags): this gives us the ability to perform updates, etc. reliably without relying on the document from which the page was installed.

I agree with this too.

An evil developer creates a manifest at http://evil.com/manifest.json which has a start_url of http://irccloud.com/index.html
They submit the URL http://evil.com/manifest.json to the Firefox Marketplace or Windows Store to be featured as an app, costing $1.
A user installs the app from the app store, without reference to any page of the app
The evil developer changes the start_url of the manifest http://evil.com/login.html
The user updates the app, launches it and logs into what they think is IRCCloud
The evil developer puts an ad in the splash screen of the app suggesting the user try out the new and improved product at evil2.com
The evil developer has $1, the user's username and password, and has them using their new evil2 product

Can't a developer already do something worse than this?

  • A malicious developer submits and app with a WebView pointing to foo.com
  • foo.com automatically redirects the user to http://irccloud.com/index.html
  • A user installs the app from the Store.
  • The malicious developer then changes foo.com to become malicious.
  • The user launches the app (and doesn't even have to update it), and logs into what they think is IRCCloud.

@marcoscaceres
Copy link
Member Author

Ok, so I'm going to make manifest metadata authoritative and enable CORS by default. I think it's a fair compromise and will allow us to move forward.

@jmajnert
Copy link
Contributor

jmajnert commented May 5, 2015

Ok, so I'm going to make manifest metadata authoritative and enable CORS by default. I think it's a fair compromise and will allow us to move forward.

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants