Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<object> without "data" attribute probably needs to consider params to figure out its URL #387

Closed
bzbarsky opened this issue Dec 6, 2015 · 31 comments · Fixed by #7816
Closed
Labels
compat Standard is not web compatible or proprietary feature needs standardizing normative change topic: embed and object

Comments

@bzbarsky
Copy link
Contributor

bzbarsky commented Dec 6, 2015

Consider this testcase:

<object type=application/x-shockwave-flash>
  <param name=movie value=test.swf>
</object>

This apparently (according to https://bugzilla.mozilla.org/show_bug.cgi?id=517440#c0) loads test.swf in IE, Blink, and WebKit. Per spec, and in Firefox, it does not.

This param is NOT processed by Flash itself afaict. If it's getting loaded at all, it's the browser doing it. So the spec should probably specify this behavior, since most browsers do it.

It's worth figuring out what the actual required behavior here is, by the way. Today I ran into a page that had this markup on it:

<object>
  <param name=movie value=test.swf>
</object>

(note no type attribute) and WebKit/Blink happily loaded that as Flash. Neither IE nor Firefox did. I looked at the Blink code for this, and its behavior is summarized in https://bugzilla.mozilla.org/show_bug.cgi?id=517440#c2, assuming I understood it correctly.

So at the very least we need to figure what set of param names reliably gets treated as a url in IE. We also need to figure out whether we want the Blink/WebKit behavior of using a type sniffed from a param-provided URL if no type is specified. Seems like that would at least be consistent with what happens if no type is specified by the data attribute is specified.

@annevk annevk added normative change compat Standard is not web compatible or proprietary feature needs standardizing labels Dec 14, 2015
@foolip
Copy link
Member

foolip commented Jan 26, 2016

Here's where this happens in Blink:
https://chromium.googlesource.com/chromium/src/+/be94d60086b4453ea0f3769a9a3e85feb0cfbfe9/third_party/WebKit/Source/core/html/HTMLObjectElement.cpp#165

As @bzbarsky noted, there are four forms recognized: name=movie, name=src, name=code and name=url.

Is there any point in investigating the compat risk for this, or should we just spec all four forms?

@zcorpan
Copy link
Member

zcorpan commented Jan 26, 2016

Maybe we only need those that are used for embedding Flash?

http://webdevdata.org/ data set 2015-01-08 (780 Mb) 87,000 pages

$ find . -type f -print0 | xargs -0 -P 4 -n 40 grep -iE "<object(\s[^>]*)?>.+<param\s+([^>]+\s)?name\s*=\s*[\"']?(movie|src|code|url)([\"'>]|\s).+</object\s*>" > ../params.txt
$ cd ..
$ grep -ivE "<object\s([^>]+\s)?data\s*=" params.txt > params-2.txt
$ grep -iEc "<param\s+([^>]+\s)?name\s*=\s*[\"']?movie" params-2.txt 
185
$ grep -iEc "<param\s+([^>]+\s)?name\s*=\s*[\"']?src" params-2.txt 
11
$ grep -iEc "<param\s+([^>]+\s)?name\s*=\s*[\"']?code" params-2.txt 
2
$ grep -iEc "<param\s+([^>]+\s)?name\s*=\s*[\"']?url" params.txt 
5
  • url and code are used for Java
  • url for WMP
  • movie and src for Flash

Most of these have an embed fallback though; excluding those with embed leaves only:

./ad/serpro.gov.br_ad691dde2366c82cdc7e59bb595e93ab.html.txt:                jq('#div_255801').append('<!--[if !IE]> Firefox and others will use outer object --><object classid="java:com.fluendo.player.Cortado.class" type="application/x-java-applet" archive="http://www.tv.serpro.gov.br/cortado.jar" height="200px" width="250px" ><!-- Konqueror browser needs the following param --><param name="archive" value="http://www.tv.serpro.gov.br/cortado.jar" /><param name="url" value="'+this.href+'"/><param name="local" value="false"/><param name="seekable" value="false"/><param name="autoplay" value="true"/><param name="live" value="true"/><!--<![endif]--><!-- MSIE (Microsoft Internet Explorer) will use inner object --><object classid="clsid:8AD9C840-044E-11D1-B3E9-00805F499D93" codebase="http://java.sun.com/update/1.5.0/jinstall-1_5_0-windows-i586.cab" height="200px" width="250px" ><param name="code" value="com.fluendo.player.Cortado" /><param name="archive" value="http://www.tv.serpro.gov.br/cortado.jar" /><param name="url" value="'+this.href+'"/><param name="local" value="false"/><param name="seekable" value="false"/><param name="autoplay" value="true"/><param name="live" value="true"/><strong>This browser does not have a Java Plug-in.<br /><a href="http://java.sun.com/products/plugin/downloads/index.html">Get the latest Java Plug-in here.</a></strong></object><!--[if !IE]> close outer object --></object><!--<![endif]-->');
./b5/sodastreamusa.com_b5e0db4dc89b9395d893060e4b8ec07e.html.txt:<script type="text/javascript" src="https://seal.verisign.com/getseal?host_name=www.sodastreamusa.com&amp;size=S&amp;use_flash=YES&amp;use_transparent=YES&amp;lang=en"></script><object classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000" codebase="https://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=5,0,0,0" id="s_s" width="100" height="72" align=""> <param name="movie" value="https://seal.websecurity.norton.com/getseal?at=1&amp;sealid=2&amp;dn=www.sodastreamusa.com&amp;lang=en"> <param name="loop" value="false"> <param name="menu" value="false"> <param name="quality" value="best"> <param name="wmode" value="transparent"> <param name="allowScriptAccess" value="always"></object>
./c3/asiafm.cn_c394b8ffa956011c585c4bdf566aee1a.html.txt:       $("#bfq").html("<object style='display:none;' id='WindowsMediaPlayer' name='WindowsMediaPlayer' width='100' height='50' classid='clsid:6BF52A52-394A-11d3-B153-00C04F79FAA6' viewastext=''> <param id='URL' name='URL' value='"+fmurl+"'><param name='volume' value='"+l+"'></object>")
./c3/asiafm.cn_c394b8ffa956011c585c4bdf566aee1a.html.txt:       $("#bfq").html("<object style='display:none;' id='WindowsMediaPlayer' name='WindowsMediaPlayer' width='100' height='50' classid='clsid:6BF52A52-394A-11d3-B153-00C04F79FAA6' viewastext=''> <param id='URL' name='URL' value='"+fmurl+"'><param name='volume' value='"+l+"'></object>")
./c3/asiafm.cn_c394b8ffa956011c585c4bdf566aee1a.html.txt:       $("#bfq").html("<object style='display:none;' id='WindowsMediaPlayer' name='WindowsMediaPlayer' width='100' height='50' classid='clsid:6BF52A52-394A-11d3-B153-00C04F79FAA6' viewastext=''> <param id='URL' name='URL' value='"+fmurl+"'><param name='volume' value='"+l+"'></object>") 

(1 site embedding Java, 1 site embedding Flash, 1 site embedding WMP.)

@foolip
Copy link
Member

foolip commented Jan 26, 2016

Well, if anyone wants to avoid adding more cases than necessary, I can measure how often each is encountered in Blink, but that would add a lot of waiting to the process.

@zcorpan
Copy link
Member

zcorpan commented Jan 26, 2016

My thinking was that nobody cares about non-Flash, given that Blink dropped applet and Firefox will drop support for all non-Flash plugins.

https://blog.mozilla.org/futurereleases/2015/10/08/npapi-plugins-in-firefox/

The data in webdevdata suggests movie and src cover all Flash cases, so we can add only those.

@bzbarsky
Copy link
Contributor Author

We should really figure out what Edge does here.

@annevk
Copy link
Member

annevk commented Jan 26, 2016

@travisleithead @jacobrossi, ideas?

@foolip
Copy link
Member

foolip commented Jan 27, 2016

I tested IE11 on Win7, and it looks like only name=movie and name=src are supported, neither name=code nor name=url seem to do anything.

Spec'ing only name=movie and name=src SGTM!

@domenic
Copy link
Member

domenic commented Jan 27, 2016

If you create a quick test case I can do Edge, in case they have decided to increase interop by implementing a couple more. (When I wake up. Going to sleep now, for realz!)

@foolip
Copy link
Member

foolip commented Jan 27, 2016

@foolip
Copy link
Member

foolip commented Jan 27, 2016

Good luck with that sleep thing.

@domenic
Copy link
Member

domenic commented Jan 27, 2016

Looks like the same in Edge as IE11: movie and src only.

@zcorpan
Copy link
Member

zcorpan commented Jan 27, 2016

OK so let's spec movie and src.

(note no type attribute) and WebKit/Blink happily loaded that as Flash.

It appears this only works if the URL ends with ".swf" (the spec has something about this as well for both embed and object).

Should we check for dynamic changes to params?

Whenever one of the following conditions occur:

  • the element is created,
  • the element is popped off the stack of open elements of an HTML parser or XML parser,
  • the element is not on the stack of open elements of an HTML parser or XML parser, and it is either inserted into a document or removed from a document,
  • the element's node document changes whether it is fully active,
    one of the element's ancestor object elements changes to or from showing its fallback content,
  • the element's classid attribute is set, changed, or removed,
  • the element's classid attribute is not present, and its data attribute is set, changed, or removed,
    neither the element's classid attribute nor its data attribute are present, and its type attribute is set, changed, or removed,
  • the element changes from being rendered to not being rendered, or vice versa,

...the user agent must queue a task to run the following steps to (re)determine what the object element represents. This task being queued or actively running must delay the load event of the element's node document.

http://software.hixie.ch/utilities/js/live-dom-viewer/saved/3845 ... probably yes.

@bzbarsky
Copy link
Contributor Author

Should we check for dynamic changes to params?

Probably less critical in terms of compat, so it depends on whether we're viewing this as a "thing we should support and make nice" or a "thing we're adding just so sites don't break, make it simple".

It appears this only works if the URL ends with ".swf"

Or some other type handled by a plug-in, at least in Chrome.

Are we adding <param name="type"> as well?

@annevk
Copy link
Member

annevk commented Jan 9, 2021

<object><param name=src value=https://html5.org/temp/abstract.pdf></object>

seems to work in Chrome and Safari, but not Firefox. @foolip @cdumez interested in removing support for this? Per @zcorpan's analysis above it's not used for PDF.

@domenic
Copy link
Member

domenic commented Aug 9, 2021

@mfreed7 would you be up for adding a use counter for when https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/html/html_object_element.cc;l=148;drc=fb62432bab790c4f6006a4ce04084e214fd2d302;bpv=1;bpt=1 is hit for PDFs because of HTMLParamElement::IsURLParameter(name)?

blueboxd pushed a commit to blueboxd/chromium-legacy that referenced this issue Sep 4, 2021
See [1] for context, but there is interest in deprecating the
ability to specify an <object> containing a <param> element
that specifies the URL (via a name in {"code","data","movie",
"src", or "url"}). This CL adds a use counter for that
feature.

[1] whatwg/html#387

Bug: 572908
Change-Id: I7f152d05d992606224fb895bca6cd65a4bca15ea
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3139401
Commit-Queue: Mason Freed <masonf@chromium.org>
Commit-Queue: Joey Arhar <jarhar@chromium.org>
Auto-Submit: Mason Freed <masonf@chromium.org>
Reviewed-by: Joey Arhar <jarhar@chromium.org>
Cr-Commit-Position: refs/heads/main@{#918305}
@mfreed7
Copy link
Contributor

mfreed7 commented Sep 8, 2021

@mfreed7 would you be up for adding a use counter for when https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/html/html_object_element.cc;l=148;drc=fb62432bab790c4f6006a4ce04084e214fd2d302;bpv=1;bpt=1 is hit for PDFs because of HTMLParamElement::IsURLParameter(name)?

Use counter added. I'll try to check back here in a bit when data is available.

@mfreed7
Copy link
Contributor

mfreed7 commented Oct 7, 2021

Ok, early data says that usage of <object> with a URL <param> is low but non-zero, around 0.055% of page loads. This is above the "easy deprecation" threshold, so I'm not sure what we'd need/want to to here.

@domenic
Copy link
Member

domenic commented Oct 7, 2021

Thanks for looking into it! It's possible a more sophisticated use counter could drive that number down (e.g., compute the URL with and without that logic to see if any Firefox fallback code would take over; or, be sure to only include PDF, and not any NaCl cases or similar). But probably we should just proceed with speccing the 2/3 browser behavior here, since the wins from the simpler 1/3 browser behavior are mostly for theoretical purity and spec authors, whereas a nontrivial deprecation imposes costs on implementers and web developers and users.

I can try working on that after vacation. Presumably we'll still get some good simplifications if this is the only impact param can have.

@domenic
Copy link
Member

domenic commented Oct 12, 2021

@mfreed7 @annevk before I start such work, do you agree with that course of action, instead of trying to more aggressively pursue deprecation?

@annevk
Copy link
Member

annevk commented Oct 13, 2021

@domenic at the triage meeting @mfreed7 was okay with taking a look at HTTP archive to see if these page loads actually end up being about PDF or are leftover Flash content. I think that would still be a worthwhile exercise.

@domenic
Copy link
Member

domenic commented Nov 1, 2021

Ping @mfreed7 on those HTTP archive checks, ideally before the next triage meeting on Thursday :)

@mfreed7
Copy link
Contributor

mfreed7 commented Nov 4, 2021

Thanks for the ping. So I took a look at this:

  1. The existing Chromium use counter only counts instances where the URL came from a <param> object inside the <object>, and not the case where the <object> had its own URL. In that case, the counter would not fire, and the URL from the <object> itself would be used. I thought this was the "fallback" logic mentioned in this comment but let me know if I missed another way for there to be a fallback URL that would get used if we disallowed <param name=src value=url>.
  2. To properly check whether the URL pointed to a PDF would require code in the browser, to check the mime type of the resource. I tried various things on HTTP Archive, such as looking for <object><param name=src value=something.pdf> with a (simplified) regex like this: (?i)<object[^>]*>[^<]*<param[^>]*pdf[^>]*>. The problem is that a) HTML can't be parsed properly with regex, b) searching for URLs that end in .pdf doesn't get all PDFs, c) my regex is very approximate, and most importantly d) many usages of <param> look like <param name="src" value="{path}"> so we can't see the filename.

So while I did the HTTP Archive search, I don't really trust the results at all. I'm thinking we have two options:

  1. Add a proper use counter for <object>s that specify a <param> URL pointing to a resource that comes back as a PDF.
  2. Just make the change behind a flag and see what breaks in Canary/Dev.

I'm leaning toward #2, but it's a more risky approach. Thoughts?

@mfreed7
Copy link
Contributor

mfreed7 commented Nov 4, 2021

Ok, from the triage discussion, I took the action item to post the list of what I found on the HTTP Archive search. Please see the link below, which contains about 50 random examples I found. I took some time and went through the first ~20 or so to see what they were doing. There do appear to be a couple packages that people are using which would break, plus at least one that didn't look like it was from a package (or was a one-off). There are also a number of examples where there is a <param> pointing to a PDF, but the parent <object> also has the same URL/content, so it would likely continue working.

TL;DR I think there might be real compat issues with PDFs. But see what you think.

Results:
https://docs.google.com/spreadsheets/d/1Fo3F6IIOMFbXH116Y22950CSSksvuRLLwO3c5Kn8E90/edit?usp=sharing&resourcekey=0-U-u5Uecsr9aK2S-CWSwPDg

@domenic
Copy link
Member

domenic commented Nov 4, 2021

I took a quick look at several of those pages and couldn't find any smoking gun broken site. Note that several of the "red" rows for jscripts.jquery.js are not an issue: that library does

            var objbuilder = '<div class="flex-video"><object width="100%" height="100%" data="data:application/pdf;base64,' + data.content + '" type="application/pdf" name="brochure">';
            objbuilder += '<param name="src" value="data:application/pdf;base64,' + data.content + '">';
            objbuilder += '</object></div>';

which falls back correctly.

Similarly I tried to go to a few homepages that use RD3 and they seem to just be including lots of RD3 libraries; I'm not sure they actually use the PDFPrint.js file.

So, it's still not clear to me whether this simplification is worth the risk, but I guess so far I still think it might work.

I will say that if you wanted to create a more comprehensive use counter for (1) what I would do is:

  • Compute the URL in the current fashion
  • Compute the URL without <param> having any input
  • Compare them. If they are different, and the result is a PDF, log a use counter hit.

@mfreed7
Copy link
Contributor

mfreed7 commented Jan 12, 2022

I will say that if you wanted to create a more comprehensive use counter for (1) what I would do is:

  • Compute the URL in the current fashion
  • Compute the URL without <param> having any input
  • Compare them. If they are different, and the result is a PDF, log a use counter hit.

So I finally got a chance to get back to this and add such a use counter. And I re-discovered what I commented about above:

  1. The existing Chromium use counter only counts instances where the URL came from a <param> object inside the <object>, and not the case where the <object> had its own URL. In that case, the counter would not fire, and the URL from the <object> itself would be used.

Therefore the existing counter actually does seem to be counting instances where the <param> URL is being used, i.e. the cases we'll be possibly breaking. In the case where the <object> itself contains a URL, that will already always be used, and any <param> children will be ignored (and the use counter will not fire). So the existing counter does everything in the three bullets above, except for the "and the result is a PDF" part.

So the only thing the existing use counter counts that it shouldn't would be a <param> pointing to something that never loads. Like a flash .swf file or another unsupported plugin. It turns out to be a bit difficult to use-count that, since the <object> embeds a separate document with its own navigation, and connecting the two isn't trivial. I'll keep taking a look, but I likely won't finish before tomorrow's triage meeting, and there definitely won't be data by then anyway.

Comments appreciated. I think we should discuss our plan of action assuming the current (trending downward) 0.04% actually represents usage that we'll break by deprecating <param> URLs.

@mfreed7
Copy link
Contributor

mfreed7 commented Jan 18, 2022

So the only thing the existing use counter counts that it shouldn't would be a <param> pointing to something that never loads. Like a flash .swf file or another unsupported plugin. It turns out to be a bit difficult to use-count that, since the <object> embeds a separate document with its own navigation, and connecting the two isn't trivial. I'll keep taking a look, but I likely won't finish before tomorrow's triage meeting, and there definitely won't be data by then anyway.

So two new use counters have been added for this case. Both still only count (as above) when the containing <object> does not specify a URL, but the <param> does. But additionally, these two only count if a resource is successfully loaded from that URL. One for PDFs and one for non-PDFs:

These just landed, so it'll take a few months for data to show up.

@mfreed7
Copy link
Contributor

mfreed7 commented Feb 28, 2022

These just landed, so it'll take a few months for data to show up.

Quick update: still no data.

@mfreed7
Copy link
Contributor

mfreed7 commented Apr 4, 2022

Checking back in again here. The use counters in the comment above landed in Chrome 99.0.4832.0, which was in stable for most of March. The use counters show roughly this:

  • <param> that specifies a URL, inside an <object> that doesn't: 0.04%
  • As above, but URL successfully resolves to a PDF resource: 0.00002%
  • As above, but URL successfully resolves to a non-PDF resource: not measurable (not surprising, since PDF is the only Web-exposed plugin)

So the vast majority (99.95%) of <param> URL usage appears to point to invalid resources - likely mostly Flash. A very small percentage (0.05% of <param>-with-URL usage, 0.00002% of web page loads) would break if we deprecated this functionality.

If the above number are correct, it would seem "ok" to move ahead with deprecation.

@mfreed7 mfreed7 added the agenda+ To be discussed at a triage meeting label Apr 4, 2022
@annevk
Copy link
Member

annevk commented Apr 5, 2022

Thank you, that sounds very promising! \o/ (I suspect we'd still have to support HTMLParamElement indefinitely, but at least it no longer partakes in a processing model.)

@past past removed the agenda+ To be discussed at a triage meeting label Apr 8, 2022
domenic added a commit that referenced this issue Apr 12, 2022
Given that plugins are gone from the web platform (with their full removal from the spec being tracked in #6003), it is not useful. In some browsers it can be used to figure out the URL of an <object>, even when that <object> is not being used for a plugin, via params named "movie", "src", "code", or "url". But we decided to remove this behavior from browsers instead of specifying it.

This retains the HTMLParamElement interface, as well as the parser behavior of <param>.

Closes #387. Helps with #6003.
@domenic
Copy link
Member

domenic commented Apr 12, 2022

I have posted a spec PR for this at #7816 and hope we can proceed with the removal in Chromium soon.

domenic added a commit that referenced this issue Apr 21, 2022
Given that plugins are gone from the web platform (with their full removal from the spec being tracked in #6003), it is not useful. In some browsers it can be used to figure out the URL of an <object>, even when that <object> is not being used for a plugin, via params named "movie", "src", "code", or "url". But we decided to remove this behavior from browsers instead of specifying it.

This retains the HTMLParamElement interface, as well as the parser behavior of <param>.

Closes #387. Helps with #6003.
mfreed7 pushed a commit to mfreed7/html that referenced this issue Jun 3, 2022
Given that plugins are gone from the web platform (with their full removal from the spec being tracked in whatwg#6003), it is not useful. In some browsers it can be used to figure out the URL of an <object>, even when that <object> is not being used for a plugin, via params named "movie", "src", "code", or "url". But we decided to remove this behavior from browsers instead of specifying it.

This retains the HTMLParamElement interface, as well as the parser behavior of <param>.

Closes whatwg#387. Helps with whatwg#6003.
mjfroman pushed a commit to mjfroman/moz-libwebrtc-third-party that referenced this issue Oct 14, 2022
See [1] for context, but there is interest in deprecating the
ability to specify an <object> containing a <param> element
that specifies the URL (via a name in {"code","data","movie",
"src", or "url"}). This CL adds a use counter for that
feature.

[1] whatwg/html#387

Bug: 572908
Change-Id: I7f152d05d992606224fb895bca6cd65a4bca15ea
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3139401
Commit-Queue: Mason Freed <masonf@chromium.org>
Commit-Queue: Joey Arhar <jarhar@chromium.org>
Auto-Submit: Mason Freed <masonf@chromium.org>
Reviewed-by: Joey Arhar <jarhar@chromium.org>
Cr-Commit-Position: refs/heads/main@{#918305}
NOKEYCHECK=True
GitOrigin-RevId: 8d40b3335c5e54869d4ffec5108f997753152be0
@mfreed7
Copy link
Contributor

mfreed7 commented Oct 19, 2022

I have posted a spec PR for this at #7816 and hope we can proceed with the removal in Chromium soon.

Just a quick check-in on that process. I've been very slowly rolling out this change in Chromium, and I'm about to increase from 2% of stable to 5% of stable. But at this point, things are looking good: no reported bugs. I'll keep going slowly with the rollout, but I just wanted to say so far, so good. I'll stop commenting in this issue, but will continue updating the Chromium tracking bug. Of course, if a compat issue is reported, I'll report back here and re-open this bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compat Standard is not web compatible or proprietary feature needs standardizing normative change topic: embed and object
Development

Successfully merging a pull request may close this issue.

8 participants
@past @zcorpan @foolip @domenic @bzbarsky @annevk @mfreed7 and others