Skip to content
This repository has been archived by the owner on Jul 24, 2023. It is now read-only.

Auth_OpenID_Parse fails to find link attributes in HTML with uppercase HEAD tag #133

Closed
strk opened this issue Mar 18, 2017 · 9 comments
Closed

Comments

@strk
Copy link
Contributor

strk commented Mar 18, 2017

Offending HTML:

<HTML>
<HEAD>
    <TITLE>strk's publications</TITLE>
        <link rel="stylesheet"
     href="style.css" type="text/css">




  <link rel="openid2.provider" href="https://id.xxx.io/" />
  <link rel="openid2.local_id" href="https://strk.xxx.io/openid/" />

</HEAD>
<BODY BGCOLOR="#FFFFFF">

The Auth_OpenID_Parse's parseLinkAttrs returns empty when passed that HTML.
Why not using DOM rather than regexps in that function ?

@strk
Copy link
Contributor Author

strk commented Mar 18, 2017

Lowercasing the <head> tags (open and closed) seems to fix the parsing, which is weird because the regexp used to parse ends with a /si flags

@strk
Copy link
Contributor Author

strk commented Mar 18, 2017

Actually lowercasing just the first is enough.
This is with PHP 5.6.19-0+deb8u1

@strk
Copy link
Contributor Author

strk commented Mar 18, 2017

from #PHP IRC channel on Freenode:

18:17 <@TML> strk: But an XPath would be so much easier: $dom = new DOMDocument();
             $dom->loadHTML($data); $xpath = new DOMXPath($dom); $head =
             $xpath->query('//head')->items(0);

@strk
Copy link
Contributor Author

strk commented Mar 18, 2017

I restricted the input further.
This does not work:

<HTML>
<HEAD>  
  <link rel="openid2.provider" href="https://id.kbt.io/" />
  <link rel="openid2.local_id" href="https://strk.kbt.io/openid/" />
</HEAD>

While this works:

<HTML>
<head>  
  <link rel="openid2.provider" href="https://id.kbt.io/" />
  <link rel="openid2.local_id" href="https://strk.kbt.io/openid/" />
</HEAD>

@strk
Copy link
Contributor Author

strk commented Mar 18, 2017

The built regexp attempting to find that head block is this:

re:/<head\b(?!:)([^>]*?)(?:\/>|>(.*)(?:<\/?(?:head|body|html)\s*>|\Z))/si

@strk
Copy link
Contributor Author

strk commented Mar 18, 2017

Commenting out the $match = $match[0] line in function match around line 218 of Auth/OpenID/Parse.php fixes this case.

@strk
Copy link
Contributor Author

strk commented Mar 18, 2017

Sorry I was wrong, the actual fix is to always use preg_match rather than using mb_ereg_search after stripping flags (which is what makes that regexp case-sensitive).

So, what's the rationale to prefer mb_ereg_search over preg_match ?

@strk
Copy link
Contributor Author

strk commented Mar 18, 2017

and the $match = $match[0] line needs to go away when preg_match is used...

strk added a commit to strk/php-openid that referenced this issue Mar 18, 2017
@strk strk changed the title Auth_OpenID_Parse fails to find link attributes in HTML Auth_OpenID_Parse fails to find link attributes in HTML with uppercase HEAD tag Apr 25, 2017
@strk
Copy link
Contributor Author

strk commented Apr 25, 2017

@pfefferle here you can find the testcase if you want to give it a try

marcoceppi pushed a commit that referenced this issue Jan 10, 2018
* Fix case-sensitivity in discovery
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant