Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Unknown JSON-LD item #10
I'm looking to build a script that sees what data it can glean from any given url, microdata first, then content. Your parser seems perfect for that, but I've noticed a case where an error is thrown in certain situations.
I'm giving the following url:
And I'm getting the following warning:
Is it finding microdata but attempting to parse it as JSON-LD?
I've also noticed cases where no data is obtained though microdata is used on the page, is this indicative of poor configuration their end?
Thanks in advance
Here's a list of urls with data that either isn't being returned, or is buggy:
I appreciate that some of these may be down to the implementation of the microdata on the pages themselves.
@danjaywing Serveral things:
$parsers = \Jkphl\Micrometa\Parser\Microformats2::PARSE | // Microformats = 1 \Jkphl\Micrometa\Parser\Microdata::PARSE | // Microdata = 2 \Jkphl\Micrometa\Parser\JsonLD::PARSE; // JSON-LD = 4 $micrometaParser = new \Jkphl\Micrometa($url, null, $parsers);
Thanks for the fix.
The following is an example of a page containing microdata that isn't parsed:
As you can see from the source code, there is a product type, but when I attempt to parse the url, no data for the product is retrieved.
Possibly an issue with their code. If your parser identifies 'mainEntityOfPage' does it begin parsing inside it?
referenced this issue
May 29, 2017
added a commit
May 30, 2017
@danjaywing FYI: I just published the next major release with improved support for additional formats. I did a rough check with your list of example files. They all yield results now, I think there are still some issues with HTML Microdata parsing though. I will further track these over at jkphl/rdfa-lite-microdata#6. Thanks again for this valuable set of examples!