Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

A Microdata Extractor for PHP 5

branch: master

Fetching latest commit…

Octocat-spinner-32-eaf2f5

Cannot retrieve the latest commit at this time

Octocat-spinner-32 examples
Octocat-spinner-32 itemscope
Octocat-spinner-32 md_extract
Octocat-spinner-32 unit_testing
Octocat-spinner-32 LICENSE
Octocat-spinner-32 README
README
To test this:

1. Make sure you have PHP 5 with Tidy ( http://www.php.net/manual/en/tidy.installation.php ) and MB_String ( http://ar.php.net/manual/en/mbstring.installation.php ) support.
2. Move the folder to an apache dir and access the /examples folder with your browser.

This has been tested so far in PHP 5.3.4.

Known Issues:

. There might be problems with non-ASCII-extending Characted Encodings. If you find this issue report it with a clear explanation of how the problem encoding works or even better yet, some code.
. Base treating of http://www.example.com/dir/ and http://www.example.com/dir is the same. Meaning href="a.jpg" will be translated to http://www.example.com/dir/a.jpg in both cases. I don't know if this is correct or not.

Future:

. Add a construct_by_uri() .. (I need some live pages with microdata to test this).
. Add a vocabulary validator framework on top of this.
. I might (I said MIGHT not WILL) port it to C or C++.

Emiliano Martínez Luque
http://www.metonymie.com

PS: This is way clearer than the Microformat parser I did a couple of years ago, that's because Microdata has way clearer syntax than microformats. I think that Microformats was a great idea but there were some design ideas that were overly complex and it took a lot of code gymnastics to implement them, and I really believe that Microdata is a better spec.   

PS2: If you need to contact me for whatever reason use the contact form in http://www.metonymie.com
Something went wrong with that request. Please try again.