A Microdata Extractor for PHP 5
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


To test this:

1. Make sure you have PHP 5 with Tidy ( http://www.php.net/manual/en/tidy.installation.php ) and MB_String ( http://ar.php.net/manual/en/mbstring.installation.php ) support.
2. Move the folder to an apache dir and access the /examples folder with your browser.

This has been tested so far in PHP 5.3.4.

Known Issues:

. There might be problems with non-ASCII-extending Characted Encodings. If you find this issue report it with a clear explanation of how the problem encoding works or even better yet, some code.
. Base treating of http://www.example.com/dir/ and http://www.example.com/dir is the same. Meaning href="a.jpg" will be translated to http://www.example.com/dir/a.jpg in both cases. I don't know if this is correct or not.


. Add a construct_by_uri() .. (I need some live pages with microdata to test this).
. Add a vocabulary validator framework on top of this.
. I might (I said MIGHT not WILL) port it to C or C++.

Emiliano Martínez Luque

PS: This is way clearer than the Microformat parser I did a couple of years ago, that's because Microdata has way clearer syntax than microformats. I think that Microformats was a great idea but there were some design ideas that were overly complex and it took a lot of code gymnastics to implement them, and I really believe that Microdata is a better spec.   

PS2: If you need to contact me for whatever reason use the contact form in http://www.metonymie.com