Skip to content


Subversion checkout URL

You can clone with
Download ZIP
A Microdata Extractor for PHP 5
Fetching latest commit...
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


To test this:

1. Make sure you have PHP 5 with Tidy ( ) and MB_String ( ) support.
2. Move the folder to an apache dir and access the /examples folder with your browser.

This has been tested so far in PHP 5.3.4.

Known Issues:

. There might be problems with non-ASCII-extending Characted Encodings. If you find this issue report it with a clear explanation of how the problem encoding works or even better yet, some code.
. Base treating of and is the same. Meaning href="a.jpg" will be translated to in both cases. I don't know if this is correct or not.


. Add a construct_by_uri() .. (I need some live pages with microdata to test this).
. Add a vocabulary validator framework on top of this.
. I might (I said MIGHT not WILL) port it to C or C++.

Emiliano Martínez Luque

PS: This is way clearer than the Microformat parser I did a couple of years ago, that's because Microdata has way clearer syntax than microformats. I think that Microformats was a great idea but there were some design ideas that were overly complex and it took a lot of code gymnastics to implement them, and I really believe that Microdata is a better spec.   

PS2: If you need to contact me for whatever reason use the contact form in 
Something went wrong with that request. Please try again.