Map Reduce with XML InputFormat.
This is a code written to clean and convert Wiki XML data set to a delimited text. Extracted Films data from Wikipedia archive to do analysis.
Sample.xml is provided. Look at the WikiMR driver program to change if your XML structure is changing.
Have Fun!!!