Motorsport data YAML files created using Python to scrape and parse from the Wikipedia API.
These were created using this Chrome scraping extension to find the proper XPaths for the relevant data from the following Wikipedia pages:
2013 Formula One season
2013 IndyCar Series season
2013 NASCAR Sprint Cup Series
2013 NASCAR Nationwide Series
2013 NASCAR Camping World Truck Series
2013 NHRA Mello Yello Drag Racing Series
After grabbing the appropriate Wikipeida URL's for the relevant race events, I then used them to generate the respective URL's from Wikipedia's MediaWiki API.
race_alias_tester.py
This is used to test the API URL's and to figure out the parsing rules to extract the race event aliases. This was a big pain in the ass since Wikipedia articles do not have uniform formatting.
racing_yaml_creator.py
This is the main script that reads the CSV files, parses for the race event aliases from the Wikipedia API, and then creates the corresponding YAML file. Because Wikipedia data is not uniform, some minor cleaning up of the YAML files after they were created was necessary.
aliases_csv2yml.py
This is the quick script that creates the YAML file from a CSV file that already contains the race event aliases, such as the case for Formula One.