In Python, read the .80 file format, for 80legs web crawl results.
The URL and data are UTF-8 decoded.
For people interested in deserializing in other languages, the file format this creates and reads is:
Note that:
* The last 4 items (<URL-SIZE><URL><DATA-SIZE><DATA>) repeat for each url/data pair.
* <classID>, <versionID>, <URL-SIZE>, and <DATA-SIZE> are encoded 32-bit integers.
* The url is encoded using UTF-8.