Every repository with this icon (
Every repository with this icon (
| name | age | message | |
|---|---|---|---|
| |
README.html | ||
| |
README.textile | ||
| |
extractor.rb | ||
| |
mexico.original.txt | ||
| |
mexico.txt | ||
| |
mexico.yml | ||
| |
textilize.rb |
The Purpose
I usually find myself looking for the list of states and cities of Mexico, however, in almost every software available for use as open source, the mentiones list is not included, usually the only pre-loaded one is the USA listings of states and cities (very rarely but sometimes they include the cities).
The Source
At the SEPOMEX (Mexican Post Service, by it’s spanish acronym), there is a web page Descarga Códigos Postales Mexicanos that allows you to download a full listing in several formats, including CSV and TXT (pipe ‘|’ separated). Unfortunately CSV format is damaged (not properly formatted), but the TXT one is pristine. Fields 3 and 4 are the city and state respectively. The original file you download is in iso-8859-1 format, while the script works with utf-8, so it’s probably wise to do something like this before (the file you download from SEPOMEX is named CPdescarga.txt):
iconv -f iso-8859-1 -t utf-8 CPdescarga.txt > mexico.txt
The Extractor
The extractor.rb file is a ruby script that takes this files you download from SEPOMEX, in TXT format, and parses and extracts a listing of mexican states and cities, over 2800 different cities and locations are extracted by the script. The output is to standard out, and in YAML format, so you can easily incorporate it in your own projects.
The extractor.rb script expects to find the list of cities in the TXT format from SEPOMEX, but in utf-8 formatting. You don’t pass any parameters to extractor, and probably will need to pipe the output to a yaml file:
./extractor.rb > mexico.yml
Progress text is output to standard error, so it doesn’t interfer with the yaml output.
Other Things in Project
The textilize.rb program is just a snippet converted to a command line utility to transform a .textile file to .html.








