public
Description: Listado en YAML de las principales ciudades (2800+) de Mexico organizadas por estado. YAML listing of Mexico's main cities organized by state.
Homepage: http://rha7dotcom.blogspot.com/
Clone URL: git://github.com/rha7dotcom/ciudadesdemexico.git
README.textile

The Purpose

I usually find myself looking for the list of states and cities of Mexico, however, in almost every software available for use as open source, the mentiones list is not included, usually the only pre-loaded one is the USA listings of states and cities (very rarely but sometimes they include the cities).

The Source

At the SEPOMEX (Mexican Post Service, by it’s spanish acronym), there is a web page Descarga Códigos Postales Mexicanos that allows you to download a full listing in several formats, including CSV and TXT (pipe ‘|’ separated). Unfortunately CSV format is damaged (not properly formatted), but the TXT one is pristine. Fields 3 and 4 are the city and state respectively. The original file you download is in iso-8859-1 format, while the script works with utf-8, so it’s probably wise to do something like this before (the file you download from SEPOMEX is named CPdescarga.txt):

iconv -f iso-8859-1 -t utf-8 CPdescarga.txt > mexico.txt

The Extractor

The extractor.rb file is a ruby script that takes this files you download from SEPOMEX, in TXT format, and parses and extracts a listing of mexican states and cities, over 2800 different cities and locations are extracted by the script. The output is to standard out, and in YAML format, so you can easily incorporate it in your own projects.

The extractor.rb script expects to find the list of cities in the TXT format from SEPOMEX, but in utf-8 formatting. You don’t pass any parameters to extractor, and probably will need to pipe the output to a yaml file:

./extractor.rb > mexico.yml

Progress text is output to standard error, so it doesn’t interfer with the yaml output.

Other Things in Project

The textilize.rb program is just a snippet converted to a command line utility to transform a .textile file to .html.