Skip to content

Commit

Permalink
feat(script): add script
Browse files Browse the repository at this point in the history
  • Loading branch information
fabiancook committed Nov 6, 2014
1 parent 59d1a27 commit 5f8b7ef
Show file tree
Hide file tree
Showing 9 changed files with 271 additions and 0 deletions.
1 change: 1 addition & 0 deletions .idea/.name

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions .idea/countries_org-scraper.iml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 4 additions & 0 deletions .idea/encodings.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 4 additions & 0 deletions .idea/misc.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions .idea/modules.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 5 additions & 0 deletions .idea/scopes/scope_settings.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/vcs.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

209 changes: 209 additions & 0 deletions .idea/workspace.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

26 changes: 26 additions & 0 deletions scraper.py
Expand Up @@ -21,3 +21,29 @@
# on Morph for Python (https://github.com/openaustralia/morph-docker-python/blob/master/pip_requirements.txt) and all that matters
# is that your final data is written to an Sqlite database called data.sqlite in the current working directory which
# has at least a table called data.
import scraperwiki
impot lxml.html

html = scrperwiki,scrape("http://countrycode.org/")

root = lxml.html.fromstring(html)

i = 0

for tr in root.cssselect("#main_table_blue tbody tr")
i++
tds = tr.select("td")

iso = tds[1].text_content()
countryCode = tds[2].text_content()

isoSplit = iso.split('/')

data = {
'name': tds[0[.text_content().strip(),
'countryCode': int(countryCode),
'countryCodeUnique': i,
'ISO2': isoSplit[0].strip(),
'ISO3': isoSplit[1].strip()
}

0 comments on commit 5f8b7ef

Please sign in to comment.