Skip to content

Commit

Permalink
Update scraper.py
Browse files Browse the repository at this point in the history
  • Loading branch information
walinchus committed Jun 2, 2017
1 parent 5ffaf6d commit 456a0fa
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions scraper.py
Expand Up @@ -31,13 +31,13 @@ def scrapepdf(url):
#so we use try here to stop it breaking the whole thing
try:
#This line tests how many matches we get
print 'SCHOOL NAME? ', name.text.encode('ascii', 'ignore')
print 'SCHOOL NAME? ', name.text.encode('ascii', 'ignore')
#There's only one when tested, so let's store the first and only match
#see https://docs.python.org/2/howto/unicode.html
#for more on .encode('ascii', 'xmlcharrefreplace')
record['schoolname'] = schoolname[0].text.encode('ascii', 'xmlcharrefreplace')
except AttributeError:
print 'AttributeError - ignored'
except AttributeError:
print 'AttributeError - ignored'

#Now the date, which is in <text top="224" left="661" width="147" height="18" font="2"
dateinspected = pdfroot.findall('.//text[@top="224"]')
Expand Down

0 comments on commit 456a0fa

Please sign in to comment.