Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken file raises GPXException, XMLSyntaxError, TypeError #10

Closed
johnjohndoe opened this issue Apr 24, 2013 · 5 comments
Closed

Broken file raises GPXException, XMLSyntaxError, TypeError #10

johnjohndoe opened this issue Apr 24, 2013 · 5 comments

Comments

@johnjohndoe
Copy link

When I use the parser with a broken GPX file an exception is raised. Here is the code I use. My first assumption was to catch a GPXException.

368  file = open(filepath)
369  try:
370      gpx = gpxpy.parse(file)
371  except gpxpy.gpx.GPXException:
372      print "GPXException for %s." % filepath
373      return 1

However the main reason is somewhere else.

Parsing points in sometrack.gpx
ERROR:root:expected '>', line 3125, column 29
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/gpxpy-0.8.7-py2.7.egg/gpxpy/parser.py", line 209, in parse
    self.xml_parser = LXMLParser(self.xml)
  File "/usr/local/lib/python2.7/dist-packages/gpxpy-0.8.7-py2.7.egg/gpxpy/parser.py", line 107, in __init__
    self.dom = mod_etree.XML(self.xml)
  File "lxml.etree.pyx", line 2734, in lxml.etree.XML (src/lxml/lxml.etree.c:54411)
  File "parser.pxi", line 1578, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:82748)
  File "parser.pxi", line 1457, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:81546)
  File "parser.pxi", line 965, in lxml.etree._BaseParser._parseDoc (src/lxml/lxml.etree.c:78216)
  File "parser.pxi", line 569, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:74472)
  File "parser.pxi", line 650, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:75363)
  File "parser.pxi", line 590, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:74696)
XMLSyntaxError: expected '>', line 3125, column 29

File "somescript.py", line 370, in extractpoints gpx = gpxpy.parse(file)
File "/usr/local/lib/python2.7/dist-packages/gpxpy-0.8.7-py2.7.egg/gpxpy/__init__.py",
     line 28, in parse raise mod_gpx.GPXException('Error parsing {0}: {1}'
                       .format(xml_or_file[0 : 100], parser.get_error()))
TypeError: 'file' object has no attribute '__getitem__'

Please refer to this Stackoverflow post which already suggests a solution to the problem.

Here is a sample file that produces the syntax error.

<?xml version='1.0' encoding='UTF-8' standalone='yes' ?>
<gpx version="1.1" creator="nl.sogeti.android.gpstracker" xsi:schemaLocation="http://www.topografix.com/GPX/1/1 http://www.topografix.com/gpx/1/1/gpx.xsd" xmlns="http://www.topografix.com/GPX/1/1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:gpx10="http://www.topografix.com/GPX/1/0" xmlns:ogt10="http://gpstracker.android.sogeti.nl/GPX/1/0">
<metadata>
<time>2013-03-29T13:52:22Z</time>
</metadata>
<trk>
<name>Track 2013-03-29 14:52</name>
<trkseg>
<trkpt lat="52.48128721585312" lon="13.452307177252663">
<ele>76.0</ele>
<time>2013-03-29T14:09:34Z</time>
<extensions>
<gpx10:speed>5.1076788902282715</gpx10:speed>
<ogt10:accuracy>10.0</ogt10:accuracy>
<gpx10:course>216.4146728515625</gpx10:course></extensions>
</trkpt>
<trkpt lat="52.48077771999975" lon="13.452037039115906">
<ele>78.0</ele>
<time>2013-03-29T14:09:47Z</time>
<extensions>
<gpx10:speed>4.982673168182373</gpx10:speed>
<ogt10:accuracy>10.0</ogt10:accuracy>
<gpx10:course>131.78286743164063</gpx10:course></extensions>
</trkpt>
<trkpt lat="52.480716389960826" lon="13.45214590285023">
<ele>78.0</ele>
<time>2013-03-29T14:09:49Z</time>
<extensions>
<gpx10:speed>4.982266426086426</gpx10:speed>
<ogt10:accuracy>10.0</ogt10:accuracy>
<gpx10:course>129.60655212402344</gpx10:course></extensions>
</trkpt>
<trkpt lat="52.48065528656295" lon="13.452274408986215">
<ele>78.0</ele>
<time>2013-03-29T14:09:51Z</time
@tkrajina
Copy link
Owner

Hi @johnjohndoe thanks for your bug submission. I changed and commited a new version. The parse() method will now fail on invalid XML file and the code where your problem was manifesting is deleted. Also, I deleted the is_valid() method (it there is a exception on parse() -- no need to check is_valid() later).

Hope this is OK for you?

@johnjohndoe
Copy link
Author

@tkrajina Looks good so far.
One thing I would rewrite is how you handle exceptions. I would not handle all errors as generic exceptions but add a specific XMLSyntaxError except block before. This is the error you know of at the point in time.

except XMLSyntaxError as e:
    mod_logging.debug('Error in:\n%s\n-----------\n' % self.xml)
    mod_logging.exception(e)
    raise mod_gpx.GPXException('Error parsing XML: %s' % str(e))
except Exception as e:
    mod_logging.debug('Error in:\n%s\n-----------\n' % self.xml)
    mod_logging.exception(e)
    raise mod_gpx.GPXException('Unknown error: %s' % str(e))

If you can accept a crash on an unknown error you could even omit the generic except block.

@tkrajina
Copy link
Owner

What about this: 8accbce ?

The idea is to have two kinds of gpxpy exceptions, one for invalid XML files and the other for invalid valid-XML-invalid-GPX files. You can choose if to detect all exceptions with GPXException or XML-specific with GPXXMLSyntaxException. If GPXXMLSyntaxException -- you can retrieve the original (which can be lxml or minidom) exception with e.original_exception.

@johnjohndoe
Copy link
Author

@tkrajina I disagree. I find it more complicated to handle multiple GPX*Exceptions as a user of the library. What would be the benefit of yet another wrapper for an exception caused by an internal module not part of the library? All the user needs is the original error message which you wrap into a GPXException. This is good enough imho.

@johnjohndoe
Copy link
Author

@tkrajina tkrajina closed this as completed May 5, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants