New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avoidance for xpath predicate error #6

Open
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
3 participants
@junichim

junichim commented Nov 2, 2018

I use trim_osc.py to update OpenStreetMap data.
I encountered xpath error for a changes.osc.gz. error is below.

osm@map:~/tmp/osmosis_test3$ ~/src/regional/trim_osc.py -d gis -b 136.202763 34.482312 136.6553727 34.4756217  -z -v changes.osc.gz trimed.osc.gz
/home/osm/.local/lib/python3.5/site-packages/psycopg2/__init__.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary ple
ase use "pip install psycopg2-binary" instead. For details see: <http://initd.org/psycopg/docs/install.html#binary-install-from-pypi>.
  """)
Traceback (most recent call last):
  File "/home/osm/src/regional/trim_osc.py", line 161, in <module>
    for nd in root.xpath('//way[@id={}]/nd'.format(row[0])):
  File "src/lxml/etree.pyx", line 1577, in lxml.etree._Element.xpath
  File "src/lxml/xpath.pxi", line 307, in lxml.etree.XPathElementEvaluator.__call__
  File "src/lxml/xpath.pxi", line 227, in lxml.etree._XPathEvaluatorBase._handle_result
lxml.etree.XPathEvalError: Error in xpath expression

This changes.osc.gz can be download by osmosis under this configurations.txt

# The URL of the directory containing change files.
baseUrl=https://planet.openstreetmap.org/replication/hour

# Defines the maximum time interval in seconds to download in a single invocation.
# Setting to 0 disables this feature.
#maxInterval = 3600
#maxInterval = 7200
maxInterval = 14400

and this state.txt

#Thu Nov 01 23:49:21 JST 2018
sequenceNumber=53060
timestamp=2018-10-02T03\:00\:00Z

I doubt cause of this error is large xml file like
https://mailman-mail5.webfaction.com/pipermail/lxml/20151125/015531.html

Mentioned above changes.osc.gz file is over 300MB.
But I don't know true reason (only I guess).

So I test your script about truncated changes.osc.gz data. This data is only top 58 lines for original one.
Your script has no error.

Then I create a PR to change xpath process, my idea is to modify xpath process about way without predicate.

Please consider and check this PR.

@Zverik

This comment has been minimized.

Owner

Zverik commented Nov 2, 2018

Thanks for the pull request, junichim. Did you investigate what makes the xpath expression erroneous? Could it be that row[0] returns an empty expression?

@junichim

This comment has been minimized.

junichim commented Nov 2, 2018

Sorry I didn't investigate row[0] is empty or not.
So I check this now.

    print("row[0]: ", row[0]);
    for nd in root.xpath('//way[@id={}]/nd'.format(row[0])):
        nodes[nd.get('ref')] = True

result is below.

row[0]:  4324115
Traceback (most recent call last):
  File "/home/osm/src/regional/trim_osc.py", line 156, in <module>
    for nd in root.xpath('//way[@id={}]/nd'.format(row[0])):
  File "src/lxml/etree.pyx", line 1577, in lxml.etree._Element.xpath
  File "src/lxml/xpath.pxi", line 307, in lxml.etree.XPathElementEvaluator.__call__
  File "src/lxml/xpath.pxi", line 227, in lxml.etree._XPathEvaluatorBase._handle_result
lxml.etree.XPathEvalError: Error in xpath expression

row[0] is not empty.

@junichim

This comment has been minimized.

junichim commented Nov 2, 2018

Sorry I mistook description about below, because I mixed some experiment about this.

wrong

So I test your script about truncated changes.osc.gz data. This data is only top 58 lines for original one.
Your script has no error.

right

So I test some command like your script and simple xpath with predicate on python3 console. For original changes.osc.gz, it has error, otherwise for truncated changes.osc.gz data it complated without error. This truncated data is only top 58 lines for original one.

@PierreHachard

This comment has been minimized.

PierreHachard commented Nov 27, 2018

Hello,

Same as @junichim here, I'm trying to use the script trim_osc.py using a bounding box to trim the file changes.osc.gz for daily updates.
Same problem, the script fails on for nd in root.xpath('//way[@id={}]/nd'.format(row[0])):, error below :

row[0] : 1675955 Traceback (most recent call last): File "/pathto/regional/trim_osc.py", line 162, in <module> for nd in root.xpath('//way[@id={}]/nd'.format(row[0])): File "src/lxml/etree.pyx", line 1577, in lxml.etree._Element.xpath File "src/lxml/xpath.pxi", line 307, in lxml.etree.XPathElementEvaluator.__call__ File "src/lxml/xpath.pxi", line 227, in lxml.etree._XPathEvaluatorBase._handle_result lxml.etree.XPathEvalError: Invalid expression

The command I'm trying to run is :
trim_osc.py -d gis -b -5 47 8.3 51.35 -z changes.osc.gz trimmed.osc.gz
(with the bounding box representing north of France)

I'm getting the exact same error and row[0] isn't empty.
My file changes.osc.gz is 72MB.

I also tried running the same command with a smaller file (41M) including the XML line <way id="1675955" version="9" timestamp="2018-11-09T10:06:20Z" uid="9018038" user="kasals" changeset="64319701"> that the script fails on for the bigger file.
To my surprise, the script worked with the smaller file... do you think it's an error related to xpath ?

Do you have any news about this issue ?

Thanks in advance.

@PierreHachard

This comment has been minimized.

PierreHachard commented Nov 28, 2018

Actually I tried @junichim pull request and it seems to do the job avoiding the error..
thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment