Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot load WMS capabilities from WMS provider in QGIS3 from standalone python #25761

Closed
qgib opened this issue Jan 16, 2018 · 17 comments
Closed
Labels
Bug Either a bug report, or a bug fix. Let's hope for the latter! Data Provider Related to specific vector, raster or mesh data providers High Priority Regression Something which used to work, but doesn't anymore

Comments

@qgib
Copy link
Contributor

qgib commented Jan 16, 2018

Author Name: David Marteau (@dmarteau)
Original Redmine Issue: 17866
Affected QGIS version: master
Redmine category:data_provider


When trying to load projects with wms or wcs layer defined from python: loading capabilities fail the first time with timeout error, thus preventing creating any project or layer from python.

The problem has been verified in Debian, Ubuntu and fresh OSX build from master branch.

The effect is that you cannot instanciate properly a QgsProject with a WMS layer.

This problem has been verified on Debian stretch, Ubuntu Xenial, OXS 10.11 from fresh master.

This can be tested by simply instanciate a QgsProject in python (in a standalone python script) from a .qgs project having a single wms layer: it will fail to instanciate the layer.

We have somehow been able to reduce the problem to the loading of the wmsprovider. The following python code provide a minimal example for reproducing the problem:

Note that when you run this script, clear the Qgis network cache between two invocations.

import os
import logging

from qgis.core import QgsApplication, QgsMessageLog
from qgis.core import QgsProject, QgsProviderRegistry

logger = logging.getLogger()
logger.setLevel(logging.DEBUG)

1. No Display
os.environ['QT_QPA_PLATFORM'] = 'offscreen'

qgis_application = QgsApplication([], False )
qgis_application.setPrefixPath('/usr/local', True)
qgis_application.initQgis()

1. Log any messages
def writelogmessage(message, tag, level):
    arg = '{}: {}'.format( tag, message )
    if level == QgsMessageLog.WARNING:
        logger.warning(arg)
    elif level == QgsMessageLog.CRITICAL:
        logger.error(arg)
    else:
        logger.info(arg)

messageLog = QgsApplication.messageLog()
messageLog.messageReceived.connect( writelogmessage )

wmsuri = ("contextualWMSLegend=0&crs=EPSG:4326&dpiMode=7&featureCount=10&format=image/jpeg"
          "&layers=s2cloudless&styles&tileMatrixSet=s2cloudless-wmsc-14"
          "&url=http://tiles.maps.eox.at/wms?" )

1. GetCapabilities Fail with Request Timeout !!!
provider = QgsProviderRegistry.instance().createProvider( "wms", wmsuri )

1. Run again => works !!!
provider = QgsProviderRegistry.instance().createProvider( "wms", wmsuri )

qgis_application.exitQgis()
del qgis_application

On failure, the logs will show the following errors

WMS: Download of capabilities failed: Operation canceled
Network: Network request https://tiles.maps.eox.at/wms?SERVICE=WMS&REQUEST=GetCapabilities timed out

After some investigation we have found that:

  • Sending a request from python directly with the QgsNetworkAccessManager and QEventLoop works as expected.
  • Calling from python a simple C++ method that do the same call fail: the QgsNetworkAccessManager never a send a finished signal from a successfully sent request
  • When tracing the c++ execution we have checked that the event loop has the running status.

@qgib
Copy link
Contributor Author

qgib commented Jan 16, 2018

Author Name: Alessandro Pasotti (@elpaso)


I suspect that a main event loop is required for a proper execution of the signals and slots in the network access manager, please have a look to the attached working test case.

But I still don't know why :(


  • 11939 was configured as test_provider_wms.py
  • status_id was changed from Open to Feedback
  • assigned_to_id was configured as Alessandro Pasotti

@qgib
Copy link
Contributor Author

qgib commented Jan 16, 2018

Author Name: David Marteau (@dmarteau)


I do not think this is totally right since after the first attempt (which fail) then things appears to be ok: IHMO there is some kind of init state condition concerning event loop which is not set on first run.

Anyway, runnig app.exec will defeat the purpose of embedding since it introduce a blocking state.

@qgib
Copy link
Contributor Author

qgib commented Jan 16, 2018

Author Name: Alessandro Pasotti (@elpaso)


Well, I don't really know what you're trying to do, but in QGIS application you do have a running main event loop and this is not an issue.

If you have a standalone non-GUI script that must run without an event loop you need a blocking condition or your script might terminate before the network reply comes back.

In any case you need an event loop (main or local).

Btw, in theory this whole thing should work without a main event loop because in the wms capabilities code there is a local event loop running, whose purpose is exactly to turn an asynchronous operation into a blocking one, what I don't understand is why it works the second time, I believe that the answer is in Qt QNetworkAccessManager core code, I don't see any obvious issue in QgsNetworkAccessManager or in the WMS code.

Just curious, was it working in 2.x ?


  • assigned_to_id removed Alessandro Pasotti

@qgib
Copy link
Contributor Author

qgib commented Jan 16, 2018

Author Name: David Marteau (@dmarteau)


Alessandro Pasotti wrote:

Well, I don't really know what you're trying to do, but in QGIS application you do have a running main event loop and this is not an issue.

This is true for desktop. This is not true for server: in the fcgi implementation the main loop is a fcgi loop which is not related
to the QGIS application loop.

If you have a standalone non-GUI script that must run without an event loop you need a blocking condition or your script might terminate before the network reply comes back.

Yes

In any case you need an event loop (main or local).

Yes

Btw, in theory this whole thing should work without a main event loop because in the wms capabilities code there is a local event loop running, whose purpose is exactly to turn an asynchronous operation into a blocking one

Exactly

, what I don't understand is why it works the second time,

In fact, this is the way it should work in the normal way: as you said it, because of the the local event, it should not block the first time.

I believe that the answer is in Qt QNetworkAccessManager core code, I don't see any obvious issue in QgsNetworkAccessManager or in > the WMS code.

Me neither, I have spent some time to check the c++ code and I did not see anything suspicious.
Curiously, when I wrote a test case in python directly with the QNetworkAccessManager (and a local QEventLoop) to mimic what's in the
c++ code then it work flawlessly.

Just curious, was it working in 2.x ?

This has to be checked, and it has to be checked also in the fcgi implementation of the Qgis3 server.

@qgib
Copy link
Contributor Author

qgib commented Jan 17, 2018

Author Name: Alessandro Pasotti (@elpaso)


I've translated the (failing) python test to C++ and it passes ... that's wierd.

    # This is failing
    def testWmsGetCapabilitiesLayer(self):
        self.layer = QgsRasterLayer(self.wmsuri, 'mywms', 'wms')
        self.assertTrue(self.layer.isValid())

     // This is passing:
     QgsRasterLayer layer( QStringLiteral( "contextualWMSLegend=0&crs=EPSG:4326&dpiMode=7&featureCount=10&format=image/jpeg"
                                            "&layers=s2cloudless&styles&tileMatrixSet=s2cloudless-wmsc-14"
                                            "&url=http://tiles.maps.eox.at/wms?" ),
                          QStringLiteral( "mywms" ),
                          QStringLiteral( "wms" ));
      QVERIFY(layer.isValid());

That's pointing us towards the initialization code in QGIS application: initQGIS doesn't do all the stuff...


  • 11946 was configured as testqgswmsprovider17866.cpp

@qgib
Copy link
Contributor Author

qgib commented Jan 17, 2018

Author Name: Alessandro Pasotti (@elpaso)


The test works fine in 2.x so it's a regression


  • regression was changed from 0 to 1

@qgib
Copy link
Contributor Author

qgib commented Jan 17, 2018

Author Name: Alessandro Pasotti (@elpaso)


  • priority_id was changed from Normal to High

@qgib
Copy link
Contributor Author

qgib commented Jan 17, 2018

Author Name: Alessandro Pasotti (@elpaso)


I did some more research with different endpoints and I've discovered something really confusing, I now believe that we hit a Qt bug.
Here is my evidence:
I've tested several endpoints (qgis-server, geoserver and remote wms), this is the summary:

  • qgis server on http://127.0.0.1 works
  • qgis server (same as above) on http://localhost fails
  • geoserver on a local docker with defined hostname fails
  • geoserver on a local docker (same as above) with the docker IP works
  • any remote hostname fails
  • any remote IP address works

For example (same server):

        fails = 'contextualWMSLegend=0&crs=EPSG:4326&dpiMode=7&featureCount=10&format=image/png&layers=track_1&styles=&url=https://www.movimentolento.it/resource/wms/'
        works = 'contextualWMSLegend=0&crs=EPSG:4326&dpiMode=7&featureCount=10&format=image/png&layers=track_1&styles=&url=http://85.94.200.131/resource/wms/'

So, we can assume that there is something going bad with hostname resolution, every time I use the IP address it works and it fails when using the hostname (of course I verified with a browser that all the getcapabilities URLs were working).

@qgib
Copy link
Contributor Author

qgib commented Jan 18, 2018

Author Name: Alessandro Pasotti (@elpaso)


Further info: a DNS look up prior to the connection works (Note: Since Qt 4.6.3 QHostInfo is using a small internal 60 second DNS cache for performance improvements.)

    def testLookup(self):
        self.loop = QEventLoop()
        def _p(info):
            self.loop.quit()
            self.assertTrue(info.addresses()[0].toString() == '85.94.200.130')
        QHostInfo.lookupHost('www.movimentolento.it', _p)
        self.loop.exec_()
        del(self.loop)
        # Without the DNS lookup the following fails
        self.layer = QgsRasterLayer(self.wmsuri_remote_ml_hostname, 'mywms', 'wms')
        self.assertTrue(self.layer.isValid())


@qgib
Copy link
Contributor Author

qgib commented Jan 18, 2018

Author Name: David Marteau (@dmarteau)


Allessandro, thanks for investigating this.

So you think that it is related to DNS ?

For information, In my tests, wathever the url given in the first call but all subsequent call on différent hostname were successful: I have tested this by first calling with a 'localhost' - which has no wms service at all, but fail with timeout, then doing other calls to - say 'tiles.maps.eox.at' - are all successfull.

So could we have a scenario where the DNS resolver is not correctly available/initialized for network opération at the first call and available for all subsequent calls ?

@qgib
Copy link
Contributor Author

qgib commented Jan 18, 2018

Author Name: Alessandro Pasotti (@elpaso)


So you think that it is related to DNS ?

I suspect it is related with QHostInfo::lookupHost() which is called by QAbstractSocketPrivate, there are a lot of signal-slot connections that being in a separate thread are probably bound to the GUI thread loop (which isn't running in our test cases) for a correct dispatch and delivery. Still this doesn't explain why subsequent calls on different hostnames works after the first fails.

For information, In my tests, wathever the url given in the first call but all subsequent call on différent hostname were successful: I have tested this by first calling with a 'localhost' - which has no wms service at all, but fail with timeout, then doing other calls to - say 'tiles.maps.eox.at' - are all successfull.
So could we have a scenario where the DNS resolver is not correctly available/initialized for network opération at the first call and available for all subsequent calls ?

I admit that I don't have a clear picture but yes, I think that DNS lookup initialization might be involved.

Would you please confirm that using an IP address instead of an hostname works?

@qgib
Copy link
Contributor Author

qgib commented Jan 18, 2018

Author Name: Alessandro Pasotti (@elpaso)


I understand that this is just a workaround, but calling

QHostInfo.fromName('') # Yes: it also works whith empty/invalid URL

initialize the system and then it works as expected, internally it dynamically load a bounch of system libraries (on unix).

@qgib
Copy link
Contributor Author

qgib commented Jan 24, 2018

Author Name: Alessandro Pasotti (@elpaso)


Lowering priority because we have a workaround.


  • priority_id was changed from High to Normal

@qgib
Copy link
Contributor Author

qgib commented Feb 22, 2018

Author Name: Giovanni Manghi (@gioman)


Making it "high" as is a regression.


  • operating_system was changed from All to
  • status_id was changed from Feedback to Open
  • priority_id was changed from Normal to High

@qgib
Copy link
Contributor Author

qgib commented Dec 28, 2018

Author Name: Paolo Cavallini (@pcav)


Could you please check again on current release?
Thanks.

@qgib
Copy link
Contributor Author

qgib commented Dec 28, 2018

Author Name: Giovanni Manghi (@gioman)


Paolo Cavallini wrote:

Could you please check again on current release?
Thanks.

Please change status to "feedback" when needed.


  • status_id was changed from Open to Feedback

@qgib
Copy link
Contributor Author

qgib commented Jan 14, 2019

Author Name: Nyall Dawson (@nyalldawson)


  • status_id was changed from Feedback to Closed
  • resolution was changed from to no timely feedback

@qgib qgib closed this as completed Jan 14, 2019
@qgib qgib added Bug Either a bug report, or a bug fix. Let's hope for the latter! High Priority Data Provider Related to specific vector, raster or mesh data providers Regression Something which used to work, but doesn't anymore labels May 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Either a bug report, or a bug fix. Let's hope for the latter! Data Provider Related to specific vector, raster or mesh data providers High Priority Regression Something which used to work, but doesn't anymore
Projects
None yet
Development

No branches or pull requests

1 participant