Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[server] Allow better control of the response flow chain from server filters. #45158

Merged
merged 1 commit into from
Nov 3, 2021

Conversation

dmarteau
Copy link
Contributor

@dmarteau dmarteau commented Sep 20, 2021

Description

Allow better control the response flow for server filters.

Consider the following use case:

A plugin need to collect all chunks from the response (i.e a WFS GetFeature request) in order to rebuild the whole response for applying transformation before forwarding the new body.

This filter will clear each chunk in order not to send the wrong data.

While this may be ok most of the time, it becomes a problem when a filter must rebuild the whole response before inspecting or applying transformation to the complete response: in order to reconstruct a response from streamed chunks one must retrieve partial content then clear the body to prevent the original content to be flushed.

As side effect, the flush method is called with empty chunks . In this case this behavior is not desirable since, since some response backend implementation may interpret this as a stream terminator.
Also no subsequent filters should be notified because they will receive nothing, otherwise they have to check the validity of the date and eventually abort processing - this is not really api friendly.

This PR allow filters to control the call chain by implementing new filter callbacks that allow returning a control value for stopping propagation. This allow for a better control of streamed data but also of the whole response flow.

Changes

  • Deprecate bool QgsFilter::onRequestReady(), bool QgsFilter::onSendResponse(), bool QgsFilter::onResponseComplete()
  • Add new methods bool QgsFilter::onRequestReady(), bool QgsFilter::onSendResponse(), bool QgsFilter::onResponseComplete() which return boolean controlling data flow.
  • Handle flow control in QgsFilterResponseDecorator::flush()
  • New Tests

@github-actions github-actions bot added this to the 3.22.0 milestone Sep 20, 2021
@dmarteau dmarteau force-pushed the fix-server-filter-streaming branch 2 times, most recently from 03f5518 to 9f59c7a Compare September 20, 2021 13:21
@elpaso elpaso self-requested a review September 20, 2021 13:26
@Gustry Gustry added the Server Related to QGIS server label Sep 20, 2021
rldhont
rldhont previously approved these changes Sep 20, 2021
@dmarteau
Copy link
Contributor Author

I have made change so that onSendResponse() call sendResponse() and return true for propagation which is the default behavior.

@elpaso
Copy link
Contributor

elpaso commented Sep 21, 2021

@dmarteau , @rldhont I'd like to review this one, please don't merge.

@dmarteau
Copy link
Contributor Author

dmarteau commented Sep 21, 2021

@elpaso, no problem, we are waiting for your review.

@rldhont
Copy link
Contributor

rldhont commented Sep 21, 2021

Hi @elpaso, I was thinking of waiting for a second review anyway. We will wait for your review.

@rldhont rldhont dismissed their stale review September 21, 2021 07:35

Waiting for @elpaso review

@elpaso
Copy link
Contributor

elpaso commented Sep 22, 2021

@dmarteau I have a couple of questions about your sentence:

Doing this, an empty chunk is sent to the network.

Looking at the current code in master, if you call clearBody() (which calls truncate() on QgsFcgiServerResponse::) the buffer is cleared and a subsequent call to ::flush() won't send anything to the wire (except the headers, once) because mBuffer.bytesAvailable() will return false.

So, it seems to me (I haven't tested it though) that by truncating the response, a plugin sendResponse() method has a way to effectively prevent an empty chunk to be sent down to the wire.

This doesn't actually prevent a second filter to be executed, but that's a different issue than the one you mentioned and brings me to the second topic: what is this PR really about? It seems to me that what this PR really does is implementing a way for a filter to abort the execution of the subsequent filters, but there are other simpler ways to achieve the same goal without any change to the current implementation of the server: clearing the body is probably one of those because the subsequent filters won't have anything to process and could exit without doing anything but you could also set a global variable or attach it to any interface object (it's Python after all).

In the new test case you are basically testing if a second filter is executed after the first when propagation is false, here is my implementation of the same test, using current master, no changes required:

    def test_streaming_pipeline(self):
        """ Test streaming pipeline propagation
        """
        try:
            from qgis.server import QgsServerFilter
            from qgis.core import QgsProject
        except ImportError:
            print("QGIS Server plugins are not compiled. Skipping test")
            return

        # create a service for streming data
        class StreamedService(QgsService):

            def __init__(self):
                super().__init__()
                self._response = b"Should never appear"
                self._name = "TestStreamedService"
                self._version = "1.0"

            def name(self):
                return self._name

            def version(self):
                return self._version

            def executeRequest(self, request, response, project):
                response.setStatusCode(206)
                response.write(self._response)
                response.flush()

        class Filter1(QgsServerFilter):

            def sendResponse(self, ):
                request = self.serverInterface().requestHandler()
                request.clearBody()
                request.appendBody(b'A')
                request.sendResponse()
                request.appendBody(b'B')
                request.sendResponse()
                request.propagate = self.propagate

            def responseComplete(self):
                request = self.serverInterface().requestHandler()
                request.appendBody(b'C')

        class Filter2(QgsServerFilter):
            # This should be called only if filter1 propagate
            def sendResponse(self):
                request = self.serverInterface().requestHandler()
                if request.propagate:
                    request.appendBody(b'D')

        serverIface = self.server.serverInterface()
        serverIface.setFilters({})

        service0 = StreamedService()

        reg = serverIface.serviceRegistry()
        reg.registerService(service0)

        filter1 = Filter1(serverIface)
        filter2 = Filter2(serverIface)
        serverIface.registerFilter(filter1, 200)
        serverIface.registerFilter(filter2, 300)

        project = QgsProject()

        # Test no propagation
        filter1.propagate = False
        _, body = self._execute_request_project('?service=%s' % service0.name(), project=project)
        self.assertEqual(body, b'ABC')

        # Test with propagation
        filter1.propagate = True
        _, body = self._execute_request_project('?service=%s' % service0.name(), project=project)
        self.assertEqual(body, b'ABDC')

        serverIface.setFilters({})
        reg.unregisterService(service0.name(), service0.version())

So, either this PR brings nothing new to the table or the test is missing the target (or I am missing the point completely 😄 )

@dmarteau
Copy link
Contributor Author

Looking at the current code in master, if you call clearBody() (which calls truncate() on QgsFcgiServerResponse::) the buffer is cleared and a subsequent call to ::flush() won't send anything to the wire (except the headers, once) because mBuffer.bytesAvailable()

This is specific to the QgsFcgiServerResponse implementation of flush. AMHA This behavior should not depends from a specific implementation.

In the new test case you are basically testing if a second filter is executed after the first when propagation is false, here is my implementation of the same test, using current master, no changes required.

Your test assume that Filter1 knows about Filter2 details by examining the propagate member: this is not the case in real situation where filters plugins may come from totally differents sources (they are plugins after all...) and you cannot assume such cooperative behavior.

what is this PR really about? It seems to me that what this PR really does is implementing a way for a filter to abort the execution of the subsequent filters

This is about managing streamed response in way that filters don't have to assume if the data they receive is valid or not.

Consider the following - real - situation (described in the description of the PR)

A plugin need to collect all chunks from the response (i.e a WFS GetFeature request) in order to rebuild the whole response for applying transformation before forwarding the new body.
This filter will clear each chunk in order not to send the wrong data.
In this case no subsequent filters should be notified because they will receive nothing, otherwise they have to check the validity of the date and eventually abort processing - this is clearly not api friendly and is something that can/should be handled upstream.

@dmarteau
Copy link
Contributor Author

@elpaso by the way I think that there is problem with the actual implementation of QgsFcgiServerResponse::flush() since, afaik, an empty body is still a valid response even for request other than HEAD.

@elpaso
Copy link
Contributor

elpaso commented Sep 22, 2021

Looking at the current code in master, if you call clearBody() (which calls truncate() on QgsFcgiServerResponse::) the buffer is cleared and a subsequent call to ::flush() won't send anything to the wire (except the headers, once) because mBuffer.bytesAvailable()

This is specific to the QgsFcgiServerResponse implementation of flush. AMHA This behavior should not depends from a specific implementation.

The QgsBufferServerResponse works the same, so it's not implementation specific.

In the new test case you are basically testing if a second filter is executed after the first when propagation is false, here is my implementation of the same test, using current master, no changes required.

Your test assume that Filter1 knows about Filter2 details by examining the propagate member: this is not the case in real situation where filters plugins may come from totally differents sources (they are plugins after all...) and you cannot assume such cooperative behavior.

True, but there is a difference in a server environment: you are supposed to check and have complete control about what you install and I've never seen a server admin to wget and deploy a QGIS server plugin without checking it, in my experience server plugins are 100% custom implementations or at least customized or part of a bigger application (like g3wsuite for instance).

what is this PR really about? It seems to me that what this PR really does is implementing a way for a filter to abort the execution of the subsequent filters

This is about managing streamed response in way that filters don't have to assume if the data they receive is valid or not.

Consider the following - real - situation (described in the description of the PR)

A plugin need to collect all chunks from the response (i.e a WFS GetFeature request) in order to rebuild the whole response for applying transformation before forwarding the new body.
This filter will clear each chunk in order not to send the wrong data.
In this case no subsequent filters should be notified because they will receive nothing, otherwise they have to check the validity of the date and eventually abort processing - this is clearly not api friendly and is something that can/should be handled upstream.

Ok, so this PR is not about the "empty chunk marking the end" issue but about how to allow a plugin to break the filters chain.

I agree that it would be nice to have but I'd like to discuss it more:

  1. don't you think it should be applied to all three method (requestReady, sendResponse and responseComplete) ?
  2. can we do that in a way that doesn't introduce a new on... set of methods? I didn't check how/if it could be done and what wold it be returned from python if python returns nothing.

If it wasn't clear enough, I am ok with the implementation, what I don't like is the scenario where we have:

  • requestReady
  • sendResponse (deprecated)
  • onSendResponse <<< notice the on
  • responseComplete

The onSendResponse looks like an alien because:

  • starts with on
  • returns something

So, if there is no way to handle this without new names, how about changing (and deprecating) all three?

@dmarteau
Copy link
Contributor Author

dmarteau commented Sep 22, 2021

The QgsBufferServerResponse works the same, so it's not implementation specific.

I would rather say that QgsBufferServerResponse is just another specific implementation :-p, the actual api allow implementating custom QgsServerResponse.

you are supposed to check and have complete control about what you install

100% ok with that, but that do nat prevent that you may install different unrelated plugins. Here at 3liz we provides plugins for different unrelated purposes. One must check is that there is no incompatible plugins (i.e plugins that applies on same requests).

I didn't check how/if it could be done and what wold it be returned from python if python returns nothing.

In that case python return None and that is translated to false

The onSendResponse looks like an alien

They was several choices with the constraint do not break existing plugin implementations:

  1. Keep sendResponse and add a return value: that is, actual filters would return false and thus in order not to introduce some api break, one should inverse the logic (true: stop propagation, false: continue`) which, imho, seems awkward.
  2. Introduce a new method that return a value for breaking the chain and deprecate (or not) the other method.
  3. Set some variable that keep the propagate state. I'm really not in favor of this solution since it may introduce side effects if not reseted properly and would favor a more functional approach wich does not keep state under the hood.

don't you think it should be applied to all three method (requestReady, sendResponse, requestComplete)

Not really, I cannot think of any functional reason to block the chain on requestReady and requestComplete. Some plugins may just doing some monitoring at the end of the chain. In fact, the main reason to block the sendResponse is to prevent inneficient call when there is nothing to process.

So, if there is no way to handle this without new names, how about changing (and deprecating) all three?

Interesting, could your elaborate ?

@elpaso
Copy link
Contributor

elpaso commented Sep 22, 2021

They was several choices with the constraint do not break existing plugin implementations:

1. Keep `sendResponse` and add a return value: that is, actual filters would return `false` and thus in order not to introduce some api break, one should inverse the logic (`true`: stop propagation, `false`: continue`) which, imho, seems awkward.

I think I like this one, we just need to document it, and it won't break anything.

2. Introduce a new  method that return a value for breaking the chain and deprecate (or not) the other method.

3. Set some variable that keep the propagate state. I'm really not in favor of this solution since it may  introduce side effects if not reseted properly and would favor a more functional approach wich does not keep state under the hood.

don't you think it should be applied to all three method (requestReady, sendResponse, requestComplete)

Not really, I cannot think of any functional reason to block the chain on requestReady and requestComplete. Some plugins may just doing some monitoring at the end of the chain. In fact, the main reason to block the sendResponse is to prevent inneficient call when there is nothing to process.

Well, I can think of cases where we would want that, say for requestReady that it's an auth plugin that will issue a redirect: no need to process further.

So, if there is no way to handle this without new names, how about changing (and deprecating) all three?

Interesting, could your elaborate ?

I was meaning: add on to all three methods and deprecate the others, but at the end of the day I think I prefer option 1, returning true to break just need to be documented and it would be fine, the advantage is no API break and just a small documentation change.

@dmarteau
Copy link
Contributor Author

I think I prefer option 1, returning true to break just need to be documented and it would be fine, the advantage is no API break and just a small documentation change.

As matter of fact, this was my first choice when considering this PR, but I was pretty sure that would cause problem because of the inversion of the logic :-)

that it's an auth plugin that will issue a redirect: no need to process further.

Good point.

Ok, so if summarize:

  • No new method.
  • Add bool return value to all three original methods
  • Return true for do block, false otherwise

@elpaso
Copy link
Contributor

elpaso commented Sep 22, 2021

I think I prefer option 1, returning true to break just need to be documented and it would be fine, the advantage is no API break and just a small documentation change.

As matter of fact, this was my first choice when considering this PR, but I was pretty sure that would cause problem because of the inversion of the logic :-)

that it's an auth plugin that will issue a redirect: no need to process further.

Good point.

Ok, so if summarize:

* No new method.

* Add bool return value to all three original methods

* Return `true` for `do block`, `false` otherwise

This is technically an API break even if I doubt that anyone has ever implemented server plugins in C++ (besides me of course, but they are part of QGIS so no problems), I think we could make an exception.

Sounds good to me! Thank you for your patience 😺

@dmarteau
Copy link
Contributor Author

dmarteau commented Sep 22, 2021

Arg, I was wrong, SIP requires return value to be explicit:

718: TypeError: invalid result from Filter0.requestReady(), a 'bool' is expected not 'NoneType'
718: CMake Error at PyQgsServerPlugins.cmake:15 (MESSAGE):
718:   Test failed: Child aborted
718: 
718: 
1/1 Test #718: PyQgsServerPlugins ...............***Failed    0.77 sec

@dmarteau
Copy link
Contributor Author

dmarteau commented Sep 22, 2021

@elpaso maybe we can go with your first idea:

  • Creates three new methods prefixed with on .
  • deprecate (formally) the others.
  • by default makes the on methods call the legacy ones in order to prevent api break.

@dmarteau
Copy link
Contributor Author

Pushed new proposal.

Copy link
Contributor

@elpaso elpaso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after the (minor) comments have been addressed.

This change requires documentation updates of course, (and please don't forget the cookbook).

@@ -77,9 +77,31 @@ class SERVER_EXPORT QgsServerFilter
* getFeature requests, sendResponse() might have been called several times
* before the response is complete: in this particular case, sendResponse()
* is called once for each feature before hitting responseComplete()
*
* Note that this function is called recursively call if response is flushed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Note that this function is called recursively call if response is flushed
* Note that this function is called recursively if response is flushed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment has been removed, because it is false: calling QgsRequestHandler::sendResponse() call the initial QgsServerResonse::flush() and not the decorated one, so there is no recursion.

print("QGIS Server plugins are not compiled. Skipping test")
return

# create a service for streming data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# create a service for streming data
# create a service for streaming data

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -48,13 +48,19 @@ Returns the :py:class:`QgsServerInterface` instance
%Docstring
Method called when the :py:class:`QgsRequestHandler` is ready and populated with
parameters, just before entering the main switch for core services.

This method is considered as deprecated and :py:func:`onRequestReady` should
be preferred.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
be preferred.
be used instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

%End

virtual void responseComplete();
%Docstring
Method called when the :py:class:`QgsRequestHandler` processing has done and
the response is ready, just after the main switch for core services
and before final sending response to FCGI stdout.

This method is considered as deprecated and :py:func:`onResponseComplete` should
be preferred.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
be preferred.
be used instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -65,8 +71,48 @@ the :py:func:`~QgsServerFilter.responseComplete` plugin hook. For streaming serv
getFeature requests, :py:func:`~QgsServerFilter.sendResponse` might have been called several times
before the response is complete: in this particular case, :py:func:`~QgsServerFilter.sendResponse`
is called once for each feature before hitting :py:func:`~QgsServerFilter.responseComplete`

This method is considered as deprecated and :py:func:`onSendResponse` should
be preferred.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
be preferred.
be used instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

/**
* Method called when the QgsRequestHandler processing has done and
* the response is ready, just after the main switch for core services
* and before final sending response to FCGI stdout.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this specific to FCGI? Maybe better remove the mention to it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's not specific to FCGI. Removed.

@dmarteau
Copy link
Contributor Author

@elpaso Before merging, may be we can deprecate officially and use Q_NOWARN_DEPRECATED_PUSH for not triggering warnings ?

@elpaso
Copy link
Contributor

elpaso commented Sep 23, 2021

@elpaso Before merging, may be we can deprecate officially and use Q_NOWARN_DEPRECATED_PUSH for not triggering warnings ?

Yes, I think so, add a note that it will be removed in QGIS 4 (we also have a github issue tracker for QGIS 4 API changes).

I think there are also annotations for deprecated "since" and other stuff, just have a look to the existing deprecations.

@elpaso
Copy link
Contributor

elpaso commented Sep 23, 2021

we also have a github issue tracker for QGIS 4 API changes

Do you have a link ?

https://github.com/qgis/qgis4.0_api/issues/

@dmarteau dmarteau changed the title [server] Allow filters to control streamed response [server] Allow better control of the response flow chain from server filters. Sep 23, 2021
@dmarteau
Copy link
Contributor Author

Updated description

@dmarteau
Copy link
Contributor Author

@elpaso should the documentation target 3.22 ?

Concerning this PR, for my part, It seems that I have nothing more to add/change.

@elpaso
Copy link
Contributor

elpaso commented Sep 23, 2021

This seems to me more a feature than a bugfix, I'd probably wait for feature freeze to be lifted and add it to the next release.

@elpaso elpaso modified the milestones: 3.22.0, 3.24 (Feature) Sep 23, 2021
@elpaso elpaso added the Merge After Thaw For approved PRs which are ready to be merged as soon as master is thawed label Sep 23, 2021
@dmarteau
Copy link
Contributor Author

@elpaso

There an actual issue in server filter ATM and may be this is the right place to discuss it:

There is actually no way to inspect the loaded project before the request is processed. This may be a problem when one wants to get some metadata from the loaded project and modify the request parameter accordingly or inspect the project before processing the request. This is actually a recurrent problem we are facing when developping plugins.

The actual workflow is

No project loaded from here

  • onRequestReady() <- parameters may be modified here, eventually modifying project source parameter

Project is loaded

May be there is a missing method here for inspection of the project before entering request processing

  • onProjectReady( QgsProject* project )

Project is available from QgsProject::instance

  • onSendResponse() <- Request is beeing processed
  • onResponseComplete() <- Request has been processed

What do you think ?

@elpaso
Copy link
Contributor

elpaso commented Sep 30, 2021

@dmarteau I probably don't understand: why don't you load the project (from cache possibly) in onRequestReady() if you want to inspect it?

@dmarteau
Copy link
Contributor Author

I probably don't understand: why don't you load the project (from cache possibly) in onRequestReady() if you want to inspect it?

  1. Because this is not efficient, the project will be loaded twice
  2. Because this does not work when the project is directly passed to QgsServer::handleRequest. In this case the cache Manager is never sollicited (worst if you call if from the cache manager, if will be loaded twice).

@elpaso
Copy link
Contributor

elpaso commented Sep 30, 2021

I probably don't understand: why don't you load the project (from cache possibly) in onRequestReady() if you want to inspect it?

1. Because this is not efficient, the project will be loaded twice

Not if you load it from the cache, the cache is accessible from the server interface.

2. Because this does not work when the project is directly passed to `QgsServer::handleRequest`. In this case the cache Manager is never sollicited (worst if you call if from the cache manager, if will be loaded twice).

What prevents you from caching the project before passing in to handleRequest?

@dmarteau
Copy link
Contributor Author

What prevents you from caching the project before passing in to handleRequest?

In this case I do not see the point to pass directly a QgsProject.

Too make a long story short: the cache used is not necessarily the Qgis server implementation. The actual cache implementation suffer too many limitations (and I will make soon a QEP about it).

Embedders of qgis server may rely on alternative implementation or even not using any cache at all (think about project generated dynamically by code) and plugins should not be aware of this as they must work in all cases.

@elpaso
Copy link
Contributor

elpaso commented Sep 30, 2021

@dmarteau mind discussing these topic in a QEP? I don't think that a PR is the right place.

@dmarteau
Copy link
Contributor Author

@elpaso QEP available

@elpaso elpaso added the Frozen Feature freeze - Do not merge! label Oct 1, 2021
@nyalldawson nyalldawson removed Merge After Thaw For approved PRs which are ready to be merged as soon as master is thawed Frozen Feature freeze - Do not merge! labels Oct 22, 2021
@elpaso elpaso merged commit dc3e9eb into qgis:master Nov 3, 2021
@rldhont rldhont added API API improvement only, no visible user interface changes Changelog Items that are queued to appear in the visual changelog - remove after harvesting labels Nov 3, 2021
@zacharlie zacharlie added ChangelogHarvested This PR description has been harvested in the Changelog already. and removed Changelog Items that are queued to appear in the visual changelog - remove after harvesting labels May 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API API improvement only, no visible user interface changes ChangelogHarvested This PR description has been harvested in the Changelog already. Server Related to QGIS server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants