Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitor alternative responses from returned from a resouce #19

Closed
samperd opened this issue Mar 5, 2015 · 14 comments
Closed

Monitor alternative responses from returned from a resouce #19

samperd opened this issue Mar 5, 2015 · 14 comments

Comments

@samperd
Copy link

samperd commented Mar 5, 2015

User Story

As a user, admin
I want to configure a resource to expect a certain result from a "test"
So that I can ensure the service is not only alive, but also returning appropriate values.

Discussion

  • This feature could start with simple TEXT responses so a system can input a value into a simple text file and that value is read by GeoHealthCheck. This could allow many system scripts to simply output a status to a file (or even output a log file) to a WAF.
  • This feature could allow using XPATH to define values in XML or HTML (eg TITLE=My Org)
  • this feature could also leverage REGEX
  • how to search for expected result in JSON?
  • Could you validate the output of a service such as an XML doc of JSON doc?
  • This feature would allow some "quality" checks as well as a service check.
  • This feature would allow you to not only check the presence of say a service or API, but also validate the output of that service.
  • Perhaps a service that is running but returns an unexpected results is automatically a YELLOW unless otherwise defined.
@justb4
Copy link
Member

justb4 commented Sep 16, 2016

I can see the usefulness of this story. The implementation may become complex though. As a suggestion for a minimal implementation: have a text string/keyword that should (not) be present in the response to determine success or failure.

Example 1: testing a WMS request using XML failure responses (using EXCEPTIONS=application/vnd.ogc.se_xml). If the response contains the string <ServiceExceptionReport> the request has failed.

Example 2: testing WFS GetFeature URL: response should contain the keyword FeatureCollection.

The implementation should not be too complex and would cover many use cases. Many URL monitoring services like http://uptimerobot.com/ have this generic feature. A step further is to associate these URL resource type requests to a particular OGC: resource instance within GHC.

@tomkralidis
Copy link
Member

tomkralidis commented Sep 18, 2016

This would be best applied to WWW:LINK resources given the other types have built in checks which are based on the given OGC standards.

Options for an initial WWW:LINK validation:

  • standard HTTP error code checking is already performed as part of the existing checks
  • add an in_text field (default=None) to the Run Resource model which represents text that must exist in the response to constitute a pass
  • field is optionally populated in the 'Add Resource' form when WWW:LINK is selected
  • when a Run is executed, if in_text is not none then a simple 'foo' in result_text type check is done

Thoughts?

@justb4
Copy link
Member

justb4 commented Sep 19, 2016

Yes, my proposal was for WWW:LINK resources. The other, mainly OGC:-types will fetch and parse a GetCapabilities-response IFAICT. What I am seeking are health-checks beyond GeoCapabilities, sometimes that response may even come from a static file. OGC Services can fail for many reasons, usually one notices on Get-requests (WMS GetMap, WFS GetFeature etc) that the service is "unhealthy" without hard failures e.g. a blank image. I realize that generic auto-generated Get* requests are tricky to implement, especially for WMS, maybe interactively with a map view.

Hence, having a WWW:LINK Resource check for a keyword (in_text) in the response text is a good first step towards more detailed health checks, so yes I agree with your proposal. My proposal above also had the reverse test: that the response should not contain the keyword like ServiceException. This is very useful as a generic check for OGC Get* responses, most notably, ServiceException on a WMS GetMap. Later we can add a "parent"-option to WWW:LINK, such that we can couple this to a parent OGC: service and have grouping.

As for implementation: I guess you meant the Resource Model not the Run Model? Adding a field/column in_text to Resource that would only apply to WWW:LINK would be a bit 'specific'. Like in our discussion on Metrics in #43, my suggestion is to add a generic field/column checklist or test_details as a JSON-docstring to Resource. At a later stage we may extend/fill this with OGC-specific Get*-checks. If we give each check-type a specific typeid, this may be a start for a plugin-system for specific checks! Examples:

WWW:LINK with WMS request:

checklist: [
  {
    type: 'hascontenttype'
    properties: {
      contenttype: `image/jpeg`
    }
 },
  {
    type: 'keywordnotexists'
    properties: {
      keyword: `ServiceException>`
    }
 }
]

WFS-specific (url is e.g. WFS GetFeature):

checklist: [
  {
    type: 'hascontenttype'
    properties: {
      contenttype: `text/xml; subtype=gml/2.1.2`
    }
 },
  {
    type: 'keywordexists'
    properties: {
      keyword: `FeatureCollection>`
    }
 }
]

Many more checks can be thought of: minimum filesize (prevent blank images), featurecount, etc For OGC specific XML-based services this would entain response parsing according to XML-schema's via OWSLib, so less a need for regexes etc. But we can start simple with the WWW:LINK keyword check using the checklist-method.

Does this make sense? Still a bit new to GHC...

@tomkralidis
Copy link
Member

@justb4 thanks for the explanation, you're right I meant adding in_text to Resource, not Run. Let's do this as the initial iteration?

Having said this, I'm thinking we can design a plugin system which allows users to:

  • define/install 3rd party plugins to do specific healthchecks against the venerable WWW:LINK resource type
  • define/install 3rd party plugins to override existing healthchecks with specific tests

Putting things in plugins allows more flexibility without creating a meta-language/configuration. Thoughts?

@justb4
Copy link
Member

justb4 commented Sep 21, 2016

An extensible plugin-system together with standard plugins for healthchecks would be valuable.
I am not sure what you mean by "without creating a meta-language/configuration". One of the reasons I was proposing a generic checklist config language has to do with the UI: a user would be able to select specific checks from available plugin-meta info and possibly parameterize these.

Thinking this a bit further and staying close to what I think was GHC's initial aim: to check the health of mainly OGC-services: I think having a "checklist" against any OGC-endpoint would be very valuable. Putting specific checks against WWW:LINKs without a reference to their (OGC) endpoint may be a bit diverting from that goal. Currently the default/minimum "checklist" for an OGC-endpoint is to fetch and parse its Capabilities. But each OGC-service could have an additional list of specific checks. Take a WMS endpoint, via the GUI we may want to select/add checks against GetMap for specific Layers and Bounding boxes for ServiceExceptions.
In the end the Resources itself would be simple WWW:LINKs/checks but coupled to their parent OGC-endpoint (grouping, see #18). The UI would be a main challenge with interactive maps etc, but we could start simple: provide specific URLs for an OGC-endpoint and select checks against them.

@justb4
Copy link
Member

justb4 commented Nov 30, 2016

Pending a re-architecture, as to support an extensive checking system (via plugins), at least our GHC instances would be very much helped with a check on any Exception within a WWW:LINK response as we configure WWW:LINK Resources to access OGC-Layers. A simple check could be something like below on https://github.com/geopython/GeoHealthCheck/blob/master/GeoHealthCheck/healthcheck.py#L82 :

            if resource_type == 'WWW:LINK':
                content_type = ows.info().getheader('Content-Type')

                # Check content if the response is not an image
                if 'image/' not in content_type:
                    content = ows.read()
                    import re
                    try:
                        title_re = re.compile("<title>(.+?)</title>")
                        title = title_re.search(content).group(1)
                    except:
                        title = url

                    # Check for any OGC-Exceptions in response
                    exception = None
                    try:
                        except_re = re.compile("ServiceException>|ExceptionReport>")
                        exception = except_re.search(content).group(0)
                    except:
                        pass

                    del content
                    if exception:
                        raise Exception("Exception in response: %s" % exception)

We could even add a config setting to enable the Exception check WWW_LINK_EXCEPTION_CHECK. I can make the above change, possibly via a new issue as this would not completely solve the initial user-story.

@justb4
Copy link
Member

justb4 commented Dec 20, 2016

@tomkralidis should I make the above change with WWW_LINK_EXCEPTION_CHECK default False? For many installations, at least ours, this will be very useful as currently OGC-exceptions remain uncaught (200) for WWW-LINKs for OGC services.

@tomkralidis
Copy link
Member

@justb4 sorry for the delay. When would an OGC service be registered as a WWW:LINK? We use OWSLib for interacting with OGC services, which properly throws exception in Python that we can catch accordingly to assess a given Run.

@justb4
Copy link
Member

justb4 commented Jan 10, 2017

@tomkralidis in our setup we register individual OGC requests like GetMap and GetFeature as WWW:LINKs. Currently many error situations are not caught. The OGC: Resources are only checked for Capabilities responses. Those can never catch all individual request-errors. Just like WWW:LINK responses are checked for a <title>, my idea is to check for any <*Exception> for none-image responses. Not ideal but for time being will catch many errors where the Caps and even WWW:LINK give 200 are ok (see presentation).

Another option is to introduce an OGC:LINK, as individual requests (GetMap, GetFeature, GetObservation, etc) and always check for Exceptions.

Yet, another is to allow single keyword yes/no match as many uptime checkers support. Problem is that that requires a DB mod.

@tomkralidis
Copy link
Member

Perhaps we can keep WWW:LINK as is and introduce an in_text field for a string to catch to throw exception? Yes, it's a DB mod/addition which can have a default of None to preserve backwards compatibility. Thoughts or would something break?

@justb4
Copy link
Member

justb4 commented Jan 10, 2017

Not clear to me how to silently upgrade the DB, if SQLAlchemy has some mechanism. Also in_text is usually tagged with yes/no whether that text should (not) be in response, e.g. Exception with no. For example for WMS GetMap (or WCS GetCoverage with TIFF expected) we expect an image but may receive an Exception string in response. With just in_text we cannot check that condition unless we encode the condition in the column (Exception=no|yes) but that gets somewhat hacky.

@tomkralidis
Copy link
Member

ah, ok. Let's go with your proposal in #19 (comment) as first pass and iterate from there?

@justb4
Copy link
Member

justb4 commented Feb 8, 2017

We now have a generic Exception-check for WWW:LINK resources. I would like to close this issue and resume in the new issue #82 to implement not just alternative responses but generic OGC-specific checks.

@justb4 justb4 closed this as completed Feb 8, 2017
@samperd
Copy link
Author

samperd commented Feb 8, 2017

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants