Code Architecture

tomkralidis edited this page Jun 21, 2012 · 2 revisions

(n.b. all descriptions below are intended to provide a high level overview of how pycsw is implemented. For full details, please refer to the codebase)

Overview

pycsw is a CGI based application written in Python, which accepts HTTP GET and POST requests as per OGC:CSW 2.0.2. The basic flow of events is:

client request --> pycsw (handle request, produce response) --> server response

pycsw is always called from csw.py, and always instantiates a server.Csw object and then uses its dispatch() method to handle the request and generate a response.

The server.Csw class sets up the server to be able to handle OGC:CSW requests accordingly:

  • setup configuration (default.cfg)
  • initialize the underlying repository (database) connection and queryables model
  • set default HTTP properties (gzip compression)
  • generate !GetDomain model
  • load any profile code (e.g. as apiso)
  • setup transactions (if specified)
  • setup distributed search (if specified)
  • setup logging (if specified)

At this point, pycsw is ready to handle the request, using server.Csw.dispatch(), which does the following:

  • parse request (GET or POST or SOAP)
  • do basic parameter checking (service, version, request)
  • process the request accordingly

(server.Csw.exceptionreport() is always used when pycsw encounters an error and returns an OGC ExceptionReport)

All server.Csw methods return lxml.etree.Element objects, which are then processed by server.Csw.write_response() and returned to the client as XML.

server.Csw.getcapabilities()

  • handle SECTIONS parameter if specified
  • handle extra profile parameters if specified
  • set / process updatesequence
  • return response XML as lxml.etree.Element

server.Csw.describerecord()

  • perform GET validation
  • process the output of schemas as csw:SchemaComponent elements
  • return response XML as lxml.etree.Element

server.Csw.getdomain()

  • perform GET validation
  • process parameter name
    • validate against internal domain model
  • process property name
  • validate existence of property against self.repository.queryables['all']
  • query repository (SQL distinct query against XPath of queryable in records.xml
  • return response XML as lxml.etree.Element

server.Csw.getrecords()

  • perform GET validation
  • query repository. SQL query, one of:
  • spatial (util.query_spatial())
  • aspatial (util.query_xpath())
  • spatial + aspatial
  • sorting (if specified)
  • do distributed searching (if specified)
  • write out results (based on outputschema)
  • distributed search results are returned verbatim
  • return response XML as lxml.etree.Element

server.Csw.getrecordbyid()

  • perform GET validation
  • query repository. SQL query by id (against records.identifier)
  • write out results (based on outputschema)
  • return response XML as lxml.etree.Element

server.Csw.getrepositoryitem()

  • wrapper around server.Csw.getrecordbyid()
  • gets raw XML record
  • return response XML as lxml.etree.Element

server.Csw.transaction()

  • validate XML document
  • insert mode
  • update mode
  • delete mode

server.Csw.harvest()

  • fetch XML from URL
  • insert into repository, or update if identifier exists

Other Notes

  • server.Csw._gen_soap_wrapper() is the generic SOAP wrapper private method
  • server/config.py sets the server's operation model in config.MODEL. Any modifications are then made by calling code (e.g. to add more queryables, typenames, etc.)
  • spatial query magic is via Shapely in server/util.py:query_spatial(), called via SQL function bound back to this method
  • full text (e.g. '*:!AnyText') style queries are via server.util.py:query_anytext(), called via SQL function bound back to this method
  • XPath style queries are via server.util.py:query_xpath(), called via SQL function bound back to this method
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.