Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for the IRIS Federated Catalog Service #1779

Closed
wants to merge 87 commits into from
Closed
Show file tree
Hide file tree
Changes from 86 commits
Commits
Show all changes
87 commits
Select commit Hold shift + click to select a range
4ef5db9
start work on parsing responses
CelsoReyes Mar 25, 2017
7b84a00
well on way to parser working
CelsoReyes Mar 25, 2017
a8cfe9f
working (rough) parser
CelsoReyes Mar 31, 2017
f68c8ee
initial fed_get_stations working
CelsoReyes Apr 1, 2017
f27b39d
trying to get tags in the data
CelsoReyes Apr 1, 2017
052d85c
started tackling the problem of recognizing which data we have./nboth…
CelsoReyes Apr 2, 2017
70e4c9c
cleanup
CelsoReyes Apr 2, 2017
9186e06
not much done
CelsoReyes Apr 5, 2017
8676ee8
moved items into federator_response_parser
CelsoReyes Apr 9, 2017
f6e905e
added tests removed streaming
CelsoReyes Apr 10, 2017
9b5c3c8
minor stuff
CelsoReyes Apr 13, 2017
8e80352
fixed dosctrings
CelsoReyes Apr 14, 2017
767ff1a
have basic station request working
CelsoReyes Apr 14, 2017
094690c
renamed and continued tightnening tests
CelsoReyes Apr 15, 2017
949f9d3
reorganizing, added parallel retrieve ability
CelsoReyes Apr 16, 2017
5f2c2d8
shuffling into proper classes
CelsoReyes Apr 16, 2017
6033aae
fedatalog_response_parser bug free maybe
CelsoReyes Apr 16, 2017
9bfab8a
started packaging, added datacenter metadata lookup
CelsoReyes Apr 17, 2017
9b3730b
trying to get this to work
CelsoReyes Apr 17, 2017
45bf812
doctests pass. station retrieval works at basic level
CelsoReyes Apr 18, 2017
11278e1
have it working in serial
CelsoReyes Apr 19, 2017
311c875
all doctests pass
CelsoReyes Apr 20, 2017
7bc9c57
station working again after reshuffle
CelsoReyes Apr 20, 2017
25dff7c
stations working better, get_waveform works
CelsoReyes Apr 20, 2017
3fd3714
simplified some code
CelsoReyes Apr 21, 2017
c113bf5
it all works. now to start handling args and errors
CelsoReyes Apr 22, 2017
c9cd8ea
moved include_/exclude_providers into RoutingClient, added more tests
CelsoReyes Apr 22, 2017
2cd1358
added file writing capabilities, I think
CelsoReyes Apr 22, 2017
6791838
unable to connect to base client. too many checks that do not match up.
CelsoReyes Apr 22, 2017
f353db3
added logging. disabled parallel requests
CelsoReyes Apr 23, 2017
e32791d
working with bulk requests
CelsoReyes Apr 25, 2017
bebee93
added retry ability, better requestitem handling
CelsoReyes Apr 28, 2017
2d17623
updating tests
CelsoReyes Apr 28, 2017
6ac7be0
Functioning version
CelsoReyes Apr 28, 2017
a2d2d0e
created overview chart for classes
CelsoReyes Apr 29, 2017
7c6b2e2
added docstring to top of main .py modules
CelsoReyes Apr 29, 2017
ce8dd1f
dded documentation
CelsoReyes Apr 29, 2017
9fb2459
fedcatalog_parser documentation full
CelsoReyes Apr 29, 2017
2e3cf93
fleshing out documentation for routing_client
CelsoReyes Apr 29, 2017
f4e1809
continued documenting, tests pass
CelsoReyes Apr 30, 2017
1d76c1e
rerouting for stations works, as does using existing/previous fedcata…
CelsoReyes Apr 30, 2017
983affb
rerouting seems to work for waveforms, too
CelsoReyes Apr 30, 2017
23f9067
activated parallel processing with 120s timeout. it is not default be…
CelsoReyes Apr 30, 2017
2fe32f3
pretty much done. added doc to __init__
CelsoReyes May 1, 2017
6341202
addressed most pep8 stuff
CelsoReyes May 5, 2017
6fea4b6
scaled back logging messages
CelsoReyes May 5, 2017
5048ed8
added exception for no data, removed some exterreneous files
CelsoReyes May 5, 2017
8a90af0
fixed errors introduced by oher fixes
CelsoReyes May 5, 2017
933c4c9
addressed TODO issues
CelsoReyes May 5, 2017
5c7a470
fixed major pull auto-issues mostly dealing with future
CelsoReyes May 6, 2017
1549f10
more flake8 fixes
CelsoReyes May 7, 2017
9c7423c
missed a spot
CelsoReyes May 7, 2017
f219955
tweaking future imports
CelsoReyes May 7, 2017
a3c5261
import cleanups
megies May 8, 2017
3bfbc0b
import cleanup
megies May 8, 2017
03b81b8
imports cleanup
megies May 8, 2017
7415437
removed extereneous files, modified fdsn.rst and moved the test
CelsoReyes May 8, 2017
4b5ac59
Add imports from future.
nick-falco Jul 20, 2017
ff09715
Change file from using CRLF to LF.
nick-falco Jul 20, 2017
03a7182
Fix failing test_federatedclient tests.
nick-falco Jul 21, 2017
5da3bc5
Addtional fixes for failing tests.
nick-falco Jul 21, 2017
f49cc9d
Python 2/3 fix.
nick-falco Jul 24, 2017
9ab0bcd
More test fixes.
nick-falco Jul 24, 2017
93cb548
Fix Python 2/3 unicode issues, PEP8 doctests, and removed unnecessary…
nick-falco Jul 25, 2017
c1fc58f
Remove unused import.
nick-falco Jul 25, 2017
7aec8f4
Add FederatedClient examples to fdsn/__init__.py.
nick-falco Aug 11, 2017
6ba56fe
Add url mappings based on the irisws-fedcatalog response instead of u…
nick-falco Aug 15, 2017
ab75eb0
Explictly list arguments in FederatedClient __init__, get_routing, an…
nick-falco Aug 15, 2017
2fab8f1
Flake8
nick-falco Aug 15, 2017
2275c75
Remove get_service_mappings doctest since it was failing due to pytho…
nick-falco Aug 15, 2017
d1a19ac
fdsn fedcatalog: docstring changes
megies Aug 17, 2017
c16bb58
fdsn fedcatalog: fix bug in passing up kwargs to base class
megies Aug 17, 2017
700ed89
fdsn fedcatalog: IPython pretty printing (+ some refactoring)
megies Aug 17, 2017
bd78590
fdsn fedcatalog: fix empty location code (still needs abligatory regr…
megies Aug 17, 2017
345aecd
Fix for InsecureRequestWarning: Unverified HTTPS request is being made.
nick-falco Aug 18, 2017
35cbcc1
Disable verbose logging. Default FederatedClient logging level set to…
nick-falco Aug 18, 2017
af8be5e
Remove hard coded FedCatalog service url.
nick-falco Aug 21, 2017
94914f9
Changed assert statements to explicit if not ...: raise ...
nick-falco Aug 22, 2017
1c2bb33
Add regression test for empty location code.
nick-falco Aug 22, 2017
bbdfd68
Fix for unicode conversion error.
nick-falco Aug 22, 2017
62741ca
Flake8
nick-falco Aug 22, 2017
aa6161f
Move some globals (PROVIDERS, get_existing_route) into the FederatedC…
adam-iris Aug 23, 2017
796a783
Cache FDSN clients for parallel and serial requests.
nick-falco Aug 30, 2017
52afb28
- Update FederatedClient to use urllib2 request library instead of ur…
nick-falco Sep 8, 2017
b78245f
- Remove remaining references to urllib3.
nick-falco Sep 11, 2017
6a00009
Merge branch 'master' into irisfederator
megies Oct 4, 2017
2a6e7f1
First try on Sphinx documentation fixes. +DOCS
nick-falco Oct 5, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
5 changes: 5 additions & 0 deletions misc/docs/source/packages/obspy.clients.fdsn.rst
Expand Up @@ -10,6 +10,7 @@
:nosignatures:

client.Client
client.FederatedClient

.. comment to end block

Expand All @@ -20,6 +21,10 @@
:nosignatures:

client
routers
routers.fedcatalog_parser
routers.fedcatalog_client.FederatedClient
routers.routing_client
mass_downloader
mass_downloader.domain
mass_downloader.mass_downloader.MassDownloader
Expand Down
68 changes: 65 additions & 3 deletions obspy/clients/fdsn/__init__.py
Expand Up @@ -3,16 +3,16 @@
obspy.clients.fdsn - FDSN web service client for ObsPy
======================================================
The obspy.clients.fdsn package contains a client to access web servers that
implement the FDSN web service definitions (https://www.fdsn.org/webservices/).
implement the `FDSN web service definitions`_.

:copyright:
The ObsPy Development Team (devs@obspy.org)
:license:
GNU Lesser General Public License, Version 3
(https://www.gnu.org/copyleft/lesser.html)

Basic Usage
-----------
Basic FDSN Client Usage
-----------------------

The first step is always to initialize a client object.

Expand Down Expand Up @@ -134,8 +134,69 @@
endtime=endtime)
inventory.plot()


Basic FDSN FedCatalog Client Usage
----------------------------------

The
:mod:`FDSN fedcatalog_client <obspy.clients.fdsn.routers.fedcatalog_client>`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This link seems broken in the docs build, I think the reason is that the new submodules are not added to the sphinx package skeleton: https://github.com/obspy/obspy/blob/master/misc/docs/source/packages/obspy.clients.fdsn.rst

See also the build log for respective warnings/errors: http://docs.obspy.org/pull-requests/1779/log.txt

module provides federated
access to multiple web servers that implement the
`FDSN Station and Dataselect web service definitions
<https://www.fdsn.org/webservices/>`_.

The first step is always to initialize a :class:`FederatedClient` object.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This link sees broken as well.. this should work, hopefully: :class:`~obspy.clients.fdsn.FederatedClient` if not you have to specify full submodule path..


>>> from obspy.clients.fdsn import FederatedClient
>>> client = FederatedClient()

(1) :meth:`~obspy.clients.fdsn.routers.fedcatalog_client.
FederatedClient.get_waveforms()`: The following
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

broken link here too, likely due to aforementioned missing submodule entries in sphinx skeleton document (misc/docs/source/packages/obspy.clients.fdsn.rst)

example illustrates how to request 60 minutes of the ``"LHZ"`` channel of
station Apirathos, Naxos, Greece (``"APE"``) of the GEOFON (``"GE"``) for a
seismic event around 2006-01-08T11:34:54.000 (UTC). Results are returned as a
:class:`~obspy.core.stream.Stream` object.

>>> from obspy import UTCDateTime
>>> t = UTCDateTime("2006-01-08T11:34:54.000")
>>> st = client.get_waveforms("GE", "APE", "", "LHZ", t, t + 60 * 60)
>>> st.plot() # doctest: +SKIP

.. plot::

from obspy import UTCDateTime
from obspy.clients.fdsn import Client
client = Client('GFZ')
t = UTCDateTime("2006-01-08T11:34:54.000")
st = client.get_waveforms("GE", "APE", "", "LHZ", t, t + 60 * 60)
st.plot()

(2) :meth:`~obspy.clients.fdsn.routers.fedcatalog_client.
FederatedClient.get_stations()`: Uses the IRIS Fed Catalog web service to
return station metadata as an
:class:`~obspy.core.inventory.inventory.Inventory` object.

>>> inventory = client.get_stations(network="GE", station="A*",
... channel="?HZ", level="station",
... endtime="2016-12-31")
>>> print(inventory) # doctest: +ELLIPSIS, +NORMALIZE_WHITESPACE
Inventory created at 2...Z
Sending institution: SeisComP3 (GFZ)
Contains:
Networks (1):
GE
Stations (4):
GE.APE (GEOFON Station Apirathos, Naxos)
GE.APE (NOA/GEOFON Station Apeiranthos,Naxos, Greece)
GE.APEZ (GEOFON Station Moni Apezanon, Greece)
GE.ARPR (GEOFON/MedNet/KOERI Station Arapgir, Turkey)
Channels (0):
<BLANKLINE>

Please see the documentation for each method for further information and
examples.

.. _FDSN web service definitions: https://www.fdsn.org/webservices/
"""
from __future__ import (absolute_import, division, print_function,
unicode_literals)
Expand All @@ -144,6 +205,7 @@

from .client import Client # NOQA
from .header import URL_MAPPINGS # NOQA
from .routers import FederatedClient # NOQA


# insert supported URL mapping list dynamically in docstring
Expand Down
142 changes: 81 additions & 61 deletions obspy/clients/fdsn/client.py
Expand Up @@ -20,22 +20,12 @@
import io
import os
import re
import sys
from socket import timeout as socket_timeout
import textwrap
import threading
import warnings
from collections import OrderedDict

if sys.version_info.major == 2:
from urllib import urlencode
import urllib2 as urllib_request
import Queue as queue
else:
from urllib.parse import urlencode
import urllib.request as urllib_request
import queue

from lxml import etree

import obspy
Expand All @@ -46,6 +36,15 @@
FDSNRedirectException, FDSNNoDataException)
from .wadl_parser import WADLParser

if PY2:
from urllib import urlencode
import urllib2 as urllib_request
import Queue as queue
else:
from urllib.parse import urlencode
import urllib.request as urllib_request
import queue


DEFAULT_SERVICE_VERSIONS = {'dataselect': 1, 'station': 1, 'event': 1}

Expand Down Expand Up @@ -889,7 +888,7 @@ def get_waveforms_bulk(self, bulk, quality=None, minimumlength=None,
url = self._build_url("dataselect", "query")

data_stream = self._download(url,
data=bulk.encode('ascii', 'strict'))
data=bulk)
data_stream.seek(0, 0)
if filename:
self._write_to_file_object(filename, data_stream)
Expand Down Expand Up @@ -1034,7 +1033,7 @@ def get_stations_bulk(self, bulk, level=None, includerestricted=None,
url = self._build_url("station", "query")

data_stream = self._download(url,
data=bulk.encode('ascii', 'strict'))
data=bulk)
data_stream.seek(0, 0)
if filename:
self._write_to_file_object(filename, data_stream)
Expand Down Expand Up @@ -1079,7 +1078,10 @@ def _get_bulk_string(self, bulk, arguments):
msg = ("Unrecognized input for 'bulk' argument. Please "
"contact developers if you think this is a bug.")
raise NotImplementedError(msg)
return bulk
if PY2:
return bulk
else:
return bulk.encode('ASCII')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@megies For Python 3 I had to explicitly encode the bulk request string as ASCII (to be <class 'bytes'>). For Python 2 the bulk request "string" is of type future.types.newstr.newstr and worked as is.

Do you know of a better way to handle this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't say for sure without digging through the code and investigating which test fail you're fixing here (and what that test is doing), the main question is where this return is getting used..

But, if you're running into a situation that you have to do an encoding or not, depending on Py2/3 maybe the problem is further up.. it's best practice to always do explicit de/encoding on I/O.


def _write_to_file_object(self, filename_or_object, data_stream):
if hasattr(filename_or_object, "write"):
Expand Down Expand Up @@ -1296,53 +1298,7 @@ def _download(self, url, return_string=False, data=None, use_gzip=True):
url, opener=self._url_opener, headers=self.request_headers,
debug=self.debug, return_string=return_string, data=data,
timeout=self.timeout, use_gzip=use_gzip)
# get detailed server response message
if code != 200:
try:
server_info = data.read()
except Exception:
server_info = None
else:
server_info = server_info.decode('ASCII', errors='ignore')
if server_info:
server_info = "\n".join(
line for line in server_info.splitlines() if line)
# No data.
if code == 204:
raise FDSNNoDataException("No data available for request.",
server_info)
elif code == 400:
msg = ("Bad request. If you think your request was valid "
"please contact the developers.")
raise FDSNException(msg, server_info)
elif code == 401:
raise FDSNException("Unauthorized, authentication required.",
server_info)
elif code == 403:
raise FDSNException("Authentication failed.", server_info)
elif code == 413:
raise FDSNException("Request would result in too much data. "
"Denied by the datacenter. Split the request "
"in smaller parts", server_info)
# Request URI too large.
elif code == 414:
msg = ("The request URI is too large. Please contact the ObsPy "
"developers.", server_info)
raise NotImplementedError(msg)
elif code == 500:
raise FDSNException("Service responds: Internal server error",
server_info)
elif code == 503:
raise FDSNException("Service temporarily unavailable", server_info)
elif code is None:
if "timeout" in str(data).lower():
raise FDSNException("Timed Out")
else:
raise FDSNException("Unknown Error (%s): %s" % (
(str(data.__class__.__name__), str(data))))
# Catch any non 200 codes.
elif code != 200:
raise FDSNException("Unknown HTTP code: %i" % code, server_info)
raise_on_error(code, data)
return data

def _build_url(self, service, resource_type, parameters={}):
Expand Down Expand Up @@ -1442,7 +1398,7 @@ def run(self):
elif isinstance(wadl, FDSNRedirectException):
redirect_messages.add(str(wadl))
continue
elif wadl == "timeout":
elif wadl.decode('utf-8') == "timeout":
raise FDSNException("Timeout while requesting '%s'." % url)

if "dataselect" in url:
Expand Down Expand Up @@ -1622,6 +1578,67 @@ def build_url(base_url, service, major_version, resource_type,
return url


def raise_on_error(code, data):
"""
Raise an error for non-200 HTTP response codes

Note: Also used by the ~obspy.clients.fdsn.routers.fedcatalog_client
module.

:type code: int
:param code: HTTP response code
:type data: io.BytesIO
:param data: Data returned by the server
"""
# get detailed server response message
if code != 200:
try:
server_info = data.read()
except Exception:
server_info = None
else:
server_info = server_info.decode('ASCII', errors='ignore')
if server_info:
server_info = "\n".join(
line for line in server_info.splitlines() if line)
# No data.
if code == 204:
raise FDSNNoDataException("No data available for request.",
server_info)
elif code == 400:
msg = ("Bad request. If you think your request was valid "
"please contact the developers.")
raise FDSNException(msg, server_info)
elif code == 401:
raise FDSNException("Unauthorized, authentication required.",
server_info)
elif code == 403:
raise FDSNException("Authentication failed.", server_info)
elif code == 413:
raise FDSNException("Request would result in too much data. "
"Denied by the datacenter. Split the request "
"in smaller parts", server_info)
# Request URI too large.
elif code == 414:
msg = ("The request URI is too large. Please contact the ObsPy "
"developers.", server_info)
raise NotImplementedError(msg)
elif code == 500:
raise FDSNException("Service responds: Internal server error",
server_info)
elif code == 503:
raise FDSNException("Service temporarily unavailable", server_info)
elif code is None:
if "timeout" in str(data).lower():
raise FDSNException("Timed Out")
else:
raise FDSNException("Unknown Error (%s): %s" % (
(str(data.__class__.__name__), str(data))))
# Catch any non 200 codes.
elif code != 200:
raise FDSNException("Unknown HTTP code: %i" % code, server_info)


def download_url(url, opener, timeout=10, headers={}, debug=False,
return_string=True, data=None, use_gzip=True):
"""
Expand All @@ -1635,6 +1652,9 @@ def download_url(url, opener, timeout=10, headers={}, debug=False,
specified.

Performs a http GET if data=None, otherwise a http POST.

Note: Also used by the ~obspy.clients.fdsn.routers.fedcatalog_client
module.
"""
if debug is True:
print("Downloading %s %s requesting gzip compression" % (
Expand Down