Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-FDSN URL found in Routing service response #80

Closed
rizac opened this issue Jan 25, 2021 · 16 comments
Closed

Non-FDSN URL found in Routing service response #80

rizac opened this issue Jan 25, 2021 · 16 comments

Comments

@rizac
Copy link

rizac commented Jan 25, 2021

Currently, the routing service:

www.orfeus-eu.org/eidaws/routing/1/query?format=post

Provides a non-FDSN URL:

http://ws.resif.fr/resifws/ph5-dataselect/1/query

Thus client software which relies on FDSN URLs to retrieve automatically station and dataselect endpoints will discard that URL.

@rizac rizac changed the title No FDSN URL in Routing service: lanel No FDSN URL in Routing service Jan 25, 2021
@rizac rizac changed the title No FDSN URL in Routing service Non-FDSN URL found in Routing service response Jan 25, 2021
@jschaeff
Copy link
Collaborator

Hi, thank you for reporting this issue.
PH5-dataselect is not standard, but conforms to the FDSN ws dataselect specifications, although it delivers miniseed data from data stored in the PH5 format.
I guess it's not a big deal if clients discard it. What do you think ?

@damb
Copy link
Contributor

damb commented Jan 25, 2021

Hi, this happens since the dataselect service in http://ws.resif.fr/routing/eida_routing.xml is specified with a non-standard FDSN URL. When harvesting data from eidaws-routing localconfig configuration files the URL is taken for granted and thus not validated. Acctually, a solution where configuration is validated would be preferable.

I guess it's not a big deal if clients discard it. What do you think ?

@jschaeff, this means that every client using eidaws-routing is required to validate the response returned. Instead, it would be cleaner and much more efficient to do this once (i.e. when harvesting) in order to prevent every single client from validating the response over and over again. APIs should be considered as contracts such that everybody interacting with the API can trust the data shipped. Everything else should be considered as a bug.

What do you think?

@javiquinte
Copy link
Collaborator

Hi, thank you for reporting this issue.
PH5-dataselect is not standard, but conforms to the FDSN ws dataselect specifications, although it delivers miniseed data from data stored in the PH5 format.
I guess it's not a big deal if clients discard it. What do you think ?

Hi @jschaeff . Quick question: Does the service listening the query method on that URL adjust to the dataselect specification? (parameters, types, returns miniSEED, etc)

@rizac
Copy link
Author

rizac commented Jan 26, 2021

Thank you all for the reply.

Discarding data might not be a big deal (at least not for our client software), but the problem should not be seen under this perspective, I think.
Quoting from the routing service page: "[...] To assist users to locate data, we have designed a Routing Service [...] it must serve this information [data location] in order to help the development of smart clients and/or services of higher level". Is this contract honored, if some clients/users cannot access data because of an unknown non-FDSN URL?
If I am not missing anything, does it sound more like a big deal now?

Best
R

@javiquinte
Copy link
Collaborator

Hi everyone!
I don't see anything really problematic or against what has been promised. But I don't know if I have all the information. 🤔
Routing Service is only a providing entry points for different services. The entry points must be compatible, of course, so that we don't mix dataselect with station-WS or with some ad-hoc services, which cannot be distinguished by an external client. Quoting our "contract":

Consistency with standards: Parameter names, formats and types follow similar existing services (e.g. FDSN web services) whenever possible.

Whenever possible... like in this case.

Again, if I understood properly, behind the URL provided by RESIF there is a dataselect service. Am I right @jschaeff ?
I understand that the URL does not follow the standard structure "http://domain/fdsnws/...". We must always try to find the standard, but reality and the standard sometimes slightly differ. That's why standards need to be revised. In some cases because the standard does not make sense.

We had the same case a couple of years ago, when many of us started to serve the data in the https port instead of the http. We had to change the standard and it took a lot of time. In the meantime, the data was served on https and I don't remember anyone discarding a whole data centre because of this.
Many data centers are in the same situation as RESIF. In particular the ones providing big datasets (that's why the PH5 is there 😉 ). So, big-data users should probably not discard these cases. Only if they are not dataselect services.

@javiquinte
Copy link
Collaborator

javiquinte commented Jan 26, 2021

Hi, this happens since the dataselect service in http://ws.resif.fr/routing/eida_routing.xml is specified with a non-standard FDSN URL. When harvesting data from eidaws-routing localconfig configuration files the URL is taken for granted and thus not validated. Acctually, a solution where configuration is validated would be preferable.

I guess it's not a big deal if clients discard it. What do you think ?

@jschaeff, this means that every client using eidaws-routing is required to validate the response returned. Instead, it would be cleaner and much more efficient to do this once (i.e. when harvesting) in order to prevent every single client from validating the response over and over again. APIs should be considered as contracts such that everybody interacting with the API can trust the data shipped. Everything else should be considered as a bug.

What do you think?

Hi @damb !
As I see it, this is a completely different issue not related to the RESIF case, as that is a dataselect service (we don't care about the data format in the backstage). I understand from your message that you propose to validate the API from each of the entry points included in a Routing Service. Am I right?
If this is the case, we should probably move this idea to the Routing Service issue tracker.

@damb
Copy link
Contributor

damb commented Jan 26, 2021

Hi @javiquinte,

I understand from your message that you propose to validate the API from each of the entry points included in a Routing Service. Am I right?

Yes. When harvesting the URLs should be validated and if necessary (and possible) modified/translated in order to stick to FDSN standards.

As long as RESIF's PH5-driven fdsnws-dataselect implements the corresponding API (except of the URL path), RESIF could set up a reverse proxy with a properly configured URL path and adjust its eidaws-routing localconfig.

EDIT: Sorry, my bad. Modifying/translating the URL while harvesting does not work out without setting up a corresponding reverse-proxy. And once a reverse proxy is set up, it can be easily configured through RESIF's localconfig which makes modifying URL pathes obsolete.

Whenever possible... like in this case.

Breaking clients because of not properly validating internal configuration shouldn't be a reason to use the whenever possible. Also, clients shouldn't need to worry about what's the underlying backend implementation.

@jschaeff
Copy link
Collaborator

Hi all,

resifws/ph5-dataselect/1 conforms to the fdsnws dataselect interface. You can build your URLs the same way. There are some specific options due to the PH5 to miniSEED conversion, but they can be left default. You can even telle the obspy client to use this URL for dataselect and it will work.

The service itself is not a standard, that's why we don't serve it under fdsnws prefix.

The only way for a user to read data from the networks stored as PH5 archive is to pass through this webservice. That's why I thought it is important to advertise it through the routing service.

If EIDA routing service is not ready to advertise it, I can remove it easily. It is also advertised through the FDSN: https://www.fdsn.org/ws/datacenters/1/query?name=RESIF&includeDatasets=True

@jschaeff
Copy link
Collaborator

From what I understood, IRIS uses a dataselect interface on top of several other services delivering miniSEED data, so that there is only one entrypoint to the datacenter regardless of the backoffice details. We might do this also, someday.

@damb
Copy link
Contributor

damb commented Jan 27, 2021

You can even telle the obspy client to use this URL for dataselect and it will work.

Although, obspy is a great framework it is just another client using EIDA's FDSN webservice APIs.

If EIDA routing service is not ready to advertise it, I can remove it easily.

It's not about the question whether eidaws-routing can handle it (because it does handle it already). It's about the fact that we break clients if we advertise non-standard URLs with eidaws-routing.

@jschaeff
Copy link
Collaborator

RESIF DC supports now the fdsn compliant URL:
http://ws.resif.fr/ph5/fdsnws/dataselect/1
http://ws.resif.fr/ph5/fdsnws/availability/1

It is published in our routing tables.

@rizac
Copy link
Author

rizac commented Mar 19, 2021

Thank you @jschaeff but the above solution still breaks the standard, as the two URLS have the same domain
The specification says that a FDSN URL is:

<site>​/fdsnws/<service>/<majorversion>

where "<​site>​ is the domain name of the data" (e.g., "google.com"). So, if I am not wrong, either

  1. the specification has to be changed in: "<site> is the portion or URL up to "/fdsn/..", or
  2. the solution posted above does not fix the issue

For the moment, our client software still breaks. I can easily implement the solution 1. above, but I would beforehand have confirmation that 1. is the standard, and why is not mentioned in the official specification

Many thanks for your reply

@rizac
Copy link
Author

rizac commented Apr 19, 2021

Many thanks for the fix, now the URL is at:

http://ph5ws.resif.fr/fdsnws/dataselect/1/query

which is standard FDSN.

However, with our client software, when downloading with token, we get now a SSL certificate error:

ssl.CertificateError: hostname 'ph5ws.resif.fr' doesn't match either of 'ws.resif.fr', 'www.ws.resif.fr'

If the fix is not easy, then better restore the previous non standard FDSN URL, because this can be handled by client software quite easily (e.g.: check and exclude non standard URLs), whereas the error above is out of control

@jschaeff
Copy link
Collaborator

While waiting for the certificates, I removed the ph5ws services from the routing.

@jschaeff
Copy link
Collaborator

Our certificate is now OK.
I just published ph5ws.resif.fr/fdsnws/dataselect in our routing system.
Let's see if it works !

@jschaeff
Copy link
Collaborator

Fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants