Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIAv2 #206

Merged
merged 17 commits into from
Feb 27, 2020
Merged

SIAv2 #206

merged 17 commits into from
Feb 27, 2020

Conversation

andamian
Copy link
Contributor

@bsipocz
Copy link
Member

bsipocz commented Jan 23, 2020

I'm opening a separate PR to fix the CI.

@bsipocz bsipocz mentioned this pull request Jan 23, 2020
@cbanek
Copy link
Contributor

cbanek commented Jan 24, 2020

This looks awesome! I'm going to review this more closely tomorrow, but I like it. I get what you mentioned about should the searchv2 be on some kind of different object. I'm not sure about that but I'll think about it.

And I merged the build fix, so you can rebase on top of that and hopefully get your checkmark.

@funbaker
Copy link
Contributor

A few thoughts:

Why not using inheritance to handle all that duplicate code?
Both service versions differ only in how they present metadata, theres already a CapabilityMixin which can be used for sia2. This way one can also distinct between siav1 and v2, making it possible to use just one search function. Two query classes are fine imo, as there are enough differences.

Theres also some code for the DALI Values in pyvo.dal.adhoc for which it may make sense to put it into a separate module. This code also handles conversion between different units, which is missing in this PR yet.

It also would make sense to check if the registry code still works correctly for sia2 services.

@cbanek
Copy link
Contributor

cbanek commented Jan 24, 2020

I think I'm with @funbaker , I didn't see that the code was mostly a straight copy from SIAv1 (since that wasn't in the diff), but it seems like just inheriting it might be easier, and it looks like it would get rid of a bunch of code.

@andamian
Copy link
Contributor Author

The code is not optimized yet but I'll try to avoid duplication whenever possible.

I was trying to enhance search to support v1 and v2, but the biggest problem was the definition of position. In the current API, size denotes a rectangle which in v2 is defined as polygon or range. It has a default of 1.0 so it's hard to tell whether it was ignored or not. One way to do this is to define 3 classes for position: Circle, Range and Polygon in addition to the current functionality. If the user passed one of the 3, then size is ignored.

Another thing to consider is the fact that SIAv2 returns ObsCore results. We should probably implement an ObsCore module for the results since they can be used with TAP as well? The ObsCore software can also be the place to deal with units.

That gets us back to the search function. A common search would return results in different format depending on the service it uses. That was another argument it made me lean towards the search_v2 function.

What do you think?

@cbanek
Copy link
Contributor

cbanek commented Jan 27, 2020

I think it's totally fine to have 2 different search functions, since you really want to be explicit about if you're using v1 or v2 for a search since a given URL might not work with both?

I do like trying to reuse the code for the stuff behind the search with inheritance though.

Interesting point on the ObsCore results, what are you thinking there? It seems like there's a lot of fields, and most are optional, but I like the idea of wrapping that up a bit, although could you wrap some of the most common ones or the required ones?

@funbaker
Copy link
Contributor

funbaker commented Jan 28, 2020

In that case it'll be still better to have two different service classes than two different methods on the same class.
One service cannot do both versions.

I like the idea of explicitly implementing the DALI Data Types, as there is already some code for it.
They would be needed anyway for a single class solution.

A subclass of pyvo.dal.query.DALResults and pyvo.dal.query.Record could be used for ObsCoreResults, altough it'd need some more logic if used in TAP (Out of scope for this PR i think).

Edit: The DALI Code is in pyvo.dal.params

@molinaro-m
Copy link

SIAP-2.0 states that

The response from a successful call to the {query} resource is a table consistent with ObsTAP responses [...]

(http://www.ivoa.net/documents/SIA/20151223/REC-SIA-2.0-20151223.html#toc34)

In a sense an ObsCoreResults would be nice, but based on its TAP usage, with SIAP-2.0 reusing it
for its scope.

I'm really happy to see SIAv2 moving on in astropy/pyvo!

@andamian
Copy link
Contributor Author

@cbanek , @funbaker OK, separate search, separate service class, the SIAQuery class is already different and the SIA record will also be different. Should we just have an sia2 module that attempts to follow the API and design of sia for whomever is already familiar with it but it works with v2? Would that be a good way to transition when v1 is retired?

@funbaker
Copy link
Contributor

Well, I'm for keeping the interface as simple as possible, hence i spoke out for just one search function.
If the consensus is to make both versions explicit, it should happen on service-class level and not on search-method level, where it'd create ambiguity.

What about a sia2 module now and a more abstract image service later?

@codecov
Copy link

codecov bot commented Feb 6, 2020

Codecov Report

❗ No coverage uploaded for pull request base (master@0329e06). Click here to learn what that means.
The diff coverage is 15.15%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master     #206   +/-   ##
=========================================
  Coverage          ?   72.08%           
=========================================
  Files             ?       40           
  Lines             ?     4402           
  Branches          ?        0           
=========================================
  Hits              ?     3173           
  Misses            ?     1229           
  Partials          ?        0
Impacted Files Coverage Δ
pyvo/dal/__init__.py 100% <ø> (ø)
pyvo/dam/obscore.py 8.95% <0%> (ø)
pyvo/dal/params.py 86.58% <100%> (ø)
pyvo/dal/query.py 84.94% <100%> (ø)
pyvo/dal/sia2.py 70.17% <20.77%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0329e06...d12a46c. Read the comment docs.

@andamian
Copy link
Contributor Author

andamian commented Feb 6, 2020

@cbanek, @funbaker -Please have a look at the sia2.py file before I go too far. This is by no means finished but it gives you an idea of the scope. Some observations:

  • Search attributes are explicit. Users should not necessary be familiar with SIA2 in order to use it. I've tried to make the API accommodating and easy to use. For quantities search attributes it accepts any units as long as it can convert them to the standard units required by the spec
  • SIAService is now a VOSI service (still need to clean up the old attributes)
  • ObsCoreRecord is incomplete and it will be moved elsewhere so that it can be used by other services such as TAP. It provides properties to make it more obvious about the available fields. It also handles quantities. The idea is to avoid referring the user to the standard. Do you see any reason to replace this flat structure with a class structure that more closely follows the ObsCore data model?

Again, it's not ready for review yet, just to give you an idea of the direction and ensure I'm on the right track. Please feel free to comment at your earliest convenience so that I know whether I can continue or change direction. Thanks

@funbaker
Copy link
Contributor

funbaker commented Feb 6, 2020

Apart from a few minor issues i wonder why the properties return a copy.

Also i still think it'd be better to have dedicated classes for the DALI Types, submerged with the existing code. Can also be done later, but it wouldn't need the append methods in sia2.py, which would make it easier to read.

@andamian
Copy link
Contributor Author

andamian commented Feb 6, 2020

Apart from a few minor issues i wonder why the properties return a copy.

Also i still think it'd be better to have dedicated classes for the DALI Types, submerged with the existing code. Can also be done later, but it wouldn't need the append methods in sia2.py, which would make it easier to read.

Properties are lists because they can have multiple values that SIAv2 will OR in the query. One very simple example is calib_level. For raw data one might want 0 OR 1 as constraints, hence the list will be [0,1]. It returns a copy of the list so that the user cannot mess with it since it's sync-ed with the formatted version in the params dictionary. *_append is used to parse and validate the value before is added to the list and to the dictionary of formatted values.

The alternative was to give users access to the properties list and do the parsing and formatting of the values for the request params later before submitting it. What I didn't like with this approach was the fact that parsing would not fail when a property was set (or added) but much latter.

I'm not 100% satisfied with the API of the SIAQuery either but the big advantage of it is that it's an alternative to the search function (which is the main, most common entry point of the module) that allows to incrementally collect the constraints. e.g. instantiate a SIAQuery object and pass it to different pieces of code that might add constraints to it before executing it.

What DALI types are you thinking about? Can you give an example? If it's a better solution I rather do it now than later.

@funbaker
Copy link
Contributor

funbaker commented Feb 6, 2020

I speak about those DALI Types: http://www.ivoa.net/documents/DALI/20170517/REC-DALI-1.1.html#tth_sEc3

astropy's Quantity's can be used for that, and it should be convertable between votable fields and params (for datalink etc, see also https://github.com/astropy/pyvo/blob/master/pyvo/dal/params.py, this needs some overhaul anyway).

@andamian
Copy link
Contributor Author

andamian commented Feb 7, 2020

I see, but I can't just wrap my mind around it, in part because I'm not that familiar with all these standards and how they relate to each other. Are these DALI types used internally only or they are part of the module API? For example, I didn't consider necessary to expose a Point or a Circle or a Polygon in the interface when they can be easily expressed with SkyCoord and Quantities.

I still have to move (or at least split) ObsCoreRecord. I've started with a dam package where the ObsCore module will reside. I still have to figure out what goes into it.

I've pushed more changes if you want to have a look (ObsCoreRecord changes not included yet).

@funbaker
Copy link
Contributor

funbaker commented Feb 7, 2020

They are part of the API, both inernally and externally at different places. Having all the data type related code there would make the query api look less bloated, and it'd possible to write q.pos.append instead of q.pos_append.
The difference to Quantity and Skycoord is that DALI types have a certain format to be serialized for queries, and a defined number of elements.

There are already several places (sia2, datalink, soda) where those types are used, and I'm sure there will be more in the future, so it's also a question of duplicate code.

@msdemlei
Copy link
Contributor

msdemlei commented Feb 7, 2020 via email

@andamian
Copy link
Contributor Author

andamian commented Feb 7, 2020

Very valuable comments @funbaker & @msdemlei.

To get the level of abstraction right, I feel like we need to clarify one thing up front: who do you consider to be the typical user of pyvo? Is it someone that it's intimate with VO standards or the regular astronomer and user of astropy or astroquery packages?

If it's the former, then it's OK to have other VO abstractions in the API (the ObsCore result can be left in its VOTable format since the names and types of the columns are standard. Upper layers (astroquery) might provide the glue layer to map the astropy concepts into VO concepts and spare users from needing to know most if not any of the VO parts. If it's the latter, pyvo should shield the VO concepts as much as possible and talk to users in astropy/astroquery language.

This will guide us in future developments.

@funbaker
Copy link
Contributor

funbaker commented Feb 7, 2020

I'd say both.

pyvo presents the VO concept with the right level of abstraction, while still allowing lower-level work.

astroquery does map VO concepts indeed, but it doesn't work with the VO only.

If those DALI Objects are constructed in a way which allow tuples of Quantity and SkyCoord as well, this shouldn't be a problem.

This is already done e.g. here https://github.com/astropy/pyvo/blob/master/pyvo/dal/adhoc.py#L705

@cbanek
Copy link
Contributor

cbanek commented Feb 7, 2020

Just giving this another look now and I'm confused - it looks like there's an sia2.py but the names are the same as sia.py. Seems like this could be confusing with the same names. If this is part of the "it's not ready yet" then disregard 😄 .

@andamian
Copy link
Contributor Author

andamian commented Feb 7, 2020

Now I see what you mean @funbaker. I'll try to rework it so that that code in Soda and other places is not duplicated.
Question: Since the mapping from tuples to positional constructs is pretty straightforward (3 member tuple->Circle, 4 member tuple->Range, >4 member tuple -> Polygon), do we really need 3 types of arguments or a generic pos such as the one that I've been using in sia2 is sufficient?

@andamian
Copy link
Contributor Author

andamian commented Feb 7, 2020

@cbanek - that was on purpose. We are using the same design and same constructs so that for someone that is already familiar with the sia module is easier to transition to the sia2. But I appreciate your point too.

@msdemlei
Copy link
Contributor

msdemlei commented Feb 14, 2020 via email

pyvo/dal/adhoc.py Outdated Show resolved Hide resolved
pyvo/dal/__init__.py Outdated Show resolved Hide resolved
pyvo/dal/__init__.py Show resolved Hide resolved
pyvo/dal/adhoc.py Outdated Show resolved Hide resolved
pyvo/dal/params.py Outdated Show resolved Hide resolved
pyvo/dal/sia2.py Outdated Show resolved Hide resolved
pyvo/dal/params.py Outdated Show resolved Hide resolved
pyvo/dal/sia2.py Outdated Show resolved Hide resolved
pyvo/dal/sia2.py Outdated Show resolved Hide resolved
pyvo/dal/tests/test_sia.py Show resolved Hide resolved
@funbaker
Copy link
Contributor

Made some comment - not requested changes at this point, because i'll do another iteration anyway and I'm not 100% sure if all points are valid.

@andamian
Copy link
Contributor Author

@msdemlei - sold. Thank you for the nice explanation, some of which I've tried to capture in the docs. I've re-named it publisher_did for now to be consistent with existing creator_did and that's mainly because obs_ seems a bit redundant, but we could follow the ObsCore if that's more appropriate.

@funbaker
Copy link
Contributor

Well, LGTM when CI is green (unless someone else has something to say).

@andamian
Copy link
Contributor Author

I have no idea why the tests are failing on Travis. The data directory and the capabilities file are there. Any idea?

@funbaker
Copy link
Contributor

@andamian probably because the test data is not defined in pyvo/dal/tests/setup_package.py

@andamian
Copy link
Contributor Author

Yeah - thanks. Figured it out eventually. Confusing because it works fine locally.

What do you think should we do about the outstanding issues: pos types: tuples or proper Circle, Range or Polygon classes and the attribute names in ObsCoreMetadata? I'd like to have an agreement on these things.

@funbaker
Copy link
Contributor

I'd say proper types, with those tuples to be passes into the constructor, so that they can be magically autoconstructed when a tuple is passed. I did a similar thing in the other interfaces.

@andamian
Copy link
Contributor Author

By automagically you mean something like:

if instanceof(pos, tuple):
   pos = Circle(ra=pos[0], dec=pos[1], radius=pos[2])
elif instanceof(pos, Circle):
  pass
else:
   raise Error
```?
Or is there a better/smarter way? And these classes are available in the `pyvo.dal` package or somewhere higher?

@andamian
Copy link
Contributor Author

I'm really stuck with the sphinx error. Don't even know what where to look.

@funbaker
Copy link
Contributor

@andamian

By automagically you mean something like:

Something like this, yep.

And these classes are available in the pyvo.dal package or somewhere higher?

There's no dedicated classes, just some code in pyvo.dal.params which I wrote for datalink/soda.

About sphinx: Looks like a warning about a docstring which is converted into an error.

@andamian
Copy link
Contributor Author

@andamian

By automagically you mean something like:

Something like this, yep.

And these classes are available in the pyvo.dal package or somewhere higher?

There's no dedicated classes, just some code in pyvo.dal.params which I wrote for datalink/soda.

Those classes don't deal with Quantities, Units and don't validate the inputs. I'll need to move the validation code that I have for pos there.

About sphinx: Looks like a warning about a docstring which is converted into an error.

I know. Just can't figure out exactly which docstring it's referring to.

@andamian
Copy link
Contributor Author

andamian commented Feb 19, 2020

@funbaker - FYI, for using the Circle and Polygon classes, I will need to expand their usage. Instantiating one of these classes might result in ValueError (as opposed to DALServiceError). Calling code might turn them into DALServiceError or other error types depending on the context.

I don't see them being used at the moment, so no danger in breaking something I hope.

The other thing is that I can't tell what the param input is supposed to be because is not list or tuple.

@cbanek
Copy link
Contributor

cbanek commented Feb 19, 2020

I took a look at your sphinx issue and debugged it. Looks like it was an indentation issue. This is kind of cute to not repeat the long docstring, but it's really confusing to me. In the other place where you use it, the indentation is different.

Looking at the docs themselves, we are only showing the doc for the search function, we aren't including any other functions, as they would have the same names as the normal SIA classes...

Here's a gist that I cooked up that doesn't error, but it doesn't render properly either, since the rendering of what is a parameter or if it's the text of the previous one is the indentation level:
https://gist.github.com/cbanek/a1fb603fccc62be4f54f4f3211ce469e

My simple suggestion would just be to copy the comment block and try not to do some string manipulation on it when it's imported, then the indentation will be really much more obvious.

Copy link
Contributor

@cbanek cbanek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me. Thanks for doing all that extra position work!

@andamian
Copy link
Contributor Author

@funbaker & @cbanek - thank you for the code reviews and approvals. However, I am not comfortable to go ahead with the merge and release, at least not until I get more feedback on the ObsCoreMetadata attributes. Support for proper positional types can be added later but the ObsCoreMetadata attributes are hard to change back after they get out in the wild. The very least I can do is use the ObsCore names, but then the gain is very marginal since those names are already accessible as column names in the tabular representation. Ideas?

@andamian
Copy link
Contributor Author

andamian commented Feb 20, 2020

@funbaker & @cbanek pushed another commit. Attributes in ObsCoreMetadata match the names in ObsCore. I've also fixed the tests so now we can run remote-data ones. After your blessing, can we merge and possibly release?

I have another work in progress in astroquery this time. I've created a vo module that implements their API using pyvo goodies only and is dependent on these changes.

@msdemlei
Copy link
Contributor

msdemlei commented Feb 21, 2020 via email

@andamian andamian merged commit 2c59cea into astropy:master Feb 27, 2020
@andamian andamian deleted the sia2 branch November 30, 2020 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for SIA v2
6 participants