Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new xtype="uuid" #17

Closed
pdowler opened this issue Nov 23, 2022 · 9 comments
Closed

new xtype="uuid" #17

pdowler opened this issue Nov 23, 2022 · 9 comments
Labels
1.2 in scope for version 1.2

Comments

@pdowler
Copy link
Collaborator

pdowler commented Nov 23, 2022

a common unique identifier value type with canonical ascii serialization, eg e0b895ca-2ee4-4f0f-b595-cbd83be40b04

main use case at CADC: primary key in databases with TAP access

@Zarquan
Copy link
Member

Zarquan commented Nov 24, 2022

+1

@molinaro-m
Copy link
Member

While I agree that UR[I]dentifiers would benefit a standardised xtype (#16), I wonder if standardising one for uu[id]entifiers is needed. I guess is the usual balance/threshold issue of reserving a word versus fixing it in a standard...

@Zarquan
Copy link
Member

Zarquan commented Nov 28, 2022

UUID values have rules on what they can contain and a standard RFC 4122 for serializing them, so an xtype would mean the values can be validated.

@Zarquan
Copy link
Member

Zarquan commented Nov 28, 2022

@pdowler would this xtype apply to datatype=long[4], char[36] or either ?

@pdowler
Copy link
Collaborator Author

pdowler commented Nov 28, 2022

I use datatype="char" arraysize="36" xtype='uuid" because it makes reading a query result in text and using uuid values in queries much simpler, cut&paste, etc.

And yes, using canonical ascii form as described in the above RFC.

@pdowler
Copy link
Collaborator Author

pdowler commented Apr 17, 2023

In response to Marco: for my usage of xtype="uuid" it allows my VOTable parser to convert the chars into a UUID object rather than String. Without it, I would have to do the string-to-uuid in more code, potentially every piece of client code that encounters UUIDs.... and try to consistently deal with failure of the implicit validation (detect errors). So it makes my code better and if someone else cares about UUID it makes their code better. And as usual, if some other software doesn't know or care about xtype="uuid" they still get a string.

Without that xtype, I'd have to use something like xtype="opencadc:uuid" and no one else would gain the benefit.

Aside from our internal usage (caom2, storage-inventory, vospace, numericid for users), I'm seeing UUID more and more in other systems, for example OIDC sub.

@aragilar
Copy link

Silly question, but are xtypes explicitly tied to a particular representation (so you couldn't have the same xtype for a UUID for char[36] and datatype=long[4], or if there were to be a 128bit int type), or is that something that might be assumed by parsers?

@pdowler
Copy link
Collaborator Author

pdowler commented Apr 18, 2023

In the case of xtype="uuid", one could unambiguously allow for any of:

datatype="char" arraysize="36" aka canonical hex representation
datatype="long" arraysize="2" aka lower and upper bits in that order
datatype="byte" arraysize="16" aka raw bytes

Clients could in principle parse any such values into a UUID object, so as output (in a VOTable column) all of those are usable and could co-exist.

As input (via an HTTP parameter or in an ADQL query) a service can only really say that it accepts one of those (eg datalink service descriptor) or that the column in the tap_schema is one of those types. In that context I always found char to be the most convenient: no url encoding needed for params, simple quoted string used in ADQL queries, and can generally cut&paste values without strange failures.

So, to actually answer the question: xtypes do specify a serialization that applies to values in VOTable, HTTP parameters, etc and we usually pick one serialization that works well in those places. We could allow for other serializations but it would have to add something useful.

A secondary concern is that the values should in principle be usable by s/w that doesn't grok the xtype. For example, if I have a tap service and a column is a primary key of xtype="uuid" I think datatype="char" makes that PK column more or less equally usable to a client that doesn't know what a uuid is - it's a little less type safe but that's all. Arrays of byte or long as a PK would be more complex for minimal benefit.

@pdowler
Copy link
Collaborator Author

pdowler commented May 29, 2023

PR #24

@pdowler pdowler added the 1.2 in scope for version 1.2 label Jul 5, 2023
@pdowler pdowler closed this as completed Sep 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.2 in scope for version 1.2
Projects
None yet
Development

No branches or pull requests

4 participants