Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[jts] Support all XML Schema 2 types for type field #124

Closed
rufuspollock opened this issue May 27, 2014 · 7 comments
Closed

[jts] Support all XML Schema 2 types for type field #124

rufuspollock opened this issue May 27, 2014 · 7 comments

Comments

@rufuspollock
Copy link
Contributor

From http://www.w3.org/TR/xmlschema-2/

Specifically, list is:

  • S = supported
  • A = alias
  • Y = plan add
  • N = do not plan to add
N/A? anySimpleType                                        # already have any
S string; a sub-value of anySimpleType
N normalizedString; a sub-value of string
N token; a sub-value of normalizedString
N language; a sub-value of token
N Name; a sub-value of token
N NCName; a sub-value of Name
S boolean; a sub-value of anySimpleType
N/A? decimal; a sub-value of anySimpleType  # have number
S integer; a sub-value of decimal
N nonPositiveInteger; a sub-value of integer           # format type of integer??
N negativeInteger; a sub-value of nonPositiveInteger   # ditto
N long; a sub-value of integer                         # ditto
N int; a sub-value of long                             # alias?
N short; a sub-value of int                            # format type of integer?
N byte; a sub-value of short                           # ditto
N nonNegativeInteger; a sub-value of integer           # ditto
N unsignedLong; a sub-value of nonNegativeInteger      # ditto but ?? (when is this useful)
N unsignedInt; a sub-value of unsignedLong             # ditto
N unsignedShort; a sub-value of unsignedInt            # ditto
N unsignedByte; a sub-value of unsignedShort           # ditto but ??
N positiveInteger; a sub-value of nonNegativeInteger   # ditto but ?? - plus we have constraints
N float; a sub-value of anySimpleType                  # alias for number or sub-format of number or N?
N double; a sub-value of anySimpleType                 # format for number or N???
S duration; a sub-value of anySimpleType
S dateTime; a sub-value of anySimpleType
S time; a sub-value of anySimpleType
S date; a sub-value of anySimpleType
Y gYearMonth; a sub-value of anySimpleType
Y gYear; a sub-value of anySimpleType
? gMonthDay; a sub-value of anySimpleType
? gDay; a sub-value of anySimpleType
? gMonth; a sub-value of anySimpleType
? hexBinary; a sub-value of anySimpleType       # could add as format to string. what is use?
S base64Binary; a sub-value of anySimpleType I  # supported as format on string
S anyURI; a sub-value of anySimpleType          # supported as format on string

What do we do about existing items not in that list?

  • number (aka decimal)
  • binary aka base64Binary
  • datetime (aka dateTime)

Suggest we deprecate and remap ...

Pros

Full compatibility

Cons

Massively expanded list of types that implementors must support and users must know about.

@ldodds
Copy link
Contributor

ldodds commented Jul 22, 2014

One option would be to:

  • Say that the type field should be a URI (allowing for extension)
  • That the specification recommends use of XML Schema data types as the preferred way to handle all simple types, starting that implementers shouldn't define new URIs for things covered by Xml Schema.
  • State that implementations SHOULD support a subset of the XML Schema types as a useful working subset -- this constrains implementation overheads. The set we use in csvlint seemed to be a good compromise

This puts a little more context around the usage, keeping the full compatibility, but avoiding implementers having to support all of it?

@rufuspollock
Copy link
Contributor Author

@paulfitz @pwalsh @danfowler @jpmckinney @morty thoughts welcome here.

Below are my current thoughts:

My general feeling is we want to preserve simplicity. Thus:

On the questions of special subtypes of e.g. integer I've been thinking:

  • Where this would allow us to support a large number of backends it could be worth considering. E.g. if long were common to many systems? (I'm not sure it is)
  • Where this is to do with validation e.g. positiveInteger I'm inclined to say this stuff can go in constraints and we ignore it here

Overall, I'm generally averse to adding lots more types or even formats as they add complexity for implementors and users and they seem quite specialist and mainly of value for validation. Happy to hear input as not certain here.

Finally: the special datetime types like: gYear, gYearMonth. I think these are convenient. I'm still a bit in two minds, especially for gDay and gMonth - who will use these and when?

Aside

As a general point I think we want to offer a pattern to users of how they would extend the type system if they did want to persist specific information. For example, SQL users may need to store the specific type information e.g. that it is double vs decimal.

@pwalsh
Copy link
Member

pwalsh commented Nov 19, 2015

I'm really not a fan of that type list from XML Schema 2, and think there is great value in us having a (much) smaller type surface.

I think the majority of that list can be handled in JTS as formats/conditions on types. Perhaps there is value in a section of the spec that demonstrates this.

Personally not too convinced on the g* sub-types of datetime.

However, having said that, I've actually built a rather complex calendar/scheduling app, where we definitely needed to use days/months/weeks/hours as distinct objects in the user interface. Persistent data storage was a combination of strings, and date time objects that stored some meaningless info (example: ignore date, just use hour, if schedule repeats weekly).

So, that gives us a use case for those g* objects - to be honest I think it is the only use case that exists (calendars).

@jpmckinney
Copy link

I agree that the existing number and integer seem sufficient, and that we don't need all the other types that are mostly (if not all) min/max constraints.

The use cases for dates without years are limited. There is also no ISO or similar standard for representing these.

On the other hand, ISO 8601 describes formats matching gYear and gYearMonth, and those two have much more common use cases, so I support adding those.

@rufuspollock
Copy link
Contributor Author

@pwalsh @jpmckinney thanks, very useful and judicious comments - as always!

@pwalsh
Copy link
Member

pwalsh commented Jul 12, 2016

@rgrp can we move to close this?

@roll roll added the backlog label Aug 8, 2016
@rufuspollock
Copy link
Contributor Author

FIXED. I've moved the two points from @jpmckinney re gYear and gYearMonth to #105

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

5 participants