New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a `format` Registry #845

Open
webron opened this Issue Nov 18, 2016 · 28 comments

Comments

Projects
None yet
9 participants
@webron
Copy link
Member

webron commented Nov 18, 2016

In reference to #607 and #811.

Instead of adding additional formats to the spec, we're considering two options - either a set of formats in a supporting guidelines document or a formal OAI format registry with a set of guidelines how to enter new formats to it. The registry will serve as an official repository for these formats and tools could use it as a reference. The @OAI/tdc is currently leaning towards the option of the repository.

This ticket is a reminder to tackle it and finalize the approach and the guidelines for either case.

@MikeRalphson

This comment has been minimized.

Copy link
Member

MikeRalphson commented Apr 28, 2017

Summary of existing formats used in the wild. https://github.com/Mermade/openapi-specification-extensions/blob/master/formats/combined.tsv

It may be worth recommending that format strings only contain lower-case letters, digits and hyphens, or some other restriction, to draw attention to the fact it is not intended as an essay on the content of the property or an enum, pattern or template.

@handrews

This comment has been minimized.

Copy link

handrews commented Apr 3, 2018

@MikeRalphson

This comment has been minimized.

Copy link
Member

MikeRalphson commented Apr 3, 2018

Proposed candidates discussed in other issues (work-in-progress):

Type(s) Format Description Issue
integer int53 53-bit integer #1517
integer int16 Signed 16-bit integer (short)
integer int8 Signed 8-bit integer (To help with misunderstandings of the current byte format)
integer uint8 Unsigned 8-bit integer (To help with misunderstandings of the current byte format)
number|string decimal Fixed-point decimal numbers #889
string int64s 64-bit integer held as a string for interoperability reasons #1517
string uuid UUID v4 format - RFC4122 #542
string base64url url-safe binary #606
string time time of day - as defined by partial-time - RFC3339 #358
string duration Duration - as defined by xs:dayTimeDuration - XML Schema 1.1 / ISO 8601 #359
@MikeRalphson

This comment has been minimized.

Copy link
Member

MikeRalphson commented Apr 3, 2018

@handrews - I like the idea of potentially delegating responsibility for formats to a JSON Schema-led registry and collaborating on that.

For now, we'll keep a small standard set in one of the major specification drafts.

How close would we be on the "core" OAS formats ?

@darrelmiller

This comment has been minimized.

Copy link
Member

darrelmiller commented Apr 3, 2018

And there are a bunch more here #607

@handrews

This comment has been minimized.

Copy link

handrews commented Apr 3, 2018

@MikeRalphson regarding "core" OAS formats:

  • "byte" should be done with "contentEncoding": "base64" as of draft-07
  • "binary" should be done with "contentMediaType": "application/octet-stream" as of draft-07
  • Both of the above have existed in some form for much longer than draft-07 under various names
  • The numeric formats are reasonable (to me) but we've never really sorted that out
  • "password" should be done through a UI vocabulary, so probably not a format

@darrelmiller regarding #607:

  • "decimal" is proposed for JSON Schema but as a string format if someone who understands the use cases and concerns will just write $*%$@ PR for it- it could have been in draft-07 if anyone who was asking about it would step up. I don't think it's possible to reliably implement it as a number format so I'm confused there.
  • "uuid" Is this just a UUID or a UUID URN? The latter should be {"type": "string", "format": "uri", "pattern": "^urn:uuid:"}
  • "duration" is one that I'd like to have, we already have the other date and time ones
  • "base64url" should also be "contentEncoding": "base64url", not format
@cyberphone

This comment has been minimized.

Copy link

cyberphone commented Apr 23, 2018

What's the difference between "byte" and "binary"?
In most newer standards, byte-arrays are encoded as Base64Url rather than Base64.

BigInteger support is currently available in Java, .NET, and probably in a bunch of other platforms as well. Java and .NET use entirely different BigInteger serialization schemes.

I would separate a possible BigNumber type from Decimal/Money because the latter is based on decimal arithmetic and usually do not come with exponents. If an application needs exponents, BigNumber would be the logical choice.

@cyberphone

This comment has been minimized.

Copy link

cyberphone commented May 17, 2018

The JSON-P API for Java use the following algorithm for long and a similar one for other extended numeric types:

if (value >= 9007199254740992 || value <= -9007199254740992)
    serializeAsString(value);
else
    serializeAsNumber(value);

IMO, it is bad idea but they claim it is the "industry standard" as well as the correct interpretation of the JSON RFC.

https://github.com/cyberphone/I-JSON-Number-System#extended-numeric-data-types-compliant-with-i-json

@ioggstream

This comment has been minimized.

Copy link

ioggstream commented May 29, 2018

I propose the fixed-point numeric decimal as per

  • "DECIMAL" SQL standard "ISO/IEC 9075-2: 2016 12 15"
@cyberphone

This comment has been minimized.

Copy link

cyberphone commented May 29, 2018

I propose the fixed-point numeric decimal as per
"DECIMAL" SQL standard "ISO/IEC 9075-2: 2016 12 15"

Don't you just love standards organizations that want money for their work? https://webstore.iec.ch/publication/59685

@ioggstream

This comment has been minimized.

Copy link

ioggstream commented May 29, 2018

"DECIMAL" SQL standard "ISO/IEC 9075-2: 2016 12 15"
Don't you just love standards organizations that want money for their work?

Use the draft, Luke ;)

@grokify

This comment has been minimized.

Copy link

grokify commented Jun 18, 2018

Specifically for date/times, given the large variety of formats that exist in the wild, would a format specification be more flexible and easy to implement in code generation tools when compared to a registry enumeration? For example, a format could use strftime or Go's time format.

@ioggstream

This comment has been minimized.

Copy link

ioggstream commented Jun 18, 2018

Specifically for date/times, given the large variety of formats
that exist in the wild, ... could use strftime or Go's time format.

@grokify iiuc date-time uses JSON Schema's date-time which is based on RFC 3339 which is a subset of ISO 8601.

@grokify

This comment has been minimized.

Copy link

grokify commented Jun 18, 2018

Specifically for date/times, given the large variety of formats that exist in the wild, ... could use strftime or Go's time format.

@grokify iiuc date-time uses JSON Schema's date-time which is based on RFC 3339 which is a subset of ISO 8601.

@ioggstream I understand that and I'm a big fan of RFC 3339. However, while restricting date-time to RFC 3339 is useful when creating an API, it is not as flexible for describing existing APIs that do not use it.

When looking at the registry provided by @MikeRalphson, there are a number of non-comforming date times:

https://github.com/Mermade/openapi-specification-extensions/blob/master/formats/combined.tsv

With the OAI format registry approach, would each of the date / time format's be listed as a separate format in the registry, e.g. not date-time?

A recent example of an API I'm attempting to use is the Insightly API. They provide an OpenAPI 2.0 spec that specifies date-time but they do not use RFC 3339. Their formats are listed in their documentation under "Date Formatting":

  • Query string: M/d/yyyy h:mm:ss AM/PM - 11/7/2015 8:07:05 AM
  • Object data: yyyy-MM-dd HH:mm:ss - 2015-04-10 21:15:00

This causes the Swagger Codgen SDK I'm attempting to use to raise exceptions with response data. In order to continue to use Swagger Codgen right now, my current thinking is one of the following:

  1. Modify Insightly Swagger Spec to use a simple string type with no format and do manual conversion in code
  2. Modify Insightly Swagger Spec and Swagger Codegen to use JSON Schema x-date-time-format property with auto-generated custom parsing

Aside from formal spec support which I'm hoping to have here, I'm leaning to 1 for a short term solution and 2 for a longer-term solution.

For an official spec solution, I was thinking it would be nice to specify something like a strftime or Golang string format so code generators know how to handle alternative date times. This can leverage date-time or be a different format to not confuse the issue with JSON Schema. I think something like this is necessary for better, practical codegen, either official or unofficial.

Since OpenAPI Specification may need JSON Schema to support this approach, I've mentioned this for JSON Schema as well:

json-schema-org/json-schema-spec#613

I'm curious to hear the thoughts of those that perform code generation, including @webron who is also on the Swagger Codegen project.

@MikeRalphson

This comment has been minimized.

Copy link
Member

MikeRalphson commented Jun 19, 2018

When looking at the registry provided by @MikeRalphson, there are a number of non-comforming date times:

https://github.com/Mermade/openapi-specification-extensions/blob/master/formats/combined.tsv

With the OAI format registry approach, would each of the date / time format's be listed as a separate format in the registry, e.g. not date-time?

Just for clarity, the list linked to above is not a "registry", it's merely a survey (and and out of date one at that) of format values used in the wild. There is no expectation that values from that list would be added unreviewed to the proposed registry. There is no guarantee that a format value used by one provider has the same meaning across all APIs which use it. format will remain an open-ended property where you can define your own values.

With regard to the Insightly API, both of your approaches sound fine, though I would add:

  1. Notify upstream API provider of the error in their OAS document

Certainly the meaning of date-time inherited from JSON schema is very unlikely to change, and unless there is a common standard for custom date formatting (as opposed to Go's, Javascript's, C's strftime etc) then having such a format (or additional property) would not meet OAS's goal of being language-neutral.

@handrews

This comment has been minimized.

Copy link

handrews commented Jun 19, 2018

@MikeRalphson @grokify it is correct that date-time and the other RFC 3339-based format values will not change.

As I noted in @grokify's issue filed on JSON Schema, our solution to the problem of managing format extensibility is being tracked at json-schema-org/json-schema-spec#563

While I think it's likely that we will add some duration formats alongside the existing date and time formats, that would be the extent of any additions to the JSON Schema spec itself. The myriad other possible ways to express these concepts would be handled as extensions.

@grokify

This comment has been minimized.

Copy link

grokify commented Jun 20, 2018

@MikeRalphson

Notify upstream API provider of the error in their OAS document

I'm hesitant to do this because the current solution of falling back to a string is lossy and that information is important. I prefer to have this information and then find some way to handle it myself. For Option 1, it's simple to write a simple script that removes that parameter but then I also know which properties should have additional processing for Option 2. Without this information, Option 2 would be more difficult to know which properties to apply it on a systemic basis.

Correcting this information would be better once we have some general, alternative way to provide this.

I've built 5+ codegen SDKs so far and have to do Swagger spec processing on a few of them. I prefer this to having less information about the API.

@MikeRalphson

This comment has been minimized.

Copy link
Member

MikeRalphson commented Jun 20, 2018

@grokify I would expect the API provider to replace, not remove the misleading and incorrect format value with one (or more!) of their own.

@grokify

This comment has been minimized.

Copy link

grokify commented Jun 20, 2018

@grokify I would expect the API provider to replace, not remove the misleading and incorrect format value with one (or more!) of their own.

@MikeRalphson I would make a recommendation if there was a standard approach, but it seems there is no standard approach to replace this info at this time. I may look into implementing a custom codegen module at which time it may make sense to create a "de facto" standard approach. Until such time, I'm hesitant to encourage proliferation of non-standard approaches.

My understanding is that Swagger 2.0 Spec, and possibly OpenAPI 3.0 Spec, was not designed to specify all possible APIs. If this is the case, some APIs may not be adequately specified by the spec. With Swagger 2.0 Spec, some features in an API I manage could not be fully specified so we had to use a non-valid spec in our Swagger UI implementation. At the time multipart/mixed was not supported, however, I made a requested it was supported in the 3.0 spec so hopefully we can have a valid spec with 3.0: #303

@MikeRalphson

This comment has been minimized.

Copy link
Member

MikeRalphson commented Jun 20, 2018

To me, it seems there is no standard approach to replace this info at this time. How would you recommend this spec be updated given the description of the API?

@grokify With any non-date-time value for format (which is an extensible property, anyone can define their own values, e.g insightly-date-query / insightly-date-object). date-time means what it means, and shouldn't be used where it is inappropriate. I would expect codegen tools to have some kind of mechanism for overriding the behaviour of unknown format values, but this might not be the case.

My understanding is that Swagger 2.0 Spec, and possibly OpenAPI 3.0 Spec, was not designed to specify all possible APIs.

Correct.

With Swagger 2.0 Spec, some features in an API I manage could not be fully specified so we had to use a non-valid spec in our Swagger UI implementation.

Did you have to hack Swagger-UI? I would recommend using the supported specification extensions to do this, and to not deviate from the spec itself. I don't see how having an invalid OAS document helps you at all.

@grokify

This comment has been minimized.

Copy link

grokify commented Jun 20, 2018

@MikeRalphson Ah, I see format is an open value now. I would probably recommend using strftime, Go format, or some other format rather than something like insightly-date-query which would require an external reference. Something more generic like the following would be more useful in my mind, specifically for code generation.

"format": "strftime"
"x-format": "%Y-%m-%d %H:%M:%S"

or

"format": "go-time"
"x-format": "2006-01-02 15:04:05"

Did you have to hack Swagger-UI? I would recommend using the supported specification extensions to do this, and to not deviate from the spec itself. I don't see how having an invalid OAS document helps you at all.

For Swagger UI, I've hacked both it's code and json-editor. If you read the referenced GitHub issue, there was no supported extension in 2.0.

As for how an invalid OAS spec helps, the goal of Swagger UI is to support UI based access to APIs. If not all APIs are supported by OAS spec and hacking Swagger UI can make those APIs available in the UI, it is beneficial to support that use case. We have a different version of our spec that validates that we use for code generation. Different specs for different use cases.

@MikeRalphson

This comment has been minimized.

Copy link
Member

MikeRalphson commented Jun 20, 2018

@grokify glad we're on the same page now. Yes extending the schema object with something like x-format-specifier would be a suitable approach.

If you read the referenced GitHub issue, there was no supported extension in 2.0.

Which issue? If you mean in the spec, specification/vendor extensions are supported in OAS 2.0 in the same way as in 3.0.x

As for how an invalid OAS spec helps...

This isn't an argument for an invalid OAS document, it's an argument for an extended one.

@ioggstream

This comment has been minimized.

Copy link

ioggstream commented Jun 20, 2018

@grokify imho

"format": "strftime"
"x-format": "%Y-%m-%d %H:%M:%S"

two wrongs won't make it right ;)

@grokify

This comment has been minimized.

Copy link

grokify commented Jun 20, 2018

As for how an invalid OAS spec helps...

This isn't an argument for an invalid OAS document, it's an argument for an extended one.

@MikeRalphson An invalid spec is not a perfect solution and there is no general desire to have an invalid spec. It is a compromise given the state of the ecosystem which is more than the spec itself. Sometimes having a valid spec for a not-well supported capability means more invasive changes and maintenance are needed in ecosystem tools. Sometimes, these changes can be justified, but other times it's harder to do, especially when official support will be available in the future. In this specific scenario, having two specs, a valid one for interoperability and another specifically to be used in one app without direct consumption seemed like a reasonable compromise.

"format": "strftime"
"x-format": "%Y-%m-%d %H:%M:%S"

two wrongs won't make it right ;)

@ioggstream I'm trying to find a working, practical solution. This approach meets the requirements of being valid, providing more specification and supporting code generation better. A modification to this approach I'm thinking of is using x-pattern instead of x-format. JSON Schema already uses pattern for regular expressions and that seems similar to the use case here. While pattern always represents a regular expression, x-pattern can be format specific.

@ioggstream

This comment has been minimized.

Copy link

ioggstream commented Jun 20, 2018

@grokify I'm currently working on a large API Ecosystem.

I understand the goal of documenting the "as-is", but:

  • using alternate format for dates is probably a bad practice
  • that should be blamed and made difficult
  • we should strive to harmonization and not prolification

The only exception could be unix timestamp, which is probably better treated as numeric (@MikeRalphson have your say;).

If you provide a different use case (eg not tied to a bad spec) I'm still interested :)

Peace, R.

@grokify

This comment has been minimized.

Copy link

grokify commented Jun 20, 2018

I understand the goal of documenting the "as-is", but:

  • using alternate format for dates is probably a bad practice
  • that should be blamed and made difficult
  • we should strive to harmonization and not prolification

@ioggstream I understand this point of view from a standards perspective, but, from a practitioner perspective, we should prioritize working over perfect. Inability to support many APIs that exist will make the spec less useful.

For new APIs, we should certainly strive for implementation harmonization and I spend quite a bit of time thinking about how to spec new APIs via OpenAPI. However, there's also often a requirement to access existing APIs, which can take some time to update.

Using format and x-pattern are valid spec and seem like a reasonable compromise that can lead to improved processing harmonization sooner. Using codegen SDKs, it can make the differences in date format transparent to app code.

@ansonkao

This comment has been minimized.

Copy link

ansonkao commented Dec 3, 2018

Is time an officially recognized format, or not?

Above, I see it in the table like so:

Type(s) Format Description Issue
string time time of day - as defined by partial-time - RFC3339 #358

But it is not on the official docs next to date and date-time

@MikeRalphson

This comment has been minimized.

Copy link
Member

MikeRalphson commented Jan 17, 2019

@ansonkao

Is time an officially recognized format, or not? Above, I see it in the table like so... But it is not on the official docs next to date and date-time

The table above is a list of possible candidates to be added to a formats registry. So time is not an "official" format within the OAS, but as format is an open-ended property, you may use it today without problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment