Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we allow fuzzy dates and use of ISO8601 notation for durationInDays #723

Closed
timgdavies opened this issue Jun 8, 2018 · 12 comments
Closed
Labels
Schema: Validation Relating to constraints in the JSON Schema
Projects
Milestone

Comments

@timgdavies
Copy link
Contributor

This issue is for discussion of possible changes that could be made in future versions of OCDS to the handling of dates. We want to identify if there are use cases that would benefit from this, and how it would impact users and applications.

The current situation and future considerations

Dates and times

Whenever OCDS includes a date, we ask for a full ISO date-time consisting of:

YYYY-MM-DDTHH:MM:SSZ

I.e. Year, Month, Day, Hour, Minutes, Seconds and Timezone

This means that if a source system only has a date recorded, it has to set a time and timezone.

This was based on the understanding that in general:

  • Exact times are important in procurement contexts (e.g. knowing exactly when a bid has to be in);
  • The burden of identifying the 'default' time to use for any field should fall on the publisher, not the user of data;

However, this does mean that OCDS may not always fully represent what is in a source system, such that if a source system has only a date, the addition of a time when generating OCDS data is adding information that was not in the source. This may provide a justification for relaxing our validation of date fields to allow simply YYYY-MM-DD in future.

Duration in Days

OCDS 1.1 introduced the durationInDays field to the Period object to accommodate cases where the length of a period is known, but the exact startDate or endDate are not, or are not specified.

A question has been raised of whether the 'days' requirement is expressive enough, and whether the full ISO 8601 durations syntax could be supported

This would allow expression of concepts such as 'Two Months', which, when an actual startDate or endDate is known, can be expanded to exact number of days, rather than the fuzzy number of days expressed with a duration given initially in days (which has to make assumptions about the average length of a month).

Questions for input

We would welcome input on use cases from either publishers or data users that need (a) fuzzy dates; (b) complex durations.

We would welcome reflections on whether there are backwards compatible changes that can be made here to accommodate any such use cases.

@timgdavies timgdavies added the Schema Relating to other changes in the JSON Schema (renamed fields, schema properties, etc.) label Jun 8, 2018
@timgdavies timgdavies added this to the 2.0 milestone Jun 8, 2018
@timgdavies
Copy link
Contributor Author

There is a blog post at https://www.open-contracting.org/2018/06/08/territory-map-storing-sharing-open-contracting-data/ which explores some aspects of modelling dates in OCDS and design principles that have been applied to date.

@mpostelnicu
Copy link

mpostelnicu commented Jun 8, 2018

I've just read your post. Was wondering if you've considered optionally using the format attribute, using ISO-8601 format notation, and just let to the user input the date in what ever format she chooses fit, while specifying that format. Example would be date="2018-02-03", format="yyyy-MM-dd". Or you can always use some more comprehensive format when needed, like format="yyyy-MM-dd'T'HH:mm:ssZ" That would pose some problems when it comes to schema validation, but would provide more flexibility.

same as for duration, we can keep durationInDays and provide a duration field with the full duration description as provided by the standard, like duration="P3Y6M4DT12H30M5S" => "three years, six months, four days, twelve hours, thirty minutes, and five seconds". And use oneOf schema validation attribute to ensure you cannot use both at the same time in the same element...

Just bouncing some ideas.

@timgdavies
Copy link
Contributor Author

Thanks @mpostelnicu: useful reflections. Whilst I can see specifying a format working quite nicely in XML idiom (value and attributes), it feels less familiar/easy to model in JSON. It also does place more burden on the user to process data more heavily before use for the common use case of comparing dates etc.

On duration though, I like the idea of a backwards compatible duration field as an option instead of (or even alongside?) durationInDays. I say alongside, as I can envisage cases where both might be specified, but if given alongside one another, the validation would need to check that the durationInDays is legitimately within the possible range specified by duration.

@duncandewhurst
Copy link
Contributor

In the Open Contracts Prishtina platform there is a 'contract year' field, populated as follows:

We take all the contracts from an importer and we convert them from csv to json. All contracts are grouped based on the year they were signed for example 2017.csv holds all the contracts signed in 2017. Sometimes some of the contracts don't have the signing date so we take the year based on the csv document they are.

Relaxing date validation to permit just YYYY would enable this to be disclosed in contract/dateSigned

@ColinMaudry
Copy link
Member

The French essential procurement data standard specifies the contract length in months. So far, to convert the data to OCDS, we multiply number of months by 30.5.

@jpmckinney jpmckinney added Schema: Validation Relating to constraints in the JSON Schema and removed Schema Relating to other changes in the JSON Schema (renamed fields, schema properties, etc.) labels Jul 22, 2020
@jpmckinney jpmckinney modified the milestones: 2.0.0, 1.2.0 Strict Jul 22, 2020
@jpmckinney jpmckinney added this to To do in OCDS 1.2 via automation May 19, 2021
@ColinMaudry
Copy link
Member

ColinMaudry commented Aug 18, 2021

In EU eForms, periods of time are expressed in straight days, months or years. Examples:

 <cbc:DurationMeasure unitCode="DAY">150</cbc:DurationMeasure>
 <cbc:DurationMeasure unitCode="MONTH">6</cbc:DurationMeasure>
 <cbc:DurationMeasure unitCode="YEAR">4</cbc:DurationMeasure>

Mapping this data to OCDS requires multiplying the months by 30 and the years by 365. Over long periods of time (e.g concessions of 50 years), a period converted this way can be off by a week.

@duncandewhurst
Copy link
Contributor

Should this issue be in the 1.2.0 strict milestone given that it's about relaxing validation requirements and/or adding new fields?

@jpmckinney jpmckinney modified the milestones: 1.2.0 Strict, 1.2.0 Feb 10, 2022
@jpmckinney
Copy link
Member

jpmckinney commented Feb 10, 2022

I probably meant to add it to 1.2 and misclicked.

@neelima-j
Copy link

@odscjen @duncandewhurst In your work with the EU eForms mapping, have you seen anything that changes the direction of this issue - to relax validation for date fields, and add an ISO 8601 formatted duration field ?

@duncandewhurst
Copy link
Contributor

Nope. In eForms, dates are expressed in ISO8601 format and durations as described in #723 (comment).

@jpmckinney jpmckinney moved this from To do to To do: Validations in OCDS 1.2 Dec 8, 2022
@duncandewhurst
Copy link
Contributor

So far this issue has gathered the following demand:

  • Relax date validation: one publisher (to permit year-only, which is arguably not a date...)
  • Allow ISO8601 durations: two publishers (work arounds exist, but ISO8601 would improve accuracy)

@jpmckinney based on the 'two publishers rule of thumb', I think that we leave relaxing date validation for now. I think that we can leave ISO8601 durations too, since the current workarounds don't seem to be causing any problems. Sound good?

@jpmckinney
Copy link
Member

I agree that a plain year is semantically different from a date in the case provided, where the year is more like a unit of aggregation. (For dates far in the past, the year might be the only known value, but we aren't dealing with such uncertainty.)

For duration, ultimately, any analysis will presumably want to compare apples to apples and therefore convert the values to a common unit, which is probably going to be "day". (Indeed, ease of use is the point of the linked blog post.)

Also, in practice, I don't think buyers are setting durations of e.g. "3 years" and expecting the contract to end at the precise offset, including leap days (and leap seconds). So, I think multiplying by 365 for years is fine. There are 30.416̅ days in an average month, but I think multiplying by 30 is also fine (and avoids having to round the product), because buyers aren't trying to be that precise when setting durations in months.

Since the issue was not created based on external demand, and since the demand remains largely theoretical, I'll close.

OCDS 1.2 automation moved this from To do: Validations to Done Jun 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Schema: Validation Relating to constraints in the JSON Schema
Projects
Status: Done
OCDS 1.2
  
Done
Development

No branches or pull requests

6 participants