Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Added iso duration parsing #941

Merged
merged 2 commits into from

5 participants

@ichernev
Owner

As requested per #861.

@ichernev
Owner

@redben, @mj1856 any thoughts on this. You requested that in the first place :)

@mj1856
Collaborator

Haven't had a chance to evaluate it yet, but I'm looking forward to it! Thanks for working this in!

@redben

@ichernev Just tested, works like charm ! thanks a lot for this :)

@bollwyvl

tl; dr: +1, did some more work on my fork

I am working with roundtripped RDF data, in which reliable (de)serialization of time is pretty key. Right now I have decided to use a @dordille's plugin (moment-isoduration), but this PR, along with a Duration.toIsoString would seem like a great inclusion for core, as the ISO spec is more portable than the C# one (any technical merit aside).

Some notes, on looking at this PR, moment-isoduration, and goog.date.Interval.fromIsoString as well as ISO 8601 and W3C xsd(which is what RDF is implemented against):

  • durations can be negative
  • week is not mentioned in the specs
  • only the seconds field can be a decimal (all the others can only be ints)
  • while one can omit any component, you then have to remove the field letter
  • some other edge cases regarding lack of time, mostly handled by goog.date

Unless there is some implementation out there that requires ignoring any of the above, it seems like an overly-permissive parser is actually hurtful to implementing the spec.

i took a stab at ripping off goog.date and the plugin into my fork, and have added some tests while removing ones that don't agree with the spec.

In serialization, there are some cases where there is still strangeness, i.e. half a second after a year ago or moment.duration({y: -1, s: 1.5}) yields -P1YT0.5S... which isn't right... but the assumptions I am seeing internally (i.e. time is made of _months, _days and _milliseconds), suggest that the two models are probably irreconcilable. Any thoughts here?

Thanks for everyone's work up to here, would love to see this in moment!

@mj1856
Collaborator

@bollwyvl - Good comments. Thanks!

On your points:

  • Yes, durations can be negative. good call.
  • Weeks are mentioned in the spec. Check section 4.4.3.2.
  • Actually, the spec says that the smallest value in the period string can have decimals, but it doesn't require that seconds be the smallest value. So a value like P3.5Y is valid, while P3.5Y4D is not. (section 4.4.3.2 again, subsection a and b.) Personally, I can't think of what their reasoning is for not allowing that, but that's what the spec says. I don't think it would hurt to support decimals in any position. (correct me if I am wrong please)
  • Yes, a letter without a number is invalid (except P and T).
  • Can you elaborate on your last bullet point please?

The implementation I would want to exchange with in C# is in Noda Time. You can read their docs on period string parsing here. Although they are clear that it is not perfectly compatible with the ISO spec. For example, no decimal values at all there. Are you using some other c# implementation?

I haven't tested fully, but we should check specifically that time values come after a T designator, and that a T is only present when a time component exists. In other words P3M means three months, while PT3M means 3 minutes.

@bollwyvl

@mj1856: nothing like a good old fashioned spec read-off!

tl; dr: i read poorly. let's pick which kind of broken we want.

  • negative: at least that part is easy... the current C# implementation already had sign
  • yup, seeing the weeks now. In the ISO spec, the only example with W is as an alternate format, without any other components, but the normative §4.4.3.2 narration does include them with a W designator. §3.2.6 in the W3C spec doesn't.
  • decimals: yup. likewise the W3C spec claims only seconds can be fractional.
  • lonely letters: this at least is uncontroversial
  • this is the "no T without H|M|S". the part i grabbed from goog.date handles this with post-parsed elements, instead of with an extra-complicated regex, which seems reasonable.

I'm not using C# at all :) My roundtripped data is to the n3 format of RDF, as well as SPARQL for queries. Both of these, through RDF, inherit the XSD definitions of duration. If I was working on the backend, I'd probably be using python, in which case I would probably use isodate, which implements parsing like this. This one also allows weeks, the extended format, and doesn't check the floatiness of each of the components. However, unlike Noda, it doesn't introduce additional units (i.e. ticks). Their test suite is pretty good, though, so I might go ahead and grab that.

So somewhere between the overly-permissive implementations in other languages, the overly-restrictive format of W3C XSD and the somewhat internally inconsistent ISO spec there is a feature (and anti-feature) set that fills an important place. I'll noodle on some of these closer readings of the spec, and try to write some "misuse cases" for some of the nasty bits above.

@bollwyvl

I updated my fork with the more lenient parsing stance of isodate and Noda, which includes:

  • week components (as in PnnW and P...nnW...)
  • floats in any component (not just the last one)

Also following these on the serialization side, I have adopted P0D for the zero duration, rather than goog.date's PT0S (but not without screwing it up once, first).

I also added a bunch of test cases: some are from isodate, some are for non-parseable inputs, including the ones mentioned by @mj1856. With the exceptionless style i am seeing in the rest of the codebase, I am just checking whether asSeconds comes up zero, which is what would happen if the regexen don't hit at all, i.e. moment.duration('something-unparseable').asSeconds() === 0.

I feel this is getting sort of complete, with the following exceptions:

  • §4.4.3.3's "alternative format", PYYYYMMDDThhmmss, which isodate supports, Noda doesn't, and the W3C doesn't mention, I am disinclined to implement.
  • Then there's the seriously precise one, §4.4.4.1, that represents a duration with a start date and an alternate format, i.e. the number of seconds between when the ball dropped on new year's eve 1999 and the second i wrote this which seems... of dubious merit for inclusion
  • The recurring one, §4.5, which seems pretty cool, but also a bit vague.

Inputs welcome!

@mj1856
Collaborator

@bollwyvl - Thanks for being so detailed! Allow me some time to digest. It sounds like you are more invested in this than I am currently, so thank you for providing the analysis.

@mj1856
Collaborator

We might also want to aim for compatibility with JSR310 in the upcoming Java 8. They have had a lot more time and incentive to investigate this properly. Here is the javadoc on ISO duration parsing:
http://download.java.net/jdk8/docs/api/java/time/Period.html#parse(java.lang.CharSequence)

@bollwyvl

Sure, another integration point. Looks like it should be basically compatible with the "lenient" forms: the W component is allowed, but decimals are not. However, I see no mention of time, i.e. T... Not sure what that omission means...

I don't know what the baseline should be, as there is no way for one regex to do all of these things, and since we haven't got named subgroups, we can't even reuse much of the code between these different "features".

So this begs the question of whether implementation-specific dialects are required, much like localizations. Sounds pretty horrible, but might be the best in the long run if there is in fact no uniformity on how rigidly people want to implement this spec. Sounds like we have identified:

  • w3c
  • noda
  • py
  • java

As to what the baseline, i.e. implementation in moment.js (and not in an optional dialect file) should be: on one hand, I would vote for the W3C one, as it is the most concise and restrictive, and has other specs built on it. On the other hand, a more lenient parser (like the one most recently pushed to my fork based on your insights from Noda and isodate) might just "get the job done," though finding weird edge cases might be... exciting.

certainly happy to help in any way I can!

@ichernev
Owner

just my 2 cents: if we're going to put duration parsing in core moment, I'd vote for a quick, fast, forgiving impl, as the rest of the code (something similar to this PR). If a more robust implementation is needed it can go in a plugin. I just dont see the majority of moment users utilizing a strict duration parser. For those that do want it a plugin should be more than enough.

@icambron
Collaborator

Agreeing with @ichernev: simple && permissive > perfectly on spec || explicit support for several specs. @bollwyvl and @mj1856 what shape is this fork in? It sounds like it might be ready -- I don't know much about ISO-8601 durations, but the things you list as omissions don't sound very concerning.

@bollwyvl

Just rebased up to master, now in a new branch (-rebased), and changed the public API of duration.toIsoString to .toISOString to line up with moment and Date' s. With this, I feel pretty good about what is there: it pretty closely follows the pattern from the incumbent C# parsing, but would still appreciate some eyes on the diff.

I guess the other conversation about other formats can be pushed to later :)

@mj1856
Collaborator

I didn't spend a ton of time on it, but it looks pretty good in quick review. We don't have anything now, so I'd say let's push forward with this. We can always consider user feedback if there is some nuance we're not considering.

@icambron
Collaborator

I agree. I have a couple of small comments about the naming, but they'd be easier to express on a pull request. @bollwyvl want send one in?

@bollwyvl

Created #1113!

@ichernev ichernev merged commit 2a07b33 into moment:develop
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Jul 21, 2013
  1. @ichernev
  2. @ichernev
This page is out of date. Refresh to see the latest.
Showing with 41 additions and 9 deletions.
  1. +30 −9 moment.js
  2. +11 −0 test/moment/duration.js
View
39 moment.js
@@ -22,6 +22,7 @@
// ASP.NET json date format regex
aspNetJsonRegex = /^\/?Date\((\-?\d+)/i,
aspNetTimeSpanJsonRegex = /(\-)?(?:(\d*)\.)?(\d+)\:(\d+)\:(\d+)\.?(\d{3})?/,
+ isoDurationRegex = /^P(?:(?:([0-9,.]*)Y)?(?:([0-9,.]*)M)?(?:([0-9,.]*)D)?(?:T(?:([0-9,.]*)H)?(?:([0-9,.]*)M)?(?:([0-9,.]*)S)?)?|([0-9,.]*)W)$/,
// format tokens
formattingTokens = /(\[[^\[]*\])|(\\)?(Mo|MM?M?M?|Do|DDDo|DD?D?D?|ddd?d?|do?|w[o|w]?|W[o|W]?|YYYYY|YYYY|YY|gg(ggg?)?|GG(GGG?)?|e|E|a|A|hh?|HH?|mm?|ss?|SS?S?|X|zz?|ZZ?|.)/g,
@@ -1051,9 +1052,12 @@
var isDuration = moment.isDuration(input),
isNumber = (typeof input === 'number'),
duration = (isDuration ? input._input : (isNumber ? {} : input)),
- matched = aspNetTimeSpanJsonRegex.exec(input),
+ // matching against regexp is expensive, do it on demand
+ aspMatched = null,
+ isoMatched = null,
sign,
- ret;
+ ret,
+ parseIso;
if (isNumber) {
if (key) {
@@ -1061,15 +1065,32 @@
} else {
duration.milliseconds = input;
}
- } else if (matched) {
- sign = (matched[1] === "-") ? -1 : 1;
+ } else if (!!(aspMatched = aspNetTimeSpanJsonRegex.exec(input))) {
+ sign = (aspMatched[1] === "-") ? -1 : 1;
duration = {
y: 0,
- d: ~~matched[2] * sign,
- h: ~~matched[3] * sign,
- m: ~~matched[4] * sign,
- s: ~~matched[5] * sign,
- ms: ~~matched[6] * sign
+ d: ~~aspMatched[2] * sign,
+ h: ~~aspMatched[3] * sign,
+ m: ~~aspMatched[4] * sign,
+ s: ~~aspMatched[5] * sign,
+ ms: ~~aspMatched[6] * sign
+ };
+ } else if (!!(isoMatched = isoDurationRegex.exec(input))) {
+ parseIso = function (inp) {
+ // We'd normally use ~~inp for this, but unfortunately it also
+ // converts floats to ints.
+ // inp may be undefined, so careful calling replace on it.
+ var res = inp && parseFloat(inp.replace(',', '.'));
+ return isNaN(res) ? 0 : res;
+ };
+ duration = {
+ y: parseIso(isoMatched[1]),
+ M: parseIso(isoMatched[2]),
+ d: parseIso(isoMatched[3]),
+ h: parseIso(isoMatched[4]),
+ m: parseIso(isoMatched[5]),
+ s: parseIso(isoMatched[6]),
+ w: parseIso(isoMatched[7]),
};
}
View
11 test/moment/duration.js
@@ -198,6 +198,17 @@ exports.duration = {
test.done();
},
+ "instatiation from iso 8601 duration" : function (test) {
+ test.expect(6);
+ test.equal(moment.duration('P1Y2M3DT4H5M6S').asSeconds(), moment.duration({y: 1, M: 2, d: 3, h: 4, m: 5, s: 6}).asSeconds(), "all fields");
+ test.equal(moment.duration('P1W').asSeconds(), moment.duration({d: 7}).asSeconds(), "week field");
+ test.equal(moment.duration('P1M').asSeconds(), moment.duration({M: 1}).asSeconds(), "single month field");
+ test.equal(moment.duration('PT1M').asSeconds(), moment.duration({m: 1}).asSeconds(), "single minute field");
+ test.equal(moment.duration('P1MT2H').asSeconds(), moment.duration({M: 1, h: 2}).asSeconds(), "random fields missing");
+ test.equal(moment.duration('PY1MDT2HMS').asSeconds(), moment.duration({M: 1, h: 2}).asSeconds(), "random values missing");
+ test.done();
+ },
+
"humanize" : function (test) {
test.expect(32);
moment.lang('en');
Something went wrong with that request. Please try again.