[EN DateTime V2] Added support for cases like "April ninth through 15th" (#2905)#2994
[EN DateTime V2] Added support for cases like "April ninth through 15th" (#2905)#2994tellarin merged 3 commits intomicrosoft:masterfrom
Conversation
| var ordinalExtractions = config.OrdinalExtractor.Extract(text); | ||
|
|
||
| tokens.AddRange(MergeTwoTimePoints(text, reference)); | ||
| tokens.AddRange(MergeTwoTimePoints(text, new List<ExtractResult>(ordinalExtractions), reference)); |
There was a problem hiding this comment.
Why do we need to create a new list here?
There was a problem hiding this comment.
I was just following the same syntax used for the other methods in the list, will modify to pass directly the original list
There was a problem hiding this comment.
removed the creation of new lists from the other extracting methods
| { | ||
| er = this.config.DateExtractor.Extract(this.config.TokenBeforeDate + text, referenceDate); | ||
| if (er.Count >= 2) | ||
| er.ForEach(o => o.Start -= this.config.TokenBeforeDate.Length); |
There was a problem hiding this comment.
This seems a bit brittle. Can you add a code comment explaining the rationale?
Also, could you run some perf tests? All this new extraction/parsing seems a bit heavy.
There was a problem hiding this comment.
Couldn't some equivalent logic be run in Merge instead of directly in Period?
There was a problem hiding this comment.
implemented new approach by adding written ordinal numbers to the existing regexes, there is no perf impact in this way.
Test results for the mean test case execution time:
Base: (59.6 +/- 3.1)ms
Fix with OrdinalExtractor: (63.9 +/- 3.7)ms
Fix with regexes: (58.5 +/- 3.1)ms
| { | ||
| var pr1Value = (DateTimeResolutionResult)pr1.Value; | ||
| var pr2Value = (DateTimeResolutionResult)pr2.Value; | ||
| var timexList1 = pr1.TimexStr.Split(Constants.DateTimexConnector[0]); |
There was a problem hiding this comment.
the timex of the ordinal number needs to be updated to refer to the same month of the other date.
It is similar to what happens to the year in the method SyncYear.
Fix to issue #2905.
Modified methods MergeTwoTimePoints in Base DatePeriod Extractor/Parser to support cases where one of the date points is an ordinal number.
Fixed issue in Hindi where "पहले" (first) was wrongly extracted from "से पहले" (before).
Test cases added to EN DateTimeModel.