Skip to content

[EN DateTime V2] Added support for cases like "April ninth through 15th" (#2905)#2994

Merged
tellarin merged 3 commits intomicrosoft:masterfrom
aitelint:#2905
Jul 13, 2022
Merged

[EN DateTime V2] Added support for cases like "April ninth through 15th" (#2905)#2994
tellarin merged 3 commits intomicrosoft:masterfrom
aitelint:#2905

Conversation

@aitelint
Copy link
Copy Markdown
Contributor

Fix to issue #2905.

Modified methods MergeTwoTimePoints in Base DatePeriod Extractor/Parser to support cases where one of the date points is an ordinal number.
Fixed issue in Hindi where "पहले" (first) was wrongly extracted from "से पहले" (before).

Test cases added to EN DateTimeModel.

var ordinalExtractions = config.OrdinalExtractor.Extract(text);

tokens.AddRange(MergeTwoTimePoints(text, reference));
tokens.AddRange(MergeTwoTimePoints(text, new List<ExtractResult>(ordinalExtractions), reference));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to create a new list here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just following the same syntax used for the other methods in the list, will modify to pass directly the original list

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed the creation of new lists from the other extracting methods

{
er = this.config.DateExtractor.Extract(this.config.TokenBeforeDate + text, referenceDate);
if (er.Count >= 2)
er.ForEach(o => o.Start -= this.config.TokenBeforeDate.Length);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit brittle. Can you add a code comment explaining the rationale?
Also, could you run some perf tests? All this new extraction/parsing seems a bit heavy.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't some equivalent logic be run in Merge instead of directly in Period?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I will check

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

implemented new approach by adding written ordinal numbers to the existing regexes, there is no perf impact in this way.

Test results for the mean test case execution time:
Base: (59.6 +/- 3.1)ms
Fix with OrdinalExtractor: (63.9 +/- 3.7)ms
Fix with regexes: (58.5 +/- 3.1)ms

{
var pr1Value = (DateTimeResolutionResult)pr1.Value;
var pr2Value = (DateTimeResolutionResult)pr2.Value;
var timexList1 = pr1.TimexStr.Split(Constants.DateTimexConnector[0]);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why timex parsing here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the timex of the ordinal number needs to be updated to refer to the same month of the other date.
It is similar to what happens to the year in the method SyncYear.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed method

@tellarin tellarin merged commit f4b0e70 into microsoft:master Jul 13, 2022
@aitelint aitelint deleted the #2905 branch July 13, 2022 11:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants