Added "fall" as a season & normalized "week of month" patterns #736

alicekwak · 2023-08-08T05:18:06Z

Added "fall" as a season & normalized "week of month" patterns

added fall as a season and added post-processing method for season homonyms (fall/spring)
normalized "week of month" patterns into date ranges (e.g., first week of May, last two weeks of June, etc.)

…, 2) added fall to SEASON.tsv

kwalcock

Nice! I'm going to have to double check my own work, though. I'll run through it again before it gets merged.

kwalcock · 2023-08-08T07:16:54Z

main/src/main/scala/org/clulab/numeric/actions/NumericActions.scala

+        val prevWords = m.sentenceObj.words.slice(wordIndex - 2, wordIndex)
+        val contextWords = m.sentenceObj.words.slice(wordIndex - 5, wordIndex + 5)
+
+        (prevWords.contains("in") && prevWords.contains("the")) || prevWords.contains("this") || prevWords.contains("last") || prevWords.contains("every") ||


"the in" would pass the test as well as "in the" does. There is a containsSlice that might work to preserve the order.

Yes, your comment is correct. But I would argue it is overkill here.

"the in" would pass the test as well as "in the" does. There is a containsSlice that might work to preserve the order.

Yes, that's a good point. The filter is not perfectly precise at this moment. My intention was to create a filter that is not too complicated but works reasonably well. But if any issues are anticipated due to this filter being not exact, I'd be more than happy to do the revision!

kwalcock · 2023-08-08T07:35:26Z

main/src/main/scala/org/clulab/numeric/actions/NumericActions.scala

+        val contextWords = m.sentenceObj.words.slice(wordIndex - 5, wordIndex + 5)
+
+        (prevWords.contains("in") && prevWords.contains("the")) || prevWords.contains("this") || prevWords.contains("last") || prevWords.contains("every") ||
+          contextWords.exists(_.matches("[0-9]{0,4}"))


To match the exists construction, one could take the three contains from above and write

contextWords.exists(Array("this", "last", "every").contains)

Comparison is case-sensitive here even though above I see equalsIgnoreCase. "This spring..." might not match.

To match the exists construction, one could take the three contains from above and write

contextWords.exists(Array("this", "last", "every").contains)

Comparison is case-sensitive here even though above I see equalsIgnoreCase. "This spring..." might not match.

Thanks! I didn't know how to make that simpler. I'll revise the code as you suggested. And I'll also think about how to make the comparison case-insensitive, as I want patterns like "This spring..." to match.

kwalcock · 2023-08-08T07:45:44Z

main/src/test/scala/org/clulab/numeric/TestSeasonNormalizer.scala

+  behavior of "Default SeasonalCluProcessor"
+
+  it should "find true seasons in trueSeason1" in {
+    val processor = new CluProcessor()


There should probably be a single CluProcessor for the entire file of tests. It is very expensive to create. If the processor above is moved like so, it might work:

behavior of "Default SeasonalCluProcessor" val processor = new CluProcessor() it should "find autumn but not rainy season" in {

There should probably be a single CluProcessor for the entire file of tests. It is very expensive to create. If the processor above is moved like so, it might work:

behavior of "Default SeasonalCluProcessor" val processor = new CluProcessor() it should "find autumn but not rainy season" in {

Thanks for the great suggestion! I didn't know if that's possible or not (and also didn't know that creating CluProcessor is very expensive!) so just followed the pattern I found from the other test. I'll change the code as you suggested and see how it works.

kwalcock · 2023-08-08T07:48:55Z

main/src/test/scala/org/clulab/numeric/TestWeekNormalizer.scala

+  behavior of "WeekCluProcessor"
+
+  it should "find first two weeks of April" in {
+    val processor = new CluProcessor()


Ditto about the single CluProcessor.

main/src/main/scala/org/clulab/numeric/WeekNormalizer.scala

kwalcock · 2023-08-08T15:46:34Z

FYI, @alicekwak.

[info] *** 5 TESTS FAILED ***
[error] Failed tests:
[error] 	org.clulab.numeric.TestNumericEntityRecognition
[error] 	org.clulab.numeric.TestEvalTimeNorm
[error] 	org.clulab.utils.TestDependencyUtils
[error] 	org.clulab.odin.TestMention

[info] - should recognize date ranges from seasons *** FAILED ***
[info]   "[O]" was not equal to "[B-DATE-RANGE]" (TestNumericEntityRecognition.scala:621)
[info]   Analysis:
[info]   "[O]" -> "[B-DATE-RANGE]"

[info] - should recognize date ranges with seasons *** FAILED ***
[info]   "[O]" was not equal to "[B-DATE-RANGE]" (TestNumericEntityRecognition.scala:621)
[info]   Analysis:
[info]   "[O]" -> "[B-DATE-RANGE]"

[info] - should not degrade in performance *** FAILED ***
[info]   0.8487485647201538 was not greater than or equal to 0.85 (TestEvalTimeNorm.scala:15)

[info] - should produce one head using findHeads *** FAILED ***
[info]   Some(0) was not equal to None (TestDependencyUtils.scala:75)
[info]   Analysis:
[info]   Some(x: 0 -> )

[info] - should get None when there are no roots *** FAILED ***
[info]   Some(0) was not equal to None (TestMention.scala:59)

alicekwak · 2023-08-08T20:44:49Z

FYI, @alicekwak.

[info] *** 5 TESTS FAILED ***
[error] Failed tests:
[error] 	org.clulab.numeric.TestNumericEntityRecognition
[error] 	org.clulab.numeric.TestEvalTimeNorm
[error] 	org.clulab.utils.TestDependencyUtils
[error] 	org.clulab.odin.TestMention

[info] - should recognize date ranges from seasons *** FAILED ***
[info]   "[O]" was not equal to "[B-DATE-RANGE]" (TestNumericEntityRecognition.scala:621)
[info]   Analysis:
[info]   "[O]" -> "[B-DATE-RANGE]"

[info] - should recognize date ranges with seasons *** FAILED ***
[info]   "[O]" was not equal to "[B-DATE-RANGE]" (TestNumericEntityRecognition.scala:621)
[info]   Analysis:
[info]   "[O]" -> "[B-DATE-RANGE]"

[info] - should not degrade in performance *** FAILED ***
[info]   0.8487485647201538 was not greater than or equal to 0.85 (TestEvalTimeNorm.scala:15)

[info] - should produce one head using findHeads *** FAILED ***
[info]   Some(0) was not equal to None (TestDependencyUtils.scala:75)
[info]   Analysis:
[info]   Some(x: 0 -> )

[info] - should get None when there are no roots *** FAILED ***
[info]   Some(0) was not equal to None (TestMention.scala:59)

Thanks! I'll fix the broken tests before I make this into a real PR.

…evised grammars and actions

alicekwak · 2023-08-20T17:03:51Z

Hi @kwalcock and @MihaiSurdeanu, I did some debugging and made some more tests pass. However, I'm still failing on three tests (TestParallel, TestDependencyUtils, TestMention) and I'm not sure why they are failing. Below is the console output that I'm seeing. If you have any idea why the tests are failing, could you let me know? Thank you!

[info] *** 1 TEST FAILED ***
[error] Failed tests:
[error] org.clulab.processors.TestParallel
[info] Run completed in 16 minutes, 36 seconds.
[info] Total number of tests run: 19
[info] Suites: completed 4, aborted 0
[info] Tests: succeeded 19, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[info] Run completed in 16 minutes, 36 seconds.
[info] Total number of tests run: 548
[info] Suites: completed 79, aborted 0
[info] Tests: succeeded 546, failed 2, canceled 0, ignored 1, pending 0
[info] *** 2 TESTS FAILED ***
[error] Failed tests:
[error] org.clulab.utils.TestDependencyUtils
[error] org.clulab.odin.TestMention
[error] (main / Test / test) sbt.TestsFailedException: Tests unsuccessful
[error] (corenlp / Test / test) sbt.TestsFailedException: Tests unsuccessful
[error] Total time: 1045 s (17:25), completed Aug 20, 2023, 6:28:33 AM

kwalcock · 2023-08-20T20:35:50Z

FWIW I'm looking at it.

Fix tests

alicekwak · 2023-08-27T06:13:41Z

Hi @kwalcock and @MihaiSurdeanu, I believe this PR is now ready to be reviewed and (hopefully) merged now. Would you be able to have a look at this and let me know if there's any issue left? Thank you!

kwalcock

Some more concerns still...

main/src/main/scala/org/clulab/numeric/mentions/package.scala

kwalcock · 2023-08-27T07:41:06Z

main/src/main/scala/org/clulab/numeric/mentions/package.scala

@@ -892,6 +921,34 @@ package object mentions {
    else seasonNormalizer.norm(wordsOpt.get)
  }

+  private def getWeekRange(weekNormalizer: WeekNormalizer)(argName: String, m:Mention): Option[WeekRange] = {
+    val wordsOpt = getArgWords(argName, m)
+    print("this is wordsOpt: " ++ wordsOpt.get.mkString(" "))


This should probably be removed. Commenting out is OK. println is better for Scala.

Thanks for catching this up! I removed the line.

main/src/main/resources/org/clulab/numeric/date-ranges.yml

kwalcock · 2023-08-27T16:11:53Z

main/src/main/scala/org/clulab/numeric/TempEvalFormatter.scala

@@ -51,7 +51,7 @@ object TempEvalFormatter {
    }
  }

-  private def convertLiteralMonth(s: String): Int = {
+  def convertLiteralMonth(s: String): Int = {


At some point these 12 ifs need to change.

kwalcock · 2023-08-27T16:14:32Z

main/src/main/scala/org/clulab/numeric/mentions/package.scala

+    print("this is wordsOpt: " ++ wordsOpt.get.mkString(" "))
+
+    if (wordsOpt.isEmpty) None
+    else if (wordsOpt.get.mkString(" ").toLowerCase().equals("last week")) {getLastWeekRange(m)}


I will eventually do something about the duplicate calculation here.

kwalcock · 2023-08-27T16:18:49Z

main/src/main/scala/org/clulab/numeric/actions/NumericActions.scala

+  // A common introduction to a season
+  val inThe: Array[String] = Array("in", "the")
+  // Match a 1 to 4 digit year
+  val yearPattern = Pattern.compile("[0-9]{1,4}")


If this really is for a year, I can only imagine 2 or 4 digits, like summer of '69 or 2023. Could this be something like "[0-9]{2}|[0-9]{4}"?

Thanks for the great suggestion! I just replaced the regex pattern to "[0-9]{2}|[0-9]{4}" as you suggested.

MihaiSurdeanu

This LGTM!

…date-revision

kwalcock · 2023-09-01T17:04:41Z

Thanks, @alicekwak!

alicekwak added 3 commits July 31, 2023 09:46

1) added normalizer for imprecise dates (e.g., 'first week of April')…

71d6edf

…, 2) added fall to SEASON.tsv

fall/spring filter added

627eda9

normalizing 'last week/last two weeks of month' patterns

18cf67e

alicekwak requested review from MihaiSurdeanu and kwalcock August 8, 2023 05:18

kwalcock requested changes Aug 8, 2023

View reviewed changes

1) revised and moved new tests into TestNumericEntityRecognition 2) R…

b8f2694

…evised grammars and actions

kwalcock added 5 commits August 20, 2023 17:06

Fix test in dependency utils

eb5539f

Fix mention test

6695055

Smooth out NumericActions

2582a4a

Change variable name

beb9e14

Merge pull request #738 from clulab/kwalcock/date-revision

0149e78

Fix tests

alicekwak marked this pull request as ready for review August 27, 2023 06:10

kwalcock reviewed Aug 27, 2023

View reviewed changes

MihaiSurdeanu approved these changes Aug 28, 2023

View reviewed changes

alicekwak added 4 commits August 28, 2023 18:51

cleaned up unwanted lines

9152996

Merge branch 'master' into alice-date-revision

cc932cb

Forcing an empty commit.

f723a33

Merge remote-tracking branch 'origin/alice-date-revision' into alice-…

007d233

…date-revision

alicekwak requested a review from kwalcock September 1, 2023 04:55

kwalcock approved these changes Sep 1, 2023

View reviewed changes

kwalcock merged commit d2e5e4a into master Sep 1, 2023
1 check passed

kwalcock deleted the alice-date-revision branch September 1, 2023 15:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added "fall" as a season & normalized "week of month" patterns #736

Added "fall" as a season & normalized "week of month" patterns #736

alicekwak commented Aug 8, 2023

kwalcock left a comment

kwalcock Aug 8, 2023

MihaiSurdeanu Aug 8, 2023

alicekwak Aug 8, 2023

kwalcock Aug 8, 2023

alicekwak Aug 8, 2023

kwalcock Aug 8, 2023

alicekwak Aug 8, 2023

kwalcock Aug 8, 2023

kwalcock commented Aug 8, 2023

alicekwak commented Aug 8, 2023

alicekwak commented Aug 20, 2023

kwalcock commented Aug 20, 2023

alicekwak commented Aug 27, 2023

kwalcock left a comment

kwalcock Aug 27, 2023

alicekwak Aug 29, 2023

kwalcock Aug 27, 2023

kwalcock Aug 27, 2023

kwalcock Aug 27, 2023

alicekwak Aug 29, 2023

MihaiSurdeanu left a comment

kwalcock commented Sep 1, 2023

Added "fall" as a season & normalized "week of month" patterns #736

Added "fall" as a season & normalized "week of month" patterns #736

Conversation

alicekwak commented Aug 8, 2023

kwalcock left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kwalcock commented Aug 8, 2023

alicekwak commented Aug 8, 2023

alicekwak commented Aug 20, 2023

kwalcock commented Aug 20, 2023

alicekwak commented Aug 27, 2023

kwalcock left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MihaiSurdeanu left a comment

Choose a reason for hiding this comment

kwalcock commented Sep 1, 2023