-
Notifications
You must be signed in to change notification settings - Fork 723
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added new rules to parse phrases for upcoming weeks. #491
Conversation
upcoming weeks" and "upcoming xxx weeks"
"upcoming" phrase
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chinmay87 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
and year as well.
@yuanbing has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chinmay87 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@chinmay87 merged this pull request in 097b926. |
Summary: the new rules could parse phrases in the form of xxx upcoming weeks upcoming xxx weeks Pull Request resolved: facebook#491 Test Plan: Imported from GitHub, without a Test Plan: line. Differential Revision: D21959647 Pulled By: chinmay87 fbshipit-source-id: a062a8c7a6c2e23b921b1099b886fa589c69c454
Summary: the new rules could parse phrases in the form of xxx upcoming weeks upcoming xxx weeks Pull Request resolved: facebook#491 Test Plan: Imported from GitHub, without a Test Plan: line. Differential Revision: D21959647 Pulled By: chinmay87 fbshipit-source-id: a062a8c7a6c2e23b921b1099b886fa589c69c454
Summary: the new rules could parse phrases in the form of xxx upcoming weeks upcoming xxx weeks Pull Request resolved: facebook#491 Test Plan: Imported from GitHub, without a Test Plan: line. Differential Revision: D21959647 Pulled By: chinmay87 fbshipit-source-id: a062a8c7a6c2e23b921b1099b886fa589c69c454
Summary: the new rules could parse phrases in the form of xxx upcoming weeks upcoming xxx weeks Pull Request resolved: facebook#491 Test Plan: Imported from GitHub, without a Test Plan: line. Differential Revision: D21959647 Pulled By: chinmay87 fbshipit-source-id: a062a8c7a6c2e23b921b1099b886fa589c69c454
* Use mkRuleHoliday for KO language Summary: Use mkRuleHoliday for KO language Reviewed By: patapizza Differential Revision: D16441727 fbshipit-source-id: 167c47ed5f3960e9ab68d9b4b444ae9f1c63eda5 * Use mkRuleSeason for KO language Summary: Use mkRuleSeason for KO language. Reviewed By: patapizza Differential Revision: D16441908 fbshipit-source-id: 94c450461fe788561dbf41e7b811de3d699f504d * Use mkRuleDaysOfWeek for KO language Summary: Use mkRuleDaysOfWeek for KO language. Reviewed By: patapizza Differential Revision: D16441936 fbshipit-source-id: 1bee48cd59caede92cf43056ea5aa4d953827318 * Reuse helpers Summary: Reuse mkRuleHolidays, mkRuleSeasons, mkRuleDaysOfWeek, mkRuleMonths in the Arabic rules for Time, and guard against `isOkWithThisNext` for ruleThisTime, ruleNextTime and ruleLastTime. Reviewed By: chinmay87 Differential Revision: D16589997 fbshipit-source-id: f8e6d7fb6362abc314aa9eb04bfa4b965e1d1e41 * Reuse helpers Summary: Guarded against `isOkWithThisNext` for ruleThisTime, ruleNextTime and ruleLastTime. Changes based off of D16589879 and D7200945 Reviewed By: chinmay87 Differential Revision: D16678781 fbshipit-source-id: b1fba9067818d0cc587016651b35d54d5c70e621 * Add Pargat Diwas holiday | T28481361 Summary: Adding support for Pargat Diwas holiday from 2000 to 2030. Made changes to: - corpus.hs (included test) - computed.hs (included dates): Dates were checked on the sources: * https://www.drikpanchang.com/hindu-saints/valmiki/maharishi-valmiki-jayanti.html?year=2020 (year being changed) * https://www.festivalsdatetime.co.in/2018/11/2020-valmiki-jayanti-date-and-time-2020.html?m=1 (year being changed) For conflicting dates (one day difference), chose www.drikpanchang.com as the primary choice. - rules.hs (included regex for the festival) Generated all the related classifiers Reviewed By: chinmay87 Differential Revision: D16721179 fbshipit-source-id: c3c1023b39ca616374a5d936c08352ee0d4b210d * Add back Copyright header to EN_GB classifiers Summary: as title Reviewed By: chinmay87 Differential Revision: D16902441 fbshipit-source-id: 9b6d1fe9696191dffef5af315a47666edf715c0d * Duration/EN: <integer> and a half minutes Summary: Add rule for "<integer> and a half minutes" Differential Revision: D17108126 fbshipit-source-id: beaf3eea976572fdec292df6c17303abeb8fcdbf * reuse helpers Summary: guard against `isOkWithThisNext` for ruleThisTime, ruleNextTime and ruleLastTime. Required changes to be made in ruleNoon, ruleMidnight, and ruleWeekend Reviewed By: patapizza Differential Revision: D16999895 fbshipit-source-id: 2c18bd39692389f7e6a9d2fb82802e9dc509893d * Add Ugadi holiday Summary: Adding support for Ugadi holiday from 2000 to 2030. Reviewed By: chinmay87 Differential Revision: D17156412 fbshipit-source-id: b16f710912bf90a1a472661e31d83fbff7043cfe * fix missing latent corpus in tests Reviewed By: chinmay87 Differential Revision: D17225649 fbshipit-source-id: 5a41372737c31e87ec944824b852516de531d376 * latent entities Summary: Adding latent matching rules. Matching Numerical to QuantityData with Unnamed as unit Reviewed By: chinmay87 Differential Revision: D17225711 fbshipit-source-id: 8e423454e5e7b83eb8de4cabfd4f85a2a35b7a6d * Time: Extend support for Ramadan and Eid al-Fitr from 1950 to 2000 Summary: Previously Duckling was supporting Ramadan and Eid al-Fitr from 2000 to 2029. Now the range has been expanded to start from 1950. Reviewed By: patapizza Differential Revision: D17319760 fbshipit-source-id: 008ef7ffef77f293666a32381f663db07b8130ae * En/Time: Support "Saint" Patrick's Day Summary: Improved the regex to support "saint" Reviewed By: patapizza Differential Revision: D17327533 fbshipit-source-id: 97b8290c416adbc4b32d543b01f408016c62075a * Time/EN: Added Ratha-Yatra holiday Summary: Added dates for Ratha-Yatra I double checked the dates, but still could have messed something up. Reviewed By: patapizza Differential Revision: D17358944 fbshipit-source-id: bf7c5f81c83c312166c5e3c960534606b95f0c2e * EN/FR support temperature with decimal values (#408) Summary: Allowing the support of decimal values like in human body temperature. Example: 98.6°F Pull Request resolved: https://github.com/facebook/duckling/pull/408 Reviewed By: girifb Differential Revision: D17401806 Pulled By: patapizza fbshipit-source-id: f04b768e2f6cb48c9c50977a5807d62e38f8d545 * Add support for computed holidays in ZH Reviewed By: chinmay87 Differential Revision: D17431708 fbshipit-source-id: 88b95877f49c0f46e4c9817020c266c37974daa2 * Translate periodic holidays Summary: Adding support for periodic holidays in Italian Reviewed By: chinmay87 Differential Revision: D17204304 fbshipit-source-id: 06adf1c5f673263cab7269033d2a312baca842eb * Translated all computed holidays from English into German and added HolidayHelpers to reduce code duplication. Summary: Added German langauage support for all computed holidays currently supported in English. Also created HolidayHelpers, which contains more complex holiday calculations used across languages. Reviewed By: patapizza Differential Revision: D17386163 fbshipit-source-id: 9dd88f8b0d699e5d7254a5ba7114bfa01b15176a * Translate periodic holidays Summary: Add the support for periodic holidays in Spanish (the ones that make sense in the language). Skipped Periodic Holidays: Año Nuevo, Navidad, Nochevieja: These holidays already had rules defined for them in the Rules file (see ruleAnoNuevo, ruleNavidad, ruleNochevieja) Ugadi: This holiday does not have a translation into Spanish. Note: Ignoring Lint issues for "Line too long", to follow convention set by original file / previous diffs. Reviewed By: chinmay87 Differential Revision: D17506908 fbshipit-source-id: 7b43768443dcd4f020a20758995a63ab88706b35 * Remove ANN pragma for duckling Summary: Use of ANN pragma may slow down compilation time because of TemplateHaskell. Because of that, using comment style ignore would be preferable. For more information on ways to ignore hints with hlint, please see https://github.com/ndmitchell/hlint#ignoring-hints Reviewed By: patapizza Differential Revision: D17365266 fbshipit-source-id: 71e4952738bba17b4d2ec2a18b31b4b7e3f509db * Time/PL: refactor Summary: Refactor Time/PL code by reusing mkRuleHolidays and mkRuleSeasons, and guarding against isOkWithThisNext for ruleLastTime, ruleNextTime, ruleThisTime, ,and mkOkForThisNext for ruleWeekend Add another polish holidays Reviewed By: chinmay87 Differential Revision: D17395534 fbshipit-source-id: d4ec591b0aad71f8f5e144ff5274491d55dc97f6 * Time/EN: Added support for Ganesh Chaturthi Summary: Added support for Ganesh/Vinayaka Chaturthi Hindu holiday from 2000 to 2030 Reviewed By: haoxuany Differential Revision: D17675368 fbshipit-source-id: 2d53ad2592fc8d234bd7a3cbac2bddeaa45b220b * Duration/EN support for <integer> hour and <integer> Summary: Resolves durations such as "2 hours and ten" to 130 minutes or "1 hour and 15" to 75 minutes. Reviewed By: zhpzuo Differential Revision: D17822118 fbshipit-source-id: 7da5c0e43ced91cb924046f764c133a66af8ee4d * Time/EN: Added rama navami holiday Summary: Added support for Rama Navami holiday from 2000 to 2030 Reviewed By: chinmay87 Differential Revision: D17881237 fbshipit-source-id: f3f17d67d178fa8fbcb8ae640c3bfc17bc3e21d3 * Add "<part-of-day> at "<time-of-day>" rule. Summary: Parts of day are time ranges, e.g. "tonight" is a range from 6:00pm to midnight. We have intersect logic in place to resolve a string like "tonight at 7pm" to one time, at 7pm. But if the time is outside of the part of day's range (e.g. "tonight at 5pm"), the string is resolved to 2 separate times ("tonight" and "at 5pm"). These changes resolve e.g. "tonight at xx" to "xx" irrespective of the range of tonight, as long as the am/pm makes sense (so "tonight at 5am" would still resolve to 2 separate times - "tonight" and at "5am"). "this/early morning at xx" gets resolved to "xx am". All other parts of day get resolved to "xx pm", with one exception: all parts of day resolve "... at 12" to midnight. Differential Revision: D17694898 fbshipit-source-id: 1e24023759bb942659285d18a6a4d0b09f77c9da * Readd a test. Summary: This got removed in a previous commit, readd this to confirm this functionality is still working. Reviewed By: haoxuany Differential Revision: D18175640 fbshipit-source-id: 3d06efe3537e1a517f412ed739f3cc34a9b3105b * Time/EN_US: Modify this|last|next <cycle> Summary: We weren't capturing cases like "the first Saturday of the month", due to "the month" not being properly parsed. Reviewed By: haoxuany Differential Revision: D18193355 fbshipit-source-id: 2c4e83a3f22b0fe306ce7662ade85434a0016784 * Time/EN: Support 'the <day-of-month > of <month>' Summary: We weren't capturing cases like "the second of february" as it was matching with the "the <cycle> of <time>" rule Differential Revision: D18249651 fbshipit-source-id: 09e214f585b96d07af4d5043de61445f4e156c54 * reuse helpers in HR rules Summary: [Duckling][Time][HR] Reuse helpers: cleanup code by reusing helper functions in HR rules Reviewed By: chinmay87 Differential Revision: D18310913 fbshipit-source-id: 0efd69121d25f6a0967b104bfaf97a2b3096ed30 * keep unicode output in tests sane Summary: Make test failure outputs readable by proper printing of `Data.Text`, using the `unpack` function rather than relying on the implementation of the `Show` typeclass for `Text` Reviewed By: patapizza Differential Revision: D18367058 fbshipit-source-id: b5aece3c8818f16dfe4c55235f6b9a183ba6f70f * Add Numeral dimension for new language TH (#399) Summary: Hello, I am new to Haskell, but I would like to add Thai language (TH) to Duckling. I have tried to extended Duckling by adding Numeral dimension for new language TH. Please have a look at it and see what we can improve. Thanks! Pull Request resolved: https://github.com/facebook/duckling/pull/399 Reviewed By: patapizza Differential Revision: D17651508 Pulled By: haoxuany fbshipit-source-id: 4b3ee1352f239eee637958f5e9dce68430352a0a * Duration/EN: parse Xm as X minutes and X.Y hrs as X.Y hours Summary: modified the regex pattern for minutes to include m alone, as well as the regex pattern for ruleDurationDotNumeralHours to pass h, hr, and hrs Reviewed By: patapizza Differential Revision: D18799727 fbshipit-source-id: df4d0bd53407b427254169454e647e43e073795e * Duration/EN: Leverage TimeGrain for number.number hours Summary: also kill redundant `isGrain` helper from `Time`. Reviewed By: dwhit, haoxuany Differential Revision: D18937649 fbshipit-source-id: ed658cc3bac70e6592dabae536a31a4c2da8a578 * fix imports for Types Summary: apparently this is breaking the external build, fix this Reviewed By: patapizza Differential Revision: D19104360 fbshipit-source-id: bc75f698b483a7f4f5b2905e11cf52fd36c1f0a9 * Add Time dimension for language BG Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/403 Reviewed By: haoxuany Differential Revision: D18348752 Pulled By: patapizza fbshipit-source-id: ce3b5c76cb2cf39114216842529d4eaa8df5b93f * Added Slovak (sk) language with numeral dimension and tests. Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/428 Reviewed By: haoxuany Differential Revision: D18348514 Pulled By: patapizza fbshipit-source-id: 9b0b9c2caa9fec8330746059eefa6185a8f3e072 * AF Setup + Numeral (#422) Summary: - Setup Afrikaans (AF) language - Added Numeral Dimension Some of the paths have changed, and some extra files were necessary, after basing initial work off https://github.com/facebook/duckling/commit/24d3f199768be970149412c95b1c1bf5d76f8240 I followed some of the Numeral examples from Dutch as well as Hungarian, since Afrikaans and Dutch have some similarities. One thing was examples for numbers having the number as an example, which I didn't do here, because I'm not sure it's necessary. Pull Request resolved: https://github.com/facebook/duckling/pull/422 Reviewed By: awalterschulze Differential Revision: D18348617 Pulled By: patapizza fbshipit-source-id: b8c4218629c264b48d6f2cecc4c23e2e281a64da * Time/EN_US: orthodox good friday Summary: Supporting "orthodox good friday" in addition to "orthodox great friday" in the regex Reviewed By: chinmay87 Differential Revision: D19604033 fbshipit-source-id: c6ca68fc34e284304ca2ba07a8f1bf81378c3558 * Adding Locales for ES Numeral Summary: Adding locale rules for ES Numeral because Spain use "," as decimal but south american country use "." as decimal. Wiki: https://en.wikipedia.org/wiki/Decimal_separator Reviewed By: haoxuany Differential Revision: D20040111 fbshipit-source-id: e2a4bfc2928df19976ef98e90ee82e7d21b52313 * Time/EN_US: Super Tuesday Summary: Adding new holiday. Reviewed By: haoxuany Differential Revision: D20193781 fbshipit-source-id: c8be523293b7b6ee836965c8914e3db58cc41085 * Time/EN: the (nth) closest (day) to (time) Summary: Leveraging `predNthClosest` helper in English rules. "the second closest monday to february 6" "the closest tax day to boss day 2018" Reviewed By: haoxuany Differential Revision: D20214444 fbshipit-source-id: b6be32f63097d221aa7ccc6df4e3639e4deee4a9 * Fix build Summary: Adding new locale rules to cabal file Reviewed By: patapizza Differential Revision: D20288009 fbshipit-source-id: 71fe63973b4bc58d2fa7952af725b11238c99ef9 * Enabling TimeGrain (#460) Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/460 Exposing the TimeGrain feature Reviewed By: patapizza Differential Revision: D20250270 fbshipit-source-id: 726f85eebd95ae31d911ebd9a43428d549aba877 * AmountOfMoney/ES: Add support for intervals Summary: This change applies roughly the same rules for supporting intervals in Spanish AmountOfMoney that we suppor in English: intervals using `entre _ e _` / `de _ a _` / `_ - _` with either money in both slots or a number in the first slot and money in the second. My Spanish is okay but not great - I'm confident these rules are good and cover the most likely phrases, but there's probably room to add more coverage. Reviewed By: patapizza Differential Revision: D20425979 fbshipit-source-id: deb17fc331e1aa192d91dd47bc7f3864a246f0be * Numeral/EN: Fixes ambiguous parses when both ruleNegative and ruleMultiply apply (#406) Summary: I noticed two ambiguous parses would occur when both ruleNegative and ruleMultiply would apply. For example: "minus three million two hundred thousand" ``` *Duckling.Debug> debug (makeLocale EN Nothing) "minus three million two hundred thousand" [This Numeral] compose by multiplication (minus three million two hundred thousand) -- negative numbers (minus three million two hundred) -- -- regex (minus) -- -- intersect 2 numbers (three million two hundred) -- -- -- compose by multiplication (three million) -- -- -- -- integer (0..19) (three) -- -- -- -- -- regex (three) -- -- -- -- powers of tens (million) -- -- -- -- -- regex (million) -- -- -- compose by multiplication (two hundred) -- -- -- -- integer (0..19) (two) -- -- -- -- -- regex (two) -- -- -- -- powers of tens (hundred) -- -- -- -- -- regex (hundred) -- powers of tens (thousand) -- -- regex (thousand) negative numbers (minus three million two hundred thousand) -- regex (minus) -- intersect 2 numbers (three million two hundred thousand) -- -- compose by multiplication (three million) -- -- -- integer (0..19) (three) -- -- -- -- regex (three) -- -- -- powers of tens (million) -- -- -- -- regex (million) -- -- compose by multiplication (two hundred thousand) -- -- -- compose by multiplication (two hundred) -- -- -- -- integer (0..19) (two) -- -- -- -- -- regex (two) -- -- -- -- powers of tens (hundred) -- -- -- -- -- regex (hundred) -- -- -- powers of tens (thousand) -- -- -- -- regex (thousand) ``` This PR fixes this ambiguity and Duckling will only return the second (correct) parse. Pull Request resolved: https://github.com/facebook/duckling/pull/406 Test Plan: regen'd classifiers (no-op) :test Duckling.Tests Imported from GitHub, without a `Test Plan:` line. Reviewed By: chinmay87, girifb Differential Revision: D20303354 Pulled By: patapizza fbshipit-source-id: 280b0e33b7c944f9d87a7c23afda2f6a843e28a4 * Duration: Rename `timesOneAndAHalf` to `nPlusOneHalf` Summary: When I first skimmed our rules for "half an hour" vs "an hour and a half" I actually thought there might be a bug, because `timesOneAndAHalf` sounds like it's actually multiplying by `1.5`. There's no bug, the implementation is entirely correct, but it does not multiply by 1.5, it adds .5 to any integer value at the given grain. This diff renames the function to be more descriptive. Handy trick for doing this kind of refactor without IDE tooling: ``` find duckling/Duckling/Duration/ -name 'Rules.hs'| xargs sed -i 's/timesOneAndAHalf/nPlusOneHalf/g' ``` Reviewed By: haoxuany Differential Revision: D20456966 fbshipit-source-id: 35020685f091a41618b30b7e5f95dbfa48509b88 * Add type=value to JSON response for Email, PhoneNumber and Url Summary: For consistency. Reviewed By: jtliao Differential Revision: D20524369 fbshipit-source-id: 44031667adccab9bca7b3b6d42c80878bb96ccae * Time/es: Fix ruleYearLatent Summary: Fix `ruleYearLatent` to be the same as the one in `en`. We don't want to match numerals that could have been hours. Reviewed By: patapizza Differential Revision: D20683975 fbshipit-source-id: cdef9b1b5f8a21dc5e207ed2a7afcad84c56a596 * Fix to dockerfile so PCRE regex works. (#467) Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/467 Reviewed By: chinmay87 Differential Revision: D20700248 Pulled By: patapizza fbshipit-source-id: 17f933106c6f18fcd93b73f42af458220d93b6cf * AmountOfMoney/EN: Make ruleIntervalMax, ruleIntervalMin symmetric Summary: When I was working on some related diffs, I noticed that there were some asymmetries between the regexes for ruleIntervalMax and ruleIntervalMin: - we had no support for "at most", even though we did have "at least" - we had no support for "not? less than" - the ordering of the different constructions didn't match This a minor tweak to make things match better Reviewed By: patapizza Differential Revision: D20484594 fbshipit-source-id: c3c54a9cc1b83402e42634b7a98a1a3b8cc5e09c * Improve Docker build (#341) Summary: * Reduces size of final image from 5GB to 130MB * Builds any checkout (not locked to the master) * Doesn't run stack on CMD (executes static build of Duckling instead) Pull Request resolved: https://github.com/facebook/duckling/pull/341 Reviewed By: chinmay87 Differential Revision: D21083018 Pulled By: patapizza fbshipit-source-id: d909158f20f5b8da5b0248a25103b850797bc3a3 * Time/es: Make "n horas" latent". (#478) Summary: 1. ~~Fixed broken build due to the problem with main test entry point;~~ 2. Fixed the ambiguous results caused by mishandling the ranking rules for parsing frames in ES. For example "una hora" be interpreted either as "Duration" or "1pm" in "Time" dimension. And the expected result should be in "Duration" dimension. 3. ~~ignore stack lock file~~ Pull Request resolved: https://github.com/facebook/duckling/pull/478 Test Plan: ``` :test Endpoint.Duckling.Tests --hide-successes [1003 of 1003] Endpoint.Duckling.Tests (Duckling.Api changed) Ok, two modules loaded. All 357 tests passed (79.69s) ``` ``` haxlsh> H.io $ debug (makeLocale ES Nothing) "de una horas" [This Time, This Duration] <integer> <unit-of-duration> (una horas) -- number (0..15) (una) -- -- regex (una) -- hora (grain) (horas) -- -- regex (horas) [Entity {dim = "duration", body = "una horas", value = RVal Duration (DurationData {value = 1, grain = Hour}), start = 3, end = 12, latent = False, enode = Node {nodeRange = Range 3 12, token = Token Duration (DurationData {value = 1, grain = Hour}), children = [Node {nodeRange = Range 3 6, token = Token Numeral (NumeralData {value = 1.0, grain = Nothing, multipliable = False, okForAnyTime = True}), children = [Node {nodeRange = Range 3 6, token = Token RegexMatch (GroupMatch ["una","","a","","",""]), children = [], rule = Nothing}], rule = Just "number (0..15)"},Node {nodeRange = Range 7 12, token = Token TimeGrain Hour, children = [Node {nodeRange = Range 7 12, token = Token RegexMatch (GroupMatch ["ora"]), children = [], rule = Nothing}], rule = Just "hora (grain)"}], rule = Just "<integer> <unit-of-duration>"}}] it :: [Entity] ``` Reviewed By: fascpt Differential Revision: D21770015 Pulled By: chinmay87 fbshipit-source-id: 3056fcf656140c9d65b70b5c604a286ea2c307b2 * Fixed a problem with paring fractional time phrase for hours and minutes. (#483) Summary: Current behavior: "an hour and 45 minutes" -> parsed as "1 hour" [dimension: "Duration"] "a minute and 30 seconds" ->parsed as "1 minute" [dimension: "Duration"] Expected behavior: "an hour and 45 minutes" -> "105 minutes" with dimension as "Duration" "a minute and 30 seconds" -> "90 seconds" with dimension as "Duration" The fix: adding new rule to handle this duration composition pattern. (<some duration> and <some other duration>) Pull Request resolved: https://github.com/facebook/duckling/pull/483 Reviewed By: haoxuany Differential Revision: D21850773 Pulled By: chinmay87 fbshipit-source-id: 62eb6859e0ce2b88cf8ae48d836a1a6a1ac8705d * Fixed a problem with parsing "day of month" that contains "dia" in it (#487) Summary: Current: "el dia nueve" -> "9pm" of current day Expected: "el dia nueve" -> 9th of current or next month Fix: added new ES rule to handle the pattern like "el dia <day of month>" Pull Request resolved: https://github.com/facebook/duckling/pull/487 Reviewed By: girifb Differential Revision: D21850807 Pulled By: chinmay87 fbshipit-source-id: d8edd81273c7e5f700b440ccc8c7e7bded679051 * Added new rule for "midday" (#490) Summary: added new EN rule to parse the phrases that contain "midday". Pull Request resolved: https://github.com/facebook/duckling/pull/490 Differential Revision: D21959562 Pulled By: chinmay87 fbshipit-source-id: f9ab45aecd551e8959d00b0025ed38b616ed6b14 * Fixed problem with parsing fractional (with decimal) minutes (#484) Summary: Current behavior sentence with pattern "xxx.yyy minutes" parsed as yyy minutes. Expected behavior: xxx.yyy minutes = 60*xxx+0.yyy*60 seconds For example: "15.5" minutes = 60*15+0.560 = 930 seconds Pull Request resolved: https://github.com/facebook/duckling/pull/484 Reviewed By: haoxuany Differential Revision: D21850782 Pulled By: chinmay87 fbshipit-source-id: c007901d4dd6476e5e383a13892ecff9b2191fff * Added support for parsing new ES duration phrases like half hour, quarter of hour. (#489) Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/489 Differential Revision: D21959268 Pulled By: chinmay87 fbshipit-source-id: 2b785b44da5437c7b27af098daef551139dad990 * Fixed the problem with parsing fractional hour phrase that contains "quarter" or "quarters" (#485) Summary: Current: if the fractional hour expression describes the hour fraction with term like "quarter or quarters", then duckling couldn't correctly recognize it. Expected: Duckling should be able to identify this kind of expression and parse it correctly. Fix: Add new rule to parse the fractional hour pattern that contains the keyword like "quarter or quarters". Pull Request resolved: https://github.com/facebook/duckling/pull/485 Test Plan: Imported from GitHub, without a Test Plan: line. Reviewed By: haoxuany Differential Revision: D21850804 Pulled By: chinmay87 fbshipit-source-id: 818b7b3f37e3f8a6d1a7d579db19fb2cfb2763f4 * ES/Duration: Add Copyright header to tests file Summary: as title Reviewed By: girifb Differential Revision: D21998107 fbshipit-source-id: 7c1c91db9a1ebf29d702930570341dc3b6b0ce65 * Duckling probabilistic layer bug fix Summary: while computing a score used to rank in Duckling, it currently sums up the log likelihoods learned during training. While ranking, the goal is to find the (same span) parse candidate which is _more_ likely to lead to a *correct* parse. However, the old logic was summing up the "more confident of the two classes" log likelihood.From what I understand this is the part which feels wrong. I created an example of two rules: #1. a rule where the classifier learns that the rule is very confidently NOT the correct parse. - okdata (positive class) is very low confidence (high negative number prior) - kodata (negative class) is very high confidence (low negative number prior) #2. a rule where the classifier is confident that it is the correct parse, but not Very Confident. - okdata (positive class) is high confidence (nonzero, but low negative number prior) - kodata (negative class) is very low confidence (high negative number prior) these two rules match the same regex, thus the same span. While duckling parses it, it turns out, that rule #1 ranks higher than rule #2. The reason why is because #1 is MORE confident that it is the INCORRECT (does not contribute to) parse than rule #2. Does this make sense? to solve this problem, I changed the ranking score estimation to use only the positive class scores (okdata). In the example above, it fixes it so rule #2 would end up ranking higher because the positive class confidence is higher than #1's positive class confidence. Would really love some deeper input from Duckling experts. I re-learned haskell and learned haxl to craft a small example here, and I am very new to Duckling (just started reading the ranking code on Friday). I know Duckling is battle-tested but I also don't believe that means a bug can't exist. And further, this specific bug may not happen a whole lot for 2 reasons: - there are not a lot of rules which end up higher negative confidence than positive (requires enough negative corpus examples over positive ones) - ranking uses span width first, and only when the spans are equivalent does the score based ranking come into play. So it requires that 2 rules match the same span before any actual score calculation even matters. Reviewed By: patapizza Differential Revision: D22009276 fbshipit-source-id: 13491689d39d810da526fa4bb8b6e526d4cafd35 * Added new rules to parse phrases for upcoming weeks. (#491) Summary: the new rules could parse phrases in the form of xxx upcoming weeks upcoming xxx weeks Pull Request resolved: https://github.com/facebook/duckling/pull/491 Test Plan: Imported from GitHub, without a Test Plan: line. Differential Revision: D21959647 Pulled By: chinmay87 fbshipit-source-id: a062a8c7a6c2e23b921b1099b886fa589c69c454 * Fix a problem with parsing ES time phrase Summary: The root cause was the error in parsing the ES numeral value [1-9] that spelled with two words instead of one. For example "cero dos" should be parsed the as "dos". Currently it's being as two numeral values: 0 and 3. Reviewed By: chinmay87 Differential Revision: D22162804 fbshipit-source-id: 949956935a21e742f6788e7afa788ff728dd9a8d * Added new rule to support ES phrase for "next week". (#493) Summary: Please note that the major diff with the existing rule for next week is that the new phrase doesn't have the leading "la" or anything with similar meaning. Pull Request resolved: https://github.com/facebook/duckling/pull/493 Test Plan: Imported from GitHub, without a Test Plan: line. Reviewed By: patapizza Differential Revision: D21981169 Pulled By: yuanbing fbshipit-source-id: 7478d1262c3a4599d359b485b28a547ad5f44b76 * Updated the rule to parse ordinal day of month in ES (#495) Summary: the rule is updated to conform with natural expression of "ordinal day of month". Pull Request resolved: https://github.com/facebook/duckling/pull/495 Differential Revision: D22054297 Pulled By: yuanbing fbshipit-source-id: d9d8e00311d4d3121685ab5b09f6c1f52f3077c9 * support "noon" phrase in ES Summary: This fix is to add support to parse alternative phrase, in ES, for "noon". Currently the supported ES phrase for "noon" is "mediodia", the alternative form is "medio<whitespace*>dia". Reviewed By: chinmay87 Differential Revision: D22188049 fbshipit-source-id: 798b83be75798f3b0d695a0f01a65dc84af98e22 * Added new rule support composite duration phrase in ES (#498) Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/498 Test Plan: In haxlsh: H.io $ debug (makeLocale ES Nothing) "dos hora y treinta y cinco minutos" [This Duration] Reviewed By: chinmay87 Differential Revision: D22054695 Pulled By: yuanbing fbshipit-source-id: b4486141bf7ccb0e538e40ce40fadd7daef374a8 * added new rule to parse phrase in the pattern "xxx minutes to <hour-of-day>" (#500) Summary: Current: 20 minutes to 2pm tomorrow -> 20 minutes (dimension: Time) Expected: 20 minutes to 2pm tomorrow -> 1:45pm of next day (dimension: Time) Pull Request resolved: https://github.com/facebook/duckling/pull/500 Reviewed By: chinmay87 Differential Revision: D22200580 Pulled By: yuanbing fbshipit-source-id: e47e5b5aaf4e3644c7032096caa75672a8543087 * Fixed a problem in parsing ES timestamp Summary: There are two types of ES phrases for timestamp to support: 1. "para las seis cero dos pm" 2. "para las 6 0 2 pm" The solution is to: 1. added a new rule to parse two-digit number between 1 and 9 (inclusive); 2. modified the regex pattern to support additional optional phrase "para" in front of "las". Reviewed By: chinmay87 Differential Revision: D22218800 fbshipit-source-id: 58f692beb6f10834c0ab639b31bf239bf4a1970e * Added new rule to parse ES phrase for time of day (in the afternoon) (#496) Summary: Current: "seis dos de lar tarde" -> "dos de lar tarde" or 2pm; note that the term "seis" is dropped. Expected: "seis dos de lar tarde" -> "seis dos de lar tarde" or 6:02pm Pull Request resolved: https://github.com/facebook/duckling/pull/496 Test Plan: H.io $ debug (makeLocale ES Nothing) "seis dos de la tarde" [This Time] Reviewed By: chinmay87 Differential Revision: D22054328 Pulled By: yuanbing fbshipit-source-id: 1ecb05885fc506176cc04768aa158279c7e7fd4f * Updated the rule to parse "last <day-of-week> of <time>" Summary: current: last friday in october -> the date of Friday of previous week expected: last friday in october -> the data of last Friday of month october Reviewed By: chinmay87 Differential Revision: D22201326 fbshipit-source-id: 1983c1b9c24aa356977af7def42d5ba07c7f08be * Add support for spelled out time of Summary: Current: "twelve zero three" -> 12:00pm Expected: "twelve zero three" -> 12:03pm The root cause was that duckling doesn't support this kind of pattern for timestamp. The uniqueness here was that the number "three" was spelled as "zero three" that Duckling failed to understand. Reviewed By: chinmay87 Differential Revision: D22313140 fbshipit-source-id: 9e481a142a16b94c61b1770e7f8be036497419f8 * added new rule to handle ES phrase for next week (#497) Summary: Current: "siquiente semana" -> [] // empty result Expected: "siquiete semana" -> "next week" Pull Request resolved: https://github.com/facebook/duckling/pull/497 Test Plan: haxlsh> H.io $ debug (makeLocale ES Nothing) "siguiente semana" [This Time] Reviewed By: chinmay87 Differential Revision: D22054455 Pulled By: yuanbing fbshipit-source-id: 576e96a49eebace9b5baa382efac2e266e651d8e * added support to parse oridinal day-of-week Summary: Current: "first monday of last month" -> the date of first monday starting from current time. Note here the term "last month" is dropped Expected: "first monday of last month" -> the date of first monday of previous month. Reviewed By: chinmay87 Differential Revision: D22300243 fbshipit-source-id: 16622860c52ec2ce9c7a7bcd6094192255aa5a0b * Added support for parsing year composed of multiple ES words Summary: The root cause is this lacking of support for the composition of numerals in ES. For example, "mil novecientos noventa" is parsed 3 individual numbers: 1000, 900 and 90 correspondingly. Instead, the expected result is a single numeral value that is the sum of aforementioned three numbers. The same expection can be extended to the composition with arbitrary number of numeral values. Reviewed By: chinmay87 Differential Revision: D22192034 fbshipit-source-id: 476489145b83297b82d88f3451020c867e2d08aa * Tweak the rule for parsing "tomorrow" in ES Summary: There are two rules for parsing "manana" (dimension: Time): one is resolved to "morning"; while the other is resolved to "tomorrow". And the first (or "morning") rule resolves to a LATENT result; while the second (or "tomorrow") rule resolves to a NON-LATENT result. If the duckling is called with "latent" option turned off, the "tomorrow" rule prevails. However, if the duckling is invoked with "latent" option turned on, the "morning" rule is preferred. The solution (for now) is to steer the classifier towards "tomorrow" rule by adding large number of (same) examples for "tomorrow" rule. Reviewed By: chinmay87 Differential Revision: D22425277 fbshipit-source-id: 2f139eec0c38b9b5227f27d9f09f6264e7cf86cd * Fixed the problem parsing "next <day-of-week>" Summary: If the current time is: 07/07/2020 (tuesday), Current: "next saturday" -> 07/11/2020 Expected: "next saturday" -> 07/18/2020 According to Quora (https://www.quora.com/When-is-this-Monday-and-next-Monday-Are-they-the-same#:~:text='Next%20Monday'%20is%20Monday%20of,the%20first%20Monday%20after%20today.), the term "next saturday" means the first saturday in the week after current (this) week, regardless the current day of week. Reviewed By: haoxuany Differential Revision: D22420499 fbshipit-source-id: c2bd28b9fda78ff3cb0418a50c3b302be350b02d * Fixed a problem in parsing mult-word timestamp for ES Summary: Current: "seis cero cinco pm" [dimension Time] -> "cero cinco pm" or "5 pm" here the term "seis" was dropped because it was treated as "6" in "Numeral" dimension. Expected: "seis cero cinco pm" -> "6:05 pm" The root cause was that the rule "<hour-of-day> <integer> (as relative minutes)" dropped the first term "hour-of-day" if it was parsed as a latent token. Reviewed By: chinmay87 Differential Revision: D22553028 fbshipit-source-id: abc92bb369c23d2b3084641eab2a2dabb87dbc66 * Fixed the rule for parsing "coming <time cycle>" Summary: Currently the term "coming" is being treated the same way as "this" or "current". The expected treatment should be the same as the term "next". Reviewed By: chinmay87 Differential Revision: D22435156 fbshipit-source-id: b0b20d8a38014267fb7d037b685ce126f602bda7 * Export default module name 'Main' from within TestMain.hs file (#512) Summary: **Summary** **Current** `stack test` fails with an error "output was redirected with -o, but no output will be generated because there is no Main module" **Expected** `stack test` should run tests to completion The cause here seems to be that the [`main-is` flag](https://github.com/facebook/duckling/blob/a88e0669f7d5889bda182b61bd05cdae697a2c07/duckling.cabal#L851) supplies the *filename* in which to begin tests, but expects to find a *module* named `Main` there by default. Two possible fixes are possible - either: - [Add a ghc-options flag](https://github.com/facebook/duckling/issues/505#issue-650474748) to specify a module name; confusingly the flag name is also `main-is` - Use the default `Main` module name within TestMain.hs (the approach taken here is the latter, since this avoids duplicating use of flags named `main-is` in slightly different contexts) **References** - https://github.com/facebook/duckling/issues/505 - https://github.com/haskell/cabal/issues/4315 **Version Info** ```sh $ stack --version 1.9.3.1 x86_64 Compiled with: - Cabal-2.4.0.1 # <remainder of output omitted> ``` Resolves https://github.com/facebook/duckling/issues/505 Pull Request resolved: https://github.com/facebook/duckling/pull/512 Reviewed By: girifb Differential Revision: D22799888 Pulled By: patapizza fbshipit-source-id: 2c0808790e6671e6bc3c9b1f322e57b8dc32a8cc * Re-sync with internal repository * Time/DE: Don't parse "so" Summary: "so" is an adverb in German: https://github.com/wit-ai/wit/issues/1860 It's also a short form for "Sonntag" (Sunday); making the dot mandatory. Reviewed By: haoxuany Differential Revision: D22900791 fbshipit-source-id: 8dc873f79a21ca2add074f9c664e84fae56f1e67 * NL/Duration: Add "anderhalf uur" (#502) Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/502 Reviewed By: patapizza Differential Revision: D22260625 Pulled By: haoxuany fbshipit-source-id: bf44fdab7def19f6dd0e0ef7763c112a3b024396 * Time/FR: Some speed up Summary: Guarding against grains, shortening regexes. Reviewed By: jtliao Differential Revision: D23387716 fbshipit-source-id: de84d0efa79c4ae10bd9fbf14e82a724fee1a1f2 * Time/EN: Fix empty group match Summary: sad_palpatine Differential Revision: D23718913 fbshipit-source-id: 363bf9a43d8d1cd77405882bc70a7fa1a1de2dbe * Remove dependency on Data.Some (#533) Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/533 In recent versions of Data.Some the name of the constructor, `This` has changed name to `Some`. This has become rather problematic for us to migrate so we're just going to remove the dependency. The meat of this diff is adding the type `Seal` to `Duckling.Types`. That type replaces `Some`. Reviewed By: pepeiborra Differential Revision: D23929459 fbshipit-source-id: 8ff4146ecba4f1119a17899961b2d877547f6e4f * Update dependencies/CI Summary: This PR accomplishes several things: - removes dist-newstyle (local build artifacts should not be checked in) - extends the .gitignore to include many common build artifacts/editor artifacts - allow more modern dependencies (upper bounds of many were out of date by one or two years' worth of releases) - upgrade stack lts (9.2 -> 14.2) to GHC 8.6.5 - regenerate .travis.yml using the now-standard haskell-ci (many haskell core libraries use this), instead of the outdated script that was maintained by hvr; as a precursor to this, the tested-with versions were updated Reviewed By: patapizza Differential Revision: D24623967 fbshipit-source-id: 838fe571df0b8d44106349659ce8ce8ab82f0bc6 * Adds new rules of accentuation of the Portuguese (#531) Summary: Keeps accents consistent, "quinquagésimo" there is no more "Ü". Pull Request resolved: https://github.com/facebook/duckling/pull/531 Reviewed By: patapizza Differential Revision: D23770703 Pulled By: chessai fbshipit-source-id: f8a34c02028faf9f51eca6a016b5bad988a83f04 * Dockerfile: debugs the build and uses Debian Buster everywhere (#539) Summary: The Dockerfile build part did not copy the Duckling implementation into the container, making the build fail. I also harmonized the target Debian to Buster, that is the one currently hidden behind `haskell:8`. Pull Request resolved: https://github.com/facebook/duckling/pull/539 Reviewed By: patapizza Differential Revision: D24688839 Pulled By: chessai fbshipit-source-id: 0ffcc4d28a599b7edad668730117828d26e116ad * Quantity rules for Spanish (ES) Summary: Spanish (ES) will now have all the same quantity rules as English (EN) (which I think is the most-supported language), plus more. This includes the following: * bowls - (bol(es)?|tazón(es)?|cuencos?|platos? (soperos?)|(hondos?)) (EN does not currently have this) * cups - (tazas?) * dishes - (platos?|fuentes?) (EN does not currently have this) * grams - (((m(ili)?)|(k(ilo)?))?g(ramo)?s?) * ounces - ((onzas?)|oz) * pints - (pintas?) (EN does not currently have this) * pounds - ((lb|libra)s?) * quarts - (cuartos? de galón) (EN does not currently have this) * tablespoons - (cucharadas? (grande)?) (EN does not currently have this) * teaspoons - (cucharaditas?) (EN does not currently have this) Reviewed By: patapizza Differential Revision: D24628214 fbshipit-source-id: 2e8d500661f30fa0928cb7d3f21470afc01e2285 * adds frequent durations in German (#509) Summary: Found a lacking frequent duration in German and a small typo in the existing one. Pull Request resolved: https://github.com/facebook/duckling/pull/509 Reviewed By: patapizza Differential Revision: D24690104 Pulled By: chessai fbshipit-source-id: b49a7a636abf5b92f2fe7c0d5b2ca2fe64acbaa2 * ghc88x compat (#550) Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/550 Reviewed By: haoxuany Differential Revision: D24844625 Pulled By: chessai fbshipit-source-id: 52dcf5f9488386f7f407535e876bff1207823fe0 * fix common windows build issue (#549) Summary: * use regex-pcre-builtin by default on windows * update cabal version to 2.2 to support leading commas - requires the very first line in cabal file be the cabal-version line - BSD3 is not BSD-3-Clause (don't ask me why) resolves https://github.com/facebook/duckling/issues/547 Pull Request resolved: https://github.com/facebook/duckling/pull/549 Reviewed By: haoxuany Differential Revision: D24838317 Pulled By: chessai fbshipit-source-id: 376eb30a94ab88420915b868dffddb252fd08e76 * make duckling time not treat 0:xx and 12:xx ambiguously Reviewed By: haoxuany Differential Revision: D24929661 fbshipit-source-id: 3858d14ef1655f079daa33d2b159e8cb918a70ac * Support for more Hindi numbers (#552) Summary: Add support for additional Hindi numbers like 300, 81, 150, 1000, 1520. These are not supported in the current master version. Pull Request resolved: https://github.com/facebook/duckling/pull/552 Reviewed By: ashwinp-fb, girifb Differential Revision: D25072230 Pulled By: chessai fbshipit-source-id: 35277a2349384bcf44a20e74852113f5c010e618 * FA Setup (#520) Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/520 Reviewed By: patapizza Differential Revision: D25072459 Pulled By: chessai fbshipit-source-id: 5db72eda36fe166a452b2345cab75fb1508b192b * Improve german time approximation (#435) Summary: Improves the recognition of German time approximation language and removes a single error in the rule of <time-of-day> approximately. Pull Request resolved: https://github.com/facebook/duckling/pull/435 Reviewed By: patapizza Differential Revision: D24934281 Pulled By: chessai fbshipit-source-id: 641bcb6a7e5c26e66c735fe13bccae9b7a8909ae * ES/Ordinal: Fixes "tercero" pattern regex (#477) Summary: Missing "tercer" regex in rule Pull Request resolved: https://github.com/facebook/duckling/pull/477 Reviewed By: patapizza Differential Revision: D24934794 Pulled By: chessai fbshipit-source-id: a51f6fe3187749885784bfaacfee09cf26a8df6d * GitHub CI over Travis (#555) Summary: Facebook is migrating away from Travis CI, to GitHub actions. Pull Request resolved: https://github.com/facebook/duckling/pull/555 Reviewed By: patapizza Differential Revision: D25228779 Pulled By: chessai fbshipit-source-id: a392b93e5a7b02d1f47b477b6c459901d3171e05 * Document how to pass dimensions to the example application Summary: External users are repeatedly confused by lack of results from the duckling example executable. We should just go through all dimensions for the duckling call in the example app. Reviewed By: patapizza Differential Revision: D25468199 fbshipit-source-id: 6cf56b130d4d0aa3181f098d6a7c9a133bfa85ff * Fix typo in PL Duration Rules (#426) Summary: 'miej' in Polish is the imperative form of the verb 'mieć' (to have). "mniej więcej" means "more or less" and it was the intention here. Pull Request resolved: https://github.com/facebook/duckling/pull/426 Reviewed By: patapizza, girifb Differential Revision: D25546380 Pulled By: chessai fbshipit-source-id: 1047b83109cab917f1f4dbe87b667f8ccd2fb92d * Support abbreviation of Crore and Lakh Summary: Crore (1e7) and Lakh (1e5) are both commonly used to describe an amount of Indian currency. Common abbreviations are "Cr" (Crore) and "lkh", "L", "lac" (lakh). Additionally, common spellings of "crore" include "karor" and "koti" Reviewed By: patapizza Differential Revision: D25550546 fbshipit-source-id: 0c1479d9027431cb0d1182b5117eabca6f939cb2 * Add a new Arabic locale (EG) (#554) Summary: Egyptian Arabic is a dialect of Arabic that is mostly a spoken language that is used in everyday communications. This PR adds new locale to Arabic to support the differences between Modern Standard Arabic (MSA) and Egyptian Arabic (EG). I have mainly depended on the different locales of Spanish that are supported by Duckling to create the new Egyptian Arabic locale. New modifications are added to the `Numeral` dimension since I didn't spot differences in other dimensions. Pull Request resolved: https://github.com/facebook/duckling/pull/554 Reviewed By: patapizza Differential Revision: D25543502 Pulled By: chessai fbshipit-source-id: 4cbb7be78a52071c8681380077f0b4dc033a60de * ExampleMain: fix build failure (#560) Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/560 Reviewed By: patapizza Differential Revision: D25564850 Pulled By: chessai fbshipit-source-id: 631f96a3ed71b9d7707560ff6bfe7596feee2305 * add: support for quarter to, quarter past and half in HI (#423) Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/423 Reviewed By: girifb Differential Revision: D25573001 Pulled By: chessai fbshipit-source-id: 5474f108e968bdfb53ebc2518b46f28befdeba89 * Adding Numerical Dimention support for Telugu language (#470) Summary: This pull request is to add support for Telugu language (Numerical Dimension) to Duckling Pull Request resolved: https://github.com/facebook/duckling/pull/470 Differential Revision: D25546700 Pulled By: chessai fbshipit-source-id: 1d88ee27da8a577a4a79ff31be8cb55ed6444c4e * Time/PL - new rules (#538) Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/538 Reviewed By: haoxuany Differential Revision: D24640854 Pulled By: chessai fbshipit-source-id: 51eb0d530b143511f79992a91ca8f465b7860b6e * Add CreditCardNumber to common dimensions (#563) Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/563 Reviewed By: girifb Differential Revision: D25624047 Pulled By: chessai fbshipit-source-id: b50cf34f4a28bfcbd4a0ca3479debc5a5c118b5e * Correct CDT TimeZone offset Summary: CDT is UTC -5. (-5 hours) * (60 minutes/hour) = -300 hours. 540 was probably copy/paste error. Reviewed By: girifb Differential Revision: D25877623 fbshipit-source-id: de4f84f2564cbb154aec95eee63c458c64f8a85f * Add ASAP, at the moment to EN time (#405) Summary: * "at the moment" is considered identical to "now". * "ASAP" is considered identical to "from now" Pull Request resolved: https://github.com/facebook/duckling/pull/405 Reviewed By: patapizza Differential Revision: D26009483 Pulled By: chessai fbshipit-source-id: addf4c509e69d413cae279601c64f72710eba11f * Numeral/ZH: support more common expressions (#516-1) (#522) Summary: **1st set of changes from pull request https://github.com/facebook/duckling/issues/516 Supporting more common expressions, such as fraction, half, dozen, in Chinese. Pull Request resolved: https://github.com/facebook/duckling/pull/522 Reviewed By: patapizza Differential Revision: D23428893 Pulled By: chessai fbshipit-source-id: 3454ac70a4bfff90dc282560916a0fae9969f521 * NL/amount-of-money (#504) Summary: Currently values like 1000.000 (in Dutch . is thousand separator) are not recognised, as the ruleDecimalWithThousandsSeparator requires the decimal part (e.g. 1000.000,34) to be present. This PR adds some data and changes the ruleDecimalWithThousandsSeparator to make the decimal part optional. Pull Request resolved: https://github.com/facebook/duckling/pull/504 Reviewed By: patapizza, girifb Differential Revision: D26078885 Pulled By: chessai fbshipit-source-id: b1679c713e1d17a168d34a3cc556b6c36a571d75 * Time&Duration/ZH: support Cantonese and more common expressions (#516-2) (#523) Summary: **2nd set of changes from pull request https://github.com/facebook/duckling/issues/516 Supporting Cantonese and more common expressions in Chinese. Adding rules file for Duration/ZH. Pull Request resolved: https://github.com/facebook/duckling/pull/523 Reviewed By: haoxuany Differential Revision: D23428901 Pulled By: chessai fbshipit-source-id: 6d04c97b63bac966eb61d77cab2f08f7543dbbf0 * NL/Duration: Support composite durations (#503) Summary: E.g. "1 uur en drie kwartier", "1 dag 4 uur", etc. Pull Request resolved: https://github.com/facebook/duckling/pull/503 Reviewed By: patapizza Differential Revision: D22260615 Pulled By: chessai fbshipit-source-id: 40689f7630b4d5bab498df730528ce6bf768fa89 * skip logfile creation if no logging (#377) Summary: **Motivation** Currently the log files and the log directory for the server are always created, even if the logging is disabled. If duckling is used on OpenShift the file creation leads to errors if no volume mount is defined. **Proposed Change**: Only create log files / log directory if the logging is enabled. Pull Request resolved: https://github.com/facebook/duckling/pull/377 Reviewed By: patapizza Differential Revision: D26148878 Pulled By: chessai fbshipit-source-id: f8e2b1a38586121d854a4826c322b4b859cc9c6b * Add Arabic rule for a week ago (#379) Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/379 Reviewed By: patapizza Differential Revision: D26149123 Pulled By: chessai fbshipit-source-id: 5f0bca88fc1b64da5d93fcf715996d58a972fda2 * Polish(PL) - Support for seventy, eighty, ninety (#417) Summary: Support for polish equivalents of seventy, eighty, ninety. Pull Request resolved: https://github.com/facebook/duckling/pull/417 Reviewed By: patapizza Differential Revision: D26130642 Pulled By: chessai fbshipit-source-id: 4a0be944dcd0a9dea155caae145cf4a38537753f * implement 'the day after tomorrow' in Romanian Summary: adds a rule for 'the day after tomorrow' in Romanian. regenerates classifiers. Reviewed By: girifb Differential Revision: D26155042 fbshipit-source-id: 80005ab94a10f9fbf242c9a712bd040e4f6bc477 * parse latent year intervals Summary: adds a new rule that parses year intervals such as "1960 - 1961". see inline comments for heuristics. Reviewed By: patapizza Differential Revision: D25840835 fbshipit-source-id: 851a5b1c78440cbf065bf9f20a05c78d4967ea3c * Adds UAH currency Type and examples to EN and RU Corpus (#433) Summary: This PR adds UAH currency Type and examples to EN and RU Corpus Pull Request resolved: https://github.com/facebook/duckling/pull/433 Reviewed By: girifb Differential Revision: D25102990 Pulled By: chessai fbshipit-source-id: ed40e8dfcf145a65c7e6d87158da0efacb32e256 * Add initial support for volumes in Chinese Reviewed By: girifb Differential Revision: D26183123 Pulled By: chessai fbshipit-source-id: 1acd27d5172cfb5bccbeb1576700e2c60a8e3907 * Minor Volume.FR improvement: add "Centilitre" type (#354) Summary: Minor Volume.FR improvement: add "Centilitre" type. This is useful for recipe parsing. Pull Request resolved: https://github.com/facebook/duckling/pull/354 Reviewed By: patapizza Differential Revision: D26193246 Pulled By: chessai fbshipit-source-id: ddd551e062b8efeff1e786e30e35815c0c29a34c * Parse more date formats in Norwegian (#395) Summary: In general there are some clashes between time formats `hhmm` and date formats `ddmm`. For example, depending on context, `22.10` can mean clock time ten past ten or the twenty second of october. In general it's correct to interpret this as clock time, as Duckling currently does. But there are some cases not currently covered by Duckling where we have more unambiguous dates, e.g. `12.03.2018` and `27.11`. These are included here (in addition to midnight `24:00` which was also missing). #### Changes: - Bug in `ruleDdmm` regex meant that dates on the format `dd/mm` where `mm > 9` were not parsed - `ruleYyyymmdd` now also parses dots and forward slashes, i.e. `2012.05.14` and `2012/05/14` - New rule `rule2400` parses `24:00` and `24.00` (I elected not to include it in `ruleMidnighteodendOfDay` as it has grain minute rather than day) - New rule `ruleDmm` parses `1/10`, `9.12` etc - New rule `ruleDDm` parses `10/3`, `11.1` etc - New rule `ruleDdDotMm` parses `25.02`, `31.10` etc - `ruleDdmmyyyy` now also parses dots, i.e. `03.10.1983` - New tests Pull Request resolved: https://github.com/facebook/duckling/pull/395 Reviewed By: patapizza Differential Revision: D26193069 Pulled By: chessai fbshipit-source-id: cf711807fa1d40be2303f2426d74ded40c2e23b3 * Extend numeral rules Summary: - Extend fraction rule - add mixed fraction rules - add prefix of 10/100/10_000 rules Reviewed By: girifb Differential Revision: D26191175 Pulled By: chessai fbshipit-source-id: c2f6b74602e1b8061e0c556721ad8e36821fdb5c * Quantity/EN: Support k.g k.g. (#570) Summary: Adding . in between kilogram units used to be extracted as a Numeral instead of Quantity. Pull Request resolved: https://github.com/facebook/duckling/pull/570 Reviewed By: patapizza Differential Revision: D26199687 Pulled By: chessai fbshipit-source-id: 65e39f20296946d5762d7180b12878f4e66ea701 * Use System.FilePath.Posix Summary: Results in no change on linux/macos, but this is necessary on windows to prevent paths from being botched Reviewed By: girifb Differential Revision: D25893201 fbshipit-source-id: ca79dd8a766aecf27562044865d9bc258a4e8d11 * extend AmountOfMoney rules Summary: Add rules: - `hkd` as HKD, and related rules (prefix and suffix) - dollar and <amount-of-money> rule - dollar and a half rule - intersection for <amount-of-money> and `a half` Changed: - dime and dollar rules now have improved coverage Reviewed By: girifb Differential Revision: D26191724 Pulled By: chessai fbshipit-source-id: bf63b6eaa751fb96dcf341fa2b66db06a6eeca79 * Extend distance rules Summary: Add rules: - one meter and <dist> - <dist> meters and <dist> Reviewed By: girifb Differential Revision: D26191350 Pulled By: chessai fbshipit-source-id: 52c85c94647e98fba866c24d3386eea988f7f58c * AmountOfMoney - extend interval support Reviewed By: haoxuany Differential Revision: D26254863 Pulled By: chessai fbshipit-source-id: dfc06f9831de2d50c11d252429c4fb9b8c1eb13a * Volume - extend interval support Reviewed By: haoxuany Differential Revision: D26255089 Pulled By: chessai fbshipit-source-id: e4bdb0aa3c1be55dff0a5577155a3d0469d6762d * Distance - introduce interval rules Reviewed By: haoxuany Differential Revision: D26256269 Pulled By: chessai fbshipit-source-id: 0c3ca267158fd5189fef5540d5bbb903b0dd00b4 * add: support for composite duration in hindi (#425) Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/425 Reviewed By: girifb Differential Revision: D26263097 Pulled By: chessai fbshipit-source-id: 29605023746a30dc286ffb246eb30fdc4067cbd8 * Time - add more common expressions Summary: Added: last <duration> <time> <day-of-month> Reviewed By: haoxuany Differential Revision: D26263977 Pulled By: chessai fbshipit-source-id: b00ece753593a7fabe45bbaa9e1f013860e38d80 * Be more permissive with numerals [20, 90] Summary: There are a handful of more spelling for russian numbers [20, 30 .. 90] that we aren't handling. Additionally, we optimise for recall over precision by allowing some invalid spellings that could be understandable typos. Reviewed By: patapizza Differential Revision: D26285711 Pulled By: chessai fbshipit-source-id: fd8a8f373d228a526e79b22326eff48bb966310d * Adds german times rules like "Übernächste Woche" (week after next) (#330) Summary: fixes https://github.com/facebook/duckling/issues/329 and allows for recognizing of terms like übernächste woche Pull Request resolved: https://github.com/facebook/duckling/pull/330 Reviewed By: girifb Differential Revision: D26284196 Pulled By: chessai fbshipit-source-id: 160e73668b835c83adb0fd1c396a8a2977e86516 * adds german time rule for expressions like: Montag in 3 Wochen (#332) Summary: closes https://github.com/facebook/duckling/issues/331 Pull Request resolved: https://github.com/facebook/duckling/pull/332 Reviewed By: girifb Differential Revision: D26283481 Pulled By: chessai fbshipit-source-id: 054c6467a69896ff3ebbd1f9bc0734aadf1b6dbe * Add Time dimension for RU language Summary: Used b40e2147a9e7e5445e3c42ffc4b45b30d3b1b052 as reference Reviewed By: kappa Differential Revision: D24773196 Pulled By: chessai fbshipit-source-id: 7cc008c0ee80f930efd76e39bb16ca91ec94b641 * add: support for specific times in HI duration (#424) Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/424 Reviewed By: girifb Differential Revision: D26411920 Pulled By: chessai fbshipit-source-id: 3f0063e4786688579f2f53f46b31bda5d222d402 * forward factor parse tree for exploit in T85548324 Summary: due to exploit in T85548324, factoring the input to get a smaller parse tree (the existing one parses tail recursively, whereas this one uses ruleIntersect which is still bad, but slightly better). Differential Revision: D26657170 fbshipit-source-id: fe3a738073b4d30ae401521bb692f4a4bba48d96 * Typo correction (#574) Summary: This commit includes typo correction for `half` and `three` equivelant in Turkish Pull Request resolved: https://github.com/facebook/duckling/pull/574 Reviewed By: girifb Differential Revision: D26726718 Pulled By: chessai fbshipit-source-id: 840c2d8e491057b6ccec81562ff64356789f587d * Combine duplicated examples Summary: I was looking at adding support for "next week" constructions in Spanish to close https://github.com/facebook/duckling/issues/553 (which it appears has already been handled), when I noticed that the equivalent logic for English has been split into two separate examples: "coming week" isn't in the same example as other equivalent constructs like "upcoming week" and "next week". This diff combines them, which I think is clearer and fewer lines of code Reviewed By: chessai Differential Revision: D26892322 fbshipit-source-id: 68ca4644759198fc79d963ae080495c3f2d4a923 * Update classifiers Summary: I was testing an unrelated change (which doesn't change classifier scores) and reran classifiers just to be safe, I noticed that the scores changed. This diff updates them. Reviewed By: chessai Differential Revision: D26892970 fbshipit-source-id: c7da3e3b7d01955f98b287a3ff4e7c1ff2837c7f * Load timezones more leniently. (#582) Summary: On some linux systems, such as on NixOS, /usr/share/zoneinfo does not exist. What does exist in its place is /etc/zoneinfo. So, we should try to load that if /usr/share/zoneinfo does not exist. Pull Request resolved: https://github.com/facebook/duckling/pull/582 Reviewed By: girifb Differential Revision: D27086925 Pulled By: chessai fbshipit-source-id: f4a38822be9888d57034f67a6f7abd17d56d38b8 * Feature/Turkish money (#579) Summary: Added amount of money dimension for Turkish language Pull Request resolved: https://github.com/facebook/duckling/pull/579 Test Plan: :test Endpoint.Duckling.Test Reviewed By: haoxuany, bugra Differential Revision: D27017300 Pulled By: chessai fbshipit-source-id: e8cb257a2953675f54269ed358948e8cbe38af7b * Time - #444 Handle 2-digit date in existing d/m/y rule Summary: The pattern laied out in the bug report https://github.com/facebook/duckling/issues/444 is actually already handled by the pattern `<day-of-month>(ordinal or number)/<named-month>/year`. The problem is purely that the regular expression doesn't match 2-digit years, so the pattern is getting skipped rather than evaluated. This diff fixes the regexp and adds a new example with a 2-digit pattern. This fixes the bug report: ``` > debug (makeLocale EN Nothing) "10-Apr-15" [Seal Time] <day-of-month>(ordinal or number)/<named-month>/year (10-Apr-15) -- integer (numeric) (10) -- -- regex (10) -- regex (-) -- April (Apr) -- -- regex (Apr) -- regex (-) -- regex (15) [Entity {dim = "time", body = "10-Apr-15", value = RVal Time (TimeValue (SimpleValue (InstantValue {vValue = 2015-04-10 00:00:00 -0200, vGrain = Day})) [SimpleValue (InstantValue {vValue = 2015-04-10 00:00:00 -0200, vGrain = Day})] Nothing), start = 0, end = 9, latent = False, enode = Node {nodeRange = Range 0 9, token = Token Time TimeData{latent=False, grain=Day, form=Nothing, direction=Nothing, holiday=Nothing, hasTimezone=False}, children = [Node {nodeRange = Range 0 2, token = Token Numeral (NumeralData {value = 10.0, grain = Nothing, multipliable = False, okForAnyTime = True}), children = [Node {nodeRange = Range 0 2, token = Token RegexMatch (GroupMatch ["10"]), children = [], rule = Nothing}], rule = Just "integer (numeric)"},Node {nodeRange = Range 2 3, token = Token RegexMatch (GroupMatch []), children = [], rule = Nothing},Node {nodeRange = Range 3 6, token = Token Time TimeData{latent=False, grain=Month, form=Just (Month {month = 4}), direction=Nothing, holiday=Nothing, hasTimezone=False}, children = [Node {nodeRange = Range 3 6, token = Token RegexMatch (GroupMatch []), children = [], rule = Nothing}], rule = Just "April"},Node {nodeRange = Range 6 7, token = Token RegexMatch (GroupMatch []), children = [], rule = Nothing},Node {nodeRange = Range 7 9, token = Token RegexMatch (GroupMatch ["15"]), children = [], rule = Nothing}], rule = Just "<day-of-month>(ordinal or number)/<named-month>/year"}}] ``` Reviewed By: chessai Differential Revision: D27106007 fbshipit-source-id: 4751672aef807464febef87f6d22d7270bd335df * Style tweaks Summary: The facebook internal linters prefer us to avoid excessive point-free style and extra $ where we could instead move existing brackets. Making those style tweaks for Time/EN/Rules.hs because I was looking at the file as part of Reviewed By: chessai Differential Revision: D27108042 fbshipit-source-id: 7c8e76578476ea14d655131943e693c5159b12d2 * Ignore the no-"-1" linter error for duckling Summary: By default, Facebook Haskell code has a lint error banning -1s, because at one point it was used as a default value in some API handlers and it's better to use Option + Nothing for this use case. But in Duckling - particularly in the Time module - it's quite common that we actually want to work with -1 as a meaningful integer value, involving some kind of offset. Reviewed By: chessai Differential Revision: D27118984 fbshipit-source-id: 5fe2200e8005a20855d7fdd3a8eb2ad33291edc8 * Update README.md (#585) Summary: Convert `This` to `Seal` in order to make this example working. Pull Request resolved: https://github.com/facebook/duckling/pull/585 Reviewed By: girifb Differential Revision: D27235685 Pulled By: chessai fbshipit-source-id: 71a712a622b5d9d10f7842276a2b8f60f962477e * bump dependencies (#588) Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/588 Reviewed By: girifb Differential Revision: D27435008 Pulled By: chessai fbshipit-source-id: f0b0a42752cfebcf290cbc6c6d194ba09724670f * Time Dimension for TR locale (#584) Summary: Added time dimension for Turkish language Pull Request resolved: https://github.com/facebook/duckling/pull/584 Differential Revision: D27235743 P…
the new rules could parse phrases in the form of
xxx upcoming weeks
upcoming xxx weeks