Skip to content

Duckling v0.2.0.0

Latest
Compare
Choose a tag to compare
@chessai chessai released this 16 Apr 17:06

Core

  • Bump versions on dependencies
  • ExampleMain: add support for setting the port
  • Support for GHC >= 8.8.x && GHC <= 9.0.1
  • Relicense to BSD-3
  • Probabilistic layer bug fix
  • Remove dependency on Data.Some
  • Depend on regex-pcre-builtin on Windows
  • Update cabal-version to 2.2
  • Migrate to GitHub CI over Travis
  • Some attempts at getting smaller parse trees (less exploitable)

Rulesets

  • Common

    • Correct CDT TimeZone offset CDT is (-300), not 540.
    • Fixed BST and IST offsets. BST is +1 (60) and IST is +5:30 (330)
    • Make unicode output in tests sane by not relying on Show
    • Add type=value to JSON response for Email, PhoneNumber, and Url, for consistency
    • CreditCardNumber: new! (duckling can now recognise credit cards)
    • Email: Add DE (German) + IS (Icelandic) spelled out email
    • Numeral: Don't accept dashes ('-') as token separators
    • Time: Add Karva Chauth holiday
    • Time: Add Vaisakhi holiday
    • Time: Add daylight savings start/endtimes to holidays
    • Time: Add Purim and Shushan Purim (Jewish holidays)
    • Time: Add Guru Gobind Singh Jayanti holiday
    • Time: Extend support for Ramadan and Eid al-Fitr from 1950 to 2050
    • Time: Add support for Parsi New Year
    • Time: Add support for Dayananda Saraswati Jayanti holiday
    • Time: Extend support for Mawlid from 1950 to 1998
    • Time: Add Rabindra Jayanti holiday
    • Time: Add Guru Ravidass Jayanti holiday
    • Time: Extend support for Eid al-Adha from 1950 to 2000
    • Time: Add Krishna Janmashtami holiday
    • Time: Add Mahavir Jayanti holiday
    • Time: Add Maha Shivaratri holiday
    • Time: add Ugadi holiday
    • Volume: interval support
  • AF (Afrikaans)

    • Numeral: new!
  • AR (Arabic)

    • Add a new locale: EG (Egyptian)
    • AmountOfMoney: Add more variants of EGP
    • Numeral: Add support for numerals written in Arabic script
    • Numeral: Support decimals and comma-separated integers
    • PhoneNumber: new!
    • Time: Add periodic times
    • Time: Add rule for a week ago
  • BG (Bulgarian)

    • Time: new!
  • BN (Bengali)

    • Numeral: new!
  • CA (Catalan)

    • Numeral: new!
    • Ordinal: new!
  • DA (Danish)

    • Ordinal: Add support for larger spelled-out ordinals
    • Time: recognise abbreviation 'kl'
  • DE (German)

    • Distance: new!
    • Duration: Add more common durations
    • Numeral: Fix typo on "fünfzehn"(15)
    • Time: Fix wrong parsing of YYYY-MM-DD dates
    • Time: Add rule for 'the day before yesterday'
    • Time: Fix a bug for "fünfter"¬
    • Time: support for "am ersten Dezember" to " am einunddreißigsten Dezember"¬
    • Time: now recognise "der fünfte Dezember"¬
    • Time: Add support for (pre-)computed holidays
    • Time: Don't parse 'so'
    • Time: improve approximations
    • Time: Add rule for week after next
    • Time: Add rule for montag in n wochen
    • Volume: new!
  • EN (English)

    • AmountOfMoney: Add support for subunits of dollars, e.g. nickels, dimes, quarters, as well as the number of coins
    • AmountOfMoney: Add support for latent
    • AmountOfMoney: Add support for lakh and crore
    • AmountOfMoney: correct regex to match three letter currency codes beginning with C
    • AmountOfMoney: Make ruleInterval{Max,Min} symmetric
    • AmountOfMoney: Support abbreviations of lakh and crore
    • AmountOfMoney: Add UAH currency
    • AmountOfMoney: Add HKD
    • AmountOfMoney: dollar and <amount-of-money>
    • AmountOfMoney: dollar and a half
    • AmountOfMoney: intersection for <amount-of-money> and a half
    • AmountOfMoney: extend interval support
    • AmountOfMoney: Add more variants of EGP
    • Distance: Support composite distances
    • Distance: one meter and <distance>
    • Distance: <distance> meters and <distance>
    • Distance: Add interval support
    • Duration: Fix composite durations without delimiters.
      It previously only worked with
      commas/'and'-separated tokens
    • Duration: Support more words in ruleDurationNumeralMore
    • Duration: <integer> and a half minutes
    • Duration: support for <integer> hour and <integer>
    • Duration: parse Xm as X minutes and X.Y hrs as X.Y hours
    • Duration: Leverage TimeGrain for x.y hours
    • Duration: Support composite duration
    • Numeral: Fix ambiguous parses when both ruleNegative and ruleMultiply apply
    • Numeral: Extend fraction rule
    • Numeral: Add mixed fraction rules
    • Numeral: Add prefix of 10/100/10_000 rules
    • Quantity: Support 'k.g' and 'k.g.' for kilograms
    • Temperature: Support temperatures with decimal values
    • Time: Fix palm sunday regex
    • Time: Fix latent time of days like "ten thirty"
    • Time: Don't parse time of days above 12 with meridiem
      like "13 am"
    • Time: Add rule to match <time> + <timezone>
    • Time: Add "all week"/"rest of the week" rules
    • Time: Parse time intervals like: 2015-03-28 17:00:00/2015-03-28 21:00:00
    • Time: Add rules to handle "DD/MM/YYYY" or "DD MM YYYY"
    • Time: Add support for YYYYQQ and YYQQ expressions, like
      '2018Q4' and '18Q4'
    • Time: Add rule for 'during <month>'
    • Time: Restrict 'on' absorption to days
    • Time: Fix durations, upper intervals should all be exclusive
    • Time: Fix grain on some intervals for time-of-days
    • Time: Add rule 'in <duration> at <time-ofday>
    • Time: Add <datetime> - <datetime> (interval) timezone rule
    • Time: '<day> in <duration> should only operate on grain > Hour
    • Time: Add '<integer> <day-of-week> from <time>' rule
    • Time: Add <duration> past <time>
    • Time: Make 'the week' resolve to interval from today to end of week
    • Time: Add support for <hour>h<min>
    • Time: Add rule for a quarter after <hour-of-day>
    • Time: Support 'Martin Luther Kings Day'
    • Time: Add more holiday aliases (Veteran Day, Mardi Gras, St. Paddy's Day)
    • Time: Intersect "9 tomorrow morning"
    • Time: Support 'Chinese New Years'
    • Time: Support "Saint" Patrick's Day
    • Time: Add Ratha-Yatra Holiday
    • Time: Add support for Ganesh Chaturthi holiday
    • Time: Add rama navami holiday
    • Time: Add '<part-of-day> at <time-of-day>' rule
    • Time: Modify 'this|last|next <cycle>' to capture things like "the month" in "the first Saturday of the month"
    • Time: Support 'the <day-of-month> of <month>'
    • Time: Add Orthodox Good Friday holiday
    • Time: The (nth) closest (day) to (time)
    • Time: Add rule for 'midday'
    • Time: Fix a problem with parsing fractional minutes
    • Time: parse 'upcoming xxx weeks' and 'xxx upcoming weeks'
    • Time: Add new rule to parse phrase in the pattern 'xxx minutes to <hour-of-day>
    • Time: Add support for spelled-out times
    • Time: Add support to parse <ordinal> <day-of-week>
    • Time: Fix parsing 'next <day-of-week>'
    • Time: Treat 'coming' as 'next'
    • Time: Make duckling time not treat 0:xx and 12:xx ambiguously
    • Add 'asap', 'at the moment' to EN time
    • Parse latent year intervals
    • Time: last <duration>
    • Time: <time> <day-of-month>
    • Volume: Extend interval support
    • Time: Handle 2-digit date in existing DD/MM/YYYY rule
  • EN_UK (English, United Kingdom)

    • Duration: new!
    • Time: new!
  • EN_US (English, United States)

    • Time: Super Tuesday
  • ES (Spanish)

    • AmountOfMoney: Add support for intervals
    • Duration: Add support for parsing phrases like half hour, quarter of an hour
    • Duration: Support composite durations
    • Numeral: Fix typo of 22 (veinto -> venti)
    • Numeral: Add locale rules, since Spain and South America use different decimal separators
    • Numeral: parse 'cero dos' as 'dos'
    • Numeral: composite numerals
    • Ordinal: Add missing 'tercer' regex
    • Quantity: new!
    • Time: Add support for periodic holidays
    • Time: Fix ruleYearLatent to not match numerals that could be hours
    • Time: Make 'n horas' latent
    • Time: Fixed a problem with parsing 'day of month' that contains 'dia'
    • Time: Add rule for 'next week'
    • Time: <ordinal> <day-of-month>
    • Time: support for 'noon'
    • Time: Allow para before timestamps
    • Time: Add rule to parse '<time-of-day> (in the afternoon)'
    • Time: Add rule to parse 'last <day-of-week> of <time>'
    • Time: Add rule for next week
    • Time: Add more corpus for tomorrow so that tomorrow latent doesn't prevail
    • Time: Fix multi-word timestamps
  • FA (Persian)

    • Numeral: new!
  • FI (Finnish)

    • Numeral: new!
  • FR (French)

    • Temperature: Support temperatures with decimal values
    • Time: Update month interval rules to handle ordinals
      and spelled-out numerals (e.g. "du premier au
      quinze juin")
    • Volume: Add 'Centilitre' type
  • HE (Hebrew)

    • AmountOfMoney: add more rules
  • HI (Hindi)

    • Duration: Pakhwada (पखवाड़ा) is 15 days n, not a fortnight
    • Duration: Add support for composite duration
    • Duration: Add support for specific times
    • Numeral: Support more numerals
    • Temperature: new!
    • Time: Add support for quarter till, quarter past, and half
  • ID (Indonesian)

    • AmountOfMoney: support intervals
  • IS (Icelandic)

    • Numeral: new!
  • IT (Italian)

    • AmountOfMoney: new!
    • Distance: new!
    • Duration: Fix ruleDurationAgo
  • KA (Georgian)

    • AmountOfMoney: new!
    • Duration: new!
    • Numeral: new!
    • Ordinal: new!
    • Time: new!
    • Time: Improvements for times in the past
    • Time: Add support for quarters
  • KM (Khmer)

    • Distance: new!
    • Numeral: new!
    • Ordinal: new!
    • Quantity: new!
    • Temperature: new!
    • Volume: new!
  • KN (Kannada)

    • Numeral: new!
  • KO (Korean)

    • AmountOfMoney: support intervals
  • LO (Lao)

    • Numeral: new!
  • ML (Malayalam)

    • Numeral: new!
    • Ordinal: new!
  • MN (Mongolian)

    • Numeral: new!
  • NL (Dutch)

    • AmountOfMoney: make decimal thousands separator optional
    • Duration: Add 'anderhalf uur'
    • Duration: support composite durations
    • Quantity: new!
    • Time: Add support for King's Day (Koningsdag)
    • Time: stop 'for <number>' from resolving as times
    • Volume: remove gallon
    • Volume: Fix typo in milliliter
  • NO (Norwegian)

    • AmountOfMoney: Parse more currencies
    • Numeral: The written numeral 8 had a typo: "otte" -> "åtte"
    • Numeral: Parse powers of ten with spaces as well as dots
    • Numeral: Add more textual powers of ten
    • Numeral: Parse textual numbers from 21 to 99 with and without spaces
    • Time: Add support for half an hour before as e.g. "halv to"
    • Time: Add support for alternative clock denotation "klokka"
    • Time: Add support for alternative tomorrow denotation "i morra"
    • Time: Add some more unambiguous datetime parsing
  • PL (Polish)

    • Numeral: Add support for seventy, eighty, ninety
    • Duration: Fix typo: miej -> mniej
    • Time: Add 'evening' to corpus
  • PT (Portuguese)

    • Ordinal: Add 13..99
    • Ordinal: Corpus, keep accents consistent
    • Time: Last and quarter expressions
  • RO (Romanian)

    • Numeral: Fix multipliers with values above 20
      In Romanian, for numerals above 20, we
      say '20 de milioane', not '20 milioane'.
    • Time: Implement 'the day after tomorrow'
  • RU (Russian)

    • AmountOfMoney: Add UAH currency
    • Numeral: Be more permissive with numerals [20, 90]
    • Time: new!
  • SK (Slovak)

    • Numeral: new!
  • SW (Swahili)

    • Numeral: new!
  • TA (Tamil)

    • Ordinal: new!
  • TE (Telugu)

    • Numeral: new!
  • TH (Thai)

    • Numeral: new!
  • TR (Turkish)

    • AmountOfMoney: new!
    • Time: new!
  • VI (Vietnamese)

    • AmountOfMoney: Add interval support
    • AmountOfMoney: ruleNg, ruleDollar, ruleVND modified to better capture usage of VI
    • Numeral: Add "ngàn" common synonym of "nghìn", and "chục",
      colloquially used to count tens.
    • Numeral: Remove ? in some regex where they don't make sense
    • Time: Fix double-digit month matching
    • Time: don't parse ngày
  • ZH (Chinese)

    • Duration: Add more common expressions
    • Numeral: Add more common expressions
    • Time: Parse YYYY-MM
    • Time: Value before month can be integer or chinese char
    • Time: Add support for (pre-)computed holidays
    • Time: Add more common expressions
    • Volume: new!

Server

  • Skip logfile creation if no logging
  • Document how example application can use specific dimensions only
  • Use System.FilePath.Posix instead of System.FilePath
  • Use all dimensions by default
  • Load timezones more leniently
  • Docker: Reduce size of image drastically
  • Dockerfile: Use Debian Buster