In this notebook we demonstrate Temporal IE using [duckling](https://github.com/FraBle/python-duckling), which is a python wrapper for wit.ai's [Duckling](https://github.com/facebookarchive/duckling_old) Clojure library.

Duckling เป็นไลบรารีที่พัฒนาโดย Facebook ซึ่งถูกออกแบบมาเพื่อสกัดข้อมูลที่มีโครงสร้างจากข้อความ (Natural Language)
โดย Duckling จะใช้สำหรับการ extract entities เช่น เวลา, วันที่, จำนวน, หน่วยวัด และอื่นๆ 
ที่สามารถตีความและนำมาใช้งานได้ในเชิงโครงสร้าง.

# Set up

In [17]:
# pip install jpype1==1.3.0 duckling==1.8.0


In [18]:
# Fix _parse_int in duckling.py

    # def _parse_int(self, java_number):
    #     # Attempt to convert java.lang.String to Python int
    #     try:
    #         return int(str(java_number))
    #     except ValueError:
    #         return java_number


    # def _parse_int(self, java_number):
    #     # Attempt to convert java.lang.String to Python int
    #     try:
    #         return int(str(java_number))
    #     except ValueError:
    #         return java_number

In [19]:
from duckling import DucklingWrapper
from pprint import pprint

In [20]:
d = DucklingWrapper()
print(d.parse_time(u'Let\'s meet at 11:45am'))

[{'dim': 'time', 'text': 'at 11:45am', 'start': 11, 'end': 21, 'value': {'value': '2024-09-22T11:45:00.000+07:00', 'grain': 'minute', 'others': [{'grain': 'minute', 'value': '2024-09-22T11:45:00.000+07:00'}, {'grain': 'minute', 'value': '2024-09-23T11:45:00.000+07:00'}, {'grain': 'minute', 'value': '2024-09-24T11:45:00.000+07:00'}]}}]


Extracting time from text


In [40]:
pprint(d.parse_time(u'Let\'s meet at 11:45am'))
print('----------------------------------------------------------------------')
pprint(d.parse_time(u'You owe me twenty bucks, please call me today'))

[{'dim': 'time',
  'end': 21,
  'start': 11,
  'text': 'at 11:45am',
  'value': {'grain': 'minute',
            'others': [{'grain': 'minute',
                        'value': '2024-09-22T11:45:00.000+07:00'},
                       {'grain': 'minute',
                        'value': '2024-09-23T11:45:00.000+07:00'},
                       {'grain': 'minute',
                        'value': '2024-09-24T11:45:00.000+07:00'}],
            'value': '2024-09-22T11:45:00.000+07:00'}}]
----------------------------------------------------------------------
[{'dim': 'time',
  'end': 45,
  'start': 40,
  'text': 'today',
  'value': {'grain': 'day',
            'others': [{'grain': 'day',
                        'value': '2024-09-22T00:00:00.000+07:00'}],
            'value': '2024-09-22T00:00:00.000+07:00'}},
 {'dim': 'time',
  'end': 17,
  'start': 11,
  'text': 'twenty',
  'value': {'grain': 'year',
            'others': [],
            'value': '2020-01-01T00:00:00.000+07:00'}}]


Extracting temperature from text

In [41]:
pprint(d.parse_temperature(u'Let\'s change the temperatur from thirty two celsius to 65 degrees'))
print('----------------------------------------------------------------------')
pprint(d.parse_temperature(u"It's getting hotter day by day, yesterday it was thirty-five degrees celcius today its 37 degrees "))

[{'dim': 'temperature',
  'end': 65,
  'start': 55,
  'text': '65 degrees',
  'value': {'unit': 'degree', 'value': 65.0}},
 {'dim': 'temperature',
  'end': 51,
  'start': 33,
  'text': 'thirty two celsius',
  'value': {'unit': 'celsius', 'value': 32.0}}]
----------------------------------------------------------------------
[{'dim': 'temperature',
  'end': 97,
  'start': 87,
  'text': '37 degrees',
  'value': {'unit': 'degree', 'value': 37.0}},
 {'dim': 'temperature',
  'end': 76,
  'start': 49,
  'text': 'thirty-five degrees celcius',
  'value': {'unit': 'celsius', 'value': 35.0}}]


Extracting timezone from text

In [42]:
pprint(d.parse_timezone(u"Let's meet at 10pm IST"))
print('----------------------------------------------------------------------')
pprint(d.parse_timezone(u"Let's meet at 22:00 EST"))

[{'dim': 'timezone',
  'end': 22,
  'start': 19,
  'text': 'IST',
  'value': {'value': 'IST'}}]
----------------------------------------------------------------------
[{'dim': 'timezone',
  'end': 23,
  'start': 20,
  'text': 'EST',
  'value': {'value': 'EST'}}]


Extracting number from text

In [24]:
d.parse_number(u"Hey i am a 20 year old student from Alaska")

[{'dim': 'number',
  'text': '20',
  'start': 11,
  'end': 13,
  'value': {'value': 20.0}}]

Extracting ordinals from text

In [45]:
pprint(d.parse_ordinal(u"I came 2nd and u came 1st in a race"))
print('----------------------------------------------------------------------')
pprint(d.parse_ordinal(u"I came second and u came first in a race"))

[{'dim': 'ordinal',
  'end': 10,
  'start': 7,
  'text': '2nd',
  'value': {'value': 2}},
 {'dim': 'ordinal',
  'end': 25,
  'start': 22,
  'text': '1st',
  'value': {'value': 1}}]
----------------------------------------------------------------------
[{'dim': 'ordinal',
  'end': 13,
  'start': 7,
  'text': 'second',
  'value': {'value': 2}},
 {'dim': 'ordinal',
  'end': 30,
  'start': 25,
  'text': 'first',
  'value': {'value': 1}}]


Extracting currency and value from text

In [50]:
pprint(d.parse_money(u"This meal costs 3$"))
print('----------------------------------------------------------------------')
pprint(d.parse_money(u"This meal costs 3 Baht"))
print('----------------------------------------------------------------------')
pprint(d.parse_money(u"This meal costs 3 Rupee"))

[{'dim': 'amount-of-money',
  'end': 18,
  'start': 16,
  'text': '3$',
  'value': {'unit': '$', 'value': 3.0}}]
----------------------------------------------------------------------
[]
----------------------------------------------------------------------
[{'dim': 'amount-of-money',
  'end': 23,
  'start': 16,
  'text': '3 Rupee',
  'value': {'unit': 'INR', 'value': 3.0}}]


Extracting email ids from text

In [27]:
pprint(d.parse_email(u"my email is abcxyz@gmail.com"))

[{'dim': 'email',
  'text': 'abcxyz@gmail.com',
  'start': 12,
  'end': 28,
  'value': {'value': 'abcxyz@gmail.com'}}]

Extracting the durations in a text

In [28]:
d.parse_duration(u"I have been working on this project for 4 hrs every month for almost 2 years.")

[{'dim': 'duration',
  'text': '4 hrs',
  'start': 40,
  'end': 45,
  'value': {'value': 4.0,
   'unit': 'hour',
   'year': None,
   'month': None,
   'day': None,
   'hour': 4,
   'minute': None,
   'second': None}},
 {'dim': 'duration',
  'text': '2 years',
  'start': 69,
  'end': 76,
  'value': {'value': 2.0,
   'unit': 'year',
   'year': 2,
   'month': None,
   'day': None,
   'hour': None,
   'minute': None,
   'second': None}}]

Extracting urls from text

In [51]:
pprint(d.parse_url(u"The official website for the book Practical NLP is http://www.practicalnlp.ai/"))
print('----------------------------------------------------------------------')
d.parse_url(u"ฉันกำลังเข้าเว็บ www.เฉลิม.com")

[{'dim': 'url',
  'end': 78,
  'start': 51,
  'text': 'http://www.practicalnlp.ai/',
  'value': {'value': 'http://www.practicalnlp.ai/'}},
 {'dim': 'url',
  'end': 78,
  'start': 51,
  'text': 'http://www.practicalnlp.ai/',
  'value': {'value': 'http://www.practicalnlp.ai/'}}]
----------------------------------------------------------------------


[]

Extracting phone numbers from text

In [33]:
d.parse_phone_number(u"My phone number is 0918494100 and my adress is 637 Huakwang Bangkok 1234518")#didnt demo this due to privacy reasons

[{'dim': 'phone-number',
  'text': '0918494100 ',
  'start': 19,
  'end': 30,
  'value': {'value': '0918494100 '}},
 {'dim': 'phone-number',
  'text': '1234518',
  'start': 68,
  'end': 75,
  'value': {'value': '1234518'}}]

Generally, a good idea would be to make a pipeline of all of these functions or which ever you require according to your use case.