# Wikipedia Articles with temporal timeline
Getting the articles from Wikipedia and creating a temporal timeline of the sentences

## Excerpt
(1) Analysis of various temporal parsers like Natty, DateParser, etc.<br>
(2) Getting articles using Wikipedia article<br>
(3) Temporal timeline using temporal words and context<br>
(4) Word Embedding of each sentence using Google Word2Vec<br>

#### Necessary Python files to download

In [None]:
# !pip install --upgrade pip
# !pip install qtconsole ipywidgets widgetsnbextension
# !pip install seaborn
# !pip install nltk
# import nltk
# nltk.download("stopwords")
# nltk.download('abc')
# nltk.download('punkt')
# !pip install gensim
# !pip install wikipedia
# !pip install natty
# !pip install dateparser

In [94]:
#Imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import sqlite3 as sql
import seaborn as sns
from time import time
import random
import gensim
import warnings
import wikipedia as wiki
from IPython.display import display
from IPython.html import widgets
from ipywidgets import *
from nltk import tokenize
import datetime


warnings.filterwarnings("ignore")

%matplotlib inline 
# sets the backend of matplotlib to the 'inline' backend:
#With this backend, the output of plotting commands is displayed inline within frontends like the Jupyter notebook,
#directly below the code cell that produced it. The resulting plots will then also be stored in the notebook document.

## For Temporal parsing of Words

In [262]:
#Using Natty Parser for parsing temporal words supports multiple languages
#https://dateparser.readthedocs.io/en/latest/
from natty import DateParser
query = "I need a desk for day after tomorrow at 8pm or 1 weeks after"
dp = DateParser(query)
print("Results")
for result in dp.result():
    print(result.date()," ",result.time())
# print(dp.result())

Results
2018-07-23   20:00:00
2018-07-28   20:00:00


In [85]:
#DateParser -> Works only with dates text not with whole text
import dateparser
dateparser.parse("2002 Jan  21")

datetime.datetime(2002, 1, 21, 0, 0)

Other NLP date parser<br>
Tried all of them which were for python<br>
https://docs.google.com/spreadsheets/d/1dKt0R247B8Mx5sFXd7htSOQB-B5kMODM2ydmjp9cr80/edit#gid=0

#### Used Combination of 2 parser to get the best of both the worlds for better accuracy(Just scroll down) 

In [273]:
#Reference & Credits:https://github.com/nltk/nltk_contrib/blob/master/nltk_contrib/timex.py
import re
import string
import os
import sys

# Requires eGenix.com mx Base Distribution
# http://www.egenix.com/products/python/mxBase/
try:
    from mx.DateTime import *
except ImportError:
    pass

# Predefined strings.
numbers = "(^a(?=\s)|one|two|three|four|five|six|seven|eight|nine|ten| \
          eleven|twelve|thirteen|fourteen|fifteen|sixteen|seventeen| \
          eighteen|nineteen|twenty|thirty|forty|fifty|sixty|seventy|eighty| \
          ninety|hundred|thousand)"
day = "(monday|tuesday|wednesday|thursday|friday|saturday|sunday)"
week_day = "(monday|tuesday|wednesday|thursday|friday|saturday|sunday)"
month = "(january|february|march|april|may|june|july|august|september| \
          october|november|december)"
dmy = "(year|day|week|month)"
rel_day = "(today|yesterday|tomorrow|tonight|tonite|now)"
exp1 = "(before|after|earlier|later|ago)"
exp2 = "(this|next|last)"
iso = "\d+[/-]\d+[/-]\d+ \d+:\d+:\d+\.\d+"
year = "((?<=\s)\d{4}|^\d{4})"
regxp1 = "((\d+|(" + numbers + "[-\s]?)+) " + dmy + "s? " + exp1 + ")"
regxp2 = "(" + exp2 + " (" + dmy + "|" + week_day + "|" + month + "))"

reg1 = re.compile(regxp1, re.IGNORECASE)
reg2 = re.compile(regxp2, re.IGNORECASE)
reg3 = re.compile(rel_day, re.IGNORECASE)
reg4 = re.compile(iso)
reg5 = re.compile(year)

def tag(text):

    # Initialization
    timex_found = []

    # re.findall() finds all the substring matches, keep only the full
    # matching string. Captures expressions such as 'number of days' ago, etc.
    found = reg1.findall(text)
    found = [a[0] for a in found if len(a) > 1]
    for timex in found:
        timex_found.append(timex)

    # Variations of this thursday, next year, etc
    found = reg2.findall(text)
    found = [a[0] for a in found if len(a) > 1]
    for timex in found:
        timex_found.append(timex)

    # today, tomorrow, etc
    found = reg3.findall(text)
    for timex in found:
        timex_found.append(timex)

    # ISO
    found = reg4.findall(text)
    for timex in found:
        timex_found.append(timex)

    # Year
    found = reg5.findall(text)
    for timex in found:
        timex_found.append(timex)

    # Tag only temporal expressions which haven't been tagged.
    temp_words=[]
    for timex in timex_found:
#         print(timex)
        temp_words.append(timex)
#         text = re.sub(timex + '(?!</TIMEX2>)', '<TIMEX2>' + timex + '</TIMEX2>', text)

    return temp_words

# Hash function for week days to simplify the grounding task.
# [Mon..Sun] -> [0..6]
hashweekdays = {
    'Monday': 0,
    'Tuesday': 1,
    'Wednesday': 2,
    'Thursday': 3,
    'Friday': 4,
    'Saturday': 5,
    'Sunday': 6}

# Hash function for months to simplify the grounding task.
# [Jan..Dec] -> [1..12]
hashmonths = {
    'January': 1,
    'February': 2,
    'March': 3,
    'April': 4,
    'May': 5,
    'June': 6,
    'July': 7,
    'August': 8,
    'September': 9,
    'October': 10,
    'November': 11,
    'December': 12}

# Hash number in words into the corresponding integer value
def hashnum(number):
    if re.match(r'one|^a\b', number, re.IGNORECASE):
        return 1
    if re.match(r'two', number, re.IGNORECASE):
        return 2
    if re.match(r'three', number, re.IGNORECASE):
        return 3
    if re.match(r'four', number, re.IGNORECASE):
        return 4
    if re.match(r'five', number, re.IGNORECASE):
        return 5
    if re.match(r'six', number, re.IGNORECASE):
        return 6
    if re.match(r'seven', number, re.IGNORECASE):
        return 7
    if re.match(r'eight', number, re.IGNORECASE):
        return 8
    if re.match(r'nine', number, re.IGNORECASE):
        return 9
    if re.match(r'ten', number, re.IGNORECASE):
        return 10
    if re.match(r'eleven', number, re.IGNORECASE):
        return 11
    if re.match(r'twelve', number, re.IGNORECASE):
        return 12
    if re.match(r'thirteen', number, re.IGNORECASE):
        return 13
    if re.match(r'fourteen', number, re.IGNORECASE):
        return 14
    if re.match(r'fifteen', number, re.IGNORECASE):
        return 15
    if re.match(r'sixteen', number, re.IGNORECASE):
        return 16
    if re.match(r'seventeen', number, re.IGNORECASE):
        return 17
    if re.match(r'eighteen', number, re.IGNORECASE):
        return 18
    if re.match(r'nineteen', number, re.IGNORECASE):
        return 19
    if re.match(r'twenty', number, re.IGNORECASE):
        return 20
    if re.match(r'thirty', number, re.IGNORECASE):
        return 30
    if re.match(r'forty', number, re.IGNORECASE):
        return 40
    if re.match(r'fifty', number, re.IGNORECASE):
        return 50
    if re.match(r'sixty', number, re.IGNORECASE):
        return 60
    if re.match(r'seventy', number, re.IGNORECASE):
        return 70
    if re.match(r'eighty', number, re.IGNORECASE):
        return 80
    if re.match(r'ninety', number, re.IGNORECASE):
        return 90
    if re.match(r'hundred', number, re.IGNORECASE):
        return 100
    if re.match(r'thousand', number, re.IGNORECASE):
      return 1000

# Given a timex_tagged_text and a Date object set to base_date,
# returns timex_grounded_text
def ground(tagged_text, base_date):

    # Find all identified timex and put them into a list
    timex_regex = re.compile(r'<TIMEX2>.*?</TIMEX2>', re.DOTALL)
    timex_found = timex_regex.findall(tagged_text)
    timex_found = map(lambda timex:re.sub(r'</?TIMEX2.*?>', '', timex), \
                timex_found)

    # Calculate the new date accordingly
    for timex in timex_found:
        timex_val = 'UNKNOWN' # Default value

        timex_ori = timex   # Backup original timex for later substitution

        # If numbers are given in words, hash them into corresponding numbers.
        # eg. twenty five days ago --> 25 days ago
        if re.search(numbers, timex, re.IGNORECASE):
            split_timex = re.split(r'\s(?=days?|months?|years?|weeks?)', \
                                                              timex, re.IGNORECASE)
            value = split_timex[0]
            unit = split_timex[1]
            num_list = map(lambda s:hashnum(s),re.findall(numbers + '+', \
                                          value, re.IGNORECASE))
            timex = sum(num_list) + ' ' + unit

        # If timex matches ISO format, remove 'time' and reorder 'date'
        if re.match(r'\d+[/-]\d+[/-]\d+ \d+:\d+:\d+\.\d+', timex):
            dmy = re.split(r'\s', timex)[0]
            dmy = re.split(r'/|-', dmy)
            timex_val = str(dmy[2]) + '-' + str(dmy[1]) + '-' + str(dmy[0])

        # Specific dates
        elif re.match(r'\d{4}', timex):
            timex_val = str(timex)

        # Relative dates
        elif re.match(r'tonight|tonite|today', timex, re.IGNORECASE):
            timex_val = str(base_date)
        elif re.match(r'yesterday', timex, re.IGNORECASE):
            timex_val = str(base_date + RelativeDateTime(days=-1))
        elif re.match(r'tomorrow', timex, re.IGNORECASE):
            timex_val = str(base_date + RelativeDateTime(days=+1))

        # Weekday in the previous week.
        elif re.match(r'last ' + week_day, timex, re.IGNORECASE):
            day = hashweekdays[timex.split()[1]]
            timex_val = str(base_date + RelativeDateTime(weeks=-1, \
                            weekday=(day,0)))

        # Weekday in the current week.
        elif re.match(r'this ' + week_day, timex, re.IGNORECASE):
            day = hashweekdays[timex.split()[1]]
            timex_val = str(base_date + RelativeDateTime(weeks=0, \
                            weekday=(day,0)))

        # Weekday in the following week.
        elif re.match(r'next ' + week_day, timex, re.IGNORECASE):
            day = hashweekdays[timex.split()[1]]
            timex_val = str(base_date + RelativeDateTime(weeks=+1, \
                              weekday=(day,0)))

        # Last, this, next week.
        elif re.match(r'last week', timex, re.IGNORECASE):
            year = (base_date + RelativeDateTime(weeks=-1)).year

            # iso_week returns a triple (year, week, day) hence, retrieve
            # only week value.
            week = (base_date + RelativeDateTime(weeks=-1)).iso_week[1]
            timex_val = str(year) + 'W' + str(week)
        elif re.match(r'this week', timex, re.IGNORECASE):
            year = (base_date + RelativeDateTime(weeks=0)).year
            week = (base_date + RelativeDateTime(weeks=0)).iso_week[1]
            timex_val = str(year) + 'W' + str(week)
        elif re.match(r'next week', timex, re.IGNORECASE):
            year = (base_date + RelativeDateTime(weeks=+1)).year
            week = (base_date + RelativeDateTime(weeks=+1)).iso_week[1]
            timex_val = str(year) + 'W' + str(week)

        # Month in the previous year.
        elif re.match(r'last ' + month, timex, re.IGNORECASE):
            month = hashmonths[timex.split()[1]]
            timex_val = str(base_date.year - 1) + '-' + str(month)

        # Month in the current year.
        elif re.match(r'this ' + month, timex, re.IGNORECASE):
            month = hashmonths[timex.split()[1]]
            timex_val = str(base_date.year) + '-' + str(month)

        # Month in the following year.
        elif re.match(r'next ' + month, timex, re.IGNORECASE):
            month = hashmonths[timex.split()[1]]
            timex_val = str(base_date.year + 1) + '-' + str(month)
        elif re.match(r'last month', timex, re.IGNORECASE):

            # Handles the year boundary.
            if base_date.month == 1:
                timex_val = str(base_date.year - 1) + '-' + '12'
            else:
                timex_val = str(base_date.year) + '-' + str(base_date.month - 1)
        elif re.match(r'this month', timex, re.IGNORECASE):
                timex_val = str(base_date.year) + '-' + str(base_date.month)
        elif re.match(r'next month', timex, re.IGNORECASE):

            # Handles the year boundary.
            if base_date.month == 12:
                timex_val = str(base_date.year + 1) + '-' + '1'
            else:
                timex_val = str(base_date.year) + '-' + str(base_date.month + 1)
        elif re.match(r'last year', timex, re.IGNORECASE):
            timex_val = str(base_date.year - 1)
        elif re.match(r'this year', timex, re.IGNORECASE):
            timex_val = str(base_date.year)
        elif re.match(r'next year', timex, re.IGNORECASE):
            timex_val = str(base_date.year + 1)
        elif re.match(r'\d+ days? (ago|earlier|before)', timex, re.IGNORECASE):

            # Calculate the offset by taking '\d+' part from the timex.
            offset = int(re.split(r'\s', timex)[0])
            timex_val = str(base_date + RelativeDateTime(days=-offset))
        elif re.match(r'\d+ days? (later|after)', timex, re.IGNORECASE):
            offset = int(re.split(r'\s', timex)[0])
            timex_val = str(base_date + RelativeDateTime(days=+offset))
        elif re.match(r'\d+ weeks? (ago|earlier|before)', timex, re.IGNORECASE):
            offset = int(re.split(r'\s', timex)[0])
            year = (base_date + RelativeDateTime(weeks=-offset)).year
            week = (base_date + \
                            RelativeDateTime(weeks=-offset)).iso_week[1]
            timex_val = str(year) + 'W' + str(week)
        elif re.match(r'\d+ weeks? (later|after)', timex, re.IGNORECASE):
            offset = int(re.split(r'\s', timex)[0])
            year = (base_date + RelativeDateTime(weeks=+offset)).year
            week = (base_date + RelativeDateTime(weeks=+offset)).iso_week[1]
            timex_val = str(year) + 'W' + str(week)
        elif re.match(r'\d+ months? (ago|earlier|before)', timex, re.IGNORECASE):
            extra = 0
            offset = int(re.split(r'\s', timex)[0])

            # Checks if subtracting the remainder of (offset / 12) to the base month
            # crosses the year boundary.
            if (base_date.month - offset % 12) < 1:
                extra = 1

            # Calculate new values for the year and the month.
            year = str(base_date.year - offset // 12 - extra)
            month = str((base_date.month - offset % 12) % 12)

            # Fix for the special case.
            if month == '0':
                month = '12'
            timex_val = year + '-' + month
        elif re.match(r'\d+ months? (later|after)', timex, re.IGNORECASE):
            extra = 0
            offset = int(re.split(r'\s', timex)[0])
            if (base_date.month + offset % 12) > 12:
                extra = 1
            year = str(base_date.year + offset // 12 + extra)
            month = str((base_date.month + offset % 12) % 12)
            if month == '0':
                month = '12'
            timex_val = year + '-' + month
        elif re.match(r'\d+ years? (ago|earlier|before)', timex, re.IGNORECASE):
            offset = int(re.split(r'\s', timex)[0])
            timex_val = str(base_date.year - offset)
        elif re.match(r'\d+ years? (later|after)', timex, re.IGNORECASE):
            offset = int(re.split(r'\s', timex)[0])
            timex_val = str(base_date.year + offset)

        # Remove 'time' from timex_val.
        # For example, If timex_val = 2000-02-20 12:23:34.45, then
        # timex_val = 2000-02-20
        timex_val = re.sub(r'\s.*', '', timex_val)

        # Substitute tag+timex in the text with grounded tag+timex.
        tagged_text = re.sub('<TIMEX2>' + timex_ori + '</TIMEX2>', '<TIMEX2 val=\"' \
            + timex_val + '\">' + timex_ori + '</TIMEX2>', tagged_text)

    return tagged_text

#Important Temporal Parser Function
def temporal(query):
    import nltk
    
    #PARSER 1--------------------------------------------
    #Using nltk_contrib to get temporal words only
    temporal_words = tag(query) 
#     print(temporal_words)
    #Using Date Parser
    import dateparser
    time1=[]
    for word in temporal_words:
        time1.append(dateparser.parse(word))
#     print(time1)
    
    # PARSER 2 --------------------------------------------------- 
    #Using Natty Parser 
    from natty import DateParser
    dp = DateParser(query)
#     time2=[]
    time2 = dp.result()
#     print(time2)
    
    #If None
    if time1==None:
        final_time = time2
    elif time2==None:
        final_time = time1
    else:
        final_time = time1 + time2
        
#     print(final_time)
    final=[]
    for t in final_time:
        if t:
            final.append(t.strftime("%Y-%m-%d %H:%M:%S"))
    return final

#Main Function
#     query = "Modi in a 2002 today speech said that he will protect rights of X. Previous year, he used to say that he will protect rights of Y"
query = "I need a desk for tomorrow,now at 3 pm "
time = temporal(query)
print(time)

['2018-07-22 18:47:31', '2018-07-21 18:47:31', '2018-07-22 18:47:31', '2018-07-21 18:47:31']


## Using Google's Trained W2Vec on Google News for Word Embedding further
Using Google trained word2vec as it it trained on huge amount of data hence will give a good embedding in Word2Vec

![alt text](https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2017/06/06062705/Word-Vectors.png)

#### Run the below line to download Googles Pretrained Word2Vec on Google News 
### Warning:[3.64 GB File]

In [None]:
# !wget --header="Host: e-2106e5ff6b.cognitiveclass.ai" --header="User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36" --header="Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8" --header="Accept-Language: en-US,en;q=0.9" --header="Cookie: _ga=GA1.2.1009651095.1527270727; _xsrf=2|d66eb8d7|8e30b1015ec501038d0632ff567bddb6|1529904261; ajs_user_id=null; ajs_group_id=null; ajs_anonymous_id=%2287156a62-c4e7-42e5-a5df-9ba768a5469e%22; notice_behavior=implied,eu; edxloggedin=true; sessionid=l63re20cjn71sq7pp96604uq8g5uxojc; edx-user-info="{\"username\": \"rickyaditya2711\"\054 \"version\": 1\054 \"email\": \"rickyaditya2711@gmail.com\"\054 \"header_urls\": {\"learner_profile\": \"https://courses.cognitiveclass.ai/u/rickyaditya2711\"\054 \"logout\": \"https://courses.cognitiveclass.ai/logout\"\054 \"account_settings\": \"https://courses.cognitiveclass.ai/account/settings\"}}"; session=.eJxVj9tugkAURX_FnGdi5FaBxKSgoiZKtYq0Ng0ZYIBRGBQGEY3_XjBt2r6us1f2PjdwjzhPEcWUgcbyEnOASha7LDtgCtoNOg3_gBoJaneSiIVh7x0UXf0wWsTyNNkZorCZKbm5Z6ZAgrAotpbZz3aC4fRn0vyqWFYujKlOp97YWF963kVdE10I-KoKzFLE7OS9rtUzNmRnPlrOlq6a6Updnh00TspafjLEGJ18aSjtfRu4Zo0HGsD9885BgRL2GNiilX18tye70-LNXLw4puw7vLzaWrWdxCN701OH0WAAjVQWOHcJDbPWxCkiSSPnxD_UKCCs-bLP889Ry7t-lgIHIckL5lKU4iaoPzINTdAvnJRH1rJ_mc4PbQu_L39r2i2V55IANEWWROn-BWX4gJ4.DjTHVQ.HCZOYATkCpndDUC4V0a_Fr4CLLU" --header="Connection: keep-alive" "https://e-2106e5ff6b.cognitiveclass.ai/files/Amazon%20Fine%20Food%20Reviews%20Dataset/GoogleNews-vectors-negative300.bin?download=1" -O "GoogleNews-vectors-negative300.bin" -c

In [4]:
from gensim.models import KeyedVectors

w2v_model_google = KeyedVectors.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True) #Loading the model from file in the disk

### Function To get stuff from Wiki and timeline the sentences in chronological order according to temporal meanings

In [243]:
#For sorting datetime objects
def date_key(a):
    """
    a: date as string
    """
    if a:
        a = datetime.datetime.strptime(a, '%Y-%m-%d %H:%M:%S').date()
        return a
    else: 
        return None

def search_wiki(query,n=10):
    #Getting the first result from Wikipedia
    article = wiki.search(query, results=1, suggestion=False)
    
    #If Article exsists
    if article:
        print("Article:",article[0],"\n")
        #Getting the first 'n' lines form Wiki article
        results = wiki.summary(article[0], sentences=n, chars=0, auto_suggest=True, redirect=True)
        
        for index,result in enumerate(tokenize.sent_tokenize(results)):
            print("Sentence",index,":",result)
            dates = sorted(temporal(result), key=date_key)
            if dates:
                print("Dates:",dates)
                print("Final:",dates[0])
            else:
                print("No temporal words")
            sent_vec = np.zeros(300)
            
            #If there is only single word in sentence
            if " " not in result:
                try:
                    sent_vec = w2v_model_google.wv[result]
                except:
                    pass
            #If result is sentence
            else:    
                for word in result.split():
                    try:
                        vec = w2v_model_google.wv[word]
                        sent_vec += vec
                    except:
                        continue
#                     print(word)
            print("Embedding:",sent_vec[:10],"\n")
    else:
        #If there is no article
        print("No results found!")

In [218]:
query = "Gandhi"
n = 5
search_wiki(query,n)

Article: Mahatma Gandhi 

Sentence 0 : Mohandas Karamchand Gandhi (; Hindustani: [ˈmoːɦənd̪aːs ˈkərəmtʃənd̪ ˈɡaːnd̪ʱi] ( listen); 2 October 1869 – 30 January 1948) was an Indian activist who was the leader of the Indian independence movement against British rule.
Dates: ['1869-07-21 00:00:00', '1869-10-02 17:28:13', '1948-07-21 00:00:00']
Final: 1869-07-21 00:00:00
Embedding: [ 0.95883179  1.26251221  1.83483887  1.56100464 -0.31054688 -1.6329689
 -0.09558105 -1.6733551   1.18566895  0.0423584 ] 

Sentence 1 : Employing nonviolent civil disobedience, Gandhi led India to independence and inspired movements for civil rights and freedom across the world.
No temporal words
Embedding: [ 0.68292236  1.25226307  1.70947266  2.90075684 -0.30595016 -1.59990311
  1.0426178  -1.50811768  0.65785217  0.67185974] 

Sentence 2 : The honorific Mahātmā (Sanskrit: "high-souled", "venerable")—applied to him first in 1914 in South Africa—is now used worldwide.
Dates: ['1914-01-01 17:28:13', '1914-07-21 0

## Timeline Sentences

In [277]:
def timeline_sentences(query,n=10):
    #Getting the first result from Wikipedia
    article = wiki.search(query, results=1, suggestion=False)
    
    #If Article exsists
    if article:
        
#         print("Article:",article[0],"\n")
        #Getting the first 'n' lines form Wiki article
        results = wiki.summary(article[0], sentences=n, chars=0, auto_suggest=True, redirect=True)
        
        sortable_articles =[]
        unsortable_articles =[]
        for index,result in enumerate(tokenize.sent_tokenize(results)):
#             print("Sentence",index,":",result)
            dates = sorted(temporal(result), key=date_key)
            if dates:
                final = dates[0]
                sortable_articles.append({"sent":result,"dates":dates,"final":final})
            else:
                unsortable_articles.append({"sent":result})

        sortable_articles = sorted(sortable_articles,key= lambda k:date_key(k["final"]))
        articles = sortable_articles + unsortable_articles
        for index,article in enumerate(articles,1): 
            if "dates" in article:
#                 print(article["dates"])
                print(article["final"])
            else:
                print("None")
            print("Sentence",index,":",article["sent"])
            
            sent_vec = np.zeros(300)
            #If there is only single word in sentence
            if " " not in result:
                try:
                    sent_vec = w2v_model_google.wv[result]
                except:
                    pass
            #If result is sentence
            else:    
                for word in result.split():
                    try:
                        vec = w2v_model_google.wv[word]
                        sent_vec += vec
                    except:
                        continue
#                     print(word)
            print("Embedding:",sent_vec[:10],"\n")
    else:
        #If there is no article
        print("No results found!")

# Enter Query here

In [279]:
#Enter the article you want ot find from wikipedia
query = "Gandhi"
#Enter the no. of sentences you need to find
n = 5
timeline_sentences(query,n)

1869-07-21 00:00:00
Sentence 1 : Mohandas Karamchand Gandhi (; Hindustani: [ˈmoːɦənd̪aːs ˈkərəmtʃənd̪ ˈɡaːnd̪ʱi] ( listen); 2 October 1869 – 30 January 1948) was an Indian activist who was the leader of the Indian independence movement against British rule.
Embedding: [ 0.85571289  1.49536133  1.04638672  0.14562988  0.2623291  -0.25268555
 -0.04956055 -1.33813477  0.62542725  0.58557129] 

1914-01-01 18:48:16
Sentence 2 : The honorific Mahātmā (Sanskrit: "high-souled", "venerable")—applied to him first in 1914 in South Africa—is now used worldwide.
Embedding: [ 0.85571289  1.49536133  1.04638672  0.14562988  0.2623291  -0.25268555
 -0.04956055 -1.33813477  0.62542725  0.58557129] 

1915-07-21 00:00:00
Sentence 3 : After his return to India in 1915, he set about organising peasants, farmers, and urban labourers to protest against excessive land-tax and discrimination.
Embedding: [ 0.85571289  1.49536133  1.04638672  0.14562988  0.2623291  -0.25268555
 -0.04956055 -1.33813477  0.6254272

In [278]:
#Enter the article you want ot find from wikipedia
query = "Elon Musk"
#Enter the no. of sentences you need to find
n = 10
timeline_sentences(query,n)

1971-07-21 00:00:00
Sentence 1 : Elon Reeve Musk  (; born June 28, 1971) is a business magnate, investor and engineer.
Embedding: [ 0.6031189   0.9591217   1.21661377  0.42248535 -1.30715942 -1.81364059
  0.68589783 -3.22961426  2.80395508  1.1048584 ] 

1995-07-21 00:00:00
Sentence 2 : He began a Ph.D. in applied physics and material sciences at Stanford University in 1995 but dropped out after two days to pursue an entrepreneurial career.
Embedding: [ 0.6031189   0.9591217   1.21661377  0.42248535 -1.30715942 -1.81364059
  0.68589783 -3.22961426  2.80395508  1.1048584 ] 

1999-07-21 00:00:00
Sentence 3 : He subsequently co-founded Zip2, a web software company, which was acquired by Compaq for $340 million in 1999.
Embedding: [ 0.6031189   0.9591217   1.21661377  0.42248535 -1.30715942 -1.81364059
  0.68589783 -3.22961426  2.80395508  1.1048584 ] 

2000-07-21 00:00:00
Sentence 4 : It merged with Confinity in 2000 and became PayPal, which was bought by eBay for $1.5 billion in October 

### References:
(1) https://news.ycombinator.com/item?id=8653901<br>
(2) https://github.com/nltk/nltk_contrib/blob/master/nltk_contrib/timex.py<br>
(3) https://docs.google.com/spreadsheets/d/1dKt0R247B8Mx5sFXd7htSOQB-B5kMODM2ydmjp9cr80/<br>
(4) https://wikipedia.readthedocs.io/en/latest/code.html#api<br>
(5) https://dateparser.readthedocs.io/en/latest/