Permalink
Browse files

Updated the explnation and modified import

  • Loading branch information...
siddharthkv7 committed Sep 26, 2018
1 parent 776f101 commit 7d8dafa2f3f4da171b98f22295c50e0699991ed2
Showing with 1 addition and 1 deletion.
  1. +1 −1 9 Lexical Dispersion Plot.ipynb
@@ -94,7 +94,7 @@
"## Explanation \n",
"\n",
"### Tokenisation\n",
"Firstly, we check which language the function is present in. Then we try to sort them accordingly, sending the Indian ones one way, and English and Latin the other. Both these groups have been assigned their own separate tokenizer. We use the CLTK Indian tokenizer for Indian languages and the NLTK `word_tokenize` method for the other two languages.\n",
"Firstly, we check which language the function is present in. Then we try to sort them accordingly, sending the Indian ones one way, and English and Latin the other. Both these groups have been assigned their own separate tokenizer. We use the CLTK `TokenizeSentence()` for Indian languages and the NLTK `word_tokenize` method for the other two languages.\n",
"\n",
"### Locating Matches and Plotting\n",
"This is a pretty straightforward task where we select matches from the text and store their positions in the text in order to display them on the graph. This is achieved using simple loops. It is followed by basic plotting and manipulating data points to produce the lexical dispersion plot."

0 comments on commit 7d8dafa

Please sign in to comment.