Skip to content

Commit

Permalink
Update the documentation to improve the latex format
Browse files Browse the repository at this point in the history
  • Loading branch information
aboSamoor committed Apr 20, 2015
1 parent bad73e8 commit 7ab8a5e
Show file tree
Hide file tree
Showing 6 changed files with 60 additions and 163 deletions.
8 changes: 5 additions & 3 deletions docs/Detection.rst
Expand Up @@ -68,21 +68,23 @@ the confidence in the detection went down for the first line
.. code:: python
for line in mixed_text.strip().splitlines():
print(line, "\n")
print(line + u"\n")
for language in Detector(line).languages:
print(language)
print("\n")
.. parsed-literal::
(u'China (simplified Chinese: \u4e2d\u56fd; traditional Chinese: \u4e2d\u570b),', '\n')
China (simplified Chinese: 中国; traditional Chinese: 中國),
name: English code: en confidence: 71.0 read bytes: 887
name: Chinese code: zh_Hant confidence: 11.0 read bytes: 1755
name: un code: un confidence: 0.0 read bytes: 0
(u"officially the People's Republic of China (PRC), is a sovereign state located in East Asia.", '\n')
officially the People's Republic of China (PRC), is a sovereign state located in East Asia.
name: English code: en confidence: 98.0 read bytes: 1291
name: un code: un confidence: 0.0 read bytes: 0
name: un code: un confidence: 0.0 read bytes: 0
Expand Down
75 changes: 15 additions & 60 deletions docs/Download.rst
Expand Up @@ -77,22 +77,6 @@ Library Interface
from polyglot.downloader import downloader
downloader.download("embeddings2.en")
.. parsed-literal::
[polyglot_data] Downloading package embeddings2.en to
[polyglot_data] /home/rmyeid/polyglot_data...
[polyglot_data] Package embeddings2.en is already up-to-date!
.. parsed-literal::
True
Collections
-----------

Expand Down Expand Up @@ -198,54 +182,25 @@ polyglot named entity recognition subsystem, as the following:

.. code:: python
downloader.supported_languages(task="ner2")
print(downloader.supported_languages_table(task="ner2"))
.. parsed-literal::
['Polish',
'Turkish',
'Russian',
'Indonesian',
'Czech',
'Arabic',
'Korean',
'Catalan; Valencian',
'Italian',
'Thai',
'Romanian, Moldavian, Moldovan',
'Tagalog',
'Danish',
'Finnish',
'German',
'Persian',
'Dutch',
'Chinese',
'French',
'Portuguese',
'Slovak',
'Hebrew (modern)',
'Malay',
'Slovene',
'Bulgarian',
'Hindi',
'Japanese',
'Hungarian',
'Croatian',
'Ukrainian',
'Serbian',
'Lithuanian',
'Norwegian',
'Latvian',
'Swedish',
'English',
'Greek, Modern',
'Spanish; Castilian',
'Vietnamese',
'Estonian']
1. Polish 2. Turkish 3. Russian
4. Indonesian 5. Czech 6. Arabic
7. Korean 8. Catalan; Valencian 9. Italian
10. Thai 11. Romanian, Moldavian, ... 12. Tagalog
13. Danish 14. Finnish 15. German
16. Persian 17. Dutch 18. Chinese
19. French 20. Portuguese 21. Slovak
22. Hebrew (modern) 23. Malay 24. Slovene
25. Bulgarian 26. Hindi 27. Japanese
28. Hungarian 29. Croatian 30. Ukrainian
31. Serbian 32. Lithuanian 33. Norwegian
34. Latvian 35. Swedish 36. English
37. Greek, Modern 38. Spanish; Castilian 39. Vietnamese
40. Estonian
You can view all the available and/or installed collections or packages
Expand Down
10 changes: 3 additions & 7 deletions docs/Transliteration.rst
Expand Up @@ -14,14 +14,10 @@ Dēmokratía".
Languages Coverage
------------------

**TODO**

Describe how did we get these models

.. code:: python
from polyglot.downloader import downloader
print(downloader.supported_languages_table("transliteration2", 3))
print(downloader.supported_languages_table("transliteration2"))
.. parsed-literal::
Expand Down Expand Up @@ -52,8 +48,8 @@ Describe how did we get these models
Download Necessary Models
^^^^^^^^^^^^^^^^^^^^^^^^^
Downloading Necessary Models
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. code:: python
Expand Down
10 changes: 6 additions & 4 deletions notebooks/Detection.ipynb
Expand Up @@ -128,7 +128,7 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 14,
"metadata": {
"collapsed": false
},
Expand All @@ -137,13 +137,15 @@
"name": "stdout",
"output_type": "stream",
"text": [
"(u'China (simplified Chinese: \\u4e2d\\u56fd; traditional Chinese: \\u4e2d\\u570b),', '\\n')\n",
"China (simplified Chinese: 中国; traditional Chinese: 中國),\n",
"\n",
"name: English code: en confidence: 71.0 read bytes: 887\n",
"name: Chinese code: zh_Hant confidence: 11.0 read bytes: 1755\n",
"name: un code: un confidence: 0.0 read bytes: 0\n",
"\n",
"\n",
"(u\"officially the People's Republic of China (PRC), is a sovereign state located in East Asia.\", '\\n')\n",
"officially the People's Republic of China (PRC), is a sovereign state located in East Asia.\n",
"\n",
"name: English code: en confidence: 98.0 read bytes: 1291\n",
"name: un code: un confidence: 0.0 read bytes: 0\n",
"name: un code: un confidence: 0.0 read bytes: 0\n",
Expand All @@ -154,7 +156,7 @@
],
"source": [
"for line in mixed_text.strip().splitlines():\n",
" print(line, \"\\n\")\n",
" print(line + u\"\\n\")\n",
" for language in Detector(line).languages:\n",
" print(language)\n",
" print(\"\\n\")"
Expand Down
103 changes: 27 additions & 76 deletions notebooks/Download.ipynb
Expand Up @@ -125,31 +125,11 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[polyglot_data] Downloading package embeddings2.en to\n",
"[polyglot_data] /home/rmyeid/polyglot_data...\n",
"[polyglot_data] Package embeddings2.en is already up-to-date!\n"
]
},
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"outputs": [],
"source": [
"from polyglot.downloader import downloader\n",
"downloader.download(\"embeddings2.en\")"
Expand Down Expand Up @@ -179,7 +159,7 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 6,
"metadata": {
"collapsed": false
},
Expand Down Expand Up @@ -234,7 +214,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 7,
"metadata": {
"collapsed": false
},
Expand All @@ -245,7 +225,7 @@
"True"
]
},
"execution_count": 3,
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
Expand All @@ -270,7 +250,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 8,
"metadata": {
"collapsed": false
},
Expand All @@ -287,7 +267,7 @@
" u'tsne2']"
]
},
"execution_count": 4,
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
Expand All @@ -305,63 +285,34 @@
},
{
"cell_type": "code",
"execution_count": 19,
"execution_count": 9,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"['Polish',\n",
" 'Turkish',\n",
" 'Russian',\n",
" 'Indonesian',\n",
" 'Czech',\n",
" 'Arabic',\n",
" 'Korean',\n",
" 'Catalan; Valencian',\n",
" 'Italian',\n",
" 'Thai',\n",
" 'Romanian, Moldavian, Moldovan',\n",
" 'Tagalog',\n",
" 'Danish',\n",
" 'Finnish',\n",
" 'German',\n",
" 'Persian',\n",
" 'Dutch',\n",
" 'Chinese',\n",
" 'French',\n",
" 'Portuguese',\n",
" 'Slovak',\n",
" 'Hebrew (modern)',\n",
" 'Malay',\n",
" 'Slovene',\n",
" 'Bulgarian',\n",
" 'Hindi',\n",
" 'Japanese',\n",
" 'Hungarian',\n",
" 'Croatian',\n",
" 'Ukrainian',\n",
" 'Serbian',\n",
" 'Lithuanian',\n",
" 'Norwegian',\n",
" 'Latvian',\n",
" 'Swedish',\n",
" 'English',\n",
" 'Greek, Modern',\n",
" 'Spanish; Castilian',\n",
" 'Vietnamese',\n",
" 'Estonian']"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
"name": "stdout",
"output_type": "stream",
"text": [
" 1. Polish 2. Turkish 3. Russian \n",
" 4. Indonesian 5. Czech 6. Arabic \n",
" 7. Korean 8. Catalan; Valencian 9. Italian \n",
" 10. Thai 11. Romanian, Moldavian, ... 12. Tagalog \n",
" 13. Danish 14. Finnish 15. German \n",
" 16. Persian 17. Dutch 18. Chinese \n",
" 19. French 20. Portuguese 21. Slovak \n",
" 22. Hebrew (modern) 23. Malay 24. Slovene \n",
" 25. Bulgarian 26. Hindi 27. Japanese \n",
" 28. Hungarian 29. Croatian 30. Ukrainian \n",
" 31. Serbian 32. Lithuanian 33. Norwegian \n",
" 34. Latvian 35. Swedish 36. English \n",
" 37. Greek, Modern 38. Spanish; Castilian 39. Vietnamese \n",
" 40. Estonian \n"
]
}
],
"source": [
"downloader.supported_languages(task=\"ner2\")"
"print(downloader.supported_languages_table(task=\"ner2\"))"
]
},
{
Expand Down
17 changes: 4 additions & 13 deletions notebooks/Transliteration.ipynb
Expand Up @@ -17,7 +17,7 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 1,
"metadata": {
"collapsed": false
},
Expand All @@ -33,18 +33,9 @@
"## Languages Coverage"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**TODO**\n",
"\n",
" Describe how did we get these models"
]
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 2,
"metadata": {
"collapsed": false
},
Expand Down Expand Up @@ -82,14 +73,14 @@
],
"source": [
"from polyglot.downloader import downloader\n",
"print(downloader.supported_languages_table(\"transliteration2\", 3))"
"print(downloader.supported_languages_table(\"transliteration2\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Download Necessary Models"
"#### Downloading Necessary Models"
]
},
{
Expand Down

0 comments on commit 7ab8a5e

Please sign in to comment.