# Downloading Models

Polyglot requires a model for each task and language.
These models are essential for the library to function.
Given the large size of some of the models, we distribute the models through a download manager separately. The download manager has several models of operation.

## Modes of Operations

### Interactive Mode Interface

In [None]:
!polyglot download

Polyglot Downloader
---------------------------------------------------------------------------
  d) Download   l) List    u) Update   c) Config   h) Help   q) Quit
---------------------------------------------------------------------------
Downloader> 

### Command Line Interface

In [2]:
!polyglot download --help

usage: polyglot download [-h] [--dir DIR] [--quiet] [--force] [--exit-on-error] [--url SERVER_INDEX_URL] [packages [packages ...]]

positional arguments:
  packages              packages to be downloaded

optional arguments:
  -h, --help            show this help message and exit
  --dir DIR             download package to directory DIR
  --quiet               work quietly
  --force               download even if already installed
  --exit-on-error       exit if an error occurs
  --url SERVER_INDEX_URL
                        download server index url


In [6]:
!polyglot download morph2.en

[polyglot_data] Downloading package morph2.en to
[polyglot_data]     /home/rmyeid/polyglot_data...
[polyglot_data]   Package morph2.en is already up-to-date!


### Library Interface

In [5]:
from polyglot.downloader import downloader
downloader.download("embeddings2.en")

[polyglot_data] Downloading package embeddings2.en to
[polyglot_data]     /home/rmyeid/polyglot_data...
[polyglot_data]   Package embeddings2.en is already up-to-date!


True

## Collections

You noticed by now that we can install a specific model by specifying its name and the target language.

Package name format is `task_name.language_code`

#### Langauge Collections

Packages are grouped by language. For example, if we want to download all the models that are specific to Arabic, the arabic collection of models name is **LANG:** followed by the language code of Arabic which is `ar`.

Therefore, we can just run:

In [10]:
!polyglot download LANG:ar

[polyglot_data] Downloading collection u'LANG:ar'
[polyglot_data]    | 
[polyglot_data]    | Downloading package tsne2.ar to
[polyglot_data]    |     /home/rmyeid/polyglot_data...
[polyglot_data]    | Downloading package transliteration2.ar to
[polyglot_data]    |     /home/rmyeid/polyglot_data...
[polyglot_data]    |   Package transliteration2.ar is already up-to-
[polyglot_data]    |       date!
[polyglot_data]    | Downloading package morph2.ar to
[polyglot_data]    |     /home/rmyeid/polyglot_data...
[polyglot_data]    |   Package morph2.ar is already up-to-date!
[polyglot_data]    | Downloading package counts2.ar to
[polyglot_data]    |     /home/rmyeid/polyglot_data...
[polyglot_data]    | Downloading package sentiment2.ar to
[polyglot_data]    |     /home/rmyeid/polyglot_data...
[polyglot_data]    | Downloading package embeddings2.ar to
[polyglot_data]    |     /home/rmyeid/polyglot_data...
[polyglot_data]    | Downloading package ner2.ar to
[polyglot_data]    |     /home/rmyeid

#### Task Collections

Packages are grouped by task. For example, if we want to download all the models that perform transliteration. The collection name is **TASK:** followed by the task name.

Therefore, we can just run:

In [13]:
downloader.download("TASK:transliteration2")

[polyglot_data] Downloading collection u'TASK:transliteration2'
[polyglot_data]    | 
[polyglot_data]    | Downloading package transliteration2.nn to
[polyglot_data]    |     /home/rmyeid/polyglot_data...
[polyglot_data]    |   Package transliteration2.nn is already up-to-
[polyglot_data]    |       date!
[polyglot_data]    | Downloading package transliteration2.no to
[polyglot_data]    |     /home/rmyeid/polyglot_data...
[polyglot_data]    |   Package transliteration2.no is already up-to-
[polyglot_data]    |       date!
[polyglot_data]    | Downloading package transliteration2.nl to
[polyglot_data]    |     /home/rmyeid/polyglot_data...
[polyglot_data]    |   Package transliteration2.nl is already up-to-
[polyglot_data]    |       date!
[polyglot_data]    | Downloading package transliteration2.az to
[polyglot_data]    |     /home/rmyeid/polyglot_data...
[polyglot_data]    |   Package transliteration2.az is already up-to-
[polyglot_data]    |       date!
[polyglot_data]    | Downloadi

True

## Langauge & Task Support

We can query our download manager for which tasks are supported by polyglot, as the following:

In [14]:
downloader.supported_tasks(lang="en")

[u'embeddings2',
 u'counts2',
 u'pos2',
 u'ner2',
 u'sentiment2',
 u'morph2',
 u'tsne2']

We can query our download manager for which languages are supported by polyglot named entity recognition subsystem, as the following:

In [19]:
downloader.supported_languages(task="ner2")

['Polish',
 'Turkish',
 'Russian',
 'Indonesian',
 'Czech',
 'Arabic',
 'Korean',
 'Catalan; Valencian',
 'Italian',
 'Thai',
 'Romanian, Moldavian, Moldovan',
 'Tagalog',
 'Danish',
 'Finnish',
 'German',
 'Persian',
 'Dutch',
 'Chinese',
 'French',
 'Portuguese',
 'Slovak',
 'Hebrew (modern)',
 'Malay',
 'Slovene',
 'Bulgarian',
 'Hindi',
 'Japanese',
 'Hungarian',
 'Croatian',
 'Ukrainian',
 'Serbian',
 'Lithuanian',
 'Norwegian',
 'Latvian',
 'Swedish',
 'English',
 'Greek, Modern',
 'Spanish; Castilian',
 'Vietnamese',
 'Estonian']

You can view all the downloaded packages and available ones in the index through the list function

In [14]:
downloader.list(show_packages=False)

Using default data directory (/home/rmyeid/polyglot_data)
 Data server index for <polyglot-models>
Collections:
  [ ] LANG:af............. Afrikaans            packages and models
  [ ] LANG:als............ als                  packages and models
  [ ] LANG:am............. Amharic              packages and models
  [ ] LANG:an............. Aragonese            packages and models
  [ ] LANG:ar............. Arabic               packages and models
  [ ] LANG:arz............ arz                  packages and models
  [ ] LANG:as............. Assamese             packages and models
  [ ] LANG:ast............ Asturian             packages and models
  [ ] LANG:az............. Azerbaijani          packages and models
  [ ] LANG:ba............. Bashkir              packages and models
  [ ] LANG:bar............ bar                  packages and models
  [ ] LANG:be............. Belarusian           packages and models
  [ ] LANG:bg............. Bulgarian            packages and models
  [ 