- Git for Windows.
- Romanian Word Embeddings from fastText. Donwload and unzip the file; you'll need it for training the model.
- Transformer from TreeBank xml to CoNLL-U format (download link will be provided).
If you have Windows 10 Pro or Windows 10 Enterprise editions installed:
- Open
Dockerdownload page on Docker Hub - Click on the
Get Dockerbutton Note: This step requires having a docker account.
For other versions of Windows, download Docker Toolbox.
Run the installer and follow the on-screen instructions to get Docker on your machine.
- Copy the address of this repository from the
Clone or download button - Open
Git GUIapplication (PressWinbutton -> typeGit GUI-> pressEnteror typegit guiinto a command line) - In the
Source Locationfield paste the address from the first step - In the
Target Directoryspecify where you want to clone (something likeC:\Git\eurolan-2019) - Make sure that
Recursively clone submodulescheckbox is checked - Press the
Clonebutton
- Copy the word embeddings file from Prerequisites section into
datadirectory. - Copy the corpus files into
data\corpusdirectory.
This step will build a Docker image of NLP-Cube - an open source natural language processing pipeline.
To buid the image:
- Open a command prompt or PowerShell as administrator
- Navigate to the directory where the repository was cloned (e.g.
cd C:\Git\eurolan-2019) - Run the following command
docker build -t eurolan2019/nlp-cube -f .\nlp-cube\docker\Dockerfile --build-arg extranotebook=notebooks/eurolan-2019.ipynb --build-arg extranotebookname="6. EUROLAN 2019" .Note: Make sure to include the final . in the command. It specifies the build context.
When the build is finished, run docker images in the same command window. You should see eurolan2019/nlp-cube in the REPOSITORY column of the output.
To run the NLP-Cube open a command prompt or PowerShell as administrator and run the following command:
docker run -p 8888:8888 -v <path to repository>/data:/data --name nlp-cube eurolan2019/nlp-cubeMake sure to replace <path to repository> with the path where the repository was cloned and replace \ with /.
E.g. If you cloned the repository in C:\Git\eurolan-2019 use the following:
docker run -p 8888:8888 -v /c/Git/eurolan-2019/data:/data --name nlp-cube eurolan2019/nlp-cubeIn the output, find the lines similar to the lines below:
To access the notebook, open this file in a browser:
file:///root/.local/share/jupyter/runtime/nbserver-7-open.html
Or copy and paste one of these URLs:
http://(a588c2c2adde or 127.0.0.1):8888/?token=27b88ef0a9c5d4c74ec54846be42ab9b1215a05adac4ce35Copy the URL from the output and paste it in the browser. It will open the Jupyter Notebook Server interface. From there, click on the examples folder and afterwards on the eurolan-2019.ipynb file.