- Git for Windows.
- Romanian Word Embeddings from fastText. Donwload and unzip the file; you'll need it for training the model.
- Transformer from TreeBank xml to CoNLL-U format (download link will be provided).
Setting up the environment
If you have
Windows 10 Pro or
Windows 10 Enterprise editions installed:
Dockerdownload page on Docker Hub
- Click on the
Get Dockerbutton Note: This step requires having a docker account.
For other versions of Windows, download Docker Toolbox.
Run the installer and follow the on-screen instructions to get
Docker on your machine.
Clone the repository
- Copy the address of this repository from the
Clone or download button
Git GUIapplication (Press
Winbutton -> type
Git GUI-> press
git guiinto a command line)
- In the
Source Locationfield paste the address from the first step
- In the
Target Directoryspecify where you want to clone (something like
- Make sure that
Recursively clone submodulescheckbox is checked
- Press the
- Copy the word embeddings file from Prerequisites section into
- Copy the corpus files into
This step will build a
Docker image of NLP-Cube - an open source natural language processing pipeline.
To buid the image:
- Open a command prompt or PowerShell as administrator
- Navigate to the directory where the repository was cloned (e.g.
- Run the following command
docker build -t eurolan2019/nlp-cube -f .\nlp-cube\docker\Dockerfile --build-arg extranotebook=notebooks/eurolan-2019.ipynb --build-arg extranotebookname="6. EUROLAN 2019" .
Note: Make sure to include the final
. in the command. It specifies the build context.
When the build is finished, run
docker images in the same command window. You should see
eurolan2019/nlp-cube in the
REPOSITORY column of the output.
To run the
NLP-Cube open a command prompt or PowerShell as administrator and run the following command:
docker run -p 8888:8888 -v <path to repository>/data:/data --name nlp-cube eurolan2019/nlp-cube
Make sure to replace
<path to repository> with the path where the repository was cloned and replace
E.g. If you cloned the repository in
C:\Git\eurolan-2019 use the following:
docker run -p 8888:8888 -v /c/Git/eurolan-2019/data:/data --name nlp-cube eurolan2019/nlp-cube
In the output, find the lines similar to the lines below:
To access the notebook, open this file in a browser: file:///root/.local/share/jupyter/runtime/nbserver-7-open.html Or copy and paste one of these URLs: http://(a588c2c2adde or 127.0.0.1):8888/?token=27b88ef0a9c5d4c74ec54846be42ab9b1215a05adac4ce35
Copy the URL from the output and paste it in the browser. It will open the Jupyter Notebook Server interface. From there, click on the
examples folder and afterwards on the