- Git for Windows.
- Romanian Word Embeddings from fastText. Donwload and unzip the file; you'll need it for training the model.
- Transformer from TreeBank xml to CoNLL-U format (download link will be provided).
If you have Windows 10 Pro
or Windows 10 Enterprise
editions installed:
- Open
Docker
download page on Docker Hub - Click on the
Get Docker
button Note: This step requires having a docker account.
For other versions of Windows, download Docker Toolbox.
Run the installer and follow the on-screen instructions to get Docker
on your machine.
- Copy the address of this repository from the
Clone or download button
- Open
Git GUI
application (PressWin
button -> typeGit GUI
-> pressEnter
or typegit gui
into a command line) - In the
Source Location
field paste the address from the first step - In the
Target Directory
specify where you want to clone (something likeC:\Git\eurolan-2019
) - Make sure that
Recursively clone submodules
checkbox is checked - Press the
Clone
button
- Copy the word embeddings file from Prerequisites section into
data
directory. - Copy the corpus files into
data\corpus
directory.
This step will build a Docker
image of NLP-Cube - an open source natural language processing pipeline.
To buid the image:
- Open a command prompt or PowerShell as administrator
- Navigate to the directory where the repository was cloned (e.g.
cd C:\Git\eurolan-2019
) - Run the following command
docker build -t eurolan2019/nlp-cube -f .\nlp-cube\docker\Dockerfile --build-arg extranotebook=notebooks/eurolan-2019.ipynb --build-arg extranotebookname="6. EUROLAN 2019" .
Note: Make sure to include the final .
in the command. It specifies the build context.
When the build is finished, run docker images
in the same command window. You should see eurolan2019/nlp-cube
in the REPOSITORY
column of the output.
To run the NLP-Cube
open a command prompt or PowerShell as administrator and run the following command:
docker run -p 8888:8888 -v <path to repository>/data:/data --name nlp-cube eurolan2019/nlp-cube
Make sure to replace <path to repository>
with the path where the repository was cloned and replace \
with /
.
E.g. If you cloned the repository in C:\Git\eurolan-2019
use the following:
docker run -p 8888:8888 -v /c/Git/eurolan-2019/data:/data --name nlp-cube eurolan2019/nlp-cube
In the output, find the lines similar to the lines below:
To access the notebook, open this file in a browser:
file:///root/.local/share/jupyter/runtime/nbserver-7-open.html
Or copy and paste one of these URLs:
http://(a588c2c2adde or 127.0.0.1):8888/?token=27b88ef0a9c5d4c74ec54846be42ab9b1215a05adac4ce35
Copy the URL from the output and paste it in the browser. It will open the Jupyter Notebook Server interface. From there, click on the examples
folder and afterwards on the eurolan-2019.ipynb
file.