Files with CD (Spanish Congress) interventions. They are processed in the following way:
[Processes so far]
The original workflow is kept in a separate SCRIPT repository. However, this original workflow is refined for Parlamint 3.0.
1.9. Numbering interventions and speeches is possible with separate scripts. Adding paragraphs and sentences is possible with a separate script (by Saturnino Luz)
<note> </notes>
are specified further according to TEI-PARLAMINT notes (documentation: https://clarin-eric.github.io/ParlaMint/#sec-comments)
NOTICE Implementation can be placed in cd2parmamint.xsl or in a separate script (then Makefile modification is needed)
Notice Makefile steps should be implemented from bottom to top. Targets are run with "make" command and the target (for instance: make cnv1).
If you are working on Windows, it is advisable to install a Windows Subsystem for Linux (WSL) to use bash command-line tools on Windows and run makefile smoothly. Several errors will crop up while trying to run the makefile.
2.0. Enable WSL
Open Windows Features, scroll down and check Windows Susbsystem for Linux. Select OK and restart Windows.
2.1. Open PowerShell or Command prompt (terminal) as an administrator (right click)
type wsl --list --online to view a list of available WSL distributions that can be installed
2.2. Type the specific distribution (for example, Ubuntu): wsl --install -d Ubuntu and press enter. Restart your computer
2.3. After restarting, it may take a while to install your desirable distribution.
2.4. Once installation of Ubuntu is complete, you'll be prompted to enter your username and password
2.5. Launch Ubuntu from the start menu on Windows
2.6. Do apt upgrade and apt update
Update by typing
$ sudo apt-get update
you'll be prompted to enter your password (step 7)
Upgrade by typing
$ sudo apt-get upgrade
2.7. Install packages and dependencies (you'll be prompted several times "Do you want to continue? [Y/n]" type Y)
$ sudo apt install moreutils
$ sudo apt install make
$ sudo apt install parallel
$ sudo apt install openjdk-19-jre-headless
$ sudo apt install unzip
There's a script loaded from the Ukrainian ParlaMint repository so you need to have svn installed:
$ sudo apt install subversion
$ sudo apt install jing
2.8. Saxon needs to be in the right place. This setup works (you should place SaxonHE12-3J here):
/opt/SaxonHE12-3J/saxon-he-12.3.jar
-rw-r--r-- 1 root root 5559891 Jul 12 11:45 /opt/SaxonHE12-3J/saxon-he-12.3.jar
go to /otp and make a dir to install SaxonHE12-3J.zip
$ cd /otp
$ sudo mkdir SaxonHE12-3J
Download SaxonHE12-3J.zip from [https://www.saxonica.com/download/] and move the file to /otp/SaxonHE12-3J
$ sudo mv /mnt/c/Users/XXXXX/Downloads/SaxonHE12-3J.zip /opt/SaxonHE12-3J
where XXXXX is your windows username
Unzip SaxonHE12-3J.zip
$ cd /otp/SaxonHE12-3J
$ sudo unzip SaxonHE12-3J.zip
then, you must create a symlink here
$ sudo ln -s /opt/SaxonHE12-3J/saxon-he-12.3.jar /usr/share/java/saxon.jar
verify that you have saxon.jar in the right place: /usr/share/java/saxon.jar
$ ll /usr/share/java/saxon.jar
lrwxrwxrwx 1 root root 35 Jul 12 11:52 /usr/share/java/saxon.jar -> /opt/SaxonHE12-3J/saxon-he-12.3.jar
Notice you should use Saxon-HE, because it allows you to run XSLT2.0 scripts
Now you can run makefile.
the script https://github.com/calzada/PARLAMINT-ES-MC/blob/master/bin/ana_work_stanza.py (by Luciana de Macedo) is working! But my machine is taking around 1 hour per file! It is important to have the best of GPU.
To run de Macedo's script: