Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FoLiA-2text aborts on a metadata issue #37

Closed
martinreynaert opened this issue Nov 12, 2019 · 2 comments
Closed

FoLiA-2text aborts on a metadata issue #37

martinreynaert opened this issue Nov 12, 2019 · 2 comments
Assignees

Comments

@martinreynaert
Copy link
Contributor

This happened on the Staten-Generaal Digitaal FoLiA which has been been processed by just about every other FoLiA tool before.

[1]+ Aborted nohup /exp/sloot/usr/local/bin/FoLiA-2text -t 120 --class=Ticcl -e ticcl.xml -o /reddata/POLMASH/FOLIALangCatTICCLTXT/d/nl/proc/sgd/ /reddata/POLMASH/FOLIALangCatTICCL/d/nl/proc/sgd/ > /reddata/POLMASH/FOLIALangCatTICCLTXT.sgd.20191112.stdout 2> /reddata/POLMASH/FOLIALangCatTICCLTXT.sgd.20191112.stderr

reynaert@maize:/reddata/POLMASH$ cat /reddata/POLMASH/FOLIALangCatTICCLTXT.sgd.20191112.stderr
nohup: ignoring input
WARNING: foreign-data found in metadata of type 'native'
changing type to 'foreign'
WARNING: foreign-data found in metadata of type 'native'
changing type to 'foreign'
WARNING: foreign-data found in metadata of type 'native'
changing type to 'foreign'
terminate called recursively
terminate called after throwing an instance of 'folia::NoSuchText'
reynaert@maize:/reddata/POLMASH$

I have no idea.

@martinreynaert
Copy link
Contributor Author

Addendum:

Before it crashed it created 5 output files, one of which had actual output:

reynaert@maize:~$ ls -l /reddata/POLMASH/FOLIALangCatTICCLTXT/d/nl/proc/sgd/
total 2
-rw-r--r-- 1 reynaert reynaert 0 Nov 12 16:41 nl.proc.sgd.d.186918700000115.folia.lc.ticcl.xml.txt
-rw-r--r-- 1 reynaert reynaert 0 Nov 12 16:41 nl.proc.sgd.d.189618970000236.folia.lc.ticcl.xml.txt
-rw-r--r-- 1 reynaert reynaert 0 Nov 12 16:41 nl.proc.sgd.d.190419050000115.folia.lc.ticcl.xml.txt
-rw-r--r-- 1 reynaert reynaert 0 Nov 12 16:41 nl.proc.sgd.d.195419550000051.2.folia.lc.ticcl.xml.txt
-rw-r--r-- 1 reynaert reynaert 1398 Nov 12 16:41 nl.proc.sgd.d.198819890000898.8.folia.lc.ticcl.xml.txt

Before this run, I tested it on a small number of files and it ran as expected.

@kosloot
Copy link
Contributor

kosloot commented Nov 13, 2019

Well, this is caused by running FoLiA-2text on files without any text in it.
This was not foreseen :{
I commited a patch to skip such documents.

kosloot pushed a commit that referenced this issue Nov 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants