Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid superfluous file copies in workspace directory #227

Closed
wrznr opened this issue Jan 4, 2019 · 4 comments · Fixed by #303
Closed

Avoid superfluous file copies in workspace directory #227

wrznr opened this issue Jan 4, 2019 · 4 comments · Fixed by #303
Assignees
Labels

Comments

@wrznr
Copy link
Contributor

wrznr commented Jan 4, 2019

Using https://raw.githubusercontent.com/OCR-D/assets/master/data/dietrich_fuehrer_1839/data/mets.xml, each processing steps copies the input files from the corresponding subdirectory into the workspace directory (i.e. files from the output directory KRAKEN-BIN are copied to . when the KRAKEN-BIN filegroup is used as input for e.g. region detection).

@wrznr
Copy link
Contributor Author

wrznr commented Jul 3, 2019

This can be closed, right?

@mikegerber
Copy link
Contributor

Seems to be fixed, using these versions:

ocrd (1.0.0b10)
ocrd-kraken (0.1.0)
ocrd-modelfactory (1.0.0b10)
ocrd-models (1.0.0b10)
ocrd-tesserocr (0.2.2)
ocrd-typegroups-classifier (0.0.1)
ocrd-utils (1.0.0b10)
ocrd-validators (1.0.0b10)

@mikegerber
Copy link
Contributor

mikegerber commented Jul 3, 2019

It still happens with ocrd-ocropy-segment 0.0.3:

#!/bin/bash -x
cd `mktemp -d`

virtualenv -p /usr/bin/python3 venv
. venv/bin/activate
pip install ocrd-ocropy

wget "https://www.dropbox.com/s/ettix80l6ul2h70/bernd_lebensbeschreibung_1738.ocrd.zip?dl=1"
dtrx bernd_lebensbeschreibung_1738.ocrd.zip

cd bernd_lebensbeschreibung_1738.ocrd/data
ocrd-ocropy-segment -m mets.xml -I OCR-D-IMG -O OCR-D-SEG-LINES

pip list | grep ocrd
ls -l

yields:

[...]
ocrd                   1.0.0b10 
ocrd-fork-ocropy       1.4.0a3  
ocrd-modelfactory      1.0.0b10 
ocrd-models            1.0.0b10 
ocrd-ocropy            0.0.3    
ocrd-utils             1.0.0b10 
ocrd-validators        1.0.0b10 
total 23348
-rw-rw-r--. 1 mike mike   13069 Jul  3 14:37 mets.xml
drwxrwxr-x. 2 mike mike     100 Feb 26 15:21 OCR-D-GT-SEG-BLOCK
drwxrwxr-x. 2 mike mike     100 Feb 26 15:21 OCR-D-GT-SEG-LINE
drwxrwxr-x. 2 mike mike     100 Feb 26 15:21 OCR-D-GT-SEG-PAGE
drwxrwxr-x. 2 mike mike     100 Feb 26 15:21 OCR-D-IMG
-rw-rw-r--. 1 mike mike 7957484 Jul  3 14:37 OCR-D-IMG_0001
-rw-rw-r--. 1 mike mike 7957484 Jul  3 14:37 OCR-D-IMG_0002
-rw-rw-r--. 1 mike mike 7973726 Jul  3 14:37 OCR-D-IMG_0003
drwxrwxr-x. 2 mike mike     100 Jul  3 14:37 OCR-D-SEG-LINES

Is this a problem with ocrd-ocropy or with core? I don't use ocrd-ocropy at the moment because of OCR-D/ocrd_ocropy#4 and it does seem to be going to be replaced (OCR-D/ocrd_ocropy#5), so maybe this issue here can be closed.

@wrznr
Copy link
Contributor Author

wrznr commented Jul 18, 2019

@kba Could you please elaborate whether this is a core or a processor problem?

@wrznr wrznr added this to the Developer Workshop milestone Jul 18, 2019
@kba kba mentioned this issue Aug 13, 2019
@kba kba closed this as completed in #303 Sep 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants