You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In our use of GROBID, we have machines with a reasonable number of cores and RAM (eg, 30 cores, 40GB RAM), but poor disk I/O. This makes it important to have GROBID not write to disk, or to use a ramdisk (aka, virtual RAM-backed partition) if it must (eg, for interaction with pdfalto).
then I can see GROBID writing PDF files to: /srv/grobid/grobid-service-0.7.0-131-gdd0251d9f/grobid-home/run/grobid/tmp/origin2651762335153943539.pdf (a relative path, not an absolute path).
I don't know the Java APIs well enough to recommend an alternative function to use, but it seems like it should be possible to use grobid-home as a prefix for relative paths, but fall back and allow absolute paths if the grobid.temp variable is an absolute path.
Separately, I can also see files like /tmp/MIME2368838021331894851.tmp getting written, and it seems like the GROBID java process is writing these. I think this is due to Jersey? I vaguely remember being able to control the location these get written using the TMPDIR UNIX environment variable in the past, but that doesn't seem to be working. It would be great to be able to control this location, or just have it be the same as grobid.temp.
A work around for the first issue (absolute paths not possible) is to create a symlink to the location I want. I can't think of a way to do that with the second problem, without having the entire /tmp directory be a random or symlink, which could have other unintended consequences.
The text was updated successfully, but these errors were encountered:
I've quickly made a PR (#932) with a change that uses it the temporary directory as it is, if the path is absolute and as before, if the path is relative.
Maybe you can test it. 😅
In our use of GROBID, we have machines with a reasonable number of cores and RAM (eg, 30 cores, 40GB RAM), but poor disk I/O. This makes it important to have GROBID not write to disk, or to use a ramdisk (aka, virtual RAM-backed partition) if it must (eg, for interaction with pdfalto).
In the past it was possible to configure
grobid.temp
to point to, eg,/run/grobid/tmp
, which we configured on Linux to be a ramdisk. In newer versions of GROBID, it looks like this doesn't work any more, due to this change: c8e11b8#diff-65f7e37a114e9b9339efbb8ec03c4b19aec2f6998f127d539b6a07b01aa9b303L360-R362Eg, if we use YAML to configure:
then I can see GROBID writing PDF files to:
/srv/grobid/grobid-service-0.7.0-131-gdd0251d9f/grobid-home/run/grobid/tmp/origin2651762335153943539.pdf
(a relative path, not an absolute path).I don't know the Java APIs well enough to recommend an alternative function to use, but it seems like it should be possible to use
grobid-home
as a prefix for relative paths, but fall back and allow absolute paths if thegrobid.temp
variable is an absolute path.Separately, I can also see files like
/tmp/MIME2368838021331894851.tmp
getting written, and it seems like the GROBID java process is writing these. I think this is due to Jersey? I vaguely remember being able to control the location these get written using theTMPDIR
UNIX environment variable in the past, but that doesn't seem to be working. It would be great to be able to control this location, or just have it be the same asgrobid.temp
.A work around for the first issue (absolute paths not possible) is to create a symlink to the location I want. I can't think of a way to do that with the second problem, without having the entire
/tmp
directory be a random or symlink, which could have other unintended consequences.The text was updated successfully, but these errors were encountered: