You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are situations when an application fails during processing because of OOM or any other reason and all processed page batches and image descriptions are lost.
Few issues arise from this:
you have to start processing again and hope it will not fail this time.
if using remote VLM provider for image description you spend tokens/money on repeated processing of the same data over and over again.
if you are resource constrained, you risk never process some documents because of failures. It's all or nothing.
It would be great if there was an option to enable caching of intermediate results of page processing and image descriptions.
For example diskcache is a great choice for this task.
Reasons to implement caching:
more reliable processing on resource constrained systems.
start from the last failure point, making eventual processing of large documents possible after few restarts.
spend less time, money and compute on processing the same data over and over again
The text was updated successfully, but these errors were encountered:
Requested feature
There are situations when an application fails during processing because of OOM or any other reason and all processed page batches and image descriptions are lost.
Few issues arise from this:
It would be great if there was an option to enable caching of intermediate results of page processing and image descriptions.
For example diskcache is a great choice for this task.
Reasons to implement caching:
The text was updated successfully, but these errors were encountered: