-
-
Notifications
You must be signed in to change notification settings - Fork 479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docx conversion on debian does not work | Libreoffice crash #784
Comments
Hello @AlexKvrlp, Did you provide enough CPU/memory to your Gotenberg instance? |
Hi @gulien, I think so. There is no limitation for the container.
Edit: |
Edit for clarity: The system works, but randomly fails (many times a day on v8 vs almost never on v7.4.2) Hi, i have been busy trying different combinations of settings while attempting not to post an issue... but after seeing this, im tossing in a +1. I had two instances running on 7.4.2 with the default settings. I MIGHT see a single 503 error each day, and most of the time, no errors at all. I couldnt use any version past 7.4.2 because i would get many more 503 errors. Anyway, i just upgraded to v8 thinking that the old issue was probably resolved by now, but no. I would like to avoid going back to 7.4.2, but if thats what ends up happening, ill be ok with it. If you need some logs, whats the best way to go about getting those? Thanks! |
I don't think that our both issues has the same reason. In my case it runs perfectly on system A but fails completly at system B. Edit: |
In that case, sorry to hijack the thread. However, if you would humor me (just in case), maybe try out the version i mentioned? If it still doesnt work, ill remove my comments to clean out the clutter. |
Thanks a lot @JocoLabs! 7.4.2 works! Again: THANKS |
I'm sorry. The above information that it fails on version 7 was not correct. |
Hey guys, looks like your issues are somehow related to: #763. TL;DR:
What's weird is that on my end, I have 0 issue, locally or on my demo instance. The issue does not seem really common, otherwise a lot of people would complaint about it 🤔 Anyway, coud you try: FROM gotenberg/gotenberg:8.0.2
USER root
RUN DEBIAN_FRONTEND=noninteractive apt-get remove -y -qq libreoffice &&\
DEBIAN_FRONTEND=noninteractive apt-get autoremove -y -qq &&\
apt-get update -qq &&\
DEBIAN_FRONTEND=noninteractive apt-get install -y -qq --no-install-recommends libreoffice &&\
libreoffice --version
USER gotenberg
Here it will replace the existing LibreOffice version (from Could you test if it works better in your case? |
Sorry, but I failed to build the image. Here ist the output:
|
@AlexKvrlp I've updated the Dockerfile, could you try again? |
Thanks for the docker file, ill give that a shot. I knew something changed between the versions, i just wasnt sure which part (and i was trying to avoid creating an issue if no one else was). Lastly, reading over other issues, it almost seems like a race condition with the locks, gc, or something else. As i mentioned, even with api and lo-startup timeouts at 500s, still no go... my guess is the race condition hits, and its stuck in a deadlock until timeout catches it (the timeout hit is always api timeout, never startup in my experience). Anyway, thanks for this, I will test when i can as the only way for me is to put it into production to mirror the environment that causes the issue (dont wait up for me). |
A race condition would cause a « go » panic AFAIK 🤔. Regarding the lock, that’s a possibility, but it should happen way more often IMO, or the conditions are rare. To clarify, does it happen only on startup? Or on restart? Or on conversion requests? |
Also, do you have the complete error message? |
@AlexKvrlp it looks like your error is happening on LibreOffice first start:
No idea why, but I wonder if it is related to the LibreOffice version. |
I just reworked the logging to warn, as well as tweaked the timeout on each instance so i can see which one broke. Lastly, of my four instances, two are running the vanilla version, and two are running the version using the older LO as per the dockerfile above. I'll keep you posted. |
Many thanks for your help!! Sadly the build failed again:
|
@AlexKvrlp @JocoLabs I've pushed the image |
Thanks, I fired up gulnap/gotenberg:libreoffice-bookworm. Unfortunately no improvement. :-/
|
Are you using the same version of Docker locally and on your production server? In my case it is working as expected:
No idea what's happening there. |
ok, this could be that the problem. For development I'm using 24.0.6 too. On production is 19.03.12. But did the docker version matters? |
🤷♂️ I don't know to be honest. But your/my local are using a Docker version which is a few major releases away from your production version. I'm not familiar with the Docker versioning system, but I guess it might have some impact. |
Ok, I'll instruct our server providers to update docker to 24.0.6. . I'll let you know if it helped. But it could take a few days until I get a response. Thanks again |
@gulien I'm very sorry but I'm only now seeing that you haven't made any conversion here. The error always only occurs for me when I convert a Docx file or start the container with the argument --libreoffice-auto-start I will report if docker upgrade is done. |
Yes but I've started Gotenberg with the argument |
Oh yes, sorry missed |
I got a quick response from our server provider. We are using Debian 9. The highest possible docker version is 19.03.15. |
Hi, In all instances (using your backport, and vanilla), the logged error is I did see something on SO about someone testing routes in Go, and if too many came in, it would throw that error almost instantly... it seemed related to the context timeout being tied to ALL of them, and not just a single one. During the right times of day, it would be possible for me to send enough docs to the api that it might just be hitting that 90s for all in queue (and not 90 per item). I dont know Go well enough to see if that is the case with your code. Lastly, i can fall back to 7.4.2, i dont want to take up too much time if im an edge case. EDIT |
@JocoLabs are you using the LibreOffice module in « stateless » mode in version 7.4.2 (i.e., a dedicated LibreOffice instance per conversion)? In current version, Gotenberg has only one LibreOffice instance running and a lock mechanism to ensure that one and only one conversion is done at a time. If there are a lot of requests incoming, it may be possible that some may not be able to acquire the lock for a conversion before timing out. You can checkout the queue size of LibreOffice with https://gotenberg.dev/docs/routes#metrics-route. I’d suggest to increase your number of Gotenberg instances to mitigate this issue. |
@gulien I was unaware of how LO was running in any version. I just had it running with out of the box settings, and saw next to zero issues. If 7.4.2 was running stateless, and 8.x is stateful; is there a flag to make it stateless in 8.x? I am doing a final trial of four instances with timeouts at 500s. If i see enough failures, i will just roll back. Either way, thanks again for the effort with this. EDIT: |
No stateless mode now. The main difference is that a unitary conversion is a lot faster. Also, instead of infinitely scaling LibreOffice instances inside a container, with the risk of resource starving, it is now up to your infrastructure to handle the scaling of Gotenberg instances. My point being it is now easier to define a strategy that fits your need, because most Docker orchestrators know how to scale up/down containers (and its cheap). |
I understand your reasoning. Thanks for the input. I will just roll back because each of my servers have 64 cores and 256g memory at my disposal. I can just tune docker to have no limits. Which is much easier than configuring a whole orch setup (which will end up using the same amount of resources anyway). Thanks again, and hopefully AlexKvrlp gets his stuff fixed up. EDIT: |
Dumb question (because i have limited knowledge on the topic). Thinking about your auto scaling comments; if auto scaling is based on resource usage, and you only run a single instance of LO doing one conversion at a time (managing your own internal work queue), what would actually trigger the orchestrator to start new instances? Lastly, because im getting off topic now, if there is further discussion here (maybe for others silent, but having a similar issue), would it be helpful to start a new issue for this topic (scaling)? |
I mean, in your case, you may just spin up more instances by default. Best option is to check que |
I have first results:
So this error occurs only on debian 9 (stretch). I will try to find a workaround an will report it here, if there is any. If not, it's ok for me too. We will upgrade the production os asap. Maybe a note in the documentation would be an option. |
Thanks for the details @AlexKvrlp 👍 |
Documentation has been updated 👍 |
Hi,
thank you for such an great tool! It works fine in my development environment (ubuntu wsl and ubuntu).
But on our production system I get only an "internal Server Error".
Our production server is an debian with 4.9.0.19-amd64 kernel. Docker version is 19.03.12.
According to the documentation I checked the ressources but there is no limitation. I think 32GB memory should be enough. ;-)
So I tried to track down the problem. If I start the container with --libreoffice-auto-start enabled, the container didn't start.
The are some tickets which suggest to increase --libreoffice-start-timeout. I increased it up to 400 seconds but still no luck.
Next I checked the LibreOffice Version with
docker exec -it gotenberg libreoffice --version
and gotERROR 4 forking process
.For me, it seems like that libreoffice is unable to start. Thats why no conversion is possible. But how to fix it?
At the moment I'm out of ideas how to get gotenberg running.
I would be very appreciate for any hints. :-)
The text was updated successfully, but these errors were encountered: