Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unoconv not running with apache, but runs fine with root #87

Closed
noego opened this issue Sep 12, 2012 · 59 comments
Closed

unoconv not running with apache, but runs fine with root #87

noego opened this issue Sep 12, 2012 · 59 comments

Comments

@noego
Copy link

noego commented Sep 12, 2012

I traced the error to this bit of the code:

        try:
            print >>sys.stderr, "lalala %s", of.basepath
            import uno, unohelper
            print >>sys.stderr, "lelele"
            office = of
            break
        except:
#            debug_office()
            print >>sys.stderr, "unoconv: Cannot find a suitable pyuno library and python binary combination in %s" % of
            print >>sys.stderr, "ERROR:", sys.exc_info()[1]
            print >>sys.stderr

When running as root I get this:

# /usr/bin/unoconv -f pdf -o "/var/www/tmp/test2" "/var/www/tmp/test1"
lalala %s /opt/libreoffice3.6
lelele

Then running the same root through a webpage with apache, I get this:

lalala %s /opt/libreoffice3.6
unoconv: Cannot find a suitable pyuno library and python binary combination in /opt/libreoffice3.6
ERROR: Error during bootstrapping uno (RuntimeException):cannot open file:///root/.ure/types.rdb: 13
lalala %s /opt/libreoffice3.6
unoconv: Cannot find a suitable pyuno library and python binary combination in /opt/libreoffice3.6
ERROR: Error during bootstrapping uno (RuntimeException):cannot open file:///root/.ure/types.rdb: 13

So basically the try bit fails with apache but works as root. I suppose it might be a permissions issue, or something in the environment, but I haven't been able to find any info.

Thanks for your attention!

lalala %s /opt/libreoffice3.6
unoconv: Cannot find a suitable pyuno library and python binary combination in /opt/libreoffice3.6
ERROR: Error during bootstrapping uno (RuntimeException):cannot open file:///root/.ure/types.rdb: 13

unoconv: Cannot find a suitable office installation on your system.
ERROR: Please locate your office installation and send your feedback to:
http://github.com/dagwieers/unoconv/issue

@dagwieers
Copy link
Member

From the exception you can see it tries to open /root/.ure/types.rdb. So the first question is why ?

@noego
Copy link
Author

noego commented Sep 13, 2012

I have no clue. From a grep, the one giving the error is /opt/libreoffice3.6/program/libpyuno.so

This is with RHEL6 and libreoffice3.6, though it also happens with 3.5. Maybe it's some environment variable?

@dagwieers
Copy link
Member

Please see this error report: http://code.google.com/p/archivematica/issues/detail?id=961

Is it possible you have more than one instance running ? Can you use flock in order to avoid running more than one instance ? Can you make sure no other LibreOffice instance is running ?

@noego
Copy link
Author

noego commented Sep 13, 2012

Just one office running. Just one soffice binary in the entire HD. Thanks for the help, by the way, I appreciate it. I'm messing around with paths now and see if I can get something to stick.

@dagwieers
Copy link
Member

@dagwieers
Copy link
Member

And possibly more relevant: https://bugs.freedesktop.org/show_bug.cgi?id=50123

Might be an installation problem. What distribution or version ? Are you using the latest unoconv v0.6 ?

@noego
Copy link
Author

noego commented Sep 13, 2012

Using 0.6, but same thing happens with 0.5. Checking the link for ideas, thanks again! Also, using RHEL 6 with libreoffice from the libreoffice rpms.

@dagwieers
Copy link
Member

Ok, this is the same environment as me. I am looking into it by doing:

[root@moria ~]# sudo -u apache unoconv -vvvv -f pdf /tmp/document-example.odt 
Verbosity set to level 4
Using office base path: /usr/lib64/libreoffice
Using office binary path: /usr/lib64/libreoffice/program
DEBUG: Connection type: socket,host=127.0.0.1,port=2002;urp;StarOffice.ComponentContext
DEBUG: Existing listener not found.
DEBUG: Launching our own listener using /usr/lib64/libreoffice/program/soffice.bin.
LibreOffice listener successfully started. (pid=5659)
DEBUG: Process /usr/lib64/libreoffice/program/soffice.bin (pid=5659) exited with 77.
Error: Unable to connect or start own listener. Aborting.

If I force a specific LibreOffice 3.6, I get:

[root@moria ~]# UNO_PATH=/opt/libreoffice3.6 sudo -E -u apache unoconv -vvvv -f pdf /tmp/document-example.odt 
Verbosity set to level 4
Using office base path: /opt/libreoffice3.6
Using office binary path: /opt/libreoffice3.6/program
DEBUG: Connection type: socket,host=127.0.0.1,port=2002;urp;StarOffice.ComponentContext
DEBUG: Existing listener not found.
DEBUG: Launching our own listener using /opt/libreoffice3.6/program/soffice.bin.
LibreOffice listener successfully started. (pid=5721)
terminate called after throwing an instance of 'com::sun::star::uno::RuntimeException'
DEBUG: Process /opt/libreoffice3.6/program/soffice.bin (pid=5721) exited with -6.
Error: Unable to connect or start own listener. Aborting.

@dagwieers
Copy link
Member

Doing the same with a real user, it does work:

[root@moria ~]# sudo -E -H -u dag unoconv -vvvv -f pdf /tmp/document-example.odt 
Verbosity set to level 4
Using office base path: /usr/lib64/libreoffice
Using office binary path: /usr/lib64/libreoffice/program
DEBUG: Connection type: socket,host=127.0.0.1,port=2002;urp;StarOffice.ComponentContext
DEBUG: Existing listener not found.
DEBUG: Launching our own listener using /usr/lib64/libreoffice/program/soffice.bin.
LibreOffice listener successfully started. (pid=5843)
Input file: /tmp/document-example.odt
Selected output format: Portable Document Format [.pdf]
Selected office filter: writer_pdf_Export
Used doctype: document
Output file: /tmp/document-example.pdf
DEBUG: Terminating LibreOffice instance.
DEBUG: Waiting for LibreOffice instance to exit.

The difference in environment is:

[root@moria Downloads]# diff <(sudo -E -H -u apache env) <(sudo -E -H -u dag env)
8c8
< USER=apache
---
> USER=dag
19,20c19,20
< HOME=/var/www
< LOGNAME=apache
---
> HOME=/home/dag
> LOGNAME=dag
28c28
< USERNAME=apache
---
> USERNAME=dag

@noego
Copy link
Author

noego commented Sep 13, 2012

For me, it works with the apache user:

UNO_PATH=/opt/libreoffice3.6 sudo -E -u apache unoconv -vvvv -f pdf -o "/var/www/tmp/test2" "/var/www/tmp/test1"
/opt/libreoffice3.6
lalala %s /opt/libreoffice3.6
lelele
Verbosity set to level 4
Using office base path: /opt/libreoffice3.6
Using office binary path: /opt/libreoffice3.6/program
DEBUG: Connection type: socket,host=127.0.0.1,port=2002;urp;StarOffice.ComponentContext
Input file: /var/www/tmp/test1
Selected output format: Portable Document Format [.pdf]
Selected office filter: writer_pdf_Export
Used doctype: document
Output file: /var/www/tmp/test2

The problem is when running it from -inside- apache.

@noego
Copy link
Author

noego commented Sep 13, 2012

In fact, it's a problem with either the PHP or Apache I compiled. I couldn't use the RHEL6 version because the SOAP lib doesn't work properly and had to roll out my own version.

The distro installed version actually works (have it running on a separate port), so what I'm going to do is make the original version do the unoconv processing, while the custom one I built does the rest. Stop gap until I can figure this specific problem out.

@noego
Copy link
Author

noego commented Sep 13, 2012

Aaaaand... somehow it fixed itself. I don't know what I did, but somehow it just started working.

@dagwieers
Copy link
Member

I noticed that I had to modify the apache user's shell to something different than /sbin/nologin to make it work through sudo. Don't know if you had to do something similar. I would be interested to learn what modifications you had to make so unoconv works through Apache. We have had several people reporting problems, and one person already described it as part of the documentation. You're input could be vital to help others...

@noego
Copy link
Author

noego commented Sep 14, 2012

I wish I could shed some light on it, but the truth is that I don't know what I did. The samba user still has the nologin bit.

The last things I remember doing were:

  1. Uninstalled libreoffice 3.6.1
  2. Installed whatever version of libreoffice yum
  3. Uninstalled that version because it didn't work
  4. Installed libreoffice 3.6.1 again

I was in the process of looking through the links you gave me to check when someone else told me "it works"

But I might have done more, IDK.

Oh, one thing I did which _MIGHT_ have to do something with it is change /etc/environment

@noego
Copy link
Author

noego commented Sep 14, 2012

(Continued from above, hit the comment button by accident)

In /etc/environment I added this line at some point soon before the thing started working:

UNO_PATH=/opt/libreoffice3.6

@noego
Copy link
Author

noego commented Sep 14, 2012

It -may- also have to do with having a unoconv -l running. I killed that process before things started working also.

@dagwieers
Copy link
Member

@noego These are all good suggestions for the next person to test. Over time I am confident we will find it.

The unoconv -l looks very suspicious to me as it is a fact that LibreOffice does not like multiple instances running at the same time. Random errors may occur, or it may fail all the time. So you should serialize the conversion requests by using something like flock rather than have multiple requests end up on the running instances at the same time.

@dagwieers
Copy link
Member

BTW Thanks for sharing your insights !

@kaplun
Copy link
Contributor

kaplun commented Sep 17, 2012

Hi don't know if this might help, but we are also using (a patched) unoconv to run Open/Libreoffice on SLC5/6 triggered from Apache (mod_wsgi). Additionally we are running unoconv (and that's why we patched it) as user nobody in a dedicated home directory, so that LibreOffice can't possibly mess up with Apache files.

We found that we need to explicitly set a HOME environment variable to unoconv->libreoffice, pointing to a directory where LibreOffice has the rights to write (in our case the user nobody). Without this HOME variable getting through the chain of sudo calls, unoconv and libreoffice, we were also experiencing the original error message your mentioning.

@dagwieers
Copy link
Member

@kaplun I'd like to hear any recommendations you have for making this part of unoconv. In what way did you patch this, and is there something we can improve to accommodate this kind of use. I would expect setting HOME before activation should work, but I could be wrong too ;-)

@kaplun
Copy link
Contributor

kaplun commented Sep 28, 2012

Hi @dagwieers!
In inveniosoftware/invenio@86fae6f you can find the basic patch that we apply to unoconv in order to run it within our project.

What we are trying to achieve is to be able to run unoconv from Apache, but with the privileges of the "nobody" user, manipulating files in a restricted dedicated directory. It's a sort of sandboxed execution (e.g. to prevent malicious macros etc.).

For this reason we advise our users to setup the sudoers file to allow Apache user to run our customized unoconv as the user nobody.

In the unoconv script we enforce a given directory as the $HOME directory for LibreOffice with rights only to the user nobody.

This happens in line inveniosoftware/invenio@86fae6f#L0R275

but is also replicated at:
inveniosoftware/invenio@86fae6f#L0R1146

in order to be sure that in every condition LibreOffice receives the correct HOME environment variable.

I am not sure if setting HOME in both places is actually needed, but since we started doing it we are no longer experiencing the above mentioned error.

Additionally our customization includes a way to kill a libreoffice listener, but we are not enough satisfied yet with the current implementation to propose it for pulling.

@dagwieers
Copy link
Member

Looking at your patch I wonder:

  • Is there anything we could do to unoconv to accomodate your usage (e.g. add an option to set HOME) ?
  • Is there a way to detect when HOME is not acceptable, maybe we can be smarter about it and test the erroneous condition and default to an acceptable HOME in /tmp or /dev/shm instead) ?
  • What is the reason for setting SelectPdfVersion=1 ?
  • If you know that this can be set also using -e SelectPdfVersion=1 on the command line ?
  • Is .pdfa an accepted standard extension ? I guess we could add it as a known export format (and setting the pdf/a option automatically in this case)
  • Whether you have any reproducible cases where unoconv does not cleanly exit and leaves stale office instances, we have to report this upstream in order to make the software more stable and more mature.

Anticipating your reply, sincerely ;-)

@dagwieers dagwieers reopened this Sep 28, 2012
@kaplun
Copy link
Contributor

kaplun commented Oct 5, 2012

Hi Dag,

In data venerdì 28 settembre 2012 10:19:57, Dag Wieers ha scritto:

  • Is there anything we could do to unoconv to accomodate your usage (e.g.
    add an option to set HOME) ?

Well, that would be awesome.

Is there a way to detect when HOME is not
acceptable, maybe we can be smarter about it and test the erroneous
condition and default to an acceptable HOME in /tmp or /dev/shm instead) ?

In our case we really would like to control HOME, because, since we are
running unoconv/libreoffice as the user nobody, we provide them with a HOME
where only nobody can read/write. In our software we further retrieve the
result of document conversion from this HOME directory and move it elsewhere
with appropriate rights.

  • What is the reason for setting SelectPdfVersion=1 ?

Simply to enforce PDF/A. Our software aims at being used to implement digital
archive, so, we convert to uploaded documents to formats designed for digital
preservation (such as PDF/A) or at least open formats (such as .odt etc.)

  • If you know that this can be set also using -e SelectPdfVersion=1 on the
    command line ?

Well, I wasn't sure about it. Indeed this should be the preferred way. I will
change our code to use the CLI rather than hardcoding the flag.

Is .pdfa an accepted standard extension ?

Unfortunately I believe not.

Whether you have any reproducible cases where unoconv does
not cleanly exit and leaves stale office instances, we have to report this
upstream in order to make the software more stable and more mature.

Ouch. Kind of difficult to reproduce for me (also I basically test often in
production environment, where we take whatever version of Open/LibreOffice is
provided with RHEL5 and RHEL6. Often these are already surpassed releases...

Thanks a lot for all your time in creating and maintaining this precious piece
of software!

Samuele 

Samuele Kaplun
Invenio Developer ** http://invenio-software.org/

@s0600204
Copy link

Hi, I've put something together over in s0600204@8bb1e9a that might be of interest - an attempt to add the option to specify HOME within the passed arguments.

I've tested this on CentOS 64-bit, with LibreOffice 3.4.5 and 3.6.1.2, and it works - kind of.

With both versions of LibreOffice, the provided folder is used as HOME without problems - as long as it has a LibreOffice profile folder within it. If it doesn't, LibreOffice (both versions) creates the profile folder and then exits, returning code 81.

If I understand the flow of this bug report: https://bugs.freedesktop.org/show_bug.cgi?id=43989, this not only appears to be expected behaviour, but happens whenever LibreOffice is freshly installed on a computer. However, normally soffice.bin is called from within something else (soffice.exe on windows, soffice on linux) that catches it and restarts LibreOffice automatically.

As unoconv calls soffice.bin directly, perhaps unoconv needs to start parsing the exit code so that should it receive this one, it knows to restart LibreOffice. Thoughts?

It should also be noted that the different versions of LibreOffice take different times to create the profile folder and exit. On my test system, version 3.4.5 takes only a second or two, whilst version 3.6 takes over 8 seconds, causing unoconv to timeout unless a longer value is set.

@rucomes
Copy link

rucomes commented Nov 8, 2012

Just in case this saves someone's day.

After spending hours trying to fix this, we realized that $HOME env value inside apache/php was /root.
It appears this has something to do with the way apache chroots itself into another user.

Adding this line (or another useful path for your case) will make the difference:

putenv('HOME=/home/apache/');

@s0600204
Copy link

s0600204 commented Nov 8, 2012

rucomes,

The reason why Apache is using /root as its HOME env value is quite possibly because Apache is running as root (or attempting to do so) on your system.

I'm no expert but, if I were you, I'd seriously look into this.

@intellisense
Copy link

Please any help in this regard? Also I have installed unoconv using apt-get.

@sockmonk
Copy link

Anything interesting in the unoconv log files?
On Jul 26, 2014 4:41 PM, "intellisense" notifications@github.com wrote:

Please any help in this regard? Also I have installed unoconv using
apt-get.


Reply to this email directly or view it on GitHub
#87 (comment).

@intellisense
Copy link

@sockmonk Which log files you are talking about? Listener is running fine, The problem I just can't connect to it. If you see above I have set verbosity level to 4. Is there any other log files which unoconv maintain internally?

@sockmonk
Copy link

I'm talking about the log files whose paths are in the supervisor conf file
from your first email.
On Jul 28, 2014 5:16 AM, "intellisense" notifications@github.com wrote:

@sockmonk https://github.com/sockmonk Which log files you are talking
about? Listener is running fine, The problem I just can't connect to it. If
you see above I have set verbosity level to 4. Is there any other log files
which unoconv maintain internally?


Reply to this email directly or view it on GitHub
#87 (comment).

@intellisense
Copy link

@sockmonk the log files are empty. And I was wondering the log files for unoconv in supervisor is for the listener. So how that can be populated if something wrong is happening while connecting with the listener.

@intellisense
Copy link

The command netstat -na|grep 2002 does not show anything. It means that www-data failed silently to bind soffice.bin to port 2002, although the listener is running as you can see above the ps aux result. Any hints now?

@pataquets
Copy link
Contributor

Had the same problem under Ubuntu Trusty+Apache2.4+PHP5.5.
Apache2.4 initscript unsets $HOME var. Also running under Docker, the best you can get is having an inherited $HOME var set to /root where Apache will not be able to write.
Solved by setting $HOME to /tmp before shell_exec'ing unoconv from PHP.
HTH.

@dagwieers
Copy link
Member

First of all, I don't know if we fixed this issue with commit d9810b7 (issue #224).
(It certainly shouldn't bail out anylonger when PATH is unset !)

If it isn't fixed, what would be the best solution for unoconv ?

  1. Have unoconv set HOME to the user's HOME when it is not set ?
  2. Add an option to unoconv (-E/--env) so the user can provide it's own environment variables ?
  3. Let unoconv make a secure temporary direct in /tmp and use that instead as HOME ?

I can see advantages and disadvantages in each.

  1. This is what users are expecting right now, but does not allow flexibility and may not work out-of-the-box.
  2. This provides the most flexibility, but may not work out-of-the-box
  3. Will work out-of-the-box in the majority of cases, but does not allow flexibility. Do we want to use /tmp by default ?

A combination of 1-2 or 2-3 are possible as well.

@pataquets
Copy link
Contributor

Also, [temporary?] options are:

  • Document this somewhere
  • Warn with a descriptive error message if $HOME not set or not writable.

@s0600204
Copy link

s0600204 commented Jul 5, 2015

I just ran a couple of tests locally (and repeated them to be sure): If memory serves, LibreOffice used to attempt to create $HOME/.libreoffice as a place to temporally store the file it's converting (and set up some user defaults). It now uses $HOME/.config/libreoffice which I personally think is a better location.

Anyhow, I changed the folder permissions for the webserver's $HOME/.config directory so it could be read but not written to. LO created $HOME/libreoffice instead for its personal space (note the lack of the .). (Making ./.config writable again caused LO to create a ./libreoffice subfolder there, clearly showing a preference.)

I then erased the $HOME/.config directory and set the webserver's $HOME to be readable but not writable by said webserver (which is the default state of things for apache the last time I tried CentOS (6.3)). LO could not create its folder and the conversion failed. The last message I got from unoconv was LibreOffice listener successfully started and I only got the message once.

My conclusion is that this is not yet fixed, but we are very close now. Options 2+3 combined sound good. Should work out of the box, with flexibility if needed.

(LibreOffice 4.3.3, Lighttpd 1.4.35 and Python 2.7.9 on Debian 8, by the way)

@dagwieers
Copy link
Member

dagwieers added a commit that referenced this issue Jul 8, 2015
Minor readme addition about HOME directory (#87)
@VInodKumar41287
Copy link

Hi,
Unoconv not running with apache through cgi perl.
And we using ubuntu operating system and Unoconv 0.6. It works in terminal and pdf file was created but not in apache as well as no error raised it should be blank.
Please anyone help us..
thanks,

@hermannkm
Copy link

hermannkm commented Jul 12, 2016

Having had the same problems and studying these discussions I decided trying to hardcode HOME to /tmp at about line 1216 in /usr/bin/unoconv:

### Main entrance
    ...
    os.environ['HOME'] = '/tmp'

which made everything work instantly (calling unoconv with apache user via moodle 3.1).
Working with Ubuntu 16.04 LTS, unoconv installed via apt-get (0.7-1.1; again, after trying the latest github-clone which had the same problems).

@begincalendar
Copy link

The workaround from @hermannkm also works for me (after experiencing this error from the master version), but I think a simpler solution (albeit a bit clunky) is to delete the /home/user/libreoffice directory, where "user" is the same user that you use to run unoconv.

That worked for me, but I haven't able to replicate why.
I did notice a .lock file in /home/user/libreoffice/4/ before I first deleted the libreoffice directory, but when I try reintroducing that file, libreoffice doesn't crash again.

@leafisme
Copy link

centos 7.3 nginx with php via php-fpm, the env in php is cleaned by php-fpm

u can use putenv to set evn["PATH"] in php code, examples

putenv("PATH=/sbin:/bin:/usr/sbin:/usr/bin"); 
var_dump(shell_exec('unoconv -vvvv -f pdf -o 123.pdf 123.doc));

or u can set env use one line shell cmd

var_dump(shell_exec('PATH=/sbin:/bin:/usr/sbin:/usr/bin'.' unoconv -vvvv -f pdf -o 123.pdf 123.doc));

or u can change php-fpm.d/www.conf to pass the env to php, add this line

clean_env = no

and the restart php-fpm

systemctl restart php-fpm.service

@eldy
Copy link

eldy commented Oct 10, 2017

The command /usr/bin/unoconv --debug -vvvv -f pdf /tmp/document-example.odt was a success with a common user but failed with user www-data (user of apache).
I had to set the home dir /var/www to read/write to have the command working successfully.

@regebro
Copy link
Member

regebro commented Nov 24, 2017

Is it not feasible to set $HOME to something writeable for the script that calls unoconv? I'm not sure unoconv is the place to fix this, it's really a configuration error.

But if we need to have unoconv work around this, making a temporary home seems the only solution. That seems both the wrong place and the wrong way to solve this, as we need to start and stop listeners and delete those directories a lot.

A check that $HOME exists and is writeable and an error message if it isn't would make sense though.

@castmetal
Copy link

Hi, I resolve with Unotools. I hope that you helps others: https://pypi.python.org/pypi/unotools

@spooky360
Copy link

Hi,
I had the same problem : unoconv fails with www-data user but it works with sudo.
After several hours trying to figure out, I've tried to lauch soffice from www-data user.

javaldx "could not find a java runtime environment"

So my problem was coming from JRE and www-data user.
I've copied the ~/.config/libreoffice/4/user/config/javasettings_Linux_X86_64.xml file from my user to www-data home directory and it works now from apache.

(I'm sorry if i haven't posted this in the right thread, but the issue was the same I think)

s0600204 added a commit to s0600204/unoconv that referenced this issue Oct 16, 2020
When Open/LibreOffice runs, it looks for a folder containing user-specific
configuration in the invoking user's home directory. If it cannot find it,
Open/LibreOffice attempts to create it.

For users running `unoconv` manually, this isn't a problem.

For users attempting to run this on a webserver, the invoking user is the http
daemon (e.g. `apache`, `nginx`, `lighttpd`), and the home folder is typically
the webserver's root directory (e.g. `/srv/http`, `/var/www`).

Unfortunately, there appear to be some scenarios where this folder is read-only,
even for the daemon (see unoconv#87, unoconv#449). This leads to errors, as Open/LibreOffice
is unable to create its configuration folder.

(It is assumed that if the configuration folder already exists, it must be
usable, else how would it have been created?)

This commit solves the issue by detecting this, emitting a warning message, and
using a temporary folder as a temporary "Home".
@regebro regebro closed this as completed Nov 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests