Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uyuni 2022.03 configuration fails with "Tomcat failed to start properly or the installer ran out of tries." #5278

Closed
gabjef opened this issue Apr 25, 2022 · 15 comments
Labels
bug Something isn't working

Comments

@gabjef
Copy link

gabjef commented Apr 25, 2022

Problem description

Failed Tomcat deployment during configuration of Uyuni 2022.03 on a greenfield openSUSE Leap 15.3 instance.

Version of Uyuni Server and Proxy (if used)

Repository     : uyuni-server-stable
Name           : Uyuni-Server-release
Version        : 2022.03-172.4.uyuni1
Arch           : x86_64
Vendor         : obs://build.opensuse.org/systemsmanagement:Uyuni
Support Level  : Level 3
Installed Size : 1.4 KiB
Installed      : Yes (automatically)
Status         : up-to-date
Source package : Uyuni-Server-release-2022.03-172.4.uyuni1.src
Summary        : Uyuni Server
Description    :
    Uyuni lets you efficiently manage physical, virtual,
    and cloud-based Linux systems. It provides automated and cost-effective
    configuration and software management, asset management, and system
    provisioning.
NAME="openSUSE Leap"
VERSION="15.3"
ID="opensuse-leap"
ID_LIKE="suse opensuse"
VERSION_ID="15.3"
PRETTY_NAME="openSUSE Leap 15.3"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:opensuse:leap:15.3"
BUG_REPORT_URL="https://bugs.opensuse.org"
HOME_URL="https://www.opensuse.org/"

Not using Uyuni Proxy

Details about the issue

Ran YAST to set up the initial /root/setup_env.sh file, then ran mgr-setup as root:

uyuni-lab-linux-mgmt:~ # /usr/lib/susemanager/bin/mgr-setup -s
Asserting correct java version...
/var/spacewalk already exists. Leaving it untouched.
Filesystem type for /var/cache is xfs - ok.
Open needed firewall ports...
running
success
success
CREATE ROLE
* Loading answer file: /root/spacewalk-answers.
** Database: Setting up database connection for PostgreSQL backend.
** Database: Populating database.
** Database: --clear-db option used.  Clearing database.
** Database: Shutting down spacewalk services that may be using DB.
** Database: Services stopped.  Clearing DB.
*** Progress: ###############################
* Configuring tomcat.
* Setting up users and groups.
* Performing initial configuration.
* Configuring apache SSL virtual host.
** /etc/apache2/vhosts.d/vhost-ssl.conf has been backed up to vhost-ssl.conf-swsave
* Configuring jabberd.
* Creating SSL certificates.
** SSL: Generating CA certificate.
** SSL: Generating server certificate.
** Database: Setting up report database.
** Database: --clear-db option used.  Clearing report database.
*** Progress: #
** Database: Installation complete.
* Deploying configuration files.
* Update configuration in database.
* Setting up Cobbler..
* Restarting services.
Tomcat failed to start properly or the installer ran out of tries.  Please check /var/log/tomcat/catalina.out or /var/log/tomcat/catalina.$(date +%Y-%m-%d).log for errors.
ERROR: spacewalk-setup failed

/root/setup_env.sh:

export MANAGER_FORCE_INSTALL='0'
export ACTIVATE_SLP='n'
export MANAGER_ADMIN_EMAIL='xxxx.xxxxx@tierpoint.com'
export MANAGER_ENABLE_TFTP='y'
export MANAGER_IP='172.18.62.60'
export MANAGER_DB_PORT='5432'
export DB_BACKEND='postgresql'
export MANAGER_DB_HOST='localhost'
export MANAGER_DB_NAME='uyuni'
export MANAGER_DB_PROTOCOL='TCP'
export MANAGER_PASS='xxxxxxxxx'
export MANAGER_PASS2='xxxxxxxxx'
export MANAGER_USER='uyuni'
export LOCAL_DB='1'
export CERT_CITY='St. Louis'
export CERT_COUNTRY='US'
export CERT_EMAIL='xxxx.xxxxx@tierpoint.com'
export CERT_O='Tierpoint'
export CERT_OU='OSA'
export CERT_PASS='xxxxxxxxxxx'
export CERT_PASS2='xxxxxxxxxx'
export CERT_STATE='MO'

Uploaded all logs from /var/log/rhn in ZIP file.
uyuni-rhn-install-logs.zip

@gabjef gabjef added bug Something isn't working P5 labels Apr 25, 2022
@gabjef
Copy link
Author

gabjef commented Apr 25, 2022

Attached Tomcat log:
catalina.2022-04-25.log

@DENISKI
Copy link

DENISKI commented Apr 25, 2022

The same error after upgrade from 2022.02 with the same stacktrace.
SEVERE [main] org.apache.catalina.startup.HostConfig.deployDirectory Error deploying web application directory [/srv/tomcat/webapps/rhn]

@juliogonzalez
Copy link
Member

For the record, it was reported at Gitter that users are able to reproduce with an upgrade to 2022.03 when the IBM JVM is not present.

I am create a VM with the problem reproduced using Uyuni Master/Develop (future 2022.04). I was already able to reproduce the problem locally. For whatever reason the testsuite is not able to reproduce.

Since users were able to upgrade to 2022.03 days ago without issue, I suspect this is something that changed on Leap 15.3. We'll need a Java developer having a look tomorrow.

@juliogonzalez
Copy link
Member

I was able to reproduce so far with:

  • Brand new 2022.03, locally on virtualbox
  • Brand new 2022.04 (master), locally, on Virtualbox.
  • Brand new 2022.04 (master), at m251.mgr.suse.de

Failure at m251.mgr.suse.de:

image

At /var/log/tomcat/catalina.2022-04-26.log:

26-Apr-2022 23:46:59.658 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/srv/tomcat/webapps/rhn]
26-Apr-2022 23:47:05.468 SEVERE [main] org.apache.catalina.startup.HostConfig.deployDirectory Error deploying web application directory [/srv/tomcat/webapps/rhn]
        java.lang.IllegalStateException: Error starting child
                at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:720)
                at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:690)
                at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:705)
                at org.apache.catalina.startup.HostConfig.deployDirectory(HostConfig.java:1132)
                at org.apache.catalina.startup.HostConfig$DeployDirectory.run(HostConfig.java:1865)
                at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
                at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
                at org.apache.tomcat.util.threads.InlineExecutorService.execute(InlineExecutorService.java:75)
                at java.base/java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:118)
                at org.apache.catalina.startup.HostConfig.deployDirectories(HostConfig.java:1044)
                at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:429)
                at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1575)
                at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:309)
                at org.apache.catalina.util.LifecycleBase.fireLifecycleEvent(LifecycleBase.java:123)
                at org.apache.catalina.util.LifecycleBase.setStateInternal(LifecycleBase.java:423)
                at org.apache.catalina.util.LifecycleBase.setState(LifecycleBase.java:366)
                at org.apache.catalina.core.ContainerBase.startInternal(ContainerBase.java:936)
                at org.apache.catalina.core.StandardHost.startInternal(StandardHost.java:841)
                at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
                at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1384)
                at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1374)
                at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
                at org.apache.tomcat.util.threads.InlineExecutorService.execute(InlineExecutorService.java:75)
                at java.base/java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:140)
                at org.apache.catalina.core.ContainerBase.startInternal(ContainerBase.java:909)
                at org.apache.catalina.core.StandardEngine.startInternal(StandardEngine.java:262)
                at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
                at org.apache.catalina.core.StandardService.startInternal(StandardService.java:421)
                at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
                at org.apache.catalina.core.StandardServer.startInternal(StandardServer.java:930)
                at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
                at org.apache.catalina.startup.Catalina.start(Catalina.java:633)
                at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
                at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                at java.base/java.lang.reflect.Method.invoke(Method.java:566)
                at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:343)
                at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:474)
        Caused by: org.apache.catalina.LifecycleException: Failed to start component [StandardEngine[Catalina].StandardHost[localhost].StandardContext[/rhn]]
                at org.apache.catalina.util.LifecycleBase.handleSubClassException(LifecycleBase.java:440)
                at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:198)
                at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:717)
                ... 37 more
        Caused by: java.lang.NullPointerException
                at org.apache.tomcat.util.scan.StandardJarScanner.process(StandardJarScanner.java:382)
                at org.apache.tomcat.util.scan.StandardJarScanner.scan(StandardJarScanner.java:195)
                at org.apache.catalina.startup.ContextConfig.processJarsForWebFragments(ContextConfig.java:1971)
                at org.apache.catalina.startup.ContextConfig.webConfig(ContextConfig.java:1129)
                at org.apache.catalina.startup.ContextConfig.configureStart(ContextConfig.java:775)
                at org.apache.catalina.startup.ContextConfig.lifecycleEvent(ContextConfig.java:301)
                at org.apache.catalina.util.LifecycleBase.fireLifecycleEvent(LifecycleBase.java:123)
                at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5044)
                at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
                ... 38 more

Java developers and QE pinged so they can have a look tomorrow ASAP at m251.mgr.suse.de. The VM has a snapshot with only Leap 15.3 installed and updated, in case a reinstall is needed, to check what's going on.

@nodeg @vandabarata, without a fix for this, we can't release 2022.04. It's a blocker.

@juliogonzalez juliogonzalez removed the P5 label Apr 26, 2022
@aaannz
Copy link
Contributor

aaannz commented Apr 26, 2022

Is there any symlinked jar in /srv/tomcat/webapps/rhn directory? Looks really similar to https://bz.apache.org/bugzilla/show_bug.cgi?id=65397

@aaannz
Copy link
Contributor

aaannz commented Apr 26, 2022

There is a patch for tomcat in related bug ( https://bz.apache.org/bugzilla/attachment.cgi?id=37944&action=diff ) @mbussolotto Can you take a look?

@juliogonzalez
Copy link
Member

Not saying that's not the issue, but I wonder why it doesn't affect the testsuite.

In case it helps, here's the last change for tomcat9 at SLE15SP3 (propagated to Leap 15.3): https://build.opensuse.org/package/rdiff/SUSE:SLE-15-SP3:Update/tomcat?linkrev=base&rev=13

BTW, I asked @lumarel the list of packages (he did the upgrade days ago, and it still worked for him), so we can compare it with the packages at m251 or (if needed) at the 2022.03 that I have locally, where I also reproduced.

@juliogonzalez
Copy link
Member

The list of packages from @lumarel: https://rpa.st/FZJQ, quoting him:

This is from my prod machine, if I remember correctly I upgraded right after the fixes got pushed to the repo (so right after your mail about the fixes)

@juliogonzalez
Copy link
Member

And the list of of packages from my broken 2022.03:
rpm.list.gz

@mcalmer
Copy link
Contributor

mcalmer commented Apr 27, 2022

There are "dead symlinks" in /srv/tomcat/webapps/rhn/WEB-INF/lib/
The seems the "netty" package has changed:

lrwxrwxrwx 1 tomcat tomcat   32 26. Apr 16:26 netty-buffer.jar -> /usr/share/java/netty-buffer.jar
lrwxrwxrwx 1 tomcat tomcat   31 26. Apr 16:26 netty-codec.jar -> /usr/share/java/netty-codec.jar
lrwxrwxrwx 1 tomcat tomcat   32 26. Apr 16:26 netty-common.jar -> /usr/share/java/netty-common.jar
lrwxrwxrwx 1 tomcat tomcat   33 26. Apr 16:26 netty-handler.jar -> /usr/share/java/netty-handler.jar
lrwxrwxrwx 1 tomcat tomcat   34 26. Apr 16:26 netty-resolver.jar -> /usr/share/java/netty-resolver.jar
lrwxrwxrwx 1 tomcat tomcat   35 26. Apr 16:26 netty-transport.jar -> /usr/share/java/netty-transport.jar
lrwxrwxrwx 1 tomcat tomcat   54 26. Apr 16:26 netty-transport-native-unix-common.jar -> /usr/share/java/netty-transport-native-unix-common.jar
$> l /usr/share/java/netty/
insgesamt 3604
drwxr-xr-x 1 root root    750 26. Apr 23:29 ./
drwxr-xr-x 1 root root  13960 26. Apr 23:29 ../
-rw-r--r-- 1 root root 300518 21. Apr 10:35 netty-buffer.jar
-rw-r--r-- 1 root root  64166 21. Apr 10:35 netty-codec-dns.jar
-rw-r--r-- 1 root root  35453 21. Apr 10:35 netty-codec-haproxy.jar
-rw-r--r-- 1 root root 466433 21. Apr 10:35 netty-codec-http2.jar
-rw-r--r-- 1 root root 626324 21. Apr 10:35 netty-codec-http.jar
-rw-r--r-- 1 root root 272780 21. Apr 10:35 netty-codec.jar
-rw-r--r-- 1 root root  42029 21. Apr 10:35 netty-codec-memcache.jar
-rw-r--r-- 1 root root  97779 21. Apr 10:35 netty-codec-mqtt.jar
-rw-r--r-- 1 root root  43445 21. Apr 10:35 netty-codec-redis.jar
-rw-r--r-- 1 root root  19196 21. Apr 10:35 netty-codec-smtp.jar
-rw-r--r-- 1 root root 116745 21. Apr 10:35 netty-codec-socks.jar
-rw-r--r-- 1 root root  28576 21. Apr 10:35 netty-codec-stomp.jar
-rw-r--r-- 1 root root 512168 21. Apr 10:35 netty-common.jar
-rw-r--r-- 1 root root   2970 21. Apr 10:35 netty-dev-tools.jar
-rw-r--r-- 1 root root 351382 21. Apr 10:35 netty-handler.jar
-rw-r--r-- 1 root root  23276 21. Apr 10:35 netty-handler-proxy.jar
-rw-r--r-- 1 root root 153044 21. Apr 10:35 netty-resolver-dns.jar
-rw-r--r-- 1 root root  36796 21. Apr 10:35 netty-resolver.jar
-rw-r--r-- 1 root root 470839 21. Apr 10:35 netty-transport.jar

@mcalmer
Copy link
Contributor

mcalmer commented Apr 27, 2022

$> zypper se -s netty
S | Name                   | Type       | Version                 | Arch   | Repository
--+------------------------+------------+-------------------------+--------+-------------------------------------------------------------
i | netty                  | Paket      | 4.1.75-150200.4.9.1     | x86_64 | Update repository with updates from SUSE Linux Enterprise 15
v | netty                  | Paket      | 4.1.75-150200.4.6.2     | x86_64 | Update repository with updates from SUSE Linux Enterprise 15
v | netty                  | Paket      | 4.1.44.Final-4.5.uyuni1 | noarch | uyuni-server-devel
v | netty                  | Paket      | 4.1.13-bp153.2.46       | x86_64 | Main Repository

@mcalmer
Copy link
Contributor

mcalmer commented Apr 27, 2022

If you perform an "update", the vendor change prevent to install the newer version.
But if you perform a "dup" or say --allow-vendor-change, you get the leap package.

@juliogonzalez
Copy link
Member

juliogonzalez commented Apr 27, 2022

Workaround for 2022.03

spacewalk-service stop
zypper ref
zypper install --repo uyuni-server-stable -f netty
spacewalk-service start

After this, the WebUI works again

Fix for 2022.04 (not released yet)

#5296

@juliogonzalez
Copy link
Member

juliogonzalez commented Apr 27, 2022

Explanation

On April 20th, 07:03 UTC, netty 4.1.75 was pushed to the SLE-15-SP3 codestream, which also bumped netty on Leap 15.3 from 4.1.13 to 4.1.75.

Uyuni, as of today, requires our own version of netty (4.1.14), with some specific symlinks.

Until now, our version (4.1.14) was newer what what was provided on Leap 15.3 (4.1.13), so our netty package got always installed.

After 4.1.75 was released, new installations just got the newest package available, so they got broken.

For existing installations that got updated from an old version to 2022.03, users should not be affected unless:

@mcalmer
Copy link
Contributor

mcalmer commented May 11, 2022

Problem fixed, closing issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants