Skip to content

Windows: upgrade from 3.5.4 -> 3.7.4 can fail due to computed node name case differences #1568

@Bhaal22

Description

@Bhaal22

Hi,

I do experience migration issues from 3.5.4 to 3.7.4 on windows environment.
I did prepare windows docker containers to ease the reproduction of the issue. We experience the same issue on windows virtual machines.

How to reproduce

  • setup your machine name to: rmq (it's easier in the Container than on real windows machines)
  • start rabbitmq 3.5.4 (OTP 18.3) instance
  • stop the broker
  • start rabbitmq 3.7.4 (OTP 20.0) instance using the same RABBITMQ_BASE folder
  • migration failed with message:
BOOT FAILED
===========

Error description:
    init:do_boot/3 line 793
    init:start_em/1 line 1085
    rabbit:start_it/1 line 445
    rabbit:'-boot/0-fun-0-'/0 line 296
    rabbit_upgrade:run_mnesia_upgrades/2 line 155
    rabbit_upgrade:die/2 line 209
    io:format(<0.56.0>, "\n\n****\n\nCluster upgrade needed but other disc nodes shut down after this one.\nPlease first star...", [])
error:badarg
Log file(s) (may contain more information):
   c:/rmq-data/log/RABBIT~1.LOG
   c:/rmq-data/log/rabbit@rmq_upgrade.log

{"init terminating in do_boot",badarg}
init terminating in do_boot (badarg)

Crash dump is being written to: c:\rmq-data\log\erl_crash.dump...done

investigations done

if "!RABBITMQ_NODENAME!"=="" (
    if "!NODENAME!"=="" (
        set RABBITMQ_NODENAME=rabbit@!COMPUTERNAME!
    ) else (
        set RABBITMQ_NODENAME=!NODENAME!
    )
)

in the default case, rabbitmq will generate rabbit@COMPUTERNAME (all in uppercase)

if "!RABBITMQ_NODENAME!"=="" (
    if "!NODENAME!"=="" (
        REM We use Erlang to query the local hostname because
        REM !COMPUTERNAME! and Erlang may return different results.
        REM Start erl with -sname to make sure epmd is started.
        call "%ERLANG_HOME%\bin\erl.exe" -A0 -noinput -boot start_clean -sname rabbit-prelaunch-epmd -eval "init:stop()." >nul 2>&1
        for /f "delims=" %%F in ('call "%ERLANG_HOME%\bin\erl.exe" -A0 -noinput -boot start_clean -eval "net_kernel:start([list_to_atom(""rabbit-gethostname-"" ++ os:getpid()), %NAMETYPE%]), [_, H] = string:tokens(atom_to_list(node()), ""@""), io:format(""~s~n"", [H]), init:stop()."') do @set HOSTNAME=%%F
        set RABBITMQ_NODENAME=rabbit@!HOSTNAME!
        set HOSTNAME=
    ) else (
        set RABBITMQ_NODENAME=!NODENAME!
    )
)

And here rabbitmq generates rabbit@hostname where hostname has the same value as cmd hostname

Workaround

  • delete db folder (not really possible)
  • manually set the RABBITMQ_NODENAME environment variable
  • rename the machine with everything as capital letters

How to reproduce with windows docker containers

DockerHub images are built from this repository: https://github.com/gsx-solutions/rmq-win

docker volume create rmq-data

docker run --rm -h rmq -v rmq-data:c:\rmq-data -ti gsxsolutions/rmq:3.5.4
docker run --rm -h rmq -v rmq-data:c:\rmq-data -ti gsxsolutions/rmq:3.7.4

Then you can just use -h RMQ to make it working.

Thank you for your work and support.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions