New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows: upgrade from 3.5.4 -> 3.7.4 can fail due to computed node name case differences #1568
Comments
Thank you for your time. Team RabbitMQ uses GitHub issues for specific actionable items engineers can work on. GitHub issues are not used for questions, investigations, root cause analysis, discussions of potential issues, etc (as defined by this team). We get at least a dozen of questions through various venues every single day, often light on details. Please post this to rabbitmq-users. Thank you. |
Cluster upgrades between feature versions require an ordered restart, which
hints at. That and more (e.g. Blue/Green deployment migrations) are documented in the Upgrade guide. |
Yes and in the matrix: 3.5.x to 3.7.x is supported. |
We test quite a few upgrade permutations, including an upgrade from 3.5.8 as part of our CI pipeline. Cluster upgrade from 3.5.x to 3.7.x will require a cluster-wide shutdown with an ordered restart, as the docs explain. Or you can do a Blue/Green deployment upgrade. Sorry but there is no evidence of a bug. This is mailing list material at this point. |
The cluster is one member node. Do you have CI pipeline on windows environment? When COMPUTERNAME environment variable != hostname if my hostname is "rmq" then environment variable COMPUTERNAME is "RMQ" (uppercase) |
I now see you have a section about case sensitivity of node names. This is a never ending source of fun on Windows and keeping track what was done by default in what version from years ago is not realistic. Setting I suspect that other operating systems which tend to use case-insensitive filesystems are not affected. |
could be an idea also to not fallback on default values for such a env variable. I think we can have the same issue on linux if the machine is renamed using uppercase in /etc/hostname initially rmq and then RMQ |
@Bhaal22 we have Windows package tests but not upgrades on Windows. I filed a documentation guides issue and this will be covered. I'm not sure we can safely force a particular case assuming there are years worth of releases that do not do that. Perhaps we can emit a warning of some kind when Scenarios where hostnames have changed are operator's responsibility. Explicitly setting |
the issue I see with this is: then the file nodes_running_at_shutdown is updated like if it was a cluster with 2 members Yeah I understand |
We recognise that it is not ideal and can be very confusing. Thank you for getting to the bottom of it. If @lukebakken has ideas about what kind of change would be reasonably safe here, we'd be happy to file another issue and consider it. We can modify the list loaded from This is a yet another argument for Blue/Green deployment upgrades, which our docs don't promote enough. |
@michaelklishin in factissue already existed and you did a PR. |
I agree now we are in weird state. With releases using uppercases and others no using it. |
The PR was indeed closed without merging because we figured it was not a safe things to do. |
Hi,
I do experience migration issues from 3.5.4 to 3.7.4 on windows environment.
I did prepare windows docker containers to ease the reproduction of the issue. We experience the same issue on windows virtual machines.
How to reproduce
investigations done
it looks like cluster membership is case sensitive
from what I read in the code:
in the default case, rabbitmq will generate rabbit@COMPUTERNAME (all in uppercase)
And here rabbitmq generates rabbit@hostname where hostname has the same value as cmd hostname
Workaround
How to reproduce with windows docker containers
DockerHub images are built from this repository: https://github.com/gsx-solutions/rmq-win
Then you can just use -h RMQ to make it working.
Thank you for your work and support.
The text was updated successfully, but these errors were encountered: