Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal server error 500 after upgrading from 5.3.0 to 5.3.2 #12185

Closed
Lugrider opened this issue Dec 15, 2023 · 7 comments
Closed

Internal server error 500 after upgrading from 5.3.0 to 5.3.2 #12185

Lugrider opened this issue Dec 15, 2023 · 7 comments

Comments

@Lugrider
Copy link

What happened?

After upgrading emqx docker version from 5.3.0 to 5.3.2, when accessing Integrations, app shows an error that correspond to the API server response accessing https://mydomain.com/api/v5/bridges:

{
    "code": "INTERNAL_ERROR",
    "message": "error, {case_clause,{throw,bridge_not_found}}, [{emqx_bridge_api,'/bridges',2,[{file,\"emqx_bridge_api.erl\"},{line,474}]},{minirest_handler,apply_callback,3,[{file,\"minirest_handler.erl\"},{line,123}]},{minirest_handler,handle,2,[{file,\"minirest_handler.erl\"},{line,51}]},{minirest_handler,init,2,[{file,\"minirest_handler.erl\"},{line,27}]},{cowboy_handler,execute,2,[{file,\"cowboy_handler.erl\"},{line,41}]},{cowboy_stream_h,execute,3,[{file,\"cowboy_stream_h.erl\"},{line,318}]},{cowboy_stream_h,request_process,3,[{file,\"cowboy_stream_h.erl\"},{line,302}]},{proc_lib,init_p_do_apply,3,[{file,\"proc_lib.erl\"},{line,240}]}]"
}

Rules are kept, but Data Bridges are not shown.

What did you expect to happen?

Show my previously created Data Bridges and Flow, without any error.

How can we reproduce it (as minimally and precisely as possible)?

I don't know if it is reproducible, but I have some four Rules and two Data Bridges (HTTP and MQTT).

Anything else we need to know?

No response

EMQX version

sysdescr  : EMQX
version   : 5.3.2
datetime  : 2023-12-15T16:35:01.893248595+00:00
uptime    : 38 minutes, 7 seconds

OS version

$ cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
$ uname -a
Linux myhostname.com 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 GNU/Linux

Log files

@Lugrider Lugrider added the BUG label Dec 15, 2023
@Lugrider
Copy link
Author

Lugrider commented Dec 18, 2023

I can confirm that the problem is with 5.3.2, version 5.3.1 works fine.

@thalesmg
Copy link
Contributor

Hi.

  1. Are there any logs you could share? If they could be set to debug before trying to upgrade, that would be best.
  2. Could you share data/configs/cluster.hocon? Please double check and censor any sensitive information before sharing it.

@Lugrider
Copy link
Author

Here comes logs but data/configs/cluster.hocon file is not present, added data/configs/cluster-override.conf instead.
emqx.zip

NOTE: After upgrade and collect logs, when running with 5.3.2, I've disabled the file log and all mqtt authentication, and when back to 5.3.1, emqx crashes. I'm now on 5.3.2 because I can't go back. I suppose that the changes on configuration I've made, makes new configuration incompatible with 5.3.1 and causes the crash. Is possible go back to 5.3.1?

@thalesmg
Copy link
Contributor

Thanks for the extra info.

I managed to reproduce your issue by starting on 5.3.0, creating an HTTP bridge, and then using the same config on 5.3.2. That produces the following log when starting EMQX:

2023-12-19T16:48:11.708766+00:00 [error] msg: action_references_nonexistent_bridges, mfa: emqx_rule_engine:with_parsed_rule/3(490), hint: this rule will be disabled, nonexistent_bridge_ids: #{nonexistent_bridge_ids => [{http,p1}]}, rule_id: <<"web">>

However, I could not observe the same issue when trying to access the bridges API: the bridge is return correctly when listing all bridges or getting its config.

Did you observe the error while trying to list the bridges in a "mixed cluster"? That is: a cluster with some nodes in 5.3.0, some in 5.3.2? That could explain how you experience it.

This particular error should be resolved in 5.4.0 (still unreleased), and considering the whole cluster is running the same (new) version.

I suppose that the changes on configuration I've made, makes new configuration incompatible with 5.3.1 and causes the crash. Is possible go back to 5.3.1?

Automatically downgrading the configuration file is not supported, I'm afraid. If you changed something in 5.3.2, then it probably saved the new state in the new format. You would need to inspect the logs when EMQX crashes and see which configurations it complains about, and then try to manually fix the config before starting it in an older EMQX version.

@Lugrider
Copy link
Author

Did you observe the error while trying to list the bridges in a "mixed cluster"? That is: a cluster with some nodes in 5.3.0, some in 5.3.2? That could explain how you experience it.

No, I only have one node, but in /mnesia folder, I got many emqx@x.x.x.x folders created during my tests and with older versions of emqx, and I don't know if this can cause any issue, or I can simply delete the unused.

I suppose that the changes on configuration I've made, makes new configuration incompatible with 5.3.1 and causes the crash. Is possible go back to 5.3.1?

Automatically downgrading the configuration file is not supported, I'm afraid. If you changed something in 5.3.2, then it probably saved the new state in the new format. You would need to inspect the logs when EMQX crashes and see which configurations it complains about, and then try to manually fix the config before starting it in an older EMQX version.

Ok, no problem I only want to know if there is an easy way, I have backups (emqx ctl data export).
An easy, and fast, way to restore previous configurations for downgrades after "failed" updates, will be appreciated. In the config folder there are many cluster-override.conf.xxx.bak (dated) files, but I don't know If I can use this files as substitution of my config or need someting more...

@thalesmg
Copy link
Contributor

No, I only have one node, but in /mnesia folder, I got many emqx@x.x.x.x folders created during my tests and with older versions of emqx, and I don't know if this can cause any issue, or I can simply delete the unused.

Those shouldn't affect your single node. If you are sure the folders are not used by your node (i.e., they don't match the name of such single node or any other in the cluster), then you can remove them.

Ok, no problem I only want to know if there is an easy way, I have backups (emqx ctl data export).
An easy, and fast, way to restore previous configurations for downgrades after "failed" updates, will be appreciated. In the config folder there are many cluster-override.conf.xxx.bak (dated) files, but I don't know If I can use this files as substitution of my config or need someting more...

Both methods should work (using an older .bak file that was created with the older version, or importing a file generated by emqx ctl data export).

@id id added #triage/wait and removed BUG labels Jan 3, 2024
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants