Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issue with NATS transporter (v2.x.x) #1237

Closed
icebob opened this issue Aug 8, 2023 Discussed in #1235 · 12 comments
Closed

Performance issue with NATS transporter (v2.x.x) #1237

icebob opened this issue Aug 8, 2023 Discussed in #1235 · 12 comments

Comments

@icebob
Copy link
Member

icebob commented Aug 8, 2023

Discussed in #1235

Originally posted by mrprigun August 8, 2023
Hello, I encountered certain obstacles in my use case while attempting to execute the molecular actions. Briefly, I have a service (gateway for the main app with moleculer-web) that looks like this :

module.exports = {
  name: "test",
  actions: {
    hello: {
      rest: "GET /hello",
      handler(ctx) {
        ctx.meta.$responseType = "text/plain";
        return "Hello Moleculer";
      }
    },

    second_call: {
      rest: "GET /second_call",
      handler(ctx) {
        // some actions
        return ctx.call('external.call', params_action_results_above);
      }
    },
  }
};

external.call is located on another node and invocation happens using NATS transporter. By design actions hello and second_call are gonna be loaded to process unique tasks. Before publishing it to production I made benchmarks for each action and received such results on my laptop:

  • The hello action works pretty well, the result was ~600rps
  • But the second one second_call was very much degraded, I received just ~40rps

Also, I received similar results after deployment to a prod-like environment. I believe this occurs due to external triggering, which is anticipated, but why so much? I've tried to use different load-balancing strategies and bulkhead but didn't receive any significant improvements. Is there a possibility of configuring Moleculer to enhance this behavior or it's some kind of bug?

@icebob
Copy link
Member Author

icebob commented Aug 8, 2023

It looks that it's an issue with the nats library version 2.x.x. With the previous 1.x.x version the performance is fine.
There is an open issue (2 years ago) about it in NATS repo: nats-io/nats.js#438

@mrprigun
Copy link

mrprigun commented Aug 9, 2023

For more context, libs:

"moleculer": "0.14.31",
"nats": "2.15.1",

Used OS for local testing: MacOS 13.5
Service images in the k8s cluster are based on node:18-alpine

@icebob
Copy link
Member Author

icebob commented Aug 9, 2023

Could you switch back to nats 1.4.12 to check the performance with this version as well?

@mrprigun
Copy link

mrprigun commented Aug 9, 2023

Already tried. Initially, it was using nats 1.4.12 and the results were pretty similar, that's why I moved to 2.15.1.

@icebob
Copy link
Member Author

icebob commented Aug 9, 2023

What is the NATS server version?

@mrprigun
Copy link

mrprigun commented Aug 9, 2023

2.9.11, if more concrete docker.io/bitnami/nats:2.9.11-debian-11-r0

@icebob
Copy link
Member Author

icebob commented Aug 9, 2023

plz try with the latest version 2.9.21

@mrprigun
Copy link

mrprigun commented Aug 9, 2023

It appears that I've identified the root cause of the issue. Upon experimenting with various broker settings, I found that disabling the metrics feature resolved the problem. In my project, I use a StatsD reporter. Everything is functioning as anticipated with the Prometheus reporter. The StatsD reporter configuration looks like this

{
        type: 'StatsD',
        options: {
          // Server host
          host: 'localhost',
          // Server port
          port: 8125,
          // Maximum payload size.
          maxPayloadSize: 1300,
        }
},

So it seems definitely not the NATS issue. I'll turn off StatsD for now.

@icebob
Copy link
Member Author

icebob commented Aug 9, 2023

It's strange because I can reproduce this issue without any metrics, only with 2.x.x nats lib.

@icebob
Copy link
Member Author

icebob commented Aug 9, 2023

I've found the problem inside the nats library. The sending logic is changed to queue-based in 2.x.x version. Skipping this logic I could reach 30.000 msg/sec instead of 40 msg/sec. I've opened an issue in the NATS client repo: nats-io/nats.js#581

@mrprigun
Copy link

mrprigun commented Aug 9, 2023

This is a significant improvement, waiting for this fix then 😃

Pushed my tests just in case https://github.com/mrprigun/moleculer-benchmark-test There are two dedicated nodes and nats in docker-compose. With enabled StatsD reporter I receive ~170rps, with disabled reporter it's ~2k rps on my laptop.

Nats service: 2.9.21
Nats lib: 2.15.1
Node: v18.15.0
OS: MacOS 13.5

@icebob
Copy link
Member Author

icebob commented Aug 19, 2023

NATS fixed the issue in 2.16.0, my results:

image

@icebob icebob closed this as completed Aug 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants