Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance and Architectural Questions On Camel-K #1196

Closed
davesargrad opened this issue Jan 13, 2020 · 4 comments
Closed

Performance and Architectural Questions On Camel-K #1196

davesargrad opened this issue Jan 13, 2020 · 4 comments
Labels

Comments

@davesargrad
Copy link

davesargrad commented Jan 13, 2020

This question is directed at the camel-k architects/experts.

### Question 1: Performance Concerns

We are looking at a system where we will need to run hundreds, perhaps even thousands, of camel-k integrations. I am trying to understand the overhead associated with a single camel-k integration.

I'm assuming that 100 integrations would result in 100 instances of a camel-k based pod. This means each pod would require a single JVM, hence 100 JVM instances.

I could envision an alternative architectural approach that would perform the same 100 functions in a single pod, as long as that pod is running all 100 routes.

Is it possible to create an integration that implements N routes, rather than a single route per integration?

Can you please help us to understand the way to most efficiently architect a camel-k based solution that must perform the equivalent of 100's or 1000's of routes?

What architectural guidance can you offer relative to helping us come up with an appropriate performance focused camel-k based architecture?

I'm interested in understanding how to architect the system to find the right architectural balance with a focus on vCPU and RAM resources.


Now that @nicolaferraro has provided an answer to the performance part of my question (Question 1), I thought I'd extend this issue, to get further architectural guidance. I did not want to create another issue for this purpose.


### Question 2: Deployment Architecture

Currently we use kamel to install an integration "from java source". If we want to prebuild an integration into a deployable docker image, is this possible?

In general, though we can deploy from source, our practice is to deploy from tested, compiled, source.

### Question 3: XML Validation against schema

We have routes that will receive XML payload and that need to validate the XML against a schema. Furthermore these routes will need to filter based on XML content. Do you have examples of routes that validate against a schema, and as well process content?

One assumption that we are making is that we will store the schema on an archiva server running within our kubernetes cluster. Hence the schema should be accessible to the integration via URL. Perhaps this is a good architectural assumption, perhaps not.

All guidance appreciated

@nicolaferraro
Copy link
Member

I can try to give some ideas to reason on, but there's no general rule valid for all scenarios.

You can run multiple routes in the same JVM, just with kamel run Routes1.java Routes2.java RoutesN.java --name routes-pack.

The question is when you want to do that and why. I think you should apply the same reasoning that people use when dealing with microservices architectures, where the big integration containing 1000 routes is the monolith.

E.g. do you want independent scalability of some integration flows? So deploy them separately in order to set the number of replicas independently. If you use Knative they will scale automatically depending on the load, but if you have a single fat integration containing all routes, you need to scale the whole stuff which is heavyweight (takes more time to startup and uses much more resources than needed).

E.g. do you have multiple teams? If so you probably want each team to be responsible for their own deployments and not having to synchronize with other teams. You also may want that each update on one single integration not to interfere with other integrations already running (but they will do if they are on the same JVM).

I think many other principles that apply to microservices apply also here. You'll end up somewhere in the middle between a single fat integration and an integration per route.

The long-term goal of Camel K is to allow you to split based on domain logic rather than resource utilization, by drastically reducing the amount of resource needed. We've already done some work on reducing the amount of resources used in the cluster and we'll do a lot more.

What you've now:

  • Knative services available for HTTP based endpoints: they shut down the JVM when not used
  • CronJob (fresh Add native support for CronJobs #1197): they activate a JVM only when they need to run

What we're working on:

  • Quarkus native compilation: so you don't run a full JVM but a tiny binary for each integration which uses resources comparable to that of a golang application
  • Keda autoscalers: to run integration only when they need to process data and stop them when idle

In particular, Quarkus is a game changer. The Camel-quarkus repository already contains a lot of Camel components that can compile to native (Camel K will be able to compile to native transparently in one of next releases, we're working on it).

If I had to run 1000 integration flows, the first thing I would consider would be to check if camel components I need to use are in the list of Quarkus extensions and contribute what's missing.. That would allow me to care less about non-functional requirements and focus on business logic and maintainability.

@davesargrad
Copy link
Author

To @nicolaferraro

This is simply awesome architectural feedback. I appreciate it greatly. I understand the tradeoffs you describe, and indeed I do see huge value in the microservices architecture. I will resist the temptation to build an integration monolith that performs 1000's of routes.

I think you are correct that we may end up with a balanced set of integrations, several routes per integration, where these routes are related in some fashion.

In the end game we will have 1000's or even tens of 1000's of routes to implement, and my goal is to drive a sensible architecture that doesnt require excessive VM resources. Your reasoning will be at the core of my thinking.

I will look into quarkus. We are interested in such game changers. It is good to know that you are also focused on this.

@davsclaus
Copy link
Contributor

@nicolaferraro there are some great bits here, maybe you/we could put together a blog post and get it posted on your blog + camel website.

@davesargrad davesargrad changed the title Performance Question On Camel-K Performance and Architectural Questions On Camel-K Jan 15, 2020
@lburgazzoli lburgazzoli added the area/documentation Documentation task label Jun 5, 2020
@github-actions
Copy link
Contributor

This issue has been automatically marked as stale due to 90 days of inactivity.
It will be closed if no further activity occurs within 15 days.
If you think that’s incorrect or the issue should never stale, please simply write any comment.
Thanks for your contributions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants