Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offer multiple flavors of Dapr sidecar #3168

Closed
artursouza opened this issue May 13, 2021 · 27 comments · Fixed by #6290 or #6469
Closed

Offer multiple flavors of Dapr sidecar #3168

artursouza opened this issue May 13, 2021 · 27 comments · Fixed by #6290 or #6469
Assignees
Labels
area/components discussion feedback-wanted kind/feature New feature request P1 pinned triaged/unresolved Indicates an issue that cannot or will not be resolved
Milestone

Comments

@artursouza
Copy link
Member

artursouza commented May 13, 2021

Today, dapr sidecar is a monolith aggregating all component implementations: Kafka, Redis, Azure CosmosDB, Azure Service Bus, AWS DynamoDB, etc. As more components get implemented, we should consider offering different "flavors" of the sidecar to keep it small.

  • Complete: includes everything, like it is today.
  • Azure: only Azure components.
  • Alibaba: only Alibaba components.
  • AWS: only AWS components.
  • GCP: only GCP components.
  • OSS: only OSS components, like Kafka and Redis.

We should also make it easy for apps to choose their flavor (maybe just a different tag) of daprd.

UPDATE

There might be some components that would need to be present in all builds: all the DNS resolution implementations, for example.

@wcs1only
Copy link
Contributor

I suspect the potential permutations are quite huge. What would really be powerful to our users is to be able to provide a way to do this arbitrarily.

But agreed, this is absolutely something we need.

@artursouza
Copy link
Member Author

@wcs1only I agree that ideally it should be pick and choose. This is a step until we can offer a a-la-carte daprd.

@mukundansundar
Copy link
Contributor

As @artursouza and @wcs1only mentioned, ideally need to have an a-la-carte option some form of plugin based model. But in the meantime this might be good. Additionally certain components should be available across all implementations eg: name resolution, middleware ? etc.

@artursouza artursouza pinned this issue May 13, 2021
@artursouza artursouza unpinned this issue May 13, 2021
@AaronCrawfis
Copy link
Contributor

AaronCrawfis commented May 13, 2021

Agree, while à la carte is the eventual goal I think we should have these versions:

daprd Version OSS comp. Azure comp. AWS comp. GCP comp. Alibaba comp.
Base edition
Azure edition
AWS edition
GCP edition
Alibaba edition
Full edition

@artursouza
Copy link
Member Author

Making sure we all get our French right, it is called "à la carte" :)

@msfussell
Copy link
Member

This should be addressed with pluggable components of your choice, that is all that is needed.

@wcs1only
Copy link
Contributor

It would be really nice if we could have our build system take in a an component list, then maintain a matrix of the flavors as say individual github actions. That would also make it nice and easy for downstream customers to fork our build with their own arbitrary list.

@AaronCrawfis
Copy link
Contributor

This should be addressed with pluggable components of your choice, that is all that is needed.

See #3168 (comment)

@artursouza
Copy link
Member Author

Pluggable components solves a different problem (but there is an intersection with this). Pluggable components will allow dapr sidecar for implementations that are not in the Dapr repo - maybe even private source. This issue does not exclude the need for pluggable components.

@CodeMonkeyLeet
Copy link
Contributor

I think the suggested breakdown along cloud provider editions plus OSS per Aaron's matrix makes a lot of sense to me; I would expect most distributed apps to be developed against a single cloud provided plus some cloud-agnostic components.

From the future extensibility perspective though, the OSS component edition seems like it's most likely to require additional refinement in the future, in the sense that is has the greatest opportunity to accumulate multiple components that fill similar roles, but only one of which will be used in that role for any given deployment. If we make that the baseline for all other editions, there's the potential for creeping bloat into the cloud editions as well. Depending on how likely folks think this is (or if we might move to an alternative like a fully à la carte solution before this becomes an issue), maybe having a core OSS set plus less frequently-used, purely cloud-agnostic OSS edition makes sense? It still creates a problem of composing a cloud edition with an OSS edition separately though.

I also like Charlie's suggestion of making it easier to build and deploy your own flavor as a lower cost alternative/precursor to dapr providing a full à la carte solution. It reduces the stakes of figuring out the "ideal" partitioning of components if users at least have a fallback to customize for their needs.

@AaronCrawfis
Copy link
Contributor

AaronCrawfis commented May 13, 2021

Was thinking about this a little more and have some notes:

  • We should only go down the path of cloud-specific versions of daprd if we intend to have this feature after à la carte is released. If à la carte makes this feature unnecessary, we should not spend time now creating them.
  • In a perfect à la carte solution, users don't have to preemptively decide what components to load into daprd. Instead, the presence of a component config file tells Dapr to install and initialize that component at runtime.
    • This means you always get the slimmest possible set of components no matter where you're running, determined by what component specs you have.
    • Cloud-specific version are no longer needed and should not be pursued, as users just specify components for that cloud and you now have a cloud-specific daprd
  • If dynamic loading of components at runtime is not possible due to technical issues with Go, then we can keep cloud-specific daprd images, with the separate ability to create your own à la carte version of dapr with whatever you want.

@yaron2
Copy link
Member

yaron2 commented May 14, 2021

If dynamic loading of components at runtime is not possible due to technical issues with Go

This is, for the very foreseeable future, a true statement. The à la carte solution should not be expected in the next year or so, and If I'm not being cautious I'd even say 2 years or more. There's almost zero progress on this front in the Go community.

Since we cannot guarantee the à la carte solution, the correct thing to do would be to go down with cloud specific daprd images.

In addition, I'd like us to try and articulate the reaons for doing so, and posing a scenario based question will help.

For example: For what reasons will someone deploy a GCP daprd image (when running on GCP) instead of the standard daprd image? What benefits does it provide?

@artursouza
Copy link
Member Author

If dynamic loading of components at runtime is not possible due to technical issues with Go

This is, for the very foreseeable future, a true statement. The à la carte solution should not be expected in the next year or so, and If I'm not being cautious I'd even say 2 years or more. There's almost zero progress on this front in the Go community.

Since we cannot guarantee the à la carte solution, the correct thing to do would be to go down with cloud specific daprd images.

In addition, I'd like us to try and articulate the reaons for doing so, and posing a scenario based question will help.

For example: For what reasons will someone deploy a GCP daprd image (when running on GCP) instead of the standard daprd image? What benefits does it provide?

I see two advantages: smaller and predictable sidecar size and predictable dependencies where random libraries are not included in the build. Remember that Go's init function can cause some damage like we saw in the Aerospike lib.

@yaron2
Copy link
Member

yaron2 commented May 14, 2021

If dynamic loading of components at runtime is not possible due to technical issues with Go

This is, for the very foreseeable future, a true statement. The à la carte solution should not be expected in the next year or so, and If I'm not being cautious I'd even say 2 years or more. There's almost zero progress on this front in the Go community.
Since we cannot guarantee the à la carte solution, the correct thing to do would be to go down with cloud specific daprd images.
In addition, I'd like us to try and articulate the reaons for doing so, and posing a scenario based question will help.
For example: For what reasons will someone deploy a GCP daprd image (when running on GCP) instead of the standard daprd image? What benefits does it provide?

I see two advantages: smaller and predictable sidecar size and predictable dependencies where random libraries are not included in the build. Remember that Go's init function can cause some damage like we saw in the Aerospike lib.

Ok, lets start with smaller and predictable sidecar size.

In terms of predictable, both full edition and a vendor edition are predictable. you always know which components exist in every release of both, and the binary and Docker images sizes are published and readily available.

As for smaller - is this really an issue though? For example, I tested the two extreme versions.

The base (OSS) edition is 68.1Mb in size. The full edition is 86Mb in size. Now the question is, who benefits from saving 18Mb of disk space?

The difference is even more negligible when you compare Docker images sizes:

The base (OSS) edition is 20.4Mb while the full edition is 24.5Mb..

where random libraries are not included in the build

This is not true.. in the case of Aerospike, someone taking the base (OSS) edition will get every library (including the Aerospike one) that's included. Someone taking just the Azure edition, for example, will have components that contain other "random" libraries.

In addition, there's a bit of confusion here: the Aerospike library wasn't a "random library" (I'm not sure what random means in this context), it was THE official Aerospike library. Libraries in general have bugs, as we've seen with the Azure Go SDK one. Building vendor specific images are no guarantee to avoiding such bugs, which in Aerospike's case was rare and platform specific.

@AaronCrawfis
Copy link
Contributor

predictable dependencies where random libraries are not included in the build.

From a security perspective this would be another scenario that customers would want. It limits the blast radius of any potential future breach.

For example, let's say there's a vulnerability in cloud provider X's pubsub component. A user of cloud provider Y wouldn't want that vulnerable code in their daprd image. Even if it may not be code that's run (since the component is never used), this also prevents a malicious user from exploiting this vulnerability or exfiltrating data by somehow injecting a Dapr component spec for the vulnerable component and then running the bad code.

The base (OSS) edition is 68.1Mb in size. The full edition is 86Mb in size. Now the question is, who benefits from saving 18Mb of disk space?

It looks like size isn't a huge savings, but I still see 18Mb as a positive factor, although not a deciding factor in the end.

Lastly, I see cloud-branded daprd images/containers to be a good marketing strategy. Just by having an Azure image, AWS image, Alibaba image, and GCP image, we signal that Dapr runs co-equally across any platform/cloud and users first seeing Dapr can easily see "this works on my cloud" without digging in to really understand the components. Could be a good way to get new users.

@yaron2
Copy link
Member

yaron2 commented May 18, 2021

From a security perspective this would be another scenario that customers would want. It limits the blast radius of any potential future breach.

Dapr is an OSS project, so I'm not sure what you mean by customers so I'll assume this means users. I think we should revisit this if we see users asking for this.

but I still see 18Mb as a positive factor

Again, I see 18Mb (and 4Mb in Docker terms) as having no impact whatsoever.

Lastly, I see cloud-branded daprd images/containers to be a good marketing strategy.

As a maintainer of the Dapr OSS project, I find it hard to make a decision for the project based on marketing concerns. Having vendor specific images does not contribute anything to the project's ability to run in those environments.

Based on the current info, I'm marking this as triaged unresolved for now and adding the discussion label so that in the future we can revisit this.

@yaron2 yaron2 added triaged/unresolved Indicates an issue that cannot or will not be resolved discussion labels May 18, 2021
@wcs1only
Copy link
Contributor

wcs1only commented May 19, 2021

So, I've been researching this as part of an embedded device exploration, and in this case, both disk space and in memory binary size are indeed a major concern. The memory budget for the user is measured in hundreds of MB, and our ability to meet this will make the difference between them being able to use Dapr or not. I did an experimental build with just the user's desired components and the delta was significant:

-rwxrwxr-x 1 charlie charlie  81M May 19 14:08 daprd

charlie@ubuntu:~$ pmap -x 2114939
2114939:   /home/charlie/.dapr/bin/daprd --app-id Followertyphoon-Knave --dapr-http-port 39311 --dapr-grpc-port 42531 --log-level info --app-max-concurrency -1 --app-protocol http --components-path /home/charlie/.dapr/components --metrics-port 33755 --app-port 3888 --placement-host-address localhost:50005 --config /home/charlie/.dapr/config.yaml
Address           Kbytes     RSS   Dirty Mode  Mapping
0000000000010000   37892   24708       0 r-x-- daprd
0000000002520000   42644   24212       0 r---- daprd
0000000004ed0000    1384     936     336 rw--- daprd
000000000502a000     384     208     208 rw---   [ anon ]
0000004000000000   65536    9980    9980 rw---   [ anon ]
0000ffff84c17000   40132    2476    2476 rw---   [ anon ]
0000ffff87348000     512       0       0 -----   [ anon ]
0000ffff873c8000       4       4       4 rw---   [ anon ]
0000ffff873c9000  523836       0       0 -----   [ anon ]
0000ffffa7358000       4       4       4 rw---   [ anon ]
0000ffffa7359000   65476       0       0 -----   [ anon ]
0000ffffab34a000       4       4       4 rw---   [ anon ]
0000ffffab34b000    8180       0       0 -----   [ anon ]
0000ffffabb48000       4       4       4 rw---   [ anon ]
0000ffffabb49000    1020       0       0 -----   [ anon ]
0000ffffabc48000     384      44      44 rw---   [ anon ]
0000ffffabca8000       4       4       0 r----   [ anon ]
0000ffffabca9000       4       4       0 r-x--   [ anon ]
0000ffffe6d09000     132      16      16 rw---   [ stack ]
---------------- ------- ------- ------- 
total kB          787536   62604   13076
-rwxrwxr-x 1 charlie charlie  36M May 19 13:44 daprd
charlie@ubuntu:~$ pmap -x 2113028                                                                                                                                                                                     
2113028:   /home/charlie/.dapr/bin/daprd --app-id Followertyphoon-Knave --dapr-http-port 39311 --dapr-grpc-port 42531 --log-level info --app-max-concurrency -1 --app-protocol http --components-path /home/charlie/.dapr/components --metrics-port 33755 --app-port 3888 --placement-host-address localhost:50005 --config /home/charlie/.dapr/config.yaml                                                                                 
Address           Kbytes     RSS   Dirty Mode  Mapping                                                                                                                                                                
0000000000010000   18016   12576       0 r-x-- daprd                                                                                                                                                                                                                                                                                                                                                                                        
00000000011b0000   17500   11288       0 r---- daprd                                                                                                                                                                  
00000000022d0000    1028     580     172 rw--- daprd                                                                                                                                                                                                                                                                                                                                                                                        
00000000023d1000     280     116     116 rw---   [ anon ]                                                                                                                                                             
0000004000000000   65536    6072    6072 rw---   [ anon ]                                                                                                                                                                                                                                                                                                                                                                                   
0000ffff7ba80000   39876    1820    1820 rw---   [ anon ]                                                                                                                                                                                                                                                                                                                                                                                   
0000ffff7e171000     512       0       0 -----   [ anon ]                                                                                                                                                                                                                                                                                                                                                                                   
0000ffff7e1f1000       4       4       4 rw---   [ anon ]                                                                                                                                                             
0000ffff7e1f2000  523836       0       0 -----   [ anon ]                                                                                                                                                                                                                                                                                                                                                                                   
0000ffff9e181000       4       4       4 rw---   [ anon ]                                                                                                                                                                                                                                                                                                                                                                                   
0000ffff9e182000   65476       0       0 -----   [ anon ]                                                                                                                                                                                                                                                                                                                                                                                   
0000ffffa2173000       4       4       4 rw---   [ anon ]                                                                                                                                                                                                                                                                                                                                                                                   
0000ffffa2174000    8180       0       0 -----   [ anon ]                                                                                                                                                                                                                                                                                                                                                                                   
0000ffffa2971000       4       4       4 rw---   [ anon ]                                                                                                                                                                                                                                                                                                                                                                                   
0000ffffa2972000    1020       0       0 -----   [ anon ]                                                                                                                                                                                                                                                                                                                                                                                   
0000ffffa2a71000     384      44      44 rw---   [ anon ]                                                                                                                                                                                                                                                                                                                                                                                   
0000ffffa2ad1000       4       4       0 r----   [ anon ]                                                                                                                                                                                                                                                                                                                                                                                   
0000ffffa2ad2000       4       4       0 r-x--   [ anon ]
0000ffffdada5000     132      16      16 rw---   [ stack ]
---------------- ------- ------- ------- 
total kB          741800   32536    8256

As you can see, the code and data sections of the binary make up the majority of the memory savings, but there was significant improvement to anon allocations as well.

I'm not so much concerned about maintaining "flavors", but having our build system be able to spit out a daprd with an arbitrary list of components I think has the potential to provide value, and would be relatively easy to build. A code generator which spits out a generated ./cmd/daprd/main.go based on an arbitrary list of components would allow for easy production of daprd binaries that contain only needed code. If we decide maintaining flavors is worth our time, then each flavor would just be a list of components as an argument to the build system. This would also make it easy for users to fork dapr and produce their own custom flavors as well.

As for what the matrix of flavors should be, should we decide to go that route, I am less interested in provider specific flavors and more interested on maturity of the included components. So flavors could include "only GA", "with beta", "with alpha". A misbehaving alpha component was behind why daprd had >8s startup times on windows for all versions prior to 1.2 (see dapr/components-contrib#865 )

@wcs1only
Copy link
Contributor

I propose that we implement the following as an intermediate step down the path to flavored Dapr builds:

Our build system should take a configuration file as an input with a list of components to include in the build. Based on this input, we should generate a component loader that is used by cmd/daprd/main.go. The output would be a daprd binary that only contains the desired modules, and nothing else. This gives us two short term benefits:

  • Ease of adding new components to dapr/dapr code.
  • Anyone that wants can show up with their own input file and have their own flavor.

In the long term, if we want to maintain flavors as in the long term as this proposal suggests, doing so becomes as easy as maintaining a yaml file.

I took a first pass at this, and I propose the following format for the component list yaml file:


root_package: "github.com/dapr/components-contrib/"
component_typenames:
  secretstores:
    - runtime_callback: WithSecretStores
      components:
        - path: "aws/secretmanager"
        - path: "azure/keyvault"
...
  bindings:
    - runtime_callback: WithInputBindings
      components:
        - path: "aws/kinesis"
        - path: "zeebe/jobworker"
...
    - runtime_callback: WithOutputBindings
      components:
        - path: "alicloud/oss"
        - path: "apns"
...

With that data, and some minor changes to some component initialization conventions, the code generation piece of this would be relatively straightforward. We could be building daprd instances with arbitrary components in no time.

@yaron2
Copy link
Member

yaron2 commented May 27, 2021

I propose that we implement the following as an intermediate step down the path to flavored Dapr builds:

Our build system should take a configuration file as an input with a list of components to include in the build. Based on this input, we should generate a component loader that is used by cmd/daprd/main.go. The output would be a daprd binary that only contains the desired modules, and nothing else. This gives us two short term benefits:

  • Ease of adding new components to dapr/dapr code.
  • Anyone that wants can show up with their own input file and have their own flavor.

In the long term, if we want to maintain flavors as in the long term as this proposal suggests, doing so becomes as easy as maintaining a yaml file.

I took a first pass at this, and I propose the following format for the component list yaml file:


root_package: "github.com/dapr/components-contrib/"
component_typenames:
  secretstores:
    - runtime_callback: WithSecretStores
      components:
        - path: "aws/secretmanager"
        - path: "azure/keyvault"
...
  bindings:
    - runtime_callback: WithInputBindings
      components:
        - path: "aws/kinesis"
        - path: "zeebe/jobworker"
...
    - runtime_callback: WithOutputBindings
      components:
        - path: "alicloud/oss"
        - path: "apns"
...

With that data, and some minor changes to some component initialization conventions, the code generation piece of this would be relatively straightforward. We could be building daprd instances with arbitrary components in no time.

I support that. Making our (project) and users consumption of components more manageable is a goal with clear benefits. Reducing memory footprint aimed for embedded devices is not (currently) a goal for Dapr.

@artursouza
Copy link
Member Author

Would this also modify go.mod based on the components that need to be built?

@yaron2
Copy link
Member

yaron2 commented May 28, 2021

Would this also modify go.mod based on the components that need to be built?

The version of components contrib could default to the latest release and be configurable.

@wcs1only
Copy link
Contributor

wcs1only commented Jun 1, 2021

Would this also modify go.mod based on the components that need to be built?

Correct me if I'm wrong, but I don't think it needs to. This would be implemented in dapr/dapr and go.mod doesn't worry about individual sub-packages, only component-contrib as a whole.

@yaron2
Copy link
Member

yaron2 commented Jun 1, 2021

It's just the components contrib version, there's no need to specify individual sub packages, in the scope of this change or outside of it.

@wcs1only
Copy link
Contributor

wcs1only commented Jun 8, 2021

I split off the work of autogenerating component import code based off a YAML file into #3271 since we determined that would be worthwhile to build in 1.3, even if we don't end up going down the flavor route.

@greenie-msft greenie-msft moved this from Planned (Committed) to Backlog in Dapr Roadmap Jun 15, 2021
@artursouza artursouza removed this from Backlog in Dapr Roadmap Jun 18, 2021
@dapr-bot
Copy link
Collaborator

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged (pinned, good first issue, help wanted or triaged/resolved) or other activity occurs. Thank you for your contributions.

@dapr-bot dapr-bot added stale Issues and PRs without response and removed stale Issues and PRs without response labels Jul 17, 2021
@ewassef
Copy link

ewassef commented Aug 3, 2021

I've submitted a proposal to solve this here #3513

@artursouza
Copy link
Member Author

I am reviving this discussion. Pluggable components is now a reality in Dapr, although not stable yet, components are still built into the Dapr sidecar. To reduce the binary size and exposure to regressions, we can offer variants of the Dapr sidecar images. To begin with, we can offer: "all" and "just stable components" variants.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/components discussion feedback-wanted kind/feature New feature request P1 pinned triaged/unresolved Indicates an issue that cannot or will not be resolved
Projects
No open projects
10 participants