Cluster access for Cilium #12

tgraf · 2016-08-23T18:45:20Z

If you are interested in filing a request for access to the CNCF Community Cluster, please fill out the details below.

If you are just filing an issue, ignore/delete those fields and file your issue.

First Name

Thomas

Last Name

Graf

Email

thomas@cilium.io

Company/Organization

Cilium

Project Title

Cilium

What existing problem or community challenge does this work address? ( Please include any past experience or lessons learned )

Cilium addresses large scale policy enforcement and addressing for containers with the help of BPF and IPv6. We would love to run continuous regression tests of development Linux kernels to avoid regressions of Linux kernel code in particular involving BPF.

Briefly describe the project

Cilium provides fast in-kernel networking and security policy enforcement for containers based on eBPF programs generated on the fly. It is an experimental project aiming at enabling emerging kernel technologies such as BPF and XDP for containers.

However, we regard the testing effort to be relevant to any container related networking solution which relies on kernel functionality.

Do you intend to measure specific metrics during the work? Please describe briefly

Network latency and throughput in comparison to scale
Determine number of application metrics to be measurable via kprobes/BPF etc.
Measure scalability limits of BPF based policy enforcement model

Which members of the CNCF community and/or end-users would benefit from your work?

All members with an interest in container networking or with interest in BPF technology for both instrumentation and networking.

Is the code that you’re going be running 100% open source? If so, what is the URL or URLs where it is located?

https://github.com/cilium/cilium

Do you commit to publishing your results and upstreaming the open source code resulting from your work? Do you agree to this within 2 months of cluster use?

Yes

Will your testing involve containers? If not, could it? What would be entailed in changing your processes to containerize your workload?

Containers only

Are there identified risks which would prevent you from achieving significant results in the project ?

We require to run a very recent Linux kernel on the bare metal system (4.8.rcX) and we require IPv6 to be allowed on the network.

Have you requested CNCF cluster resources or access in the past? If ‘no’, please skip the next three questions.

No

Please state your contributions to the open source community and any other relevant initiatives

10+ years Linux kernel development, OVS, BPF, Docker, Kubernetes, ...

Number of nodes requested (minimum 20 nodes, maximum 500 nodes). In Q3, maximum increases to 1000 nodes.

We are happy with any number of nodes we can acquire.

Duration of request (minimum 24 hours)

Ideally we can run a small number of nodes consistently to allow for regression testing as Linux kernel and BPF development progresses.

With or Without an operating system (Restricted to CNCF pre-defined OS and versions)?

We require to run a recent Linux kernel (4.8.rcX) with BPF capabilities enabled. Docker runtime.

How will this testing advance cloud native computing (specifically containerization,

orchestration, microservices or some combination).

THe kernel community currently does not include any regression or automated tests which cover the special scale and performance requirement needs of cloud native workloads. The aim is to close this gap.

Any other relevant details we should know about while preparing the infrastructure?

caniszczyk · 2016-08-23T19:57:05Z

Thanks for your request!

cc: @cncf/intel-cluster-team

cncfclusterteam · 2016-08-31T15:25:24Z

@tgraf We have a "2 week maximum" use requirement per cluster request. I see in your request that it states "open ended". We can allocate 36 nodes immediately to you, but the allocation would only last 2 weeks. Will that be sufficient for you to complete your work? /cc @cncf/intel-cluster-team

cncfclusterteam · 2016-09-07T02:17:48Z

@tgraf We haven't heard back regarding our previous message. Can you comment?

tgraf · 2016-09-07T07:13:20Z

@cncfclusterteam We are still discussing about what makes sense here. What the community would really benefit from is continuous testing. Once scale related regressions have found their way into the code, it is very hard to track them down again without continuous access to infrastructure to reproduce the issues.

dankohn · 2016-09-07T12:49:30Z

Cc @zsmith928

Zac at Packet has generously offered some bare metal resources to the etcd team to assist with their continuous integration processes. He may be able to help you out as well.

dankohn · 2017-01-10T20:10:29Z

@tgraf Out of curiosity, were you able to get resources from @zsmith928 and create a CI setup?

cncfclusterteam closed this as completed Jan 4, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cluster access for Cilium #12

Cluster access for Cilium #12

tgraf commented Aug 23, 2016

caniszczyk commented Aug 23, 2016

cncfclusterteam commented Aug 31, 2016

cncfclusterteam commented Sep 7, 2016

tgraf commented Sep 7, 2016

dankohn commented Sep 7, 2016

dankohn commented Jan 10, 2017

Cluster access for Cilium #12

Cluster access for Cilium #12

Comments

tgraf commented Aug 23, 2016

First Name

Last Name

Email

Company/Organization

Project Title

What existing problem or community challenge does this work address? ( Please include any past experience or lessons learned )

Briefly describe the project

Do you intend to measure specific metrics during the work? Please describe briefly

Which members of the CNCF community and/or end-users would benefit from your work?

Is the code that you’re going be running 100% open source? If so, what is the URL or URLs where it is located?

Do you commit to publishing your results and upstreaming the open source code resulting from your work? Do you agree to this within 2 months of cluster use?

Will your testing involve containers? If not, could it? What would be entailed in changing your processes to containerize your workload?

Are there identified risks which would prevent you from achieving significant results in the project ?

Have you requested CNCF cluster resources or access in the past? If ‘no’, please skip the next three questions.

Please state your contributions to the open source community and any other relevant initiatives

Number of nodes requested (minimum 20 nodes, maximum 500 nodes). In Q3, maximum increases to 1000 nodes.

Duration of request (minimum 24 hours)

With or Without an operating system (Restricted to CNCF pre-defined OS and versions)?

How will this testing advance cloud native computing (specifically containerization,

Any other relevant details we should know about while preparing the infrastructure?

caniszczyk commented Aug 23, 2016

cncfclusterteam commented Aug 31, 2016

cncfclusterteam commented Sep 7, 2016

tgraf commented Sep 7, 2016

dankohn commented Sep 7, 2016

dankohn commented Jan 10, 2017