<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" /></a>
The [Bus and Elephant Factor Dataset](https://github.com/geekygirldawn/k8s_data/tree/main/datasets/bus-elephant_cncf) and the analysis in this notebook were created by [Dawn Foster](https://fastwonderblog.com/) and are licensed under the [Creative Commons Attribution-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-sa/4.0/).

# Bus and Elephant Factor Data for Graduated and Incubating Projects

Data gathered on 2023-08-22

This is a quick view of these metrics for graduated and incubating CNCF projects by using the new [Devstats Elephant/Bus Factor in Repository Groups](https://all.devstats.cncf.io/d/84/elephant-bus-factor-in-repository-groups?orgId=1&var-period_name=Last%20year&var-metric=prs) dashboard combined with TAG Contributor Strategy projects data from the Landscape to map the projects to maturity level.

**Important Note / Caveat**: People frequently change jobs, so some of the organizational affiliation data will be out of date, but there is a full-time employee who researches and updates this data in addition to individuals updating their own data to keep it as up to date as possible.

If you clone this repo, you should be able to run this notebook yourself to explore the data in more detail, or you can grab the CVS files bus_by_level.csv or elephant_by_level.csv to analyze using your favorite csv / spreadsheet app.

In [53]:
# This cell generates the data structures based on the data in the files being read in

import urllib.request
import pandas as pd
import yaml
import requests

pd.set_option('display.max_rows', 500)

projects_file = requests.get("https://raw.githubusercontent.com/cncf/devstats/master/projects.yaml")
projects = yaml.safe_load(projects_file.content.decode("utf-8"))

busDF = pd.read_csv('users__2023-08-22_last_year.csv')
elephantDF = pd.read_csv('orgs_2023-08-22_last_year.csv')

levels_dict = {}
for project in projects['projects'].items():
    levels_dict[project[1]['name']] = project[1]['status']
    
busDF['level'] = busDF['Project/Repository Group'].map(levels_dict)
busDF.to_csv('bus_by_level.csv')

elephantDF['level'] = elephantDF['Project/Repository Group'].map(levels_dict)
elephantDF.to_csv('elephant_by_level.csv')

# Graduated Projects by Organization (Elephant Factor)

In [54]:
elephantDF.loc[elephantDF['level'] == 'Graduated'].sort_values(by=['BF'])

Unnamed: 0,Project/Repository Group,BF,BF%,Bus/Elephant Factor Organizations,Oth. #,Oth. %,Top 10 %,Top Organizations,Rem. #,Rem. %,level
158,Vitess,1,70.83,PlanetScale,32,29.17,97.41,"PlanetScale, Cisco, Slack Technologies Inc., T...",23,2.59,Graduated
151,TiKV,1,60.27,PingCAP,18,39.73,98.91,"PingCAP, Independent, Air Works India (Enginee...",9,1.09,Graduated
31,CRI-O,1,84.88,Red Hat Inc.,16,15.12,98.37,"Red Hat Inc., Intel Corporation, Independent, ...",7,1.63,Graduated
51,Flux,1,77.19,Weaveworks Inc.,50,22.81,92.98,"Weaveworks Inc., Independent, (Robots), Cyberc...",41,7.02,Graduated
55,Harbor,1,58.94,VMware Inc.,46,41.06,90.66,"VMware Inc., Hewlett, DaoCloud Network Technol...",37,9.34,Graduated
134,Rook,1,73.84,Red Hat Inc.,27,26.16,93.64,"Red Hat Inc., SUSE LLC, Cybozu, Cloudical, Ind...",18,6.36,Graduated
98,Linkerd,1,84.18,Buoyant Inc.,28,15.82,94.63,"Buoyant Inc., DATAWIRE AG, 株式会社アンドパッド, adidas,...",19,5.37,Graduated
44,Envoy,2,51.22,"Google LLC, Tetrate.io",81,48.78,86.65,"Google LLC, Tetrate.io, Lyft Inc., Tencent Hol...",73,13.35,Graduated
46,etcd,2,57.23,"VMware Inc., Google LLC",44,42.77,90.52,"VMware Inc., Google LLC, Independent, Red Hat ...",36,9.48,Graduated
143,SPIFFE,2,52.14,"Pacific Northwest National Laboratory, Philips",19,47.86,94.87,"Pacific Northwest National Laboratory, Philips...",11,5.13,Graduated


# Graduated Projects by Individual (Bus Factor)

In [55]:
busDF.loc[busDF['level'] == 'Graduated'].sort_values(by=['BF'])

Unnamed: 0,Project/Repository Group,BF,BF%,Bus/Elephant Factor Users,Oth. #,Oth. %,Top 10 %,Top Users,Rem. #,Rem. %,level
145,SPIFFE,3,53.45,"kfox1111, marcofranssen, mrsabath",49,46.55,76.73,"kfox1111, marcofranssen, mrsabath, faisal-memo...",42,23.27,Graduated
31,CRI-O,3,56.73,"saschagrunert, haircommander, sohankunkerkar",83,43.27,74.53,"saschagrunert, haircommander, sohankunkerkar, ...",76,25.47,Graduated
135,Rook,4,50.49,"subhamkrai, travisn, parth-gr, BlaineEXE",89,49.51,66.86,"subhamkrai, travisn, parth-gr, BlaineEXE, Madh...",83,33.14,Graduated
99,Linkerd,4,52.41,"hawkw, olix0r, alpeb, adleong",94,47.59,71.95,"hawkw, olix0r, alpeb, adleong, mateiidavid, kf...",88,28.05,Graduated
160,Vitess,5,52.37,"frouioui, GuptaManan100, shlomi-noach, mattlor...",111,47.63,77.41,"frouioui, GuptaManan100, shlomi-noach, mattlor...",106,22.59,Graduated
146,SPIRE,5,56.16,"guilhermocc, rturner3, azdagron, amartinezfayo...",42,43.84,77.17,"guilhermocc, rturner3, azdagron, amartinezfayo...",37,22.83,Graduated
46,etcd,5,50.93,"ahrtr, serathius, jmhbnz, fuweid, chaochn47",196,49.07,59.17,"ahrtr, serathius, jmhbnz, fuweid, chaochn47, t...",191,40.83,Graduated
56,Harbor,5,52.75,"YangJiao0817, AllForNothing, chlins, MinerYang...",176,47.25,70.37,"YangJiao0817, AllForNothing, chlins, MinerYang...",171,29.63,Graduated
157,TUF,5,53.73,"jku, lukpueh, fridex, mnm678, joshuagl",41,46.27,65.67,"jku, lukpueh, fridex, mnm678, joshuagl, rdimit...",36,34.33,Graduated
52,Flux,7,52.8,"stefanprodan, dholbach, pjbgf, aryan9600, hidd...",146,47.2,63.47,"stefanprodan, dholbach, pjbgf, aryan9600, hidd...",143,36.53,Graduated


# Incubating Projects by Organization (Elephant Factor)

In [56]:
elephantDF.loc[elephantDF['level'] == 'Incubating'].sort_values(by=['BF'])

Unnamed: 0,Project/Repository Group,BF,BF%,Bus/Elephant Factor Organizations,Oth. #,Oth. %,Top 10 %,Top Organizations,Rem. #,Rem. %,level
33,CubeFS,1,61.31,Guangdong OPPO Mobile Telecommunications Corp....,6,38.69,100.0,Guangdong OPPO Mobile Telecommunications Corp....,0,0.0,Incubating
74,Knative,1,55.07,Red Hat Inc.,50,44.93,95.67,"Red Hat Inc., VMware Inc., International Busin...",41,4.33,Incubating
72,Keycloak,1,89.39,Red Hat Inc.,43,10.61,96.35,"Red Hat Inc., Hitachi Ltd., Codecentric AG, Bo...",34,3.65,Incubating
71,Keptn,1,76.36,Dynatrace LLC,19,23.64,98.55,"Dynatrace LLC, Antal International, WeMakeDevs...",10,1.45,Incubating
96,Kyverno,1,60.68,PayFit,75,39.32,92.51,"PayFit, CNCF, Dell, move:elevator, Nirmata Inc...",66,7.49,Incubating
54,gRPC,1,85.6,Google LLC,56,14.4,96.67,"Google LLC, Microsoft Corporation, Adimen, Ind...",47,3.33,Incubating
48,Falco,1,64.5,Sysdig Inc.,45,35.5,96.32,"Sysdig Inc., Independent, MegiTeam, CLASTIX, S...",36,3.68,Incubating
100,Longhorn,1,86.26,SUSE LLC,8,13.74,100.0,"SUSE LLC, Dell, NetApp Inc, Aditro, ScrollStac...",0,0.0,Incubating
42,Dragonfly,1,69.35,Ant Group,14,30.65,99.46,"Ant Group, Alibaba.com, Bytedance Ltd, FNST, a...",5,0.54,Incubating
36,Dapr,1,76.19,Microsoft Corporation,44,23.81,95.76,"Microsoft Corporation, Jetstack Ltd, Diagrid, ...",35,4.24,Incubating


# Incubating Projects by Individual (Bus Factor)

In [57]:
busDF.loc[busDF['level'] == 'Incubating'].sort_values(by=['BF'])

Unnamed: 0,Project/Repository Group,BF,BF%,Bus/Elephant Factor Users,Oth. #,Oth. %,Top 10 %,Top Users,Rem. #,Rem. %,level
97,Kyverno,1,51.26,eddycharly,201,48.74,78.64,"eddycharly, realshuting, chipzoller, fjogeleit...",192,21.36,Incubating
28,Contour,2,64.45,"skriss, sunjayBhatia",60,35.55,83.2,"skriss, sunjayBhatia, izturn, tsaarni, fangfpe...",52,16.8,Incubating
42,Dragonfly,2,50.22,"gaius-qi, jiangliu",79,49.78,79.97,"gaius-qi, jiangliu, jim3ma, imeoer, fcgxz2003,...",71,20.03,Incubating
16,Chaos Mesh,3,63.36,"g1eny0ung, STRRL, cwen0",54,36.64,75.95,"g1eny0ung, STRRL, cwen0, FingerLeader, nioshie...",47,24.05,Incubating
121,OpenMetrics,3,75.0,"SphinxKnight, beorn7, bogdandrutu",1,25.0,100.0,"SphinxKnight, beorn7, bogdandrutu, baloo",0,0.0,Incubating
120,OpenKruise,3,52.32,"zmberg, chrisliu1995, veophi",76,47.68,68.95,"zmberg, chrisliu1995, veophi, songkang7, diann...",69,31.05,Incubating
43,Emissary-ingress,3,56.15,"LanceEa, LukeShu, haq204",32,43.85,77.69,"LanceEa, LukeShu, haq204, d6e-automaton, ddymk...",25,22.31,Incubating
63,in-toto,3,50.36,"adityasaky, marcelamelara, lukpueh",31,49.64,76.64,"adityasaky, marcelamelara, lukpueh, TomHennen,...",24,23.36,Incubating
10,Buildpacks,4,56.22,"natalieparellano, joe-kimmel-vmw, AidanDelaney...",68,43.78,72.61,"natalieparellano, joe-kimmel-vmw, AidanDelaney...",62,27.39,Incubating
14,cert-manager,4,54.11,"SgtCoDFish, inteon, wallrj, irbekrm",188,45.89,68.67,"SgtCoDFish, inteon, wallrj, irbekrm, maelvls, ...",182,31.33,Incubating


# Sandbox Projects by Company (Elephant Factor)

In [58]:
elephantDF.loc[elephantDF['level'] == 'Sandbox'].sort_values(by=['BF'])

Unnamed: 0,Project/Repository Group,BF,BF%,Bus/Elephant Factor Organizations,Oth. #,Oth. %,Top 10 %,Top Organizations,Rem. #,Rem. %,level
92,kubewarden,1,95.93,SUSE LLC,8,4.07,100.0,"SUSE LLC, Red Hat Inc., Motius, Capgemini, CNC...",0,0.0,Sandbox
65,K3s,1,91.89,SUSE LLC,25,8.11,96.07,"SUSE LLC, Independent, Grid Dynamics, FORTH, H...",16,3.93,Sandbox
116,OpenFeature,1,62.98,Dynatrace LLC,29,37.02,93.26,"Dynatrace LLC, Independent, Skillshare, OpenFe...",20,6.74,Sandbox
67,K8up,1,57.14,VSHN AG,3,42.86,100.0,"VSHN AG, amazee.io, Helio AG, Independent",0,0.0,Sandbox
114,OpenEBS,1,83.15,MayaData Inc. (f/k/a CloudByte Inc),14,16.85,99.3,"MayaData Inc. (f/k/a CloudByte Inc), DataCore ...",5,0.7,Sandbox
125,ORAS,1,73.67,Microsoft Corporation,21,26.33,97.55,"Microsoft Corporation, Amazon, Independent, CN...",12,2.45,Sandbox
73,Keylime,1,58.18,International Business Machines Corporation,2,41.82,100.0,"International Business Machines Corporation, R...",0,0.0,Sandbox
75,ko,1,53.95,Chainguard Inc.,16,46.05,90.79,"Chainguard Inc., Trendyol Group, Google LLC, G...",7,9.21,Sandbox
76,Konveyor,1,79.74,Red Hat Inc.,6,20.26,100.0,"Red Hat Inc., International Business Machines ...",0,0.0,Sandbox
77,kpt,1,74.34,Google LLC,14,25.66,99.34,"Google LLC, Cruise Automation, Spotify AB, Bol...",5,0.66,Sandbox


# Sandbox Projects by Individual (Bus Factor)

In [59]:
busDF.loc[busDF['level'] == 'Incubating'].sort_values(by=['BF'])

Unnamed: 0,Project/Repository Group,BF,BF%,Bus/Elephant Factor Users,Oth. #,Oth. %,Top 10 %,Top Users,Rem. #,Rem. %,level
97,Kyverno,1,51.26,eddycharly,201,48.74,78.64,"eddycharly, realshuting, chipzoller, fjogeleit...",192,21.36,Incubating
28,Contour,2,64.45,"skriss, sunjayBhatia",60,35.55,83.2,"skriss, sunjayBhatia, izturn, tsaarni, fangfpe...",52,16.8,Incubating
42,Dragonfly,2,50.22,"gaius-qi, jiangliu",79,49.78,79.97,"gaius-qi, jiangliu, jim3ma, imeoer, fcgxz2003,...",71,20.03,Incubating
16,Chaos Mesh,3,63.36,"g1eny0ung, STRRL, cwen0",54,36.64,75.95,"g1eny0ung, STRRL, cwen0, FingerLeader, nioshie...",47,24.05,Incubating
121,OpenMetrics,3,75.0,"SphinxKnight, beorn7, bogdandrutu",1,25.0,100.0,"SphinxKnight, beorn7, bogdandrutu, baloo",0,0.0,Incubating
120,OpenKruise,3,52.32,"zmberg, chrisliu1995, veophi",76,47.68,68.95,"zmberg, chrisliu1995, veophi, songkang7, diann...",69,31.05,Incubating
43,Emissary-ingress,3,56.15,"LanceEa, LukeShu, haq204",32,43.85,77.69,"LanceEa, LukeShu, haq204, d6e-automaton, ddymk...",25,22.31,Incubating
63,in-toto,3,50.36,"adityasaky, marcelamelara, lukpueh",31,49.64,76.64,"adityasaky, marcelamelara, lukpueh, TomHennen,...",24,23.36,Incubating
10,Buildpacks,4,56.22,"natalieparellano, joe-kimmel-vmw, AidanDelaney...",68,43.78,72.61,"natalieparellano, joe-kimmel-vmw, AidanDelaney...",62,27.39,Incubating
14,cert-manager,4,54.11,"SgtCoDFish, inteon, wallrj, irbekrm",188,45.89,68.67,"SgtCoDFish, inteon, wallrj, irbekrm, maelvls, ...",182,31.33,Incubating
