## Summary

### tl;dr

Notes:
* The Redis project is owned by a company (Redis) and was relicensed on 2024-03-20
* Caveat: This is purely a summary of the data that has not been validated from anyone within the community
* Analysis of contributions to one repo: https://github.com/redis/redis

Quite a few of the contributions to Redis before the relicense came from Redis employees; however, this has become  more pronounced after Redis relicensed in May 2024. Before the relicense, there were significant contributions from employees of other companies. You can see that while Redis employees made more additions and deletions, which can be inflated by large formatting changes, employees of other companies made almost twice as many commits.

Here are a few notable examples of contributions in the year leading up to the relicense from employees of Amazon, Alibaba, Huawei, Tencent, and Ericsson. All of these people have transitioned from Redis to the Valkey fork with madolson, enjoy-binbin, and zuiderkwast as top contributors to Valkey.

|People|Company|Commits|Additions|Deletions|
|:---|:---|:---|:---|:---|
| lyq2333 | Alibaba | 10 | 349 | 190 |
| soloestoy | Alibaba | 31 | 828 | 364 |
| hwware | Huawei Technologies | 25 | 763 | 224 |
| roshkhatri | Amazon | 10 | 1917 | 732 |
| hpatro | Amazon | 14 | 1437 | 632 |
| madolson | Amazon | 24 | 3310 | 1636 |
| enjoy-binbin | Tencent Cloud | 146 | 3502 | 1115 |
| zuiderkwast | Ericsson Software Technology | 16 | 13057 | 10622 |

### After relicense - 1 year (2024-03-20 - 2025-03-20)
All of the external contributors from Amazon, Alibaba, Tencent, Huawei, and Ericsson who contributed over 10 commits in the year leading up to the relicense mostly stopped contributing (some made 1-3 commits shortly after the relicense, which likely indicated work already in progress). 

Redis Employees with >= 10 commits: 
* People: 10 13.51% of people
* Commits: 303 75.00% of total commits
* Additions: 65901 66.16% of total additions
* Deletions: 16430 72.71% of total deletions

Non-Employees with >= 10 commits: 
* People: 0 0.00% of people
* Commits: 0 0.00% of total commits
* Additions: 0 0.00% of total additions
* Deletions: 0 0.00% of total deletions

Totals in dataset of people with >=10 commits:
* 66.16% of total additions
* 72.71% of total deletions

### After relicense - 6 months (2024-03-20 - 2024-09-20)


Redis Employees with >= 10 commits: 
* People: 7 15.22% of people
* Commits: 154 74.40% of total commits
* Additions: 38270 75.36% of total additions
* Deletions: 10464 72.06% of total deletions

Non-Employees with >= 10 commits: 
* People: 0 0.00% of people
* Commits: 0 0.00% of total commits
* Additions: 0 0.00% of total additions
* Deletions: 0 0.00% of total deletions

Totals in dataset of people with >=10 commits:
* 75.36% of total additions
* 72.06% of total deletions

### 1 year before relicense (2023-03-20 - 2024-03-20)

Redis Employees with >= 10 commits: 
* People: 6 6.45% of people
* Commits: 164 27.70% of total commits
* Additions: 189656 80.11% of total additions
* Deletions: 83122 73.89% of total deletions

Non-Employees with >= 10 commits: 
* People: 12 12.90% of people
* Commits: 319 53.89% of total commits
* Additions: 28334 11.97% of total additions
* Deletions: 16684 14.83% of total deletions

Totals in dataset of people with >=10 commits:
* 92.08% of total additions
* 88.72% of total deletions

External Contributors with 5+ commits:
|People|Company|Commits|Additions|Deletions|
|:---|:---|:---|:---|:---|
| CharlesChen888 | Looking for a job | 21 | 1362 | 529 |
| lyq2333 | Alibaba | 10 | 349 | 190 |
| meiravgri | None | 5 | 1272 | 361 |
| soloestoy | Alibaba | 31 | 828 | 364 |
| judeng | None | 10 | 489 | 275 |
| hwware | Huawei Technologies | 25 | 763 | 224 |
| moshekaplan | None | 7 | 48 | 4 |
| roshkhatri | Amazon | 10 | 1917 | 732 |
| hpatro | Amazon | 14 | 1437 | 632 |
| madolson | Amazon | 24 | 3310 | 1636 |
| enjoy-binbin | Tencent Cloud | 146 | 3502 | 1115 |
| zuiderkwast | Ericsson Software Technology | 16 | 13057 | 10622 |


# 1 Year After Relicense

In [8]:
from pprint import pprint
import collections
import pandas as pd
import pickle

# Pickle files generated by this script:
# https://github.com/chaoss/wg-data-science/blob/main/dataset/license-changes/fork-case-study/commits_people.py

people_pickle_a = '../data-files/redis_people_2024-03-20T00:00:00.000+00:002025-03-20T00:00:00.000+00:00.pkl'

with open(people_pickle_a, 'rb') as f:
    person_dict_a = pickle.load(f)

In [15]:
people = len(person_dict_a)
commits = 0
additions = 0
deletions = 0

for key,value in person_dict_a.items():
    # Normalize company names and use emails to derive Amazon affiliations
    if value['company'] == None:
        for email in value['email']:
            if any(x in email.lower() for x in ['redis.com', 'redislabs.com']):
                person_dict_a[key]['company'] = 'Redis Labs'
            if 'amazon.com' in email:
                person_dict_a[key]['company'] = 'Amazon'
    elif 'redis' in value['company'].lower():
        person_dict_a[key]['company'] = 'Redis Labs'
    elif 'alibaba' in value['company'].lower():
        person_dict_a[key]['company'] = 'Alibaba'
    elif any(x in value['company'].lower() for x in ['aws','amazon']):
        person_dict_a[key]['company'] = 'Amazon'
    
        
    # Get descriptive statistics
    commits = commits + value['commits']
    additions = additions + value['additions']
    deletions = deletions + value['deletions']
    
print("People:", people)
print("Commits:", commits)
print("Additions", additions)
print("Deletions", deletions)

People: 74
Commits: 404
Additions 99614
Deletions 22596


In [10]:
for key,value in person_dict_a.items():
    if (value['commits'] >= 5) and (value['company'] == None):
        print(key,value)

tezc {'email': ['ozantezcan@gmail.com'], 'deletions': 3695, 'commits': 29, 'additions': 10214, 'company': None, 'name': 'Ozan Tezcan'}


In [11]:
# Manual Fixes
person_dict_a['tezc']['company'] = 'Redis Labs' # confirmed in conversation with Madelyn Olson on 2024-09-18

# Remove bots
try:
    del person_dict_a['dependabot[bot]']
except:
    pass

In [12]:
org_people = 0
org_commits = 0
org_additions = 0
org_deletions = 0

other_people = 0
other_commits = 0
other_additions = 0
other_deletions = 0

for key,value in person_dict_a.items():
    try:
        if value['commits'] >= 5:
            if value['company'] == 'Redis Labs':
                org_people += 1
                org_commits = org_commits + value['commits']
                org_additions = org_additions + value['additions']
                org_deletions = org_deletions + value['deletions']
            else:
                other_people += 1
                other_commits = other_commits + value['commits']
                other_additions = other_additions + value['additions']
                other_deletions = other_deletions + value['deletions']
                print(key,value)
            i+=1
    except:
        pass

print("\nRedis Employees with >= 10 commits:", "\n* People:", org_people, format(org_people/people, ".2%"), "of people")
print("* Commits:", org_commits, format(org_commits/commits, ".2%"), "of total commits")
print("* Additions:", org_additions, format(org_additions/additions, ".2%"), "of total additions")
print("* Deletions:", org_deletions, format(org_deletions/deletions, ".2%"), "of total deletions")

print("\nNon-Employees with >= 10 commits:", "\n* People:", other_people, format(other_people/people, ".2%"), "of people")
print("* Commits:", other_commits, format(other_commits/commits, ".2%"), "of total commits")
print("* Additions:", other_additions, format(other_additions/additions, ".2%"), "of total additions")
print("* Deletions:", other_deletions, format(other_deletions/deletions, ".2%"), "of total deletions")
      
print("\nTotals in dataset of people with >=10 commits:")
print('*', format((other_additions + org_additions)/additions, ".2%"), "of total additions")
print('*', format((other_deletions + org_deletions)/deletions, ".2%"), "of total deletions")


Redis Employees with >= 10 commits: 
* People: 10 13.51% of people
* Commits: 303 75.00% of total commits
* Additions: 65901 66.16% of total additions
* Deletions: 16430 72.71% of total deletions

Non-Employees with >= 10 commits: 
* People: 0 0.00% of people
* Commits: 0 0.00% of total commits
* Additions: 0 0.00% of total additions
* Deletions: 0 0.00% of total deletions

Totals in dataset of people with >=10 commits:
* 66.16% of total additions
* 72.71% of total deletions


In [48]:
# Make it easy for the print statements to be copied into a Markdown table

adds=0
dels=0

print('|People|Company|Commits|Additions|Deletions|')
print('|:---|:---|:---|:---|:---|')
for key,value in person_dict_a.items():
    try:
        if (value['company'] != 'Redis Labs') and (value['additions'] >= 50):
            adds+=value['additions']
            dels+=value['deletions']
            print('|', key,'|', value['company'], '|', value['commits'],'|', value['additions'],'|', value['deletions'],'|')

    except:
        pass

|People|Company|Commits|Additions|Deletions|
|:---|:---|:---|:---|:---|
| bjosv | @Ericsson  | 1 | 413 | 6 |
| guowangy | None | 3 | 77 | 33 |
| PingXie | Google | 1 | 2205 | 213 |
| alsoalgo | None | 3 | 51 | 24 |
| enjoy-binbin | Tencent Cloud | 3 | 3263 | 308 |
| raz-mon | None | 3 | 660 | 31 |
| hpatro | Amazon | 2 | 148 | 23 |
| ClaytonNorthey92 | None | 2 | 826 | 12 |
| lipzhu | Intel | 3 | 77 | 33 |
| j178 | None | 1 | 198 | 135 |
| madolson | Amazon | 3 | 2387 | 228 |
| zuiderkwast | Ericsson Software Technology | 4 | 3813 | 255 |
| raffertyyu | None | 1 | 64 | 2 |
| xbasel | None | 1 | 2205 | 213 |
| uriyage | Amazon | 1 | 174 | 15 |
| vitahlin | None | 4 | 311 | 91 |
| ranshid | Amazon | 2 | 4410 | 426 |
| naglera | Amazon | 4 | 4540 | 450 |
| Nugine | None | 3 | 373 | 18 |


Note: Copied from above for readability.

|People|Company|Commits|Additions|Deletions|
|:---|:---|:---|:---|:---|
| bjosv | @Ericsson  | 1 | 413 | 6 |
| guowangy | None | 3 | 77 | 33 |
| PingXie | Google | 1 | 2205 | 213 |
| alsoalgo | None | 3 | 51 | 24 |
| enjoy-binbin | Tencent Cloud | 3 | 3263 | 308 |
| raz-mon | None | 3 | 660 | 31 |
| hpatro | Amazon | 2 | 148 | 23 |
| ClaytonNorthey92 | None | 2 | 826 | 12 |
| lipzhu | Intel | 3 | 77 | 33 |
| j178 | None | 1 | 198 | 135 |
| madolson | Amazon | 3 | 2387 | 228 |
| zuiderkwast | Ericsson Software Technology | 4 | 3813 | 255 |
| raffertyyu | None | 1 | 64 | 2 |
| xbasel | None | 1 | 2205 | 213 |
| uriyage | Amazon | 1 | 174 | 15 |
| vitahlin | None | 4 | 311 | 91 |
| ranshid | Amazon | 2 | 4410 | 426 |
| naglera | Amazon | 4 | 4540 | 450 |
| Nugine | None | 3 | 373 | 18 |

# 6 Months after relicense (2024-03-20 - 2024-09-20)

In [50]:
from pprint import pprint
import collections
import pandas as pd
import pickle

# Pickle files generated by this script:
# https://github.com/chaoss/wg-data-science/blob/main/dataset/license-changes/fork-case-study/commits_people.py

people_pickle_a = '../data-files/redis_people_2024-03-20T00:00:00.000+00:002024-09-20T00:00:00.000+00:00.pkl'

with open(people_pickle_a, 'rb') as f:
    person_dict_a = pickle.load(f)

In [51]:
people = len(person_dict_a)
commits = 0
additions = 0
deletions = 0

for key,value in person_dict_a.items():
    # Normalize company names and use emails to derive Amazon affiliations
    if value['company'] == None:
        for email in value['email']:
            if any(x in email.lower() for x in ['redis.com', 'redislabs.com']):
                person_dict_a[key]['company'] = 'Redis Labs'
            if 'amazon.com' in email:
                person_dict_a[key]['company'] = 'Amazon'
    elif 'redis' in value['company'].lower():
        person_dict_a[key]['company'] = 'Redis Labs'
    elif 'alibaba' in value['company'].lower():
        person_dict_a[key]['company'] = 'Alibaba'
    elif any(x in value['company'].lower() for x in ['aws','amazon']):
        person_dict_a[key]['company'] = 'Amazon'
    
        
    # Get descriptive statistics
    commits = commits + value['commits']
    additions = additions + value['additions']
    deletions = deletions + value['deletions']
    
print("People:", people)
print("Commits:", commits)
print("Additions", additions)
print("Deletions", deletions)

People: 46
Commits: 207
Additions 50786
Deletions 14521


In [52]:
for key,value in person_dict_a.items():
    if (value['commits'] >= 5) and (value['company'] == None):
        print(key,value)

YaacovHazan {'company': None, 'additions': 758, 'name': None, 'deletions': 280, 'email': ['31382944+YaacovHazan@users.noreply.github.com'], 'commits': 5}
tezc {'company': None, 'additions': 5472, 'name': 'Ozan Tezcan', 'deletions': 3282, 'email': ['ozantezcan@gmail.com'], 'commits': 19}


In [53]:
# Manual Fixes
person_dict_a['YaacovHazan']['company'] = 'Redis Labs' # https://www.linkedin.com/in/yaacov-hazan-b8043a99/
person_dict_a['tezc']['company'] = 'Redis Labs' # confirmed in conversation with Madelyn Olson on 2024-09-18

# Remove bots
try:
    del person_dict_a['dependabot[bot]']
except:
    pass

In [54]:
org_people = 0
org_commits = 0
org_additions = 0
org_deletions = 0

other_people = 0
other_commits = 0
other_additions = 0
other_deletions = 0

for key,value in person_dict_a.items():
    try:
        if value['commits'] >= 5:
            if value['company'] == 'Redis Labs':
                org_people += 1
                org_commits = org_commits + value['commits']
                org_additions = org_additions + value['additions']
                org_deletions = org_deletions + value['deletions']
            else:
                other_people += 1
                other_commits = other_commits + value['commits']
                other_additions = other_additions + value['additions']
                other_deletions = other_deletions + value['deletions']
                print(key,value)
            i+=1
    except:
        pass

print("\nRedis Employees with >= 10 commits:", "\n* People:", org_people, format(org_people/people, ".2%"), "of people")
print("* Commits:", org_commits, format(org_commits/commits, ".2%"), "of total commits")
print("* Additions:", org_additions, format(org_additions/additions, ".2%"), "of total additions")
print("* Deletions:", org_deletions, format(org_deletions/deletions, ".2%"), "of total deletions")

print("\nNon-Employees with >= 10 commits:", "\n* People:", other_people, format(other_people/people, ".2%"), "of people")
print("* Commits:", other_commits, format(other_commits/commits, ".2%"), "of total commits")
print("* Additions:", other_additions, format(other_additions/additions, ".2%"), "of total additions")
print("* Deletions:", other_deletions, format(other_deletions/deletions, ".2%"), "of total deletions")
      
print("\nTotals in dataset of people with >=10 commits:")
print('*', format((other_additions + org_additions)/additions, ".2%"), "of total additions")
print('*', format((other_deletions + org_deletions)/deletions, ".2%"), "of total deletions")


Redis Employees with >= 10 commits: 
* People: 7 15.22% of people
* Commits: 154 74.40% of total commits
* Additions: 38270 75.36% of total additions
* Deletions: 10464 72.06% of total deletions

Non-Employees with >= 10 commits: 
* People: 0 0.00% of people
* Commits: 0 0.00% of total commits
* Additions: 0 0.00% of total additions
* Deletions: 0 0.00% of total deletions

Totals in dataset of people with >=10 commits:
* 75.36% of total additions
* 72.06% of total deletions


In [55]:
# Make it easy for the print statements to be copied into a Markdown table

adds=0
dels=0

print('|People|Company|Commits|Additions|Deletions|')
print('|:---|:---|:---|:---|:---|')
for key,value in person_dict_a.items():
    try:
        if (value['company'] != 'Redis Labs') and (value['additions'] >= 5):
            adds+=value['additions']
            dels+=value['deletions']
            print('|', key,'|', value['company'], '|', value['commits'],'|', value['additions'],'|', value['deletions'],'|')

    except:
        pass

|People|Company|Commits|Additions|Deletions|
|:---|:---|:---|:---|:---|
| ClaytonNorthey92 | None | 2 | 826 | 12 |
| enjoy-binbin | Tencent Cloud | 2 | 1058 | 95 |
| stevelipinski | None | 1 | 41 | 30 |
| paoloredis | None | 1 | 24 | 0 |
| uriyage | Amazon | 1 | 174 | 15 |
| madolson | Amazon | 1 | 174 | 15 |
| AcherTT | None | 1 | 29 | 0 |
| vitahlin | None | 2 | 260 | 83 |
| judeng | None | 1 | 6 | 36 |
| guowangy | None | 1 | 40 | 24 |
| guanmengshi | None | 1 | 28 | 0 |
| yveslb | None | 2 | 785 | 31 |
| zuiderkwast | Ericsson Software Technology | 3 | 1608 | 42 |
| dev-jonghoonpark | None | 1 | 30 | 101 |
| bjosv | @Ericsson  | 1 | 413 | 6 |
| udi-speedb | None | 1 | 21 | 1 |
| linzihao1999 | Amazon | 1 | 17 | 14 |
| j178 | None | 1 | 198 | 135 |
| lyq2333 | Alibaba | 1 | 11 | 0 |
| naglera | Amazon | 2 | 130 | 24 |
| cyy-tag | None | 2 | 18 | 2 |
| lipzhu | Intel | 1 | 40 | 24 |
| hpatro | Amazon | 2 | 148 | 23 |


###### This is manually copied from above 

|People|Company|Commits|Additions|Deletions|
|:---|:---|:---|:---|:---|
| ClaytonNorthey92 | None | 2 | 826 | 12 |
| enjoy-binbin | Tencent Cloud | 2 | 1058 | 95 |
| stevelipinski | None | 1 | 41 | 30 |
| paoloredis | None | 1 | 24 | 0 |
| uriyage | Amazon | 1 | 174 | 15 |
| madolson | Amazon | 1 | 174 | 15 |
| AcherTT | None | 1 | 29 | 0 |
| vitahlin | None | 2 | 260 | 83 |
| judeng | None | 1 | 6 | 36 |
| guowangy | None | 1 | 40 | 24 |
| guanmengshi | None | 1 | 28 | 0 |
| yveslb | None | 2 | 785 | 31 |
| zuiderkwast | Ericsson Software Technology | 3 | 1608 | 42 |
| dev-jonghoonpark | None | 1 | 30 | 101 |
| bjosv | @Ericsson  | 1 | 413 | 6 |
| udi-speedb | None | 1 | 21 | 1 |
| linzihao1999 | Amazon | 1 | 17 | 14 |
| j178 | None | 1 | 198 | 135 |
| lyq2333 | Alibaba | 1 | 11 | 0 |
| naglera | Amazon | 2 | 130 | 24 |
| cyy-tag | None | 2 | 18 | 2 |
| lipzhu | Intel | 1 | 40 | 24 |
| hpatro | Amazon | 2 | 148 | 23 |

In [25]:
# These are the people who account for the non-Redis contributions
adds = 0
dels = 0
for key,value in person_dict_a.items():
    if (value['commits'] < 5):
        print(key,value)
        adds+=value['additions']
        dels+=value['deletions']
print("Adds:", adds, "Dels", dels)

ClaytonNorthey92 {'company': None, 'additions': 826, 'name': None, 'deletions': 12, 'email': ['clayton.northey@gmail.com'], 'commits': 2}
enjoy-binbin {'company': 'Tencent Cloud', 'additions': 1058, 'name': 'Binbin', 'deletions': 95, 'email': ['binloveplay1314@qq.com'], 'commits': 2}
panzhongxian {'company': 'Tencent', 'additions': 4, 'name': 'Zhongxian Pan', 'deletions': 11, 'email': ['panzhongxian0532@gmail.com'], 'commits': 1}
valentinogeron {'company': 'Redis Labs', 'additions': 86, 'name': 'Valentino Geron', 'deletions': 21, 'email': ['valentino@redis.com'], 'commits': 1}
mxmlkzdh {'company': 'Google', 'additions': 0, 'name': 'Max Malekzadeh', 'deletions': 1, 'email': ['11231195+mxmlkzdh@users.noreply.github.com'], 'commits': 1}
stevelipinski {'company': None, 'additions': 41, 'name': 'Steve', 'deletions': 30, 'email': ['7024856+stevelipinski@users.noreply.github.com'], 'commits': 1}
CoolThi {'company': None, 'additions': 1, 'name': None, 'deletions': 3, 'email': ['xchy233@gmail.c

# 1 year before relicense (2023-03-20 - 2024-03-20)

In [26]:
from pprint import pprint
import collections
import pandas as pd
import pickle

# Pickle files generated by this script:
# https://github.com/chaoss/wg-data-science/blob/main/dataset/license-changes/fork-case-study/commits_people.py

people_pickle_1yr = '../data-files/redis_people_2023-03-20T00:00:00.000+00:002024-03-20T00:00:00.000+00:00.pkl'

with open(people_pickle_1yr, 'rb') as f:
    person_dict_1yr = pickle.load(f)
    
len(person_dict_1yr)

93

In [27]:
people = len(person_dict_1yr)
commits = 0
additions = 0
deletions = 0

for key,value in person_dict_1yr.items():
    # Normalize company names and use emails to derive Amazon affiliations
    if value['company'] == None:
        for email in value['email']:
            if any(x in email.lower() for x in ['redis.com', 'redislabs.com']):
                person_dict_1yr[key]['company'] = 'Redis Labs'
            if 'amazon.com' in email:
                person_dict_1yr[key]['company'] = 'Amazon'
    elif 'redis' in value['company'].lower():
        person_dict_1yr[key]['company'] = 'Redis Labs'
    elif 'alibaba' in value['company'].lower():
        person_dict_1yr[key]['company'] = 'Alibaba'
    elif any(x in value['company'].lower() for x in ['aws','amazon']):
        person_dict_1yr[key]['company'] = 'Amazon'
    
        
    # Get descriptive statistics
    commits = commits + value['commits']
    additions = additions + value['additions']
    deletions = deletions + value['deletions']
    
print("People:", people)
print("Commits:", commits)
print("Additions", additions)
print("Deletions", deletions)

People: 93
Commits: 592
Additions 236730
Deletions 112490


In [28]:
for key,value in person_dict_1yr.items():
    if (value['commits'] >= 10) and (value['company'] == None):
        print(key,value)

YaacovHazan {'name': None, 'company': None, 'additions': 159, 'commits': 12, 'email': ['31382944+YaacovHazan@users.noreply.github.com'], 'deletions': 183}
yossigo {'name': 'Yossi Gottlieb', 'company': None, 'additions': 8040, 'commits': 11, 'email': ['yossigo@gmail.com'], 'deletions': 1179}
judeng {'name': None, 'company': None, 'additions': 489, 'commits': 10, 'email': ['abc3844@126.com'], 'deletions': 275}


In [29]:
# Manual Fixes
person_dict_1yr['YaacovHazan']['company'] = 'Redis Labs' # https://www.linkedin.com/in/yaacov-hazan-b8043a99/
person_dict_1yr['yossigo']['company'] = 'Redis Labs'  # https://www.linkedin.com/in/yossi-gottlieb-40842/
person_dict_1yr['tezc']['company'] = 'Redis Labs' # confirmed in conversation with Madelyn Olson on 2024-09-18


# Remove bots
try:
    del person_dict_1yr['dependabot[bot]']
except:
    pass

In [30]:
for key,value in person_dict_1yr.items():
    try:
        if (value['commits'] >= 10) and (value['company'] != 'Redis Labs'):
            print(key,value['company'],value['commits'],value['additions'],value['deletions'])
    except:
        pass

CharlesChen888 Looking for a job 21 1362 529
lyq2333 Alibaba 10 349 190
soloestoy Alibaba 31 828 364
judeng None 10 489 275
hwware Huawei Technologies 25 763 224
roshkhatri Amazon 10 1917 732
hpatro Amazon 14 1437 632
madolson Amazon 24 3310 1636
enjoy-binbin Tencent Cloud 146 3502 1115
zuiderkwast Ericsson Software Technology 16 13057 10622


In [31]:
org_people = 0
org_commits = 0
org_additions = 0
org_deletions = 0

other_people = 0
other_commits = 0
other_additions = 0
other_deletions = 0

for key,value in person_dict_1yr.items():
    try:
        if value['commits'] >= 5:
            if value['company'] == 'Redis Labs':
                org_people += 1
                org_commits = org_commits + value['commits']
                org_additions = org_additions + value['additions']
                org_deletions = org_deletions + value['deletions']
            else:
                other_people += 1
                other_commits = other_commits + value['commits']
                other_additions = other_additions + value['additions']
                other_deletions = other_deletions + value['deletions']
                print(key,value)
            i+=1
    except:
        pass

print("\nRedis Employees with >= 10 commits:", "\n* People:", org_people, format(org_people/people, ".2%"), "of people")
print("* Commits:", org_commits, format(org_commits/commits, ".2%"), "of total commits")
print("* Additions:", org_additions, format(org_additions/additions, ".2%"), "of total additions")
print("* Deletions:", org_deletions, format(org_deletions/deletions, ".2%"), "of total deletions")

print("\nNon-Employees with >= 10 commits:", "\n* People:", other_people, format(other_people/people, ".2%"), "of people")
print("* Commits:", other_commits, format(other_commits/commits, ".2%"), "of total commits")
print("* Additions:", other_additions, format(other_additions/additions, ".2%"), "of total additions")
print("* Deletions:", other_deletions, format(other_deletions/deletions, ".2%"), "of total deletions")
      
print("\nTotals in dataset of people with >=10 commits:")
print('*', format((other_additions + org_additions)/additions, ".2%"), "of total additions")
print('*', format((other_deletions + org_deletions)/deletions, ".2%"), "of total deletions")

CharlesChen888 {'name': 'Chen Tianjie', 'company': 'Looking for a job', 'additions': 1362, 'commits': 21, 'email': ['chentianjie.ctj@alibaba-inc.com', 'TJ_Chen@outlook.com'], 'deletions': 529}
lyq2333 {'name': 'Yanqi Lv', 'company': 'Alibaba', 'additions': 349, 'commits': 10, 'email': ['lvyanqi.lyq@alibaba-inc.com', '50293466+lyq2333@users.noreply.github.com'], 'deletions': 190}
meiravgri {'name': None, 'company': None, 'additions': 1272, 'commits': 5, 'email': ['109056284+meiravgri@users.noreply.github.com'], 'deletions': 361}
soloestoy {'name': 'zhaozhao.zz', 'company': 'Alibaba', 'additions': 828, 'commits': 31, 'email': ['zhaozhao.zz@alibaba-inc.com'], 'deletions': 364}
judeng {'name': None, 'company': None, 'additions': 489, 'commits': 10, 'email': ['abc3844@126.com'], 'deletions': 275}
hwware {'name': 'Wen Hui', 'company': 'Huawei Technologies', 'additions': 763, 'commits': 25, 'email': ['wen.hui.ware@gmail.com'], 'deletions': 224}
moshekaplan {'name': 'Moshe Kaplan', 'company': 

In [32]:
# Make it easy for the print statements to be copied into a Markdown table
print('|People|Company|Commits|Additions|Deletions|')
print('|:---|:---|:---|:---|:---|')
for key,value in person_dict_1yr.items():
    try:
        if (value['company'] != 'Redis Labs') and (value['commits'] >= 5):
            print('|', key,'|', value['company'], '|', value['commits'],'|', value['additions'],'|', value['deletions'],'|')

    except:
        pass

|People|Company|Commits|Additions|Deletions|
|:---|:---|:---|:---|:---|
| CharlesChen888 | Looking for a job | 21 | 1362 | 529 |
| lyq2333 | Alibaba | 10 | 349 | 190 |
| meiravgri | None | 5 | 1272 | 361 |
| soloestoy | Alibaba | 31 | 828 | 364 |
| judeng | None | 10 | 489 | 275 |
| hwware | Huawei Technologies | 25 | 763 | 224 |
| moshekaplan | None | 7 | 48 | 4 |
| roshkhatri | Amazon | 10 | 1917 | 732 |
| hpatro | Amazon | 14 | 1437 | 632 |
| madolson | Amazon | 24 | 3310 | 1636 |
| enjoy-binbin | Tencent Cloud | 146 | 3502 | 1115 |
| zuiderkwast | Ericsson Software Technology | 16 | 13057 | 10622 |


Manually copied from above

|People|Company|Commits|Additions|Deletions|
|:---|:---|:---|:---|:---|
| CharlesChen888 | Looking for a job | 21 | 1362 | 529 |
| felipou | None | 1 | 138 | 0 |
| lyq2333 | Alibaba | 10 | 349 | 190 |
| meiravgri | None | 5 | 1272 | 361 |
| slavak | None | 3 | 267 | 75 |
| soloestoy | Alibaba | 31 | 828 | 364 |
| judeng | None | 10 | 489 | 275 |
| hwware | Huawei Technologies | 25 | 763 | 224 |
| roshkhatri | Amazon | 10 | 1917 | 732 |
| hpatro | Amazon | 14 | 1437 | 632 |
| madolson | Amazon | 24 | 3310 | 1636 |
| vitarb | None | 2 | 2312 | 1128 |
| panjf2000 | @gnet-io | 3 | 163 | 73 |
| enjoy-binbin | Tencent Cloud | 146 | 3502 | 1115 |
| zuiderkwast | Ericsson Software Technology | 16 | 13057 | 10622 |

# 2 years before relicense (2022-03-20 - 2024-03-20)

In [33]:
from pprint import pprint
import collections
import pandas as pd
import pickle

# Pickle files generated by this script:
# https://github.com/chaoss/wg-data-science/blob/main/dataset/license-changes/fork-case-study/commits_people.py

people_pickle_2yr = '../data-files/redis_people_2022-03-20T00:00:00.000+00:002024-03-20T00:00:00.000+00:00.pkl'

with open(people_pickle_2yr, 'rb') as f:
    person_dict_2yr = pickle.load(f)
    
len(person_dict_2yr)

184

In [34]:
people = len(person_dict_2yr)
commits = 0
additions = 0
deletions = 0

for key,value in person_dict_2yr.items():
    # Normalize company names and use emails to derive Amazon affiliations
    if value['company'] == None:
        for email in value['email']:
            if any(x in email.lower() for x in ['redis.com', 'redislabs.com']):
                person_dict_2yr[key]['company'] = 'Redis Labs'
            if 'amazon.com' in email:
                person_dict_2yr[key]['company'] = 'Amazon'
    elif 'redis' in value['company'].lower():
        person_dict_2yr[key]['company'] = 'Redis Labs'
    elif 'alibaba' in value['company'].lower():
        person_dict_2yr[key]['company'] = 'Alibaba'
    elif any(x in value['company'].lower() for x in ['aws','amazon']):
        person_dict_2yr[key]['company'] = 'Amazon'
    
        
    # Get descriptive statistics
    commits = commits + value['commits']
    additions = additions + value['additions']
    deletions = deletions + value['deletions']
    
print("People:", people)
print("Commits:", commits)
print("Additions", additions)
print("Deletions", deletions)

People: 184
Commits: 1331
Additions 323721
Deletions 132462


In [35]:
for key,value in person_dict_2yr.items():
    if (value['commits'] >= 10) and (value['company'] == None):
        print(key,value)

YaacovHazan {'additions': 164, 'deletions': 184, 'name': None, 'email': ['31382944+YaacovHazan@users.noreply.github.com'], 'commits': 13, 'company': None}
tezc {'additions': 7944, 'deletions': 569, 'name': 'Ozan Tezcan', 'email': ['ozantezcan@gmail.com'], 'commits': 20, 'company': None}
yossigo {'additions': 10351, 'deletions': 1522, 'name': 'Yossi Gottlieb', 'email': ['yossigo@gmail.com'], 'commits': 25, 'company': None}
dependabot[bot] {'additions': 58, 'deletions': 59, 'name': None, 'email': ['49699333+dependabot[bot]@users.noreply.github.com'], 'commits': 11, 'company': None}
judeng {'additions': 610, 'deletions': 378, 'name': None, 'email': ['abc3844@126.com'], 'commits': 23, 'company': None}


In [36]:
# Manual Fixes
person_dict_2yr['YaacovHazan']['company'] = 'Redis Labs' # https://www.linkedin.com/in/yaacov-hazan-b8043a99/
person_dict_2yr['yossigo']['company'] = 'Redis Labs'  # https://www.linkedin.com/in/yossi-gottlieb-40842/
person_dict_2yr['tezc']['company'] = 'Redis Labs' # confirmed in conversation with Madelyn Olson on 2024-09-18

# Remove bots
try:
    del person_dict_2yr['dependabot[bot]']
except:
    pass

In [37]:
for key,value in person_dict_2yr.items():
    try:
        if (value['commits'] >= 10) and (value['company'] != 'Redis Labs'):
            print(key,value['company'],value['commits'],value['additions'],value['deletions'])
    except:
        pass

CharlesChen888 Looking for a job 25 1701 559
ncghost1 Magic School 10 62 15
pizhenwei ByteDance 22 1576 824
enjoy-binbin Tencent Cloud 263 6749 2457
soloestoy Alibaba 40 1025 511
judeng None 23 610 378
lyq2333 Alibaba 10 349 190
madolson Amazon 67 11318 3320
hpatro Amazon 20 1791 728
hwware Huawei Technologies 57 1205 330
roshkhatri Amazon 11 1964 734
zuiderkwast Ericsson Software Technology 38 15994 11944


In [38]:
org_people = 0
org_commits = 0
org_additions = 0
org_deletions = 0

other_people = 0
other_commits = 0
other_additions = 0
other_deletions = 0

for key,value in person_dict_2yr.items():
    try:
        if value['commits'] >= 5:
            if value['company'] == 'Redis Labs':
                org_people += 1
                org_commits = org_commits + value['commits']
                org_additions = org_additions + value['additions']
                org_deletions = org_deletions + value['deletions']
            else:
                other_people += 1
                other_commits = other_commits + value['commits']
                other_additions = other_additions + value['additions']
                other_deletions = other_deletions + value['deletions']
                print(key,value)
            i+=1
    except:
        pass

print("\nRedis Employees with >= 10 commits:", "\n* People:", org_people, format(org_people/people, ".2%"), "of people")
print("* Commits:", org_commits, format(org_commits/commits, ".2%"), "of total commits")
print("* Additions:", org_additions, format(org_additions/additions, ".2%"), "of total additions")
print("* Deletions:", org_deletions, format(org_deletions/deletions, ".2%"), "of total deletions")

print("\nNon-Employees with >= 10 commits:", "\n* People:", other_people, format(other_people/people, ".2%"), "of people")
print("* Commits:", other_commits, format(other_commits/commits, ".2%"), "of total commits")
print("* Additions:", other_additions, format(other_additions/additions, ".2%"), "of total additions")
print("* Deletions:", other_deletions, format(other_deletions/deletions, ".2%"), "of total deletions")
      
print("\nTotals in dataset of people with >=10 commits:")
print('*', format((other_additions + org_additions)/additions, ".2%"), "of total additions")
print('*', format((other_deletions + org_deletions)/deletions, ".2%"), "of total deletions")

devnexen {'additions': 139, 'deletions': 17, 'name': 'David CARLIER', 'email': ['devnexen@gmail.com'], 'commits': 7, 'company': None}
CharlesChen888 {'additions': 1701, 'deletions': 559, 'name': 'Chen Tianjie', 'email': ['chentianjie.ctj@alibaba-inc.com', 'TJ_Chen@outlook.com'], 'commits': 25, 'company': 'Looking for a job'}
ncghost1 {'additions': 62, 'deletions': 15, 'name': 'Eriri', 'email': ['275955589@qq.com', 'eriri233@qq.com', '41555481+ncghost1@users.noreply.github.com'], 'commits': 10, 'company': 'Magic School'}
pizhenwei {'additions': 1576, 'deletions': 824, 'name': 'zhenwei pi', 'email': ['pizhenwei@bytedance.com'], 'commits': 22, 'company': 'ByteDance'}
enjoy-binbin {'additions': 6749, 'deletions': 2457, 'name': 'Binbin', 'email': ['binloveplay1314@qq.com'], 'commits': 263, 'company': 'Tencent Cloud'}
soloestoy {'additions': 1025, 'deletions': 511, 'name': 'zhaozhao.zz', 'email': ['zhaozhao.zz@alibaba-inc.com', '276441700@qq.com'], 'commits': 40, 'company': 'Alibaba'}
judeng