# Summary

### tl;dr

Notes:
* The Terraform project is owned by a company (HashiCorp) and was relicensed on 2023-08-10
* Caveat: This is purely a summary of the data that has not been validated from anyone within the community
* Analysis of contributions to one repo: https://github.com/hashicorp/terraform

Terraform has always been dominated by employee contributions with over 90% of the code coming from employees. The numbers are similar in both the year before the relicense and in the year after. 

### 1 Year before Relicense (2022-08-10 - 2023-08-10)
HashiCorp Employees with >= 5 commits: 
* People: 21 0.15 % of people
* Commits: 971
* Additions: 202612 0.9330165133220972 % of total additions
* Deletions: 81019 0.9470921737097434 % of total deletions

Other - people with >= 5 commits: 
* People: 2 0.014285714285714285 % of people
* Commits: 13
* Additions: 84 0.00038681512999751334 % of total additions
* Deletions: 33 0.00038576187971243206 % of total deletions

Totals in dataset of people with >=5 commits:
* 0.9334033284520948 % of total additions
* 0.9474779355894558 % of total deletions

### After relicense (2023-08-10 to 2024-08-10)

Note: I suspect that one of the people in the Other category works at Hashicorp (https://github.com/ritsok)

Hashicorp employees with >= 5 commits: 
* People: 24 0.22857142857142856 % of people
* Commits: 1620
* Additions: 672393 0.9004222279656058 % of total additions
* Deletions: 242052 0.9338029636087974 % of total deletions

Other people with >= 5 commits: 
* People: 2 0.01904761904761905 % of people
* Commits: 18
* Additions: 353 0.00047271319967914424 % of total additions
* Deletions: 354 0.0013656827835238474 % of total deletions

Totals in dataset of people with >=5 commits:
* 0.9008949411652849 % of total additions
* 0.9351686463923213 % of total deletions

# 1 Year before Relicense (2022-08-10 - 2023-08-10)

In [40]:
from pprint import pprint
import collections
import pandas as pd
import pickle

# Pickle files generated by this script:
# https://github.com/chaoss/wg-data-science/blob/main/dataset/license-changes/fork-case-study/commits_people.py

people_pickle_1yr = '../data-files/terraform_people_2022-08-10T00:00:00.000+00:002023-08-10T00:00:00.000+00:00.pkl'

with open(people_pickle_1yr, 'rb') as f:
    person_dict_1yr = pickle.load(f)
    
len(person_dict_1yr)

140

In [41]:
people = len(person_dict_1yr)
commits = 0
additions = 0
deletions = 0

for key,value in person_dict_1yr.items():
    # Normalize company names and use emails to derive Amazon affiliations
    if value['company'] == None:
        for email in value['email']:
            if 'hashicorp.com' in email:
                person_dict_1yr[key]['company'] = 'HashiCorp'
    elif 'hashi' in value['company'].lower():
        person_dict_1yr[key]['company'] = 'HashiCorp'
        
    # Get descriptive statistics
    commits = commits + value['commits']
    additions = additions + value['additions']
    deletions = deletions + value['deletions']
    
print("People:", people)
print("Commits:", commits)
print("Additions", additions)
print("Deletions", deletions)

People: 140
Commits: 1189
Additions 217158
Deletions 85545


In [42]:
for key,value in person_dict_1yr.items():
    if (value['commits'] >= 5) and (value['company'] == None):
        print(key,value)

iKunal-Singh {'deletions': 21, 'additions': 21, 'company': None, 'name': 'Kunal Singh', 'email': ['109537406+iKunal-Singh@users.noreply.github.com'], 'commits': 7}
hc-github-team-tf-core {'deletions': 21, 'additions': 21, 'company': None, 'name': None, 'email': ['hc-github-team-tf-core@users.noreply.github.com'], 'commits': 6}
apparentlymart {'deletions': 9534, 'additions': 9804, 'company': None, 'name': 'Martin Atkins', 'email': ['mart@degeneration.co.uk'], 'commits': 98}
laurapacilio {'deletions': 2077, 'additions': 709, 'company': None, 'name': 'Laura Pacilio', 'email': ['83350965+laurapacilio@users.noreply.github.com'], 'commits': 75}


In [43]:
# Manual Fixes
person_dict_1yr['apparentlymart']['company'] = 'HashiCorp' # https://www.linkedin.com/in/martin-atkins-ab5a84239
person_dict_1yr['laurapacilio']['company'] = 'HashiCorp' # https://www.linkedin.com/in/laura-pacilio/
person_dict_1yr['rkoron007']['company'] = 'HashiCorp' # https://www.linkedin.com/in/rose-koron/


# Remove bots
try:
    del person_dict_1yr['hc-github-team-tf-core']
except:
    pass

In [44]:
org_people = 0
org_commits = 0
org_additions = 0
org_deletions = 0

other_people = 0
other_commits = 0
other_additions = 0
other_deletions = 0

for key,value in person_dict_1yr.items():
    try:
        if value['commits'] >= 5:
            if value['company'] == 'HashiCorp':
                org_people += 1
                org_commits = org_commits + value['commits']
                org_additions = org_additions + value['additions']
                org_deletions = org_deletions + value['deletions']
            else:
                other_people += 1
                other_commits = other_commits + value['commits']
                other_additions = other_additions + value['additions']
                other_deletions = other_deletions + value['deletions']
                print(key,value)
            i+=1
    except:
        pass

print("\nHashiCorp Employees with >= 5 commits:", "\n* People:", org_people, org_people/people, "% of people")
print("* Commits:", org_commits)
print("* Additions:", org_additions, org_additions/additions, "% of total additions")
print("* Deletions:", org_deletions, org_deletions/deletions, "% of total deletions")

print("\nOther - people with >= 5 commits:", "\n* People:", other_people, other_people/people, "% of people")
print("* Commits:", other_commits)
print("* Additions:", other_additions, other_additions/additions, "% of total additions")
print("* Deletions:", other_deletions, other_deletions/deletions, "% of total deletions")
      
print("\nTotals in dataset of people with >=5 commits:")
print('*', (other_additions + org_additions)/additions, "% of total additions")
print('*', (other_deletions + org_deletions)/deletions, "% of total deletions")

iKunal-Singh {'deletions': 21, 'additions': 21, 'company': None, 'name': 'Kunal Singh', 'email': ['109537406+iKunal-Singh@users.noreply.github.com'], 'commits': 7}
brittandeyoung {'deletions': 12, 'additions': 63, 'company': 'Hagerty', 'name': 'Brittan DeYoung', 'email': ['32572259+brittandeyoung@users.noreply.github.com'], 'commits': 6}

HashiCorp Employees with >= 5 commits: 
* People: 21 0.15 % of people
* Commits: 971
* Additions: 202612 0.9330165133220972 % of total additions
* Deletions: 81019 0.9470921737097434 % of total deletions

Other - people with >= 5 commits: 
* People: 2 0.014285714285714285 % of people
* Commits: 13
* Additions: 84 0.00038681512999751334 % of total additions
* Deletions: 33 0.00038576187971243206 % of total deletions

Totals in dataset of people with >=5 commits:
* 0.9334033284520948 % of total additions
* 0.9474779355894558 % of total deletions


# After relicense (2023-08-10 to 2024-08-10)

In [45]:
from pprint import pprint
import collections
import pandas as pd
import pickle

# Pickle files generated by this script:
# https://github.com/chaoss/wg-data-science/blob/main/dataset/license-changes/fork-case-study/commits_people.py

people_pickle_a = '../data-files/terraform_people_2023-08-10T00:00:00.000+00:002024-08-10T00:00:00.000+00:00.pkl'

with open(people_pickle_a, 'rb') as f:
    person_dict_a = pickle.load(f)

In [46]:
people = len(person_dict_a)
commits = 0
additions = 0
deletions = 0

for key,value in person_dict_a.items():
    # Normalize company names and use emails to derive Amazon affiliations
    if value['company'] == None:
        for email in value['email']:
            if 'hashicorp.com' in email:
                person_dict_a[key]['company'] = 'HashiCorp'
    elif 'hashi' in value['company'].lower():
        person_dict_a[key]['company'] = 'HashiCorp'
        
    # Get descriptive statistics
    commits = commits + value['commits']
    additions = additions + value['additions']
    deletions = deletions + value['deletions']
    
print("People:", people)
print("Commits:", commits)
print("Additions", additions)
print("Deletions", deletions)

People: 105
Commits: 1773
Additions 746753
Deletions 259211


In [47]:
for key,value in person_dict_a.items():
    if (value['commits'] >= 5) and (value['company'] == None):
        print(key,value)

vinod827 {'company': None, 'commits': 8, 'additions': 12375, 'email': ['24762720+vinod827@users.noreply.github.com', 'vinod827@gmail.com'], 'deletions': 6126, 'name': 'Vinod Kumar'}
ritsok {'company': None, 'commits': 7, 'additions': 335, 'email': ['8647768+ritsok@users.noreply.github.com', 'msokolova13@gmail.com'], 'deletions': 334, 'name': 'rita'}
hashicorp-tsccr[bot] {'company': None, 'commits': 8, 'additions': 152, 'email': ['129506189+hashicorp-tsccr[bot]@users.noreply.github.com', 'hashicorp-tsccr[bot]@users.noreply.github.com'], 'deletions': 152, 'name': None}
apparentlymart {'company': None, 'commits': 432, 'additions': 152420, 'email': ['mart@degeneration.co.uk'], 'deletions': 53069, 'name': 'Martin Atkins'}
hc-github-team-es-release-engineering {'company': None, 'commits': 8, 'additions': 944, 'email': ['82989873+hc-github-team-es-release-engineering@users.noreply.github.com'], 'deletions': 113, 'name': None}
trujillo-adam {'company': None, 'commits': 38, 'additions': 194, 'e

In [48]:
# Manual Fixes
person_dict_a['apparentlymart']['company'] = 'HashiCorp' # https://www.linkedin.com/in/martin-atkins-ab5a84239
person_dict_a['laurapacilio']['company'] = 'HashiCorp' # https://www.linkedin.com/in/laura-pacilio/
person_dict_a['vinod827']['company'] = 'HashiCorp' # https://www.linkedin.com/in/vinod-kumar-285226192/
person_dict_a['trujillo-adam']['company'] = 'HashiCorp' # https://www.linkedin.com/in/adam-trujillo-05a3176/
person_dict_a['rkoron007']['company'] = 'HashiCorp' # https://www.linkedin.com/in/rose-koron/


# Remove bots / automation accounts
try:
    del person_dict_a['hc-github-team-tf-core']
except:
    pass
try:
    del person_dict_a['hashicorp-tsccr[bot]']
except:
    pass
try:
    del person_dict_a['hc-github-team-es-release-engineering']
except:
    pass

In [49]:
org_people = 0
org_commits = 0
org_additions = 0
org_deletions = 0

other_people = 0
other_commits = 0
other_additions = 0
other_deletions = 0

for key,value in person_dict_a.items():
    try:
        if value['commits'] >= 5:
            if value['company'] == 'HashiCorp':
                org_people += 1
                org_commits = org_commits + value['commits']
                org_additions = org_additions + value['additions']
                org_deletions = org_deletions + value['deletions']
            else:
                other_people += 1
                other_commits = other_commits + value['commits']
                other_additions = other_additions + value['additions']
                other_deletions = other_deletions + value['deletions']
                print(key,value)
            i+=1
    except:
        pass

print("\nHashicorp employees with >= 5 commits:", "\n* People:", org_people, org_people/people, "% of people")
print("* Commits:", org_commits)
print("* Additions:", org_additions, org_additions/additions, "% of total additions")
print("* Deletions:", org_deletions, org_deletions/deletions, "% of total deletions")

print("\nOther people with >= 5 commits:", "\n* People:", other_people, other_people/people, "% of people")
print("* Commits:", other_commits)
print("* Additions:", other_additions, other_additions/additions, "% of total additions")
print("* Deletions:", other_deletions, other_deletions/deletions, "% of total deletions")
      
print("\nTotals in dataset of people with >=5 commits:")
print('*', (other_additions + org_additions)/additions, "% of total additions")
print('*', (other_deletions + org_deletions)/deletions, "% of total deletions")

ritsok {'company': None, 'commits': 7, 'additions': 335, 'email': ['8647768+ritsok@users.noreply.github.com', 'msokolova13@gmail.com'], 'deletions': 334, 'name': 'rita'}
cadamini {'company': 'Information developer at InVision', 'commits': 11, 'additions': 18, 'email': ['christian.adamini@invision.de'], 'deletions': 20, 'name': 'Christian Adamini'}

Hashicorp employees with >= 5 commits: 
* People: 24 0.22857142857142856 % of people
* Commits: 1620
* Additions: 672393 0.9004222279656058 % of total additions
* Deletions: 242052 0.9338029636087974 % of total deletions

Other people with >= 5 commits: 
* People: 2 0.01904761904761905 % of people
* Commits: 18
* Additions: 353 0.00047271319967914424 % of total additions
* Deletions: 354 0.0013656827835238474 % of total deletions

Totals in dataset of people with >=5 commits:
* 0.9008949411652849 % of total additions
* 0.9351686463923213 % of total deletions
