# Revenue growth by client volume segment

Here we want to look at the performance of various customer segments by annual performance. The customer segments are:

1. Less than \\$100K per year
2. \\$100K to \\$999K per year
3. \\$1M to \\$10M per year
4. More than \\$10M per year

The metrics we want to collect per segment are:

- ~~Average annual revenue increase (percent change year over year)~~
- ~~Average donor count growth (percent change year over year)~~
- ~~Average transaction value by dollar amount for one time donations~~
- ~~Average first time donor retention rate~~
- ~~Average donor retention rate~~
- ~~Total revenue by dollar value~~
- ~~re-run all metrics for top 20% and again for segmented of the top 20%~~

In [328]:
import sys
sys.path.insert(1, '../../scripts/')
from s3_support import *

from IPython.display import Markdown, display
import pandas as pd
import numpy as np

In [329]:
SEGMENTS = [
    (0, 100000),
    (100000, 1000000),
    (1000000, 10000000),
    10000000
]

# 1. Isolate organization segments

We need to identify which organizations belong to which revenue segment. We will do that by pulling the organizations annual revenue, summing all transactions grouped by org and year.

## Total revenue vs fundraising revenue

_Total revenue_ is to be considered all recorded revenue for an organization, whether it was processed on our platform or not. 

This is identified in the Bloomerang data by the `total_revenue` column. For the Qgiv data, this is retrieved from the database by the `transaction.status in ('A', 'U', 'CR', 'PR')` parameter.


_Fundraising revenue_ is to be considered all revenue processed on our platform for an organization.

This is identified in the Bloomerang data by the `fundraising_revenue` column. For the Qgiv data, this is retrieved from the database by the `transaction.status='A'` parameter.

## Bloomerang data

Loading bloomerang data from an export from Jaime Luong in `bloomerang_orgs_year.csv`.

Data updated with backfilled, estimated segment tags in `rev-segment-data/2024-12-9.jaime.Total Revenue, Fundraising Revenue, Donor Count.csv`

In [330]:
bloom_orgs = pd.read_csv("rev-segment-data/2024-12-11.mufaddal.revenue-donors.csv")

In [331]:
bloom_orgs.head(3)

Unnamed: 0,DatabaseName,created_year,transaction_year,segment,total_revenue,fundraising_revenue,donor_count
0,100blackmenindianapolis,2015,2018,3) $1M - $10M,502991.74,45615.69,247
1,100blackmenindianapolis,2015,2019,3) $1M - $10M,1481676.3,36257.02,201
2,100blackmenindianapolis,2015,2020,3) $1M - $10M,533568.08,34370.13,233


`total_revenue` is all reported revenues, including those are not reported on platform (ie, only known from Cause IQ filings). `fundraising_revenue` is funds that are processed on platform.

In [332]:
bloom_orgs['org'] = bloom_orgs['DatabaseName']
bloom_orgs['donors'] = bloom_orgs['donor_count']
bloom_orgs['year'] = bloom_orgs['transaction_year']

# online 
# bloom_orgs['volume'] = bloom_orgs['fundraising_revenue']
# all (online + offline)
bloom_orgs['volume'] = bloom_orgs['total_revenue']


drop_cols = ['DatabaseName', 'total_revenue', 'donor_count', 'transaction_year']
bloom_orgs.drop(drop_cols, axis=1, inplace=True)

In [333]:
print("Bloomerang data set:")
print("-"*40)
print("{:,} entries retrieved".format(len(bloom_orgs)))
print("{:,} distinct organizations".format(len(bloom_orgs['org'].unique())))
print("{} to {} year range".format(bloom_orgs['year'].min(), bloom_orgs['year'].max()))
print("${:,.2f} mean annual volume for all orgs, all years".format(bloom_orgs['volume'].mean()))
print("{:,} organizations with volume of $1 or less".format(len(bloom_orgs[bloom_orgs['volume']<=1])))
print("{:,} organizations with volume greater than $1".format(len(bloom_orgs[bloom_orgs['volume']>1]['org'].unique())))

Bloomerang data set:
----------------------------------------
74,515 entries retrieved
14,799 distinct organizations
2018 to 2024 year range
$716,823.25 mean annual volume for all orgs, all years
31 organizations with volume of $1 or less
14,783 organizations with volume greater than $1


In [334]:
bloom_orgs.tail(3)

Unnamed: 0,created_year,segment,fundraising_revenue,org,donors,year,volume
74512,2023,2) $100K - $999K,0.0,zyep,110,2022,43116.0
74513,2023,2) $100K - $999K,30610.57,zyep,158,2023,78250.57
74514,2023,2) $100K - $999K,20006.52,zyep,89,2024,47046.52


In [335]:
bloom_grpd = bloom_orgs.groupby('org')['volume'].sum().reset_index()
bloom_grpd.sort_values("volume", ascending=False, inplace=True)
bloom_top_20 = bloom_grpd.head(int(len(bloom_grpd) * .2))

bloom_orgs['is top 20'] = bloom_orgs['org'].isin(bloom_top_20['org'])

print("Bloomerang top 20% of orgs, lifetime volume:")
print("-"*40)
print("{:,} orgs in top 20%".format(len(bloom_top_20)))
print("All orgs volume/org: ${:,.2f} mean; ${:,.2f} median".format(bloom_grpd['volume'].mean(), bloom_grpd['volume'].median()))
print("Top 20% orgs volume/org: ${:,.2f} mean; ${:,.2f} median".format(bloom_top_20['volume'].mean(), bloom_top_20['volume'].median()))

Bloomerang top 20% of orgs, lifetime volume:
----------------------------------------
2,959 orgs in top 20%
All orgs volume/org: $3,609,303.61 mean; $846,402.70 median
Top 20% orgs volume/org: $14,215,768.42 mean; $8,600,600.88 median


In [336]:
bo_23 = bloom_orgs[bloom_orgs['year']==2023]
perc_lt5k = len(bo_23[bo_23['volume']<5000]) / len(bo_23)

print("{:,} orgs in 2023".format(len(bo_23)))
print("{:,.2f}% orgs w/ <$5k fundraising in 2023".format(perc_lt5k * 100.))

10,995 orgs in 2023
6.84% orgs w/ <$5k fundraising in 2023


## Qgiv data

transaction statuses:
- 'A': return 'Accepted';
- 'CR': return 'CR';
- 'U': return 'Uploaded';
- 'PR': return 'Promise';

In [337]:
total_revenue_where = "status in ('A', 'U', 'CR', 'PR')"
fundraising_revenue = "status='A'"

In [338]:
q = '''select
            org,
            year,
            sum(amount) as volume,
            avg(case when recurring=0 then amount else null end) as avg_onetime,
            count(case when recurring=0 then id else null end) as onetime_count,
            count(case when recurring_origin=1 then id else null end) as rec_count,
            count(distinct(email)) as donors
        from transactions
        where {} 
        group by year, org'''.format(total_revenue_where)
orgs = redshift_query_read(q, schema='production')

In [339]:
print("Qgiv data set:")
print("-"*40)
print("{:,} entries retrieved".format(len(orgs)))
print("{:,} distinct organizations".format(len(orgs['org'].unique())))
print("{} to {} year range".format(orgs['year'].min(), orgs['year'].max()))
print("${:,.2f} mean annual volume for all orgs, all years".format(orgs['volume'].mean()))
print("{:,} organizations with volume of $1 or less".format(len(orgs[orgs['volume']<=1])))
print("{:,} organizations with volume greater than $1".format(len(orgs[orgs['volume']>1]['org'].unique())))

Qgiv data set:
----------------------------------------
39,718 entries retrieved
11,733 distinct organizations
2006 to 2024 year range
$98,012.66 mean annual volume for all orgs, all years
1,830 organizations with volume of $1 or less
10,526 organizations with volume greater than $1


In [340]:
orgs.tail(3)

Unnamed: 0,org,year,volume,avg_onetime,onetime_count,rec_count,donors
39715,448863,2024,28468.0,115.723577,246,0,174
39716,455209,2024,2411.0,219.181818,11,0,11
39717,453664,2024,1.0,1.0,1,0,1


In [341]:
orgs_grpd = orgs.groupby('org')['volume'].sum().reset_index()
orgs_grpd.sort_values("volume", ascending=False, inplace=True)
orgs_top_20 = orgs_grpd.head(int(len(orgs_grpd) * .2))

orgs['is top 20'] = orgs['org'].isin(orgs_top_20['org'])

print("Qgiv top 20% of orgs, lifetime volume:")
print("-"*40)
print("{:,} orgs in top 20%".format(len(orgs_top_20)))
print("All orgs volume/org: ${:,.2f} mean; ${:,.2f} median".format(orgs_grpd['volume'].mean(), orgs_grpd['volume'].median()))
print("Top 20% orgs volume/org: ${:,.2f} mean; ${:,.2f} median".format(orgs_top_20['volume'].mean(), orgs_top_20['volume'].median()))

Qgiv top 20% of orgs, lifetime volume:
----------------------------------------
2,346 orgs in top 20%
All orgs volume/org: $331,787.86 mean; $22,431.00 median
Top 20% orgs volume/org: $1,525,451.96 mean; $545,616.88 median


In [342]:
orgs_23 = orgs[orgs['year']==2023]
perc_lt5k = len(orgs_23[orgs_23['volume']<5000]) / len(orgs_23)

print("{:,} orgs in 2023".format(len(orgs_23)))
print("{:,.2f}% orgs w/ <$5k fundraising in 2023".format(perc_lt5k * 100.))

4,522 orgs in 2023
30.67% orgs w/ <$5k fundraising in 2023


In [343]:
print("Bloomgerang + Qgiv 2023, sub-$5k fundraising")

len_all = len(orgs_23) + len(bo_23)
len_lt5k = len(orgs_23[orgs_23['volume']<5000]) + len(bo_23[bo_23['volume']<5000])

print("{:,} total orgs".format(len_all))
print("{:,.2f}% of total orgs w/ < $5k funds raised in 2023".format((len_lt5k / len_all) * 100.))

Bloomgerang + Qgiv 2023, sub-$5k fundraising
15,517 total orgs
13.78% of total orgs w/ < $5k funds raised in 2023


## tagging segments

Using `10` as the _unidentified_ segment

### tagging qgiv orgs

In [344]:
qgiv_org_revenue = pd.read_csv("2024.12.06 Qgiv CauseIQ Rev Numbers.2.csv")

In [345]:
print("{:,} rows".format(len(qgiv_org_revenue)))
qgiv_org_revenue.dropna(subset=['qgiv_account_c'], inplace=True)
print("{:,} rows".format(len(qgiv_org_revenue)))

9,687 rows
9,684 rows


In [346]:
orgs_last_rev_data = []
for o in orgs['org'].unique():
    last_vol = orgs[orgs['org']==o].sort_values('year', ascending=True).iloc[-1]['volume']
    orgs_last_rev_data.append({
        'org': o,
        'last_revenue': last_vol
    })

In [347]:
def map_qgiv_segments(org):
    revenue = None
    try:
        _df = qgiv_org_revenue[qgiv_org_revenue['qgiv_account_c'].astype(int)==int(org)]
        _df = _df[~_df['annual_revenue_numbers_c'].isna()]
        
        if len(_df) > 0:
            revenue = _df['annual_revenue_numbers_c'].max()
        else:
            try:
                revenue = [r for r in orgs_last_rev_data if r['org']==int(org)][0]['last_revenue']
            except:
                pass
    except:
        pass
        
    if revenue is not None:
        for i in range(0, len(SEGMENTS)):
            if i == 3:
                if revenue > SEGMENTS[i]:
                    return i
            else:
                if revenue <= SEGMENTS[i][1]:
                    return i
            
    return 10

In [348]:
print("Qgiv revenue segments:")
orgs['segment'] = orgs['org'].apply(map_qgiv_segments)
print(orgs['segment'].value_counts())

Qgiv revenue segments:
0    16184
1    10190
2     9963
3     3381
Name: segment, dtype: int64


### tagging bloomerang orgs

In [349]:
'''
bloom['segment'] = ['3) $1M - $10M', '2) $100K - $999K', '1) Less than $100K',
                   'Unknown revenue', '4) $10M+']
'''

def map_bloom_segments(o):
    try:
        return int(str(o['segment'])[0]) - 1
    except:
        return 10

In [350]:
print("Bloomerang revenue segments:")
bloom_orgs['segment'] = bloom_orgs[['org', 'segment']].apply(map_bloom_segments, axis=1)
print(bloom_orgs['segment'].value_counts())

Bloomerang revenue segments:
1    30745
2    28941
0     7466
3     7363
Name: segment, dtype: int64


In [351]:
print("Bloomerang org counts by revenue segment:")
bloom_orgs.groupby(['org'])['segment'].first().value_counts()

Bloomerang org counts by revenue segment:


1    6136
2    5103
0    2311
3    1249
Name: segment, dtype: int64

In [352]:
jaime_segs = pd.read_csv("rev-segment-data/2024-12-9.jaime.Total Revenue, Fundraising Revenue, Donor Count.csv")

print("{:,} rows".format(len(jaime_segs)))
print("{:,} orgs".format(len(jaime_segs['DatabaseName'].unique())))
print("{} to {}".format(jaime_segs['transaction_year'].min(), jaime_segs['transaction_year'].max()))

print()
print("Mapped segment org counts:")
print(jaime_segs.groupby('DatabaseName')['segment'].first().reset_index().apply(map_bloom_segments, axis=1).value_counts())

74,511 rows
14,795 orgs
2018 to 2024

Mapped segment org counts:
1    6138
2    5103
0    2308
3    1246
dtype: int64


# 2. Analysis

## 1. Average transaction value by dollar amount for one time donations

In [353]:
q = '''select
            amount
        from transactions
        where
            status='A' and
            recurring=0'''
trans = redshift_query_read(q, schema='production')

print("Average one time donation value, all orgs, all time:")
print("-"*40)
print("mean: ${:.2f}".format(trans['amount'].mean()))
print("median: ${:.2f}".format(trans['amount'].median()))

q = '''select
            amount
        from transactions
        where
            status='A' and
            recurring=0 and
            year>=2021'''
trans = redshift_query_read(q, schema='production')
print()
print("Average one time donation value, all orgs, 2021+:")
print("-"*40)
print("mean: ${:.2f}".format(trans['amount'].mean()))
print("median: ${:.2f}".format(trans['amount'].median()))

q = '''select
            amount
        from transactions
        where
            status='A' and
            recurring=0 and
            year=2023'''
trans = redshift_query_read(q, schema='production')
print()
print("Average one time donation value, all orgs, 2023:")
print("-"*40)
print("mean: ${:.2f}".format(trans['amount'].mean()))
print("median: ${:.2f}".format(trans['amount'].median()))

Average one time donation value, all orgs, all time:
----------------------------------------
mean: $170.72
median: $50.00

Average one time donation value, all orgs, 2021+:
----------------------------------------
mean: $179.14
median: $50.00

Average one time donation value, all orgs, 2023:
----------------------------------------
mean: $182.96
median: $50.00


In [354]:
output_str = "Average value per transaction by dollar value and transaction type counts, by segments, all years"
data = []
for i in range(len(SEGMENTS)):
    count_ot_mean = orgs[orgs['segment']==i]['onetime_count'].mean()
    count_ot_median = orgs[orgs['segment']==i]['onetime_count'].median()
    count_rec_mean = orgs[orgs['segment']==i]['rec_count'].mean()
    count_rec_median = orgs[orgs['segment']==i]['rec_count'].median()
    avg_ot_mean = orgs[orgs['segment']==i]['avg_onetime'].mean()
    avg_ot_median = orgs[orgs['segment']==i]['avg_onetime'].median()
    
    try:
        k = "\\\${:,} to \\\${:,}".format(SEGMENTS[i][0], SEGMENTS[i][1])
    except:
        k = "\\\${:,}+".format(SEGMENTS[i])
    
    data.append({
        'key': k,
        'sample size': "{:,}".format(len(orgs[orgs['segment']==i])),
        'mean onetime count': "{:,.2f}".format(count_ot_mean),
        'median onetime count': "{:,.2f}".format(count_ot_median),
        'mean recurring count': "{:,.2f}".format(count_rec_mean),
        'median recurring count': "{:,.2f}".format(count_rec_median),
        'mean avg onetime value': "\\\${:,.2f}".format(avg_ot_mean),
        'median avg onetime value': "\\\${:,.2f}".format(avg_ot_median)
    })
    
output_str += '\n'
output_str += pd.DataFrame(data).to_markdown()

Markdown(output_str)

Average value per transaction by dollar value and transaction type counts, by segments, all years
|    | key                           | sample size   | mean onetime count   |   median onetime count |   mean recurring count |   median recurring count | mean avg onetime value   | median avg onetime value   |
|---:|:------------------------------|:--------------|:---------------------|-----------------------:|-----------------------:|-------------------------:|:-------------------------|:---------------------------|
|  0 | \\$0 to \\$100,000            | 16,184        | 141.44               |                     24 |                   4.34 |                        0 | \\$223.33                | \\$120.32                  |
|  1 | \\$100,000 to \\$1,000,000    | 10,190        | 288.02               |                     76 |                   8.4  |                        1 | \\$307.92                | \\$162.87                  |
|  2 | \\$1,000,000 to \\$10,000,000 | 9,963         | 497.75               |                    168 |                  13.89 |                        1 | \\$546.34                | \\$185.65                  |
|  3 | \\$10,000,000+                | 3,381         | 1,065.38             |                    199 |                  65.52 |                        1 | \\$395.64                | \\$202.88                  |

In [355]:
output_str = "Segment counts for the year 2023:"

data = []
orgs_2023 = orgs[orgs['year']==2023]
for i in range(len(SEGMENTS)):
    len_orgs = len(orgs_2023[orgs_2023['segment']==i])
    try:
        k = "\\\${:,} to \\\${:,}".format(SEGMENTS[i][0], SEGMENTS[i][1])
    except:
        k = "\\\${:,}+".format(SEGMENTS[i])
        
    data.append({
        'key': k,
        'orgs': "{:,}".format(len_orgs)
    })
    
output_str += "\n"
output_str += pd.DataFrame(data).to_markdown()

Markdown(output_str)

Segment counts for the year 2023:
|    | key                           | orgs   |
|---:|:------------------------------|:-------|
|  0 | \\$0 to \\$100,000            | 1,425  |
|  1 | \\$100,000 to \\$1,000,000    | 1,388  |
|  2 | \\$1,000,000 to \\$10,000,000 | 1,282  |
|  3 | \\$10,000,000+                | 427    |

In [356]:
output_str = "Average onetime/recurring for 2023:"

data = []
for i in range(len(SEGMENTS)):
    count_ot_mean = orgs_2023[orgs_2023['segment']==i]['onetime_count'].mean()
    count_ot_median = orgs_2023[orgs_2023['segment']==i]['onetime_count'].median()
    count_rec_mean = orgs_2023[orgs_2023['segment']==i]['rec_count'].mean()
    count_rec_median = orgs_2023[orgs_2023['segment']==i]['rec_count'].median()
    mn_ot_mean = orgs_2023[orgs_2023['segment']==i]['avg_onetime'].mean()
    mdn_ot_mean = orgs_2023[orgs_2023['segment']==i]['avg_onetime'].median()
    
    try:
        k = "\\\${:,} to \\\${:,}".format(SEGMENTS[i][0], SEGMENTS[i][1])
    except:
        k = "\\\${:,}+".format(SEGMENTS[i])
    
    data.append({
        'key': k,
        'sample size': "{:,}".format(len(orgs[orgs['segment']==i])),
        'mean onetime count': "{:,.2f}".format(count_ot_mean),
        'median onetime count': "{:,.2f}".format(count_ot_median),
        'mean recurring count': "{:,.2f}".format(count_rec_mean),
        'median recurring count': "{:,.2f}".format(count_rec_median),
        'mean avg onetime value': "\\\${:,.2f}".format(mn_ot_mean),
        'median onetime value': "\\\${:,.2f}".format(mdn_ot_mean)
    })
    
output_str += "\n"
output_str += pd.DataFrame(data).to_markdown()

Markdown(output_str)

Average onetime/recurring for 2023:
|    | key                           | sample size   | mean onetime count   |   median onetime count |   mean recurring count |   median recurring count | mean avg onetime value   | median onetime value   |
|---:|:------------------------------|:--------------|:---------------------|-----------------------:|-----------------------:|-------------------------:|:-------------------------|:-----------------------|
|  0 | \\$0 to \\$100,000            | 16,184        | 86.73                |                   17   |                   2.02 |                        0 | \\$249.78                | \\$121.06              |
|  1 | \\$100,000 to \\$1,000,000    | 10,190        | 317.12               |                  101   |                   6.57 |                        0 | \\$381.06                | \\$205.55              |
|  2 | \\$1,000,000 to \\$10,000,000 | 9,963         | 605.83               |                  225.5 |                  13.06 |                        1 | \\$677.43                | \\$226.59              |
|  3 | \\$10,000,000+                | 3,381         | 1,354.90             |                  249   |                  93.28 |                        1 | \\$588.71                | \\$237.34              |

In [357]:
output_str = "Average onetime transaction value (2023), by org:\n"

mn_all = orgs['avg_onetime'].mean()
mdn_all = orgs['avg_onetime'].median()

top_20_2023 = orgs[orgs['is top 20']]
mn_top20 = top_20_2023['avg_onetime'].mean()
mdn_top20 = top_20_2023['avg_onetime'].median()

data = [{
    "key": "all orgs",
    "mean": "\\\${:,.2f}".format(mn_all),
    "median": "\\\${:,.2f}".format(mdn_all)
},{
    "key": "top 20% orgs",
    "mean": "\\\${:,.2f}".format(mn_top20),
    "median": "\\\${:,.2f}".format(mdn_top20)
}]

In [358]:
rep_forms = pd.read_csv("../representative forms/filtered_forms.csv")

q = '''select id as form, org from form'''
org_form = redshift_query_read(q, schema='production')

rep_forms['org'] = rep_forms['form'].apply(lambda x: org_form[org_form['form'].astype(int)==int(x)]['org'].iloc[0])

In [359]:
rep_orgs = orgs_2023[orgs_2023['org'].isin(rep_forms['org'].tolist())]
mn_rep = rep_orgs['avg_onetime'].mean()
mdn_rep = rep_orgs['avg_onetime'].median()

data.append({
    "key": "representative forms",
    "mean": "\\\${:,.2f}".format(mn_rep),
    "median": "\\\${:,.2f}".format(mdn_rep)
})

output_str += pd.DataFrame(data).to_markdown()

Markdown(output_str)

Average onetime transaction value (2023), by org:
|    | key                  | mean      | median    |
|---:|:---------------------|:----------|:----------|
|  0 | all orgs             | \\$341.16 | \\$153.16 |
|  1 | top 20% orgs         | \\$520.90 | \\$213.01 |
|  2 | representative forms | \\$344.33 | \\$197.55 |

In [360]:
output_str = "Average one time values by segment, representative forms (2023):"
output_str += "\n"

data = []
for i in range(len(SEGMENTS)):
    avg_ot_mean = rep_orgs[rep_orgs['segment']==i]['avg_onetime'].mean()
    mdn_ot_mean = rep_orgs[rep_orgs['segment']==i]['avg_onetime'].median()
    
    try:
        k = "\\\${:,} to \\\${:,}:".format(SEGMENTS[i][0], SEGMENTS[i][1])
    except:
        k = "\\\${:,}+:".format(SEGMENTS[i])
        
    data.append({
        "key": k,
        "mean onetime value": "\\\${:.2f}".format(avg_ot_mean),
        "median onetime value": "\\\${:.2f}".format(mdn_ot_mean)
    })

output_str += pd.DataFrame(data).to_markdown()

Markdown(output_str)

Average one time values by segment, representative forms (2023):
|    | key                            | mean onetime value   | median onetime value   |
|---:|:-------------------------------|:---------------------|:-----------------------|
|  0 | \\$0 to \\$100,000:            | \\$196.18            | \\$127.51              |
|  1 | \\$100,000 to \\$1,000,000:    | \\$277.70            | \\$188.70              |
|  2 | \\$1,000,000 to \\$10,000,000: | \\$373.81            | \\$221.36              |
|  3 | \\$10,000,000+:                | \\$634.59            | \\$233.03              |

In [361]:
q = """select
            form,
            id,
            amount
        from transactions where
            status='A' and
            recurring=0 and
            year=2023"""
ot_trans = redshift_query_read(q, schema='production')

In [362]:
ot_trans['rep_form'] = ot_trans['form'].isin(rep_forms['form'].unique())

In [363]:
ot_trans.groupby(['rep_form'])['amount'].agg(['mean', 'median']).reset_index()

Unnamed: 0,rep_form,mean,median
0,False,199.86012,51.49
1,True,125.588414,50.0


## 2. Total revenue by dollar value

In [364]:
print("Average volume by segment for all years:")
print("-"*40)
for i in range(len(SEGMENTS)):
    vol_mean = orgs[orgs['segment']==i]['volume'].mean()
    vol_median = orgs[orgs['segment']==i]['volume'].median()
    try:
        print("${:,} to ${:,}: ${:,.2f} mean; ${:,.2f} median".format(SEGMENTS[i][0], SEGMENTS[i][1], vol_mean, vol_median))
    except:
        print("${:,}+: ${:,.2f} mean; ${:,.2f} median".format(SEGMENTS[i], vol_mean, vol_median))

Average volume by segment for all years:
----------------------------------------
$0 to $100,000: $28,608.78 mean; $4,935.50 median
$100,000 to $1,000,000: $68,068.62 mean; $17,897.45 median
$1,000,000 to $10,000,000: $150,896.41 mean; $40,294.50 median
$10,000,000+: $364,644.25 mean; $52,255.00 median


In [365]:
print("Segment average volume for the year 2023:")
print("-"*40)
orgs_2023 = orgs[orgs['year']==2023]
for i in range(len(SEGMENTS)):
    vol_mean = orgs_2023[orgs_2023['segment']==i]['volume'].mean()
    vol_median = orgs_2023[orgs_2023['segment']==i]['volume'].median()
    try:
        print("${:,} to ${:,}: ${:,.2f} mean; ${:,.2f} median".format(SEGMENTS[i][0], SEGMENTS[i][1], vol_mean, vol_median))
    except:
        print("${:,}+: ${:,.2f} mean; ${:,.2f} median".format(SEGMENTS[i], vol_mean, vol_median))

Segment average volume for the year 2023:
----------------------------------------
$0 to $100,000: $17,350.40 mean; $4,100.00 median
$100,000 to $1,000,000: $75,782.03 mean; $25,407.85 median
$1,000,000 to $10,000,000: $183,427.72 mean; $64,289.35 median
$10,000,000+: $768,425.06 mean; $69,626.06 median


## 3. Average annual revenue increase (percent change year over year)

Data source: queried from GBQ (Bloomerang) and Redshift (Qgiv, `Transactions` table) for data containing each org, processing volume amount, and year. This data is iterated over to calculate the year over year change for each org to build a second dataset. This second dataset is used to calculate various metrics with different timeframes and filters.

Limiting dataset to 2021 forward. Leaving start years in data set for the moment to see raw values, will likely just use median's to avoid artificially altering the dataset.

In [366]:
DISPLAY_GROUPS = ['all', '3+ years & $5k in 2023']
DISPLAY_COLUMNS = ['mean', 'median']

# repeat rows for all then for top 20%
DISPLAY_ROWS = ['all years', '2021, 2022, 2023', '2023']
SEGMENT_ROWS = ['< $100K', '$100K to $1M', '$1M to $10M', '> $10M']

### qgiv

In [367]:
filtered_2023_orgs_qgiv = orgs[(orgs['year']==2023)&(orgs['volume']>5000)]['org'].unique().tolist()

print("{:,} total orgs".format(len(orgs['org'].unique())))
print("{:,} filtered orgs (>$5k in 2023)".format(len(filtered_2023_orgs_qgiv)))

11,733 total orgs
3,133 filtered orgs (>$5k in 2023)


In [368]:
org_growth = None
org_growth_3years = None
for org in orgs['org'].unique():
    _df = orgs[orgs['org']==org].copy().sort_values('year', ascending=True)
    _df['volume yoy'] = _df['volume'].pct_change()
    _df.dropna(inplace=True)
    
    org_growth = pd.concat([_df.copy(), org_growth])
    
    try:
        org_growth_3years = pd.concat([_df.iloc[1:].copy(), org_growth_3years])
    except:
        pass

In [369]:
print("Averages across dataset, no groupings:")
print("-"*40)
print()

mn_all = org_growth['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_all = org_growth['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().median()
mn_212223 = org_growth[org_growth['year'].isin([2021, 2022, 2023])]['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_212223 = org_growth[org_growth['year'].isin([2021, 2022, 2023])]['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().median()
mn_23 = org_growth[org_growth['year']==2023]['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_23 = org_growth[org_growth['year']==2023]['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().median()

table_all_orgs = {
    'all years': {
        'mean': "{:,.2f}%".format(mn_all * 100.),
        'median': "{:,.2f}%".format(mdn_all * 100.)
    },
    '2021, 2022, 2023': {
        'mean': "{:,.2f}%".format(mn_212223 * 100.), 
        'median': "{:,.2f}%".format(mdn_212223 * 100.)
    },
    '2023': {
        'mean': "{:,.2f}%".format(mn_23 * 100.), 
        'median': "{:,.2f}%".format(mdn_23 * 100.)
    }
}

print("All orgs, varying timeframes:")
display(pd.DataFrame(table_all_orgs).transpose())
print()

mn = org_growth_3years['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn = org_growth_3years['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().median()
mn_212223 = org_growth_3years[org_growth_3years['year'].isin([2021, 2022, 2023])]['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_212223 = org_growth_3years[org_growth_3years['year'].isin([2021, 2022, 2023])]['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().median()
mn_23 = org_growth_3years[org_growth_3years['year']==2023]['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_23 = org_growth_3years[org_growth_3years['year']==2023]['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().median()

table_3years = {
    'all years': {
        'mean': "{:,.2f}%".format(mn_all * 100.),
        'median': "{:,.2f}%".format(mdn_all * 100.)
    },
    '2021, 2022, 2023': {
        'mean': "{:,.2f}%".format(mn_212223 * 100.), 
        'median': "{:,.2f}%".format(mdn_212223 * 100.)
    },
    '2023': {
        'mean': "{:,.2f}%".format(mn_23 * 100.), 
        'median': "{:,.2f}%".format(mdn_23 * 100.)
    }
}

print("Orgs w/ 3+ years of records, varying timeframes:")
display(pd.DataFrame(table_3years).transpose())

Averages across dataset, no groupings:
----------------------------------------

All orgs, varying timeframes:


Unnamed: 0,mean,median
all years,"96,119.04%",9.33%
"2021, 2022, 2023","103,563.36%",5.28%
2023,"166,982.34%",5.87%



Orgs w/ 3+ years of records, varying timeframes:


Unnamed: 0,mean,median
all years,"96,119.04%",9.33%
"2021, 2022, 2023",641.84%,0.78%
2023,434.89%,1.89%


In [392]:
print("Grouped by org")
print("-"*40)
print()

mn_all = org_growth.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_all = org_growth.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
mn_212223 = org_growth[org_growth['year'].isin([2021, 2022, 2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_212223 = org_growth[org_growth['year'].isin([2021, 2022, 2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
mn_23 = org_growth[org_growth['year']==2023].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_23 = org_growth[org_growth['year']==2023].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()

table_all_orgs = {
    'all years': {
        'mean': "{:,.2f}%".format(mn_all * 100.),
        'median': "{:,.2f}%".format(mdn_all * 100.)
    },
    '2021, 2022, 2023': {
        'mean': "{:,.2f}%".format(mn_212223 * 100.), 
        'median': "{:,.2f}%".format(mdn_212223 * 100.)
    },
    '2023': {
        'mean': "{:,.2f}%".format(mn_23 * 100.), 
        'median': "{:,.2f}%".format(mdn_23 * 100.)
    }
}

print("All orgs, varying timeframes:")
display(pd.DataFrame(table_all_orgs).transpose())
print()

mn = org_growth_3years.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn = org_growth_3years.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
mn_212223 = org_growth_3years[org_growth_3years['year'].isin([2021, 2022, 2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_212223 = org_growth_3years[org_growth_3years['year'].isin([2021, 2022, 2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
mn_23 = org_growth_3years[org_growth_3years['year']==2023].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_23 = org_growth_3years[org_growth_3years['year']==2023].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
mn_235k = org_growth_3years[(org_growth_3years['year']==2023)&(org_growth_3years['volume']>5000)].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_235k = org_growth_3years[(org_growth_3years['year']==2023)&(org_growth_3years['volume']>5000)].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()

table_3years = {
    'all years': {
        'mean': "{:,.2f}%".format(mn_all * 100.),
        'median': "{:,.2f}%".format(mdn_all * 100.)
    },
    '2021, 2022, 2023': {
        'mean': "{:,.2f}%".format(mn_212223 * 100.), 
        'median': "{:,.2f}%".format(mdn_212223 * 100.)
    },
    '2023': {
        'mean': "{:,.2f}%".format(mn_23 * 100.), 
        'median': "{:,.2f}%".format(mdn_23 * 100.)
    },
    '2023 & $5k': {
        'mean': "{:,.2f}%".format(mn_235k * 100.), 
        'median': "{:,.2f}%".format(mdn_235k * 100.)
    }
}

print("Orgs w/ 3+ years of records, varying timeframes, averages across dataset:")
display(pd.DataFrame(table_3years).transpose())

Grouped by org
----------------------------------------

All orgs, varying timeframes:


Unnamed: 0,mean,median
all years,"162,110.73%",12.71%
"2021, 2022, 2023","174,323.23%",8.52%
2023,"166,982.34%",5.87%



Orgs w/ 3+ years of records, varying timeframes, averages across dataset:


Unnamed: 0,mean,median
all years,"162,110.73%",12.71%
"2021, 2022, 2023",603.40%,1.49%
2023,434.89%,1.89%
2023 & $5k,519.69%,5.69%


In [371]:
print("Grouped by org, top 20%")
print("-"*40)
print()

mn_all = org_growth[org_growth['is top 20']].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_all = org_growth[org_growth['is top 20']].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
mn_212223 = org_growth[org_growth['is top 20']&org_growth['year'].isin([2021, 2022, 2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_212223 = org_growth[org_growth['is top 20']&org_growth['year'].isin([2021, 2022, 2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
mn_23 = org_growth[org_growth['is top 20']&(org_growth['year']==2023)].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_23 = org_growth[org_growth['is top 20']&(org_growth['year']==2023)].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()

table_all_orgs = {
    'all years': {
        'mean': "{:,.2f}%".format(mn_all * 100.),
        'median': "{:,.2f}%".format(mdn_all * 100.)
    },
    '2021, 2022, 2023': {
        'mean': "{:,.2f}%".format(mn_212223 * 100.), 
        'median': "{:,.2f}%".format(mdn_212223 * 100.)
    },
    '2023': {
        'mean': "{:,.2f}%".format(mn_23 * 100.), 
        'median': "{:,.2f}%".format(mdn_23 * 100.)
    }
}

print("All orgs, varying timeframes:")
display(pd.DataFrame(table_all_orgs).transpose())
print()

mn_all = org_growth_3years[org_growth_3years['is top 20']].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_all = org_growth_3years[org_growth_3years['is top 20']].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
mn_212223 = org_growth_3years[org_growth_3years['is top 20']&org_growth_3years['year'].isin([2021, 2022, 2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_212223 = org_growth_3years[org_growth_3years['is top 20']&org_growth_3years['year'].isin([2021, 2022, 2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
mn_23 = org_growth_3years[org_growth_3years['is top 20']&(org_growth_3years['year']==2023)].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_23 = org_growth_3years[org_growth_3years['is top 20']&(org_growth_3years['year']==2023)].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()

table_3years = {
    'all years': {
        'mean': "{:,.2f}%".format(mn_all * 100.),
        'median': "{:,.2f}%".format(mdn_all * 100.)
    },
    '2021, 2022, 2023': {
        'mean': "{:,.2f}%".format(mn_212223 * 100.), 
        'median': "{:,.2f}%".format(mdn_212223 * 100.)
    },
    '2023': {
        'mean': "{:,.2f}%".format(mn_23 * 100.), 
        'median': "{:,.2f}%".format(mdn_23 * 100.)
    }
}

print("Orgs w/ 3+ years of records, varying timeframes, averages across dataset:")
display(pd.DataFrame(table_3years).transpose())

Grouped by org, top 20%
----------------------------------------

All orgs, varying timeframes:


Unnamed: 0,mean,median
all years,"285,202.64%",13.43%
"2021, 2022, 2023","321,262.93%",9.38%
2023,"280,456.28%",3.98%



Orgs w/ 3+ years of records, varying timeframes, averages across dataset:


Unnamed: 0,mean,median
all years,431.06%,5.20%
"2021, 2022, 2023",826.13%,4.11%
2023,298.48%,1.76%


In [372]:
print("Org growth by revenue segment, all orgs, 21-23")
segment_data = []
for i in range(len(SEGMENTS)):
    seg_growth = org_growth[(org_growth['year'].isin([2021, 2022, 2023]))&(org_growth['segment']==i)]
    mn_growth = seg_growth.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean() * 100.
    mdn_growth = seg_growth.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median() * 100.
    
    segment_data.append({
        'segment': i,
        'mean': "{:,.2f}%".format(mn_growth),
        'median': "{:,.2f}%".format(mdn_growth)
    })

display(pd.DataFrame(segment_data))

print()
print("Org growth by revenue segment, orgs w/ 3+ years, 21-23")
segment_data = []
for i in range(len(SEGMENTS)):
    seg_growth = org_growth_3years[(org_growth_3years['year'].isin([2021, 2022, 2023]))&(org_growth_3years['segment']==i)]
    mn_growth = seg_growth.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean() * 100.
    mdn_growth = seg_growth.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median() * 100.
    
    segment_data.append({
        'segment': i,
        'mean': "{:,.2f}%".format(mn_growth),
        'median': "{:,.2f}%".format(mdn_growth)
    })

display(pd.DataFrame(segment_data))

Org growth by revenue segment, all orgs, 21-23


Unnamed: 0,segment,mean,median
0,0,"66,465.21%",-2.13%
1,1,"128,115.85%",15.70%
2,2,"91,966.27%",11.12%
3,3,"953,801.77%",7.51%



Org growth by revenue segment, orgs w/ 3+ years, 21-23


Unnamed: 0,segment,mean,median
0,0,855.84%,-10.23%
1,1,664.49%,6.06%
2,2,205.06%,4.24%
3,3,728.92%,1.79%


### bloomerang

In [373]:
filtered_2023_orgs_bloom = bloom_orgs[(bloom_orgs['year']==2023)&(bloom_orgs['volume']>5000)]['org'].unique().tolist()

print("{:,} total orgs".format(len(orgs['org'].unique())))
print("{:,} filtered orgs (>$5k in 2023)".format(len(filtered_2023_orgs_bloom)))

11,733 total orgs
10,220 filtered orgs (>$5k in 2023)


In [374]:
bloom_org_growth = None
bloom_org_growth_3years = None
for org in bloom_orgs['org'].unique():
    _df = bloom_orgs[bloom_orgs['org']==org].copy().sort_values('year', ascending=True)
    _df['volume yoy'] = _df['volume'].pct_change()
    _df.dropna(inplace=True)
    
    bloom_org_growth = pd.concat([_df, bloom_org_growth])
    
    try:
        bloom_org_growth_3years = pd.concat([_df.iloc[1:], bloom_org_growth_3years])
    except:
        pass

In [391]:
print("Averages across dataset, no groupings:")
print("-"*40)
print()

mn_all = bloom_org_growth['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_all = bloom_org_growth['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().median()
mn_212223 = bloom_org_growth[bloom_org_growth['year'].isin([2021, 2022, 2023])]['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_212223 = bloom_org_growth[bloom_org_growth['year'].isin([2021, 2022, 2023])]['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().median()
mn_23 = bloom_org_growth[bloom_org_growth['year']==2023]['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_23 = bloom_org_growth[bloom_org_growth['year']==2023]['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().median()

table_all_orgs = {
    'all years': {
        'mean': "{:,.2f}%".format(mn_all * 100.),
        'median': "{:,.2f}%".format(mdn_all * 100.)
    },
    '2021, 2022, 2023': {
        'mean': "{:,.2f}%".format(mn_212223 * 100.), 
        'median': "{:,.2f}%".format(mdn_212223 * 100.)
    },
    '2023': {
        'mean': "{:,.2f}%".format(mn_23 * 100.), 
        'median': "{:,.2f}%".format(mdn_23 * 100.)
    }
}

print("All orgs, varying timeframes:")
display(pd.DataFrame(table_all_orgs).transpose())
print()

mn = bloom_org_growth_3years['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn = bloom_org_growth_3years['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().median()
mn_212223 = bloom_org_growth_3years[bloom_org_growth_3years['year'].isin([2021, 2022, 2023])]['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_212223 = bloom_org_growth_3years[bloom_org_growth_3years['year'].isin([2021, 2022, 2023])]['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().median()
mn_23 = bloom_org_growth_3years[bloom_org_growth_3years['year']==2023]['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_23 = bloom_org_growth_3years[bloom_org_growth_3years['year']==2023]['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().median()
mn_235k = bloom_org_growth_3years[(bloom_org_growth_3years['year']==2023)&(bloom_org_growth_3years['volume']>5000)].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_235k = bloom_org_growth_3years[(bloom_org_growth_3years['year']==2023)&(bloom_org_growth_3years['volume']>5000)].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()

table_3years = {
    'all years': {
        'mean': "{:,.2f}%".format(mn_all * 100.),
        'median': "{:,.2f}%".format(mdn_all * 100.)
    },
    '2021, 2022, 2023': {
        'mean': "{:,.2f}%".format(mn_212223 * 100.), 
        'median': "{:,.2f}%".format(mdn_212223 * 100.)
    },
    '2023': {
        'mean': "{:,.2f}%".format(mn_23 * 100.), 
        'median': "{:,.2f}%".format(mdn_23 * 100.)
    },
    '2023 & $5k': {
        'mean': "{:,.2f}%".format(mn_235k * 100.), 
        'median': "{:,.2f}%".format(mdn_235k * 100.)
    }
}

print("Orgs w/ 3+ years of records, varying timeframes:")
display(pd.DataFrame(table_3years).transpose())

Averages across dataset, no groupings:
----------------------------------------

All orgs, varying timeframes:


Unnamed: 0,mean,median
all years,"2,220.92%",2.96%
"2021, 2022, 2023","2,324.87%",8.07%
2023,"1,569.46%",1.80%



Orgs w/ 3+ years of records, varying timeframes:


Unnamed: 0,mean,median
all years,"2,220.92%",2.96%
"2021, 2022, 2023",350.15%,6.34%
2023,299.10%,0.64%
2023 & $5k,312.77%,2.53%


In [376]:
print("Grouped by org")
print("-"*40)
print()

mn_all = bloom_org_growth.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_all = bloom_org_growth.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
mn_212223 = bloom_org_growth[bloom_org_growth['year'].isin([2021, 2022, 2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_212223 = bloom_org_growth[bloom_org_growth['year'].isin([2021, 2022, 2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
mn_23 = bloom_org_growth[bloom_org_growth['year']==2023].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_23 = bloom_org_growth[bloom_org_growth['year']==2023].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()

table_all_orgs = {
    'all years': {
        'mean': "{:,.2f}%".format(mn_all * 100.),
        'median': "{:,.2f}%".format(mdn_all * 100.)
    },
    '2021, 2022, 2023': {
        'mean': "{:,.2f}%".format(mn_212223 * 100.), 
        'median': "{:,.2f}%".format(mdn_212223 * 100.)
    },
    '2023': {
        'mean': "{:,.2f}%".format(mn_23 * 100.), 
        'median': "{:,.2f}%".format(mdn_23 * 100.)
    }
}

print("All orgs, varying timeframes:")
display(pd.DataFrame(table_all_orgs).transpose())
print()

mn = bloom_org_growth_3years.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn = bloom_org_growth_3years.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
mn_212223 = bloom_org_growth_3years[bloom_org_growth_3years['year'].isin([2021, 2022, 2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_212223 = bloom_org_growth_3years[bloom_org_growth_3years['year'].isin([2021, 2022, 2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
mn_23 = bloom_org_growth_3years[bloom_org_growth_3years['year']==2023].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_23 = bloom_org_growth_3years[bloom_org_growth_3years['year']==2023].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()

table_3years = {
    'all years': {
        'mean': "{:,.2f}%".format(mn_all * 100.),
        'median': "{:,.2f}%".format(mdn_all * 100.)
    },
    '2021, 2022, 2023': {
        'mean': "{:,.2f}%".format(mn_212223 * 100.), 
        'median': "{:,.2f}%".format(mdn_212223 * 100.)
    },
    '2023': {
        'mean': "{:,.2f}%".format(mn_23 * 100.), 
        'median': "{:,.2f}%".format(mdn_23 * 100.)
    }
}

print("Orgs w/ 3+ years of records, varying timeframes, averages across dataset:")
display(pd.DataFrame(table_3years).transpose())

Grouped by org
----------------------------------------

All orgs, varying timeframes:


Unnamed: 0,mean,median
all years,"1,508.71%",4.14%
"2021, 2022, 2023","1,951.85%",8.64%
2023,"1,572.66%",1.93%



Orgs w/ 3+ years of records, varying timeframes, averages across dataset:


Unnamed: 0,mean,median
all years,"1,508.71%",4.14%
"2021, 2022, 2023",371.15%,6.61%
2023,299.64%,0.77%


In [377]:
print("Grouped by org, top 20%")
print("-"*40)
print()

mn_all = bloom_org_growth[bloom_org_growth['is top 20']].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_all = bloom_org_growth[bloom_org_growth['is top 20']].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
mn_212223 = bloom_org_growth[bloom_org_growth['is top 20']&bloom_org_growth['year'].isin([2021, 2022, 2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_212223 = bloom_org_growth[bloom_org_growth['is top 20']&bloom_org_growth['year'].isin([2021, 2022, 2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
mn_23 = bloom_org_growth[bloom_org_growth['is top 20']&(bloom_org_growth['year']==2023)].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_23 = bloom_org_growth[bloom_org_growth['is top 20']&(bloom_org_growth['year']==2023)].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()

table_all_orgs = {
    'all years': {
        'mean': "{:,.2f}%".format(mn_all * 100.),
        'median': "{:,.2f}%".format(mdn_all * 100.)
    },
    '2021, 2022, 2023': {
        'mean': "{:,.2f}%".format(mn_212223 * 100.), 
        'median': "{:,.2f}%".format(mdn_212223 * 100.)
    },
    '2023': {
        'mean': "{:,.2f}%".format(mn_23 * 100.), 
        'median': "{:,.2f}%".format(mdn_23 * 100.)
    }
}

print("All orgs, varying timeframes:")
display(pd.DataFrame(table_all_orgs).transpose())
print()

mn_all = bloom_org_growth_3years[bloom_org_growth_3years['is top 20']].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_all = bloom_org_growth_3years[bloom_org_growth_3years['is top 20']].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
mn_212223 = bloom_org_growth_3years[bloom_org_growth_3years['is top 20']&bloom_org_growth_3years['year'].isin([2021, 2022, 2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_212223 = bloom_org_growth_3years[bloom_org_growth_3years['is top 20']&bloom_org_growth_3years['year'].isin([2021, 2022, 2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
mn_23 = bloom_org_growth_3years[bloom_org_growth_3years['is top 20']&(bloom_org_growth_3years['year']==2023)].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_23 = bloom_org_growth_3years[bloom_org_growth_3years['is top 20']&(bloom_org_growth_3years['year']==2023)].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()

table_3years = {
    'all years': {
        'mean': "{:,.2f}%".format(mn_all * 100.),
        'median': "{:,.2f}%".format(mdn_all * 100.)
    },
    '2021, 2022, 2023': {
        'mean': "{:,.2f}%".format(mn_212223 * 100.), 
        'median': "{:,.2f}%".format(mdn_212223 * 100.)
    },
    '2023': {
        'mean': "{:,.2f}%".format(mn_23 * 100.), 
        'median': "{:,.2f}%".format(mdn_23 * 100.)
    }
}

print("Orgs w/ 3+ years of records, varying timeframes, averages across dataset:")
display(pd.DataFrame(table_3years).transpose())

Grouped by org, top 20%
----------------------------------------

All orgs, varying timeframes:


Unnamed: 0,mean,median
all years,683.29%,3.16%
"2021, 2022, 2023","1,705.39%",6.52%
2023,758.87%,0.63%



Orgs w/ 3+ years of records, varying timeframes, averages across dataset:


Unnamed: 0,mean,median
all years,352.72%,0.62%
"2021, 2022, 2023",721.66%,6.18%
2023,22.16%,0.42%


In [378]:
print("Org growth by revenue segment, all orgs, 21-23")
segment_data = []
for i in range(len(SEGMENTS)):
    seg_growth = bloom_org_growth[(bloom_org_growth['year'].isin([2021, 2022, 2023]))&(bloom_org_growth['segment']==i)]
    mn_growth = seg_growth.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean() * 100.
    mdn_growth = seg_growth.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median() * 100.
    
    segment_data.append({
        'segment': i,
        'mean': "{:,.2f}%".format(mn_growth),
        'median': "{:,.2f}%".format(mdn_growth)
    })

display(pd.DataFrame(segment_data))

print()
print("Org growth by revenue segment, orgs w/ 3+ years, 21-23")
segment_data = []
for i in range(len(SEGMENTS)):
    seg_growth = bloom_org_growth_3years[(bloom_org_growth_3years['year'].isin([2021, 2022, 2023]))&(bloom_org_growth_3years['segment']==i)]
    mn_growth = seg_growth.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean() * 100.
    mdn_growth = seg_growth.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median() * 100.
    
    segment_data.append({
        'segment': i,
        'mean': "{:,.2f}%".format(mn_growth),
        'median': "{:,.2f}%".format(mdn_growth)
    })

display(pd.DataFrame(segment_data))

Org growth by revenue segment, all orgs, 21-23


Unnamed: 0,segment,mean,median
0,0,"2,949.13%",4.94%
1,1,"1,075.89%",10.96%
2,2,"1,193.56%",7.76%
3,3,"7,634.31%",5.53%



Org growth by revenue segment, orgs w/ 3+ years, 21-23


Unnamed: 0,segment,mean,median
0,0,"2,355.93%",-1.45%
1,1,313.98%,8.09%
2,2,56.78%,6.94%
3,3,16.82%,4.56%


### merged (qgiv + bloomerang)

In [379]:
bloom_org_growth_3years.head(2)

Unnamed: 0,created_year,segment,fundraising_revenue,org,donors,year,volume,is top 20,volume yoy
74510,2023,1,0.0,zyep,223,2020,59847.0,False,2.252729
74511,2023,1,0.0,zyep,164,2021,52138.0,False,-0.128812


In [380]:
cols = ['year', 'segment', 'org', 'volume', 'volume yoy', 'is top 20', 'donors']

bloom_org_growth['year'] = bloom_org_growth['year']
bloom_org_growth_3years['year'] = bloom_org_growth_3years['year']

all_growth = pd.concat([bloom_org_growth[cols], org_growth[cols]])
all_growth_3years = pd.concat([bloom_org_growth_3years[cols], org_growth_3years[cols]])

In [390]:
print("Grouped by org")
print("-"*40)
print()

mn_all = all_growth.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_all = all_growth.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
mn_212223 = all_growth[all_growth['year'].isin([2021, 2022, 2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_212223 = all_growth[all_growth['year'].isin([2021, 2022, 2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
mn_23 = all_growth[all_growth['year']==2023].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_23 = all_growth[all_growth['year']==2023].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()

table_all_orgs = {
    'all years': {
        'mean': "{:,.2f}%".format(mn_all * 100.),
        'median': "{:,.2f}%".format(mdn_all * 100.)
    },
    '2021, 2022, 2023': {
        'mean': "{:,.2f}%".format(mn_212223 * 100.), 
        'median': "{:,.2f}%".format(mdn_212223 * 100.)
    },
    '2023': {
        'mean': "{:,.2f}%".format(mn_23 * 100.), 
        'median': "{:,.2f}%".format(mdn_23 * 100.)
    }
}

print("All orgs, varying timeframes:")
display(pd.DataFrame(table_all_orgs).transpose())
print()

mn = all_growth_3years.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn = all_growth_3years.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
mn_212223 = all_growth_3years[all_growth_3years['year'].isin([2021, 2022, 2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_212223 = all_growth_3years[all_growth_3years['year'].isin([2021, 2022, 2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
mn_23 = all_growth_3years[all_growth_3years['year']==2023].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_23 = all_growth_3years[all_growth_3years['year']==2023].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
mn_235k = all_growth_3years[(all_growth_3years['year']==2023)&(all_growth_3years['volume']>5000)].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_235k = all_growth_3years[(all_growth_3years['year']==2023)&(all_growth_3years['volume']>5000)].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()

table_3years = {
    'all years': {
        'mean': "{:,.2f}%".format(mn_all * 100.),
        'median': "{:,.2f}%".format(mdn_all * 100.)
    },
    '2021, 2022, 2023': {
        'mean': "{:,.2f}%".format(mn_212223 * 100.), 
        'median': "{:,.2f}%".format(mdn_212223 * 100.)
    },
    '2023': {
        'mean': "{:,.2f}%".format(mn_23 * 100.), 
        'median': "{:,.2f}%".format(mdn_23 * 100.)
    },
    '2023 & $5k': {
        'mean': "{:,.2f}%".format(mn_235k * 100.), 
        'median': "{:,.2f}%".format(mdn_235k * 100.)
    }
}

print("Orgs w/ 3+ years of records, varying timeframes, averages across dataset:")
display(pd.DataFrame(table_3years).transpose())

Grouped by org
----------------------------------------

All orgs, varying timeframes:


Unnamed: 0,mean,median
all years,"56,724.14%",6.33%
"2021, 2022, 2023","48,183.17%",8.60%
2023,"41,496.23%",2.91%



Orgs w/ 3+ years of records, varying timeframes, averages across dataset:


Unnamed: 0,mean,median
all years,"56,724.14%",6.33%
"2021, 2022, 2023",425.58%,5.66%
2023,328.83%,0.96%
2023 & $5k,351.28%,3.32%


In [382]:
print("Average year over year volume growth for all organizations, 2023:")
print("-"*40)
mn_growth = all_growth_3years[all_growth_3years['year'].isin([2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
print("Mean growth: {:,.2f}%".format(mn_growth * 100.))
mdn_growth = all_growth_3years[all_growth_3years['year'].isin([2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
print("Median growth: {:,.2f}%".format(mdn_growth * 100.))
print()

print("Average year over year volume growth for 2023, by revenue segments, orgs w/ 3+ years:")
print("-"*40)
for i in range(len(SEGMENTS)):
    seg_growth = all_growth_3years[(all_growth_3years['year'].isin([2023]))&(all_growth_3years['segment']==i)]
    mn_growth = seg_growth.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean() * 100.
    mdn_growth = seg_growth.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median() * 100.
    try:
        print("${:,} to ${:,}: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i][0], SEGMENTS[i][1], mn_growth, mdn_growth))
    except:
        print("${:,}+: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i], mn_growth, mdn_growth))
        
print()
print("Average year over year volume growth for all organizations, across 2023, >$5k 2023:")
print("-"*40)
mn_growth = all_growth_3years[all_growth_3years['year'].isin([2023])&(all_growth_3years['volume']>5000)].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
print("Mean growth: {:,.2f}%".format(mn_growth * 100.))
mdn_growth = all_growth_3years[all_growth_3years['year'].isin([2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
print("Median growth: {:,.2f}%".format(mdn_growth * 100.))
print()

Average year over year volume growth for all organizations, 2023:
----------------------------------------
Mean growth: 328.83%
Median growth: 0.96%

Average year over year volume growth for 2023, by revenue segments, orgs w/ 3+ years:
----------------------------------------
$0 to $100,000: 210.44% mean; -6.19% median
$100,000 to $1,000,000: 572.38% mean; 3.85% median
$1,000,000 to $10,000,000: 137.34% mean; 0.62% median
$10,000,000+: 233.80% mean; -1.18% median

Average year over year volume growth for all organizations, across 2023, >$5k 2023:
----------------------------------------
Mean growth: 351.28%
Median growth: 0.96%



In [383]:
print("Grouped by org, top 20%")
print("-"*40)
print()

mn_all = all_growth[all_growth['is top 20']].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_all = all_growth[all_growth['is top 20']].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
mn_212223 = all_growth[all_growth['is top 20']&all_growth['year'].isin([2021, 2022, 2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_212223 = all_growth[all_growth['is top 20']&all_growth['year'].isin([2021, 2022, 2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
mn_23 = all_growth[all_growth['is top 20']&(all_growth['year']==2023)].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_23 = all_growth[all_growth['is top 20']&(all_growth['year']==2023)].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()

table_all_orgs = {
    'all years': {
        'mean': "{:,.2f}%".format(mn_all * 100.),
        'median': "{:,.2f}%".format(mdn_all * 100.)
    },
    '2021, 2022, 2023': {
        'mean': "{:,.2f}%".format(mn_212223 * 100.), 
        'median': "{:,.2f}%".format(mdn_212223 * 100.)
    },
    '2023': {
        'mean': "{:,.2f}%".format(mn_23 * 100.), 
        'median': "{:,.2f}%".format(mdn_23 * 100.)
    }
}

print("All orgs, varying timeframes:")
display(pd.DataFrame(table_all_orgs).transpose())
print()

mn_all = all_growth_3years[all_growth_3years['is top 20']].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_all = all_growth_3years[all_growth_3years['is top 20']].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
mn_212223 = all_growth_3years[all_growth_3years['is top 20']&all_growth_3years['year'].isin([2021, 2022, 2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_212223 = all_growth_3years[all_growth_3years['is top 20']&all_growth_3years['year'].isin([2021, 2022, 2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()
mn_23 = all_growth_3years[all_growth_3years['is top 20']&(all_growth_3years['year']==2023)].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean()
mdn_23 = all_growth_3years[all_growth_3years['is top 20']&(all_growth_3years['year']==2023)].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median()

table_3years = {
    'all years': {
        'mean': "{:,.2f}%".format(mn_all * 100.),
        'median': "{:,.2f}%".format(mdn_all * 100.)
    },
    '2021, 2022, 2023': {
        'mean': "{:,.2f}%".format(mn_212223 * 100.), 
        'median': "{:,.2f}%".format(mdn_212223 * 100.)
    },
    '2023': {
        'mean': "{:,.2f}%".format(mn_23 * 100.), 
        'median': "{:,.2f}%".format(mdn_23 * 100.)
    }
}

print("Orgs w/ 3+ years of records, varying timeframes, averages across dataset:")
display(pd.DataFrame(table_3years).transpose())

Grouped by org, top 20%
----------------------------------------

All orgs, varying timeframes:


Unnamed: 0,mean,median
all years,"120,736.71%",6.54%
"2021, 2022, 2023","120,533.76%",7.05%
2023,"100,290.83%",1.79%



Orgs w/ 3+ years of records, varying timeframes, averages across dataset:


Unnamed: 0,mean,median
all years,384.02%,2.04%
"2021, 2022, 2023",758.02%,5.60%
2023,114.29%,0.71%


In [384]:
print("Org growth by revenue segment, all orgs, 21-23")
segment_data = []
for i in range(len(SEGMENTS)):
    seg_growth = all_growth[(all_growth['year'].isin([2021, 2022, 2023]))&(all_growth['segment']==i)]
    mn_growth = seg_growth.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean() * 100.
    mdn_growth = seg_growth.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median() * 100.
    
    segment_data.append({
        'segment': i,
        'mean': "{:,.2f}%".format(mn_growth),
        'median': "{:,.2f}%".format(mdn_growth)
    })

display(pd.DataFrame(segment_data))

print()
print("Org growth by revenue segment, orgs w/ 3+ years, 21-23")
segment_data = []
for i in range(len(SEGMENTS)):
    seg_growth = all_growth_3years[(all_growth_3years['year'].isin([2021, 2022, 2023]))&(all_growth_3years['segment']==i)]
    mn_growth = seg_growth.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().mean() * 100.
    mdn_growth = seg_growth.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna().median() * 100.
    
    segment_data.append({
        'segment': i,
        'mean': "{:,.2f}%".format(mn_growth),
        'median': "{:,.2f}%".format(mdn_growth)
    })

display(pd.DataFrame(segment_data))

Org growth by revenue segment, all orgs, 21-23


Unnamed: 0,segment,mean,median
0,0,"36,865.95%",1.82%
1,1,"26,785.35%",11.70%
2,2,"20,334.51%",8.17%
3,3,"257,541.03%",6.18%



Org growth by revenue segment, orgs w/ 3+ years, 21-23


Unnamed: 0,segment,mean,median
0,0,"1,574.02%",-6.19%
1,1,375.48%,7.71%
2,2,83.68%,6.77%
3,3,184.40%,4.22%


In [389]:
mdn_per_org = all_growth_3years.groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna()
len_orgs = len(mdn_per_org)
ten_perc = int(len_orgs * .1)

mn_growth = mdn_per_org.iloc[ten_perc:-ten_perc].mean()
mdn_growth = mdn_per_org.iloc[ten_perc:-ten_perc].median()


mdn_per_org = all_growth_3years[all_growth_3years['year'].isin([2021, 2022, 2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna()
len_orgs = len(mdn_per_org)
ten_perc = int(len_orgs * .1)

mn_growth_2123 = mdn_per_org.iloc[ten_perc:-ten_perc].mean()
mdn_growth_2123 = mdn_per_org.iloc[ten_perc:-ten_perc].median()

mdn_per_org = all_growth_3years[all_growth_3years['year'].isin([2023])].groupby('org')['volume yoy'].median().replace([np.inf, -np.inf], np.nan).dropna()
len_orgs = len(mdn_per_org)
ten_perc = int(len_orgs * .1)

mn_growth_23 = mdn_per_org.iloc[ten_perc:-ten_perc].mean()
mdn_growth_23 = mdn_per_org.iloc[ten_perc:-ten_perc].median()

table_data = {
    'middle 80%': {
        'mean': "{:,.2f}%".format(mn_growth * 100.),
        'median': "{:,.2f}%".format(mdn_growth * 100.)
    },
    'middle 80%, 21-23': {
        'mean': "{:,.2f}%".format(mn_growth_2123 * 100.),
        'median': "{:,.2f}%".format(mdn_growth_2123 * 100.)
    },
    'middle 80%, 23': {
        'mean': "{:,.2f}%".format(mn_growth_23 * 100.),
        'median': "{:,.2f}%".format(mdn_growth_23 * 100.)
    }
}
print("Middle 80%, grouped by org")
display(pd.DataFrame(table_data).transpose())

Middle 80%, grouped by org


Unnamed: 0,mean,median
middle 80%,"1,561.68%",-0.81%
"middle 80%, 21-23",428.61%,6.58%
"middle 80%, 23",369.69%,1.34%


In [327]:
print("Not filtering $5k:")
print("All orgs:")
display(all_growth_3years[(all_growth_3years['year']==2023)]['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().agg(['mean', 'median']).reset_index())

print()
print("Top 20%")
display(all_growth_3years[(all_growth_3years['is top 20'])&(all_growth_3years['year']==2023)]['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().agg(['mean', 'median']).reset_index())


print()
print("Filtering 3+ years of records & > $5k funds raised in 2023")

filtered_2023_orgs = filtered_2023_orgs_bloom + filtered_2023_orgs_qgiv

print("All orgs:")
display(all_growth_3years[(all_growth_3years['year']==2023)&(all_growth_3years['org'].isin(filtered_2023_orgs))]['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().agg(['mean', 'median']).reset_index())

print()
print("Top 20%")
display(all_growth_3years[(all_growth_3years['is top 20'])&(all_growth_3years['year']==2023)&(all_growth_3years['org'].isin(filtered_2023_orgs))]['volume yoy'].replace([np.inf, -np.inf], np.nan).dropna().agg(['mean', 'median']).reset_index())

Not filtering $5k:
All orgs:


Unnamed: 0,index,volume yoy
0,mean,87.157458
1,median,0.045847



Top 20%


Unnamed: 0,index,volume yoy
0,mean,137.028706
1,median,0.041822



Filtering 3+ years of records & > $5k funds raised in 2023
All orgs:


Unnamed: 0,index,volume yoy
0,mean,108.093754
1,median,0.111888



Top 20%


Unnamed: 0,index,volume yoy
0,mean,140.540804
1,median,0.052594


### 5 years growth

comparing to blackbaud 5 year growth stat language

Here we're looking at full calendar years, dropping the first two entries (first partial year, second year in which we will see insane YoY growth), limiting view to the first 5 years of org activity. The final dataset will be all orgs, growth from years 3 to 5.

In [395]:
five_years = []

for org in all_growth_3years['org'].unique():
    _df = all_growth_3years[all_growth_3years['org']==org].sort_values('year', ascending=True).iloc[:2]
    total_vol = _df['volume'].sum()
    
    five_years.append({
        'org': org,
        'max yoy': _df['volume yoy'].max(),
        'total growth perc': total_vol / _df['volume'].iloc[0],
        'segment': _df['segment'].iloc[0],
        'is top 20': _df['is top 20'].iloc[0]
    })

  # Remove the CWD from sys.path while we load stuff.
  # Remove the CWD from sys.path while we load stuff.


In [398]:
five_years_df = pd.DataFrame(five_years)

In [402]:
five_years_df[['max yoy', 'total growth perc']].replace([np.inf, -np.inf], np.nan).dropna().agg(['mean', 'median']).reset_index()

Unnamed: 0,index,max yoy,total growth perc
0,mean,33.907547,11.785431
1,median,0.416831,1.992663


In [405]:
five_years_df.replace([np.inf, -np.inf], np.nan).dropna().groupby('segment')[['max yoy', 'total growth perc']].agg(['mean', 'median']).reset_index()

Unnamed: 0_level_0,segment,max yoy,max yoy,total growth perc,total growth perc
Unnamed: 0_level_1,Unnamed: 1_level_1,mean,median,mean,median
0,0,25.815348,0.290187,18.788867,1.668786
1,1,10.961072,0.444025,8.612562,2.024107
2,2,63.617603,0.429123,13.613896,2.048073
3,3,36.367648,0.449668,2.779686,2.03213


## 4. Average donor count growth (percent change year over year)

Queries were run on Redshift (Qgiv) and GBQ (Bloomerang) to pull unique donors per year per org. These numbers were aggregated to calculate year over year changes for each organization and these year over year change percentages were averaged at various timeframes and additional org segmentation.

### qgiv

Queried for unique donors per organization per year from accepted transactions. Inherently limited to organizations that have processed live transactions.

#### donor count

In [1336]:
print("Average donor count per org, per year, all orgs:")
mn = orgs['donors'].mean()
mdn = orgs['donors'].median()
print("- All years: mean: {:,.2f}; median: {:,.2f}".format(mn, mdn))

mn = orgs[orgs['year'].isin([2021, 2022, 2023])]['donors'].mean()
mdn = orgs[orgs['year'].isin([2021, 2022, 2023])]['donors'].median()
print("- 2021, 2022, 2023: mean: {:,.2f}; median: {:,.2f}".format(mn, mdn))

mn = orgs[orgs['year']==2023]['donors'].mean()
mdn = orgs[orgs['year']==2023]['donors'].median()
print("- 2023: mean: {:,.2f}; median: {:,.2f}".format(mn, mdn))

print()
print("Average donor count per org, per year, top 20%:")
mn = orgs[orgs['is top 20']]['donors'].mean()
mdn = orgs[orgs['is top 20']]['donors'].median()
print("- All years: mean: {:,.2f}; median: {:,.2f}".format(mn, mdn))

mn = orgs[(orgs['year'].isin([2021, 2022, 2023]))&(orgs['is top 20'])]['donors'].mean()
mdn = orgs[(orgs['year'].isin([2021, 2022, 2023]))&(orgs['is top 20'])]['donors'].median()
print("- 2021, 2022, 2023: mean: {:,.2f}; median: {:,.2f}".format(mn, mdn))

mn = orgs[(orgs['year']==2023)&(orgs['is top 20'])]['donors'].mean()
mdn = orgs[(orgs['year']==2023)&(orgs['is top 20'])]['donors'].median()
print("- 2023: mean: {:,.2f}; median: {:,.2f}".format(mn, mdn))

Average donor count per org, per year, all orgs:
- All years: mean: 242.75; median: 51.00
- 2021, 2022, 2023: mean: 293.24; median: 62.00
- 2023: mean: 303.57; median: 65.00

Average donor count per org, per year, top 20%:
- All years: mean: 500.31; median: 171.00
- 2021, 2022, 2023: mean: 620.15; median: 227.00
- 2023: mean: 653.17; median: 243.00


In [1337]:
print("Average donor count per segment, per year, across all years:")
print("-"*40)
for i in range(len(SEGMENTS)):
    seg = orgs[orgs['segment']==i]
    mn_donors = seg['donors'].mean()
    mdn_donors = seg['donors'].median()
    try:
        print("${:,} to ${:,}: {:,.2f} mean; {:,.2f} median".format(SEGMENTS[i][0], SEGMENTS[i][1], mn_donors, mdn_donors))
    except:
        print("${:,}+: {:,.2f} mean; {:,.2f} median".format(SEGMENTS[i], mn_donors, mdn_donors))
        
print()
print("Average donor count per segment, per year, 2021, 2022, 2023:")
print("-"*40)
for i in range(len(SEGMENTS)):
    seg = orgs[(orgs['segment']==i)&(orgs['year'].isin([2021, 2022, 2023]))]
    mn_donors = seg['donors'].mean()
    mdn_donors = seg['donors'].median()
    try:
        print("${:,} to ${:,}: {:,.2f} mean; {:,.2f} median".format(SEGMENTS[i][0], SEGMENTS[i][1], mn_donors, mdn_donors))
    except:
        print("${:,}+: {:,.2f} mean; {:,.2f} median".format(SEGMENTS[i], mn_donors, mdn_donors))

print()
print("Average donor count per segment, per year, 2023:")
print("-"*40)
for i in range(len(SEGMENTS)):
    seg = orgs[(orgs['segment']==i)&(orgs['year']==2023)]
    mn_donors = seg['donors'].mean()
    mdn_donors = seg['donors'].median()
    try:
        print("${:,} to ${:,}: {:,.2f} mean; {:,.2f} median".format(SEGMENTS[i][0], SEGMENTS[i][1], mn_donors, mdn_donors))
    except:
        print("${:,}+: {:,.2f} mean; {:,.2f} median".format(SEGMENTS[i], mn_donors, mdn_donors))

Average donor count per segment, per year, across all years:
----------------------------------------
$0 to $100,000: 91.79 mean; 20.00 median
$100,000 to $1,000,000: 178.50 mean; 60.00 median
$1,000,000 to $10,000,000: 370.33 mean; 130.00 median
$10,000,000+: 784.65 mean; 152.00 median

Average donor count per segment, per year, 2021, 2022, 2023:
----------------------------------------
$0 to $100,000: 76.79 mean; 15.00 median
$100,000 to $1,000,000: 199.47 mean; 72.00 median
$1,000,000 to $10,000,000: 427.77 mean; 162.00 median
$10,000,000+: 965.07 mean; 192.00 median

Average donor count per segment, per year, 2023:
----------------------------------------
$0 to $100,000: 73.52 mean; 16.00 median
$100,000 to $1,000,000: 209.04 mean; 79.00 median
$1,000,000 to $10,000,000: 446.93 mean; 168.00 median
$10,000,000+: 948.89 mean; 192.00 median


#### donor growth

In [1338]:
donor_growth = None
donor_growth_3years = None
for org in orgs['org'].unique():
    cols = ['org', 'year', 'donors', 'is top 20', 'segment']
    _df = orgs[orgs['org']==org].copy().sort_values('year', ascending=True)[cols]
    _df['donors yoy'] = _df['donors'].pct_change()
    _df.dropna(inplace=True)
    donor_growth = pd.concat([_df, donor_growth])
    
    if len(_df) > 1:
        donor_growth_3years = pd.concat([_df.iloc[1:], donor_growth_3years])

In [1339]:
donor_growth.head(2)

Unnamed: 0,org,year,donors,is top 20,segment,donors yoy
39499,443005,2024,1,False,0,-0.666667
38121,441372,2018,63,False,0,-0.376238


In [1340]:
donor_growth_3years.head(2)

Unnamed: 0,org,year,donors,is top 20,segment,donors yoy
38183,442026,2019,43,False,0,7.6
37417,442026,2020,2,False,0,-0.953488


In [1341]:
data = []

k = "Average donor count growth per org, per year, all orgs"

mn = donor_growth['donors yoy'].mean()
mdn = donor_growth['donors yoy'].median()
data.append({
    'k': 'All years',
    'mean': mn,
    'median': mdn
})

mn = donor_growth[donor_growth['year'].isin([2021, 2022, 2023])]['donors yoy'].mean()
mdn = donor_growth[donor_growth['year'].isin([2021, 2022, 2023])]['donors yoy'].median()
data.append({
    'k': '2021, 2022, 2023',
    'mean': mn,
    'median': mdn
})

mn = donor_growth[donor_growth['year']==2023]['donors yoy'].mean()
mdn = donor_growth[donor_growth['year']==2023]['donors yoy'].median()
data.append({
    'k': '2023',
    'mean': mn,
    'median': mdn
})

print()
print("Average donor count growth per org, per year, top 20%:")

mn = donor_growth[donor_growth['is top 20']]['donors yoy'].mean()
mdn = donor_growth[donor_growth['is top 20']]['donors yoy'].median()
data.append({
    'k': 'Top 20%, all years',
    'mean': mn,
    'median': mdn
})

mn = donor_growth[(donor_growth['year'].isin([2021, 2022, 2023]))&(donor_growth['is top 20'])]['donors yoy'].mean()
mdn = donor_growth[(donor_growth['year'].isin([2021, 2022, 2023]))&(donor_growth['is top 20'])]['donors yoy'].median()
data.append({
    'k': 'Top 20%, 2021, 2022, 2023',
    'mean': mn,
    'median': mdn
})

mn = donor_growth[(donor_growth['year']==2023)&(donor_growth['is top 20'])]['donors yoy'].mean()
mdn = donor_growth[(donor_growth['year']==2023)&(donor_growth['is top 20'])]['donors yoy'].median()
data.append({
    'k': 'Top 20%, 2023',
    'mean': mn,
    'median': mdn
})

pd.DataFrame(data)


Average donor count growth per org, per year, top 20%:


Unnamed: 0,k,mean,median
0,All years,4.960202,0.0
1,"2021, 2022, 2023",4.4823,-0.021555
2,2023,6.531115,0.0
3,"Top 20%, all years",6.500088,0.033898
4,"Top 20%, 2021, 2022, 2023",6.099237,-0.008658
5,"Top 20%, 2023",9.732035,0.0


In [1342]:
data = []

k = "Average donor count growth per org, per year, orgs w/ 3+ years"

mn = donor_growth_3years['donors yoy'].mean()
mdn = donor_growth_3years['donors yoy'].median()
data.append({
    'k': 'All years',
    'mean': mn,
    'median': mdn
})

mn = donor_growth_3years[donor_growth_3years['year'].isin([2021, 2022, 2023])]['donors yoy'].mean()
mdn = donor_growth_3years[donor_growth_3years['year'].isin([2021, 2022, 2023])]['donors yoy'].median()
data.append({
    'k': '2021, 2022, 2023',
    'mean': mn,
    'median': mdn
})

mn = donor_growth_3years[donor_growth_3years['year']==2023]['donors yoy'].mean()
mdn = donor_growth_3years[donor_growth_3years['year']==2023]['donors yoy'].median()
data.append({
    'k': '2023',
    'mean': mn,
    'median': mdn
})

mn = donor_growth_3years[donor_growth_3years['is top 20']]['donors yoy'].mean()
mdn = donor_growth_3years[donor_growth_3years['is top 20']]['donors yoy'].median()
data.append({
    'k': 'Top 20%, all years',
    'mean': mn,
    'median': mdn
})

mn = donor_growth_3years[(donor_growth_3years['year'].isin([2021, 2022, 2023]))&(donor_growth_3years['is top 20'])]['donors yoy'].mean()
mdn = donor_growth_3years[(donor_growth_3years['year'].isin([2021, 2022, 2023]))&(donor_growth_3years['is top 20'])]['donors yoy'].median()
data.append({
    'k': 'Top 20%, 2021, 2022, 2023',
    'mean': mn,
    'median': mdn
})

mn = donor_growth_3years[(donor_growth_3years['year']==2023)&(donor_growth_3years['is top 20'])]['donors yoy'].mean()
mdn = donor_growth_3years[(donor_growth_3years['year']==2023)&(donor_growth_3years['is top 20'])]['donors yoy'].median()
data.append({
    'k': 'Top 20%, 2023',
    'mean': mn,
    'median': mdn
})

print(k)
pd.DataFrame(data)

Average donor count growth per org, per year, orgs w/ 3+ years


Unnamed: 0,k,mean,median
0,All years,0.734453,-0.039494
1,"2021, 2022, 2023",0.922698,-0.057096
2,2023,1.234282,-0.026339
3,"Top 20%, all years",0.591824,-0.007797
4,"Top 20%, 2021, 2022, 2023",0.868023,-0.035714
5,"Top 20%, 2023",1.274821,-0.017241


In [1343]:
print("Average per org donor growth by year, all orgs:")
print("-"*40)
for year in sorted(donor_growth['year'].unique()):
    _df = donor_growth[donor_growth['year']==year]
    mn = _df['donors yoy'].dropna().mean()
    mdn = _df['donors yoy'].dropna().median()
    mn_top20 = _df[_df['is top 20']]['donors yoy'].dropna().mean()
    mdn_top20 = _df[_df['is top 20']]['donors yoy'].dropna().median()
    print("{} | all: {:.2f}% mean; {:.2f}% median | top 20%: {:.2f}% mean; {:.2f}% median".format(year, mn, mdn, mn_top20, mdn_top20))

Average per org donor growth by year, all orgs:
----------------------------------------
2007 | all: 4.25% mean; 4.25% median | top 20%: 4.25% mean; 4.25% median
2008 | all: 5.11% mean; 1.20% median | top 20%: 7.48% mean; 1.80% median
2009 | all: 8.65% mean; 0.52% median | top 20%: 13.69% mean; 0.83% median
2010 | all: 2.04% mean; 0.38% median | top 20%: 1.84% mean; 0.45% median
2011 | all: 1.68% mean; 0.24% median | top 20%: 2.00% mean; 0.34% median
2012 | all: 2.80% mean; 0.23% median | top 20%: 2.36% mean; 0.25% median
2013 | all: 5.88% mean; 0.12% median | top 20%: 8.95% mean; 0.26% median
2014 | all: 2.40% mean; 0.04% median | top 20%: 3.85% mean; 0.20% median
2015 | all: 1.12% mean; 0.05% median | top 20%: 1.23% mean; 0.12% median
2016 | all: 2.59% mean; 0.04% median | top 20%: 3.95% mean; 0.10% median
2017 | all: 7.27% mean; 0.04% median | top 20%: 10.94% mean; 0.10% median
2018 | all: 6.59% mean; 0.00% median | top 20%: 7.57% mean; 0.05% median
2019 | all: 6.68% mean; 0.01% med

In [1344]:
print("Average donor count year over year growth per segment, across all years:")
print("-"*40)
for i in range(len(SEGMENTS)):
    seg = donor_growth[donor_growth['segment']==i]
    mn_donors = seg['donors yoy'].mean() * 100.
    mdn_donors = seg['donors yoy'].median() * 100.
    try:
        print("${:,} to ${:,}: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i][0], SEGMENTS[i][1], mn_donors, mdn_donors))
    except:
        print("${:,}+: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i], mn_donors, mdn_donors))
        
print()
print("Average donor count year over year growth per segment, 2021, 2022, 2023:")
print("-"*40)
for i in range(len(SEGMENTS)):
    seg = donor_growth[donor_growth['segment']==i]
    mn_donors = seg[seg['year'].isin([2021, 2022, 2023])]['donors yoy'].mean() * 100.
    mdn_donors = seg[seg['year'].isin([2021, 2022, 2023])]['donors yoy'].median() * 100.
    try:
        print("${:,} to ${:,}: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i][0], SEGMENTS[i][1], mn_donors, mdn_donors))
    except:
        print("${:,}+: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i], mn_donors, mdn_donors))
        
print()
print("Average donor count year over year growth per segment, 2023:")
print("-"*40)
for i in range(len(SEGMENTS)):
    seg = donor_growth[donor_growth['segment']==i]
    mn_donors = seg[seg['year'].isin([2023])]['donors yoy'].mean() * 100.
    mdn_donors = seg[seg['year'].isin([2023])]['donors yoy'].median() * 100.
    try:
        print("${:,} to ${:,}: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i][0], SEGMENTS[i][1], mn_donors, mdn_donors))
    except:
        print("${:,}+: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i], mn_donors, mdn_donors))

Average donor count year over year growth per segment, across all years:
----------------------------------------
$0 to $100,000: 267.19% mean; -3.23% median
$100,000 to $1,000,000: 378.03% mean; 2.90% median
$1,000,000 to $10,000,000: 582.71% mean; 1.10% median
$10,000,000+: 1,595.20% mean; 1.79% median

Average donor count year over year growth per segment, 2021, 2022, 2023:
----------------------------------------
$0 to $100,000: 221.16% mean; -10.71% median
$100,000 to $1,000,000: 301.99% mean; 0.00% median
$1,000,000 to $10,000,000: 416.05% mean; -1.83% median
$10,000,000+: 1,703.81% mean; -1.58% median

Average donor count year over year growth per segment, 2023:
----------------------------------------
$0 to $100,000: 279.01% mean; -6.87% median
$100,000 to $1,000,000: 430.97% mean; 0.00% median
$1,000,000 to $10,000,000: 543.33% mean; -0.95% median
$10,000,000+: 2,772.78% mean; 0.52% median


In [1345]:
print("Average donor count year over year growth per segment, 2021, 2022, 2023, orgs w/ 3+ years:")
print("-"*40)
for i in range(len(SEGMENTS)):
    seg = donor_growth_3years[donor_growth_3years['segment']==i]
    mn_donors = seg[seg['year'].isin([2021, 2022, 2023])]['donors yoy'].mean() * 100.
    mdn_donors = seg[seg['year'].isin([2021, 2022, 2023])]['donors yoy'].median() * 100.
    try:
        print("${:,} to ${:,}: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i][0], SEGMENTS[i][1], mn_donors, mdn_donors))
    except:
        print("${:,}+: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i], mn_donors, mdn_donors))

Average donor count year over year growth per segment, 2021, 2022, 2023, orgs w/ 3+ years:
----------------------------------------
$0 to $100,000: 83.61% mean; -14.29% median
$100,000 to $1,000,000: 62.73% mean; -0.58% median
$1,000,000 to $10,000,000: 132.71% mean; -4.66% median
$10,000,000+: 85.54% mean; -4.35% median


### bloomerang

#### donor count

In [1346]:
bloom_orgs.head()

Unnamed: 0,created_year,segment,total_revenue,org,volume,donors,year,is top 20
0,2015,2,502991.74,100blackmenindianapolis,45615.69,247,2018,False
1,2015,2,1481676.3,100blackmenindianapolis,36257.02,201,2019,False
2,2015,2,533568.08,100blackmenindianapolis,34370.13,233,2020,False
3,2015,2,73572.08,100blackmenindianapolis,35644.52,73,2021,False
4,2015,2,2943.19,100blackmenindianapolis,2740.01,5,2022,False


In [1347]:
print("Average donor count per org, per year, all orgs:")
mn = bloom_orgs['donors'].mean()
mdn = bloom_orgs['donors'].median()
print("- All years: mean: {:,.2f}; median: {:,.2f}".format(mn, mdn))

mn = bloom_orgs[bloom_orgs['year'].isin([2021, 2022, 2023])]['donors'].mean()
mdn = bloom_orgs[bloom_orgs['year'].isin([2021, 2022, 2023])]['donors'].median()
print("- 2021, 2022, 2023: mean: {:,.2f}; median: {:,.2f}".format(mn, mdn))

mn = bloom_orgs[bloom_orgs['year']==2023]['donors'].mean()
mdn = bloom_orgs[bloom_orgs['year']==2023]['donors'].median()
print("- 2023: mean: {:,.2f}; median: {:,.2f}".format(mn, mdn))

print()
print("Average donor count per org, per year, top 20%:")
mn = bloom_orgs[bloom_orgs['is top 20']]['donors'].mean()
mdn = bloom_orgs[bloom_orgs['is top 20']]['donors'].median()
print("- All years: mean: {:,.2f}; median: {:,.2f}".format(mn, mdn))

mn = bloom_orgs[(bloom_orgs['year'].isin([2021, 2022, 2023]))&(bloom_orgs['is top 20'])]['donors'].mean()
mdn = bloom_orgs[(bloom_orgs['year'].isin([2021, 2022, 2023]))&(bloom_orgs['is top 20'])]['donors'].median()
print("- 2021, 2022, 2023: mean: {:,.2f}; median: {:,.2f}".format(mn, mdn))

mn = bloom_orgs[(bloom_orgs['year']==2023)&(bloom_orgs['is top 20'])]['donors'].mean()
mdn = bloom_orgs[(bloom_orgs['year']==2023)&(bloom_orgs['is top 20'])]['donors'].median()
print("- 2023: mean: {:,.2f}; median: {:,.2f}".format(mn, mdn))

Average donor count per org, per year, all orgs:
- All years: mean: 465.13; median: 191.00
- 2021, 2022, 2023: mean: 482.72; median: 199.00
- 2023: mean: 462.18; median: 197.00

Average donor count per org, per year, top 20%:
- All years: mean: 1,093.64; median: 536.00
- 2021, 2022, 2023: mean: 1,153.73; median: 558.00
- 2023: mean: 1,092.27; median: 547.50


In [1348]:
print("Average donor count per segment, per year, across all years:")
print("-"*40)
for i in range(len(SEGMENTS)):
    seg = bloom_orgs[bloom_orgs['segment']==i]
    mn_donors = seg['donors'].mean()
    mdn_donors = seg['donors'].median()
    try:
        print("${:,} to ${:,}: {:,.2f} mean; {:,.2f} median".format(SEGMENTS[i][0], SEGMENTS[i][1], mn_donors, mdn_donors))
    except:
        print("${:,}+: {:,.2f} mean; {:,.2f} median".format(SEGMENTS[i], mn_donors, mdn_donors))
        
print()
print("Average donor count per segment, per year, 2021, 2022, 2023:")
print("-"*40)
for i in range(len(SEGMENTS)):
    seg = bloom_orgs[(bloom_orgs['segment']==i)&(bloom_orgs['year'].isin([2021, 2022, 2023]))]
    mn_donors = seg['donors'].mean()
    mdn_donors = seg['donors'].median()
    try:
        print("${:,} to ${:,}: {:,.2f} mean; {:,.2f} median".format(SEGMENTS[i][0], SEGMENTS[i][1], mn_donors, mdn_donors))
    except:
        print("${:,}+: {:,.2f} mean; {:,.2f} median".format(SEGMENTS[i], mn_donors, mdn_donors))

print()
print("Average donor count per segment, per year, 2023:")
print("-"*40)
for i in range(len(SEGMENTS)):
    seg = bloom_orgs[(bloom_orgs['segment']==i)&(bloom_orgs['year']==2023)]
    mn_donors = seg['donors'].mean()
    mdn_donors = seg['donors'].median()
    try:
        print("${:,} to ${:,}: {:,.2f} mean; {:,.2f} median".format(SEGMENTS[i][0], SEGMENTS[i][1], mn_donors, mdn_donors))
    except:
        print("${:,}+: {:,.2f} mean; {:,.2f} median".format(SEGMENTS[i], mn_donors, mdn_donors))

Average donor count per segment, per year, across all years:
----------------------------------------
$0 to $100,000: 85.47 mean; 38.00 median
$100,000 to $1,000,000: 253.68 mean; 145.00 median
$1,000,000 to $10,000,000: 639.83 mean; 310.00 median
$10,000,000+: 1,046.12 mean; 384.00 median

Average donor count per segment, per year, 2021, 2022, 2023:
----------------------------------------
$0 to $100,000: 81.68 mean; 39.00 median
$100,000 to $1,000,000: 266.55 mean; 154.00 median
$1,000,000 to $10,000,000: 672.01 mean; 327.00 median
$10,000,000+: 1,114.10 mean; 394.50 median

Average donor count per segment, per year, 2023:
----------------------------------------
$0 to $100,000: 81.51 mean; 37.00 median
$100,000 to $1,000,000: 262.57 mean; 157.00 median
$1,000,000 to $10,000,000: 662.61 mean; 323.00 median
$10,000,000+: 1,000.35 mean; 386.00 median


#### donor growth

In [1349]:
bloom_donor_growth = None
bloom_donor_growth_3years = None
for org in bloom_orgs['org'].unique():
    _df = bloom_orgs[bloom_orgs['org']==org].copy().sort_values('year', ascending=True)
    _df['donors yoy'] = _df['donors'].pct_change()
    _df.dropna(inplace=True)
    bloom_donor_growth = pd.concat([_df, bloom_donor_growth])
    
    if len(_df) > 1:
        bloom_donor_growth_3years = pd.concat([_df.iloc[1:], bloom_donor_growth_3years])

In [1350]:
data = []

k = "Average donor count growth per org, per year, all orgs"

mn = bloom_donor_growth['donors yoy'].mean()
mdn = bloom_donor_growth['donors yoy'].median()
data.append({
    'k': 'All years',
    'mean': mn,
    'median': mdn
})

mn = bloom_donor_growth[bloom_donor_growth['year'].isin([2021, 2022, 2023])]['donors yoy'].mean()
mdn = bloom_donor_growth[bloom_donor_growth['year'].isin([2021, 2022, 2023])]['donors yoy'].median()
data.append({
    'k': '2021, 2022, 2023',
    'mean': mn,
    'median': mdn
})

mn = bloom_donor_growth[bloom_donor_growth['year']==2023]['donors yoy'].mean()
mdn = bloom_donor_growth[bloom_donor_growth['year']==2023]['donors yoy'].median()
data.append({
    'k': '2023',
    'mean': mn,
    'median': mdn
})


mn = bloom_donor_growth[bloom_donor_growth['is top 20']]['donors yoy'].mean()
mdn = bloom_donor_growth[bloom_donor_growth['is top 20']]['donors yoy'].median()
data.append({
    'k': 'Top 20%, all years',
    'mean': mn,
    'median': mdn
})

mn = bloom_donor_growth[(bloom_donor_growth['year'].isin([2021, 2022, 2023]))&(bloom_donor_growth['is top 20'])]['donors yoy'].mean()
mdn = bloom_donor_growth[(bloom_donor_growth['year'].isin([2021, 2022, 2023]))&(bloom_donor_growth['is top 20'])]['donors yoy'].median()
data.append({
    'k': 'Top 20%, 2021, 2022, 2023',
    'mean': mn,
    'median': mdn
})

mn = bloom_donor_growth[(bloom_donor_growth['year']==2023)&(bloom_donor_growth['is top 20'])]['donors yoy'].mean()
mdn = bloom_donor_growth[(bloom_donor_growth['year']==2023)&(bloom_donor_growth['is top 20'])]['donors yoy'].median()
data.append({
    'k': 'Top 20%, 2023',
    'mean': mn,
    'median': mdn
})

print(k)
pd.DataFrame(data)

Average donor count growth per org, per year, all orgs


Unnamed: 0,k,mean,median
0,All years,1.785754,-0.02154
1,"2021, 2022, 2023",1.806347,0.0
2,2023,0.912199,-0.008197
3,"Top 20%, all years",1.306146,-0.029412
4,"Top 20%, 2021, 2022, 2023",0.40371,-0.018657
5,"Top 20%, 2023",0.437327,-0.023776


In [1351]:
data = []

k = "Average donor count growth per org, per year, orgs w/ 3+ years"

mn = bloom_donor_growth_3years['donors yoy'].mean()
mdn = bloom_donor_growth_3years['donors yoy'].median()
data.append({
    'k': 'All years',
    'mean': mn,
    'median': mdn
})

mn = bloom_donor_growth_3years[bloom_donor_growth_3years['year'].isin([2021, 2022, 2023])]['donors yoy'].mean()
mdn = bloom_donor_growth_3years[bloom_donor_growth_3years['year'].isin([2021, 2022, 2023])]['donors yoy'].median()
data.append({
    'k': '2021, 2022, 2023',
    'mean': mn,
    'median': mdn
})

mn = bloom_donor_growth_3years[bloom_donor_growth_3years['year']==2023]['donors yoy'].mean()
mdn = bloom_donor_growth_3years[bloom_donor_growth_3years['year']==2023]['donors yoy'].median()
data.append({
    'k': '2023',
    'mean': mn,
    'median': mdn
})


mn = bloom_donor_growth_3years[bloom_donor_growth_3years['is top 20']]['donors yoy'].mean()
mdn = bloom_donor_growth_3years[bloom_donor_growth_3years['is top 20']]['donors yoy'].median()
data.append({
    'k': 'Top 20%, all years',
    'mean': mn,
    'median': mdn
})

mn = bloom_donor_growth_3years[(bloom_donor_growth_3years['year'].isin([2021, 2022, 2023]))&(bloom_donor_growth_3years['is top 20'])]['donors yoy'].mean()
mdn = bloom_donor_growth_3years[(bloom_donor_growth_3years['year'].isin([2021, 2022, 2023]))&(bloom_donor_growth_3years['is top 20'])]['donors yoy'].median()
data.append({
    'k': 'Top 20%, 2021, 2022, 2023',
    'mean': mn,
    'median': mdn
})

mn = bloom_donor_growth_3years[(bloom_donor_growth_3years['year']==2023)&(bloom_donor_growth_3years['is top 20'])]['donors yoy'].mean()
mdn = bloom_donor_growth_3years[(bloom_donor_growth_3years['year']==2023)&(bloom_donor_growth_3years['is top 20'])]['donors yoy'].median()
data.append({
    'k': 'Top 20%, 2023',
    'mean': mn,
    'median': mdn
})

print(k)
pd.DataFrame(data)

Average donor count growth per org, per year, orgs w/ 3+ years


Unnamed: 0,k,mean,median
0,All years,0.521924,-0.050621
1,"2021, 2022, 2023",0.593273,-0.012343
2,2023,0.352369,-0.018395
3,"Top 20%, all years",0.302161,-0.046254
4,"Top 20%, 2021, 2022, 2023",0.189039,-0.020221
5,"Top 20%, 2023",0.054966,-0.024306


In [1352]:
print("Average donor count growth per org, per year, all orgs:")
mn = bloom_donor_growth['donors yoy'].mean()
mdn = bloom_donor_growth['donors yoy'].median()
print("- All years: mean: {:,.2f}; median: {:,.2f}".format(mn, mdn))

mn = bloom_donor_growth[bloom_donor_growth['year'].isin([2021, 2022, 2023])]['donors yoy'].mean()
mdn = bloom_donor_growth[bloom_donor_growth['year'].isin([2021, 2022, 2023])]['donors yoy'].median()
print("- 2021, 2022, 2023: mean: {:,.2f}%; median: {:,.2f}%".format(mn, mdn))

mn = bloom_donor_growth[bloom_donor_growth['year']==2023]['donors yoy'].mean()
mdn = bloom_donor_growth[bloom_donor_growth['year']==2023]['donors yoy'].median()
print("- 2023: mean: {:,.2f}%; median: {:,.2f}%".format(mn, mdn))

print()
print("Average donor count growth per org, per year, top 20%:")
mn = bloom_donor_growth[bloom_donor_growth['is top 20']]['donors yoy'].mean()
mdn = bloom_donor_growth[bloom_donor_growth['is top 20']]['donors yoy'].median()
print("- All years: mean: {:,.2f}%; median: {:,.2f}%".format(mn, mdn))

mn = bloom_donor_growth[(bloom_donor_growth['year'].isin([2021, 2022, 2023]))&(bloom_donor_growth['is top 20'])]['donors yoy'].mean()
mdn = bloom_donor_growth[(bloom_donor_growth['year'].isin([2021, 2022, 2023]))&(bloom_donor_growth['is top 20'])]['donors yoy'].median()
print("- 2021, 2022, 2023: mean: {:,.2f}%; median: {:,.2f}%".format(mn, mdn))

mn = bloom_donor_growth[(bloom_donor_growth['year']==2023)&(bloom_donor_growth['is top 20'])]['donors yoy'].mean()
mdn = bloom_donor_growth[(bloom_donor_growth['year']==2023)&(bloom_donor_growth['is top 20'])]['donors yoy'].median()
print("- 2023: mean: {:,.2f}%; median: {:,.2f}%".format(mn, mdn))

Average donor count growth per org, per year, all orgs:
- All years: mean: 1.79; median: -0.02
- 2021, 2022, 2023: mean: 1.81%; median: 0.00%
- 2023: mean: 0.91%; median: -0.01%

Average donor count growth per org, per year, top 20%:
- All years: mean: 1.31%; median: -0.03%
- 2021, 2022, 2023: mean: 0.40%; median: -0.02%
- 2023: mean: 0.44%; median: -0.02%


In [1353]:
print("Average per org donor growth by year, all orgs:")
print("-"*40)
for year in sorted(bloom_donor_growth['year'].unique()):
    _df = bloom_donor_growth[bloom_donor_growth['year']==year]
    mn = _df['donors yoy'].dropna().mean()
    mdn = _df['donors yoy'].dropna().median()
    mn_top20 = _df[_df['is top 20']]['donors yoy'].dropna().mean()
    mdn_top20 = _df[_df['is top 20']]['donors yoy'].dropna().median()
    print("{} | all: {:.2f}% mean; {:.2f}% median | top 20%: {:.2f}% mean; {:.2f}% median".format(year, mn, mdn, mn_top20, mdn_top20))

Average per org donor growth by year, all orgs:
----------------------------------------
2018 | all: 0.00% mean; 0.00% median | top 20%: 0.00% mean; 0.00% median
2019 | all: 2.10% mean; 0.04% median | top 20%: 2.72% mean; 0.02% median
2020 | all: 2.54% mean; 0.03% median | top 20%: 3.19% mean; 0.04% median
2021 | all: 1.98% mean; 0.02% median | top 20%: 0.52% mean; -0.00% median
2022 | all: 2.54% mean; -0.01% median | top 20%: 0.26% mean; -0.02% median
2023 | all: 0.91% mean; -0.01% median | top 20%: 0.44% mean; -0.02% median
2024 | all: 0.74% mean; -0.23% median | top 20%: 0.76% mean; -0.23% median


In [1354]:
print("Average donor count year over year growth per segment, across all years:")
print("-"*40)
for i in range(len(SEGMENTS)):
    seg = bloom_donor_growth[bloom_donor_growth['segment']==i]
    mn_donors = seg['donors yoy'].mean() * 100.
    mdn_donors = seg['donors yoy'].median() * 100.
    try:
        print("${:,} to ${:,}: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i][0], SEGMENTS[i][1], mn_donors, mdn_donors))
    except:
        print("${:,}+: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i], mn_donors, mdn_donors))
        
print()
print("Average donor count year over year growth per segment, 2021, 2022, 2023:")
print("-"*40)
for i in range(len(SEGMENTS)):
    seg = bloom_donor_growth[bloom_donor_growth['segment']==i]
    mn_donors = seg[seg['year'].isin([2021, 2022, 2023])]['donors yoy'].mean() * 100.
    mdn_donors = seg[seg['year'].isin([2021, 2022, 2023])]['donors yoy'].median() * 100.
    try:
        print("${:,} to ${:,}: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i][0], SEGMENTS[i][1], mn_donors, mdn_donors))
    except:
        print("${:,}+: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i], mn_donors, mdn_donors))
        
print()
print("Average donor count year over year growth per segment, 2023:")
print("-"*40)
for i in range(len(SEGMENTS)):
    seg = bloom_donor_growth[bloom_donor_growth['segment']==i]
    mn_donors = seg[seg['year'].isin([2023])]['donors yoy'].mean() * 100.
    mdn_donors = seg[seg['year'].isin([2023])]['donors yoy'].median() * 100.
    try:
        print("${:,} to ${:,}: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i][0], SEGMENTS[i][1], mn_donors, mdn_donors))
    except:
        print("${:,}+: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i], mn_donors, mdn_donors))

Average donor count year over year growth per segment, across all years:
----------------------------------------
$0 to $100,000: 164.25% mean; -5.21% median
$100,000 to $1,000,000: 211.41% mean; -0.69% median
$1,000,000 to $10,000,000: 168.11% mean; -2.56% median
$10,000,000+: 99.31% mean; -3.93% median

Average donor count year over year growth per segment, 2021, 2022, 2023:
----------------------------------------
$0 to $100,000: 175.19% mean; -2.08% median
$100,000 to $1,000,000: 266.18% mean; 1.71% median
$1,000,000 to $10,000,000: 119.93% mean; -1.12% median
$10,000,000+: 67.87% mean; -2.16% median

Average donor count year over year growth per segment, 2023:
----------------------------------------
$0 to $100,000: 154.21% mean; -3.30% median
$100,000 to $1,000,000: 121.60% mean; 0.17% median
$1,000,000 to $10,000,000: 47.39% mean; -1.97% median
$10,000,000+: 74.09% mean; -2.06% median


In [1355]:
print("Average donor count year over year growth per segment, 2021, 2022, 2023, orgs w/ 3+ years:")
print("-"*40)
for i in range(len(SEGMENTS)):
    seg = bloom_donor_growth_3years[bloom_donor_growth_3years['segment']==i]
    mn_donors = seg[seg['year'].isin([2021, 2022, 2023])]['donors yoy'].mean() * 100.
    mdn_donors = seg[seg['year'].isin([2021, 2022, 2023])]['donors yoy'].median() * 100.
    try:
        print("${:,} to ${:,}: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i][0], SEGMENTS[i][1], mn_donors, mdn_donors))
    except:
        print("${:,}+: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i], mn_donors, mdn_donors))

Average donor count year over year growth per segment, 2021, 2022, 2023, orgs w/ 3+ years:
----------------------------------------
$0 to $100,000: 74.20% mean; -6.61% median
$100,000 to $1,000,000: 67.39% mean; 0.00% median
$1,000,000 to $10,000,000: 57.36% mean; -1.76% median
$10,000,000+: 24.54% mean; -2.60% median


### merged (qgiv + bloomerang)

In [1356]:
cols = ['org', 'segment', 'year', 'is top 20', 'donors yoy']
bloom_donor_growth['year'] = bloom_donor_growth['year']
bloom_donor_growth_3years['year'] = bloom_donor_growth_3years['year']

all_growth_donors = pd.concat([donor_growth[cols], bloom_donor_growth[cols]])
all_growth_donors_3years = pd.concat([donor_growth_3years[cols], bloom_donor_growth_3years[cols]])

In [1357]:
all_growth_donors.tail(3)

Unnamed: 0,org,segment,year,is top 20,donors yoy
4,100blackmenindianapolis,2,2022,False,-0.931507
5,100blackmenindianapolis,2,2023,False,-0.6
6,100blackmenindianapolis,2,2024,False,99.5


In [1358]:
all_growth_donors_3years.head(3)

Unnamed: 0,org,segment,year,is top 20,donors yoy
38183,442026,0,2019,False,7.6
37417,442026,0,2020,False,-0.953488
36995,447544,2,2023,False,-0.666667


In [1359]:
data = []

k = "Average donor count growth per org, per year, all orgs"

mn = all_growth_donors['donors yoy'].mean()
mdn = all_growth_donors['donors yoy'].median()
data.append({
    'k': 'All years',
    'mean': mn,
    'median': mdn
})

mn = all_growth_donors[all_growth_donors['year'].isin([2021, 2022, 2023])]['donors yoy'].mean()
mdn = all_growth_donors[all_growth_donors['year'].isin([2021, 2022, 2023])]['donors yoy'].median()
data.append({
    'k': '2021, 2022, 2023',
    'mean': mn,
    'median': mdn
})

mn = all_growth_donors[all_growth_donors['year']==2023]['donors yoy'].mean()
mdn = all_growth_donors[all_growth_donors['year']==2023]['donors yoy'].median()
data.append({
    'k': '2023',
    'mean': mn,
    'median': mdn
})


mn = all_growth_donors[all_growth_donors['is top 20']]['donors yoy'].mean()
mdn = all_growth_donors[all_growth_donors['is top 20']]['donors yoy'].median()
data.append({
    'k': 'Top 20%, all years',
    'mean': mn,
    'median': mdn
})

mn = all_growth_donors[(all_growth_donors['year'].isin([2021, 2022, 2023]))&(all_growth_donors['is top 20'])]['donors yoy'].mean()
mdn = all_growth_donors[(all_growth_donors['year'].isin([2021, 2022, 2023]))&(all_growth_donors['is top 20'])]['donors yoy'].median()
data.append({
    'k': 'Top 20%, 2021, 2022, 2023',
    'mean': mn,
    'median': mdn
})

mn = all_growth_donors[(all_growth_donors['year']==2023)&(all_growth_donors['is top 20'])]['donors yoy'].mean()
mdn = all_growth_donors[(all_growth_donors['year']==2023)&(all_growth_donors['is top 20'])]['donors yoy'].median()
data.append({
    'k': 'Top 20%, 2023',
    'mean': mn,
    'median': mdn
})

print(k)
pd.DataFrame(data)

Average donor count growth per org, per year, all orgs


Unnamed: 0,k,mean,median
0,All years,2.795794,-0.011673
1,"2021, 2022, 2023",2.453774,-0.002045
2,2023,2.340035,-0.006594
3,"Top 20%, all years",3.648678,-0.005758
4,"Top 20%, 2021, 2022, 2023",2.42417,-0.016667
5,"Top 20%, 2023",3.87702,-0.018435


In [1360]:
data = []

k = "Average donor count growth per org, per year, orgs w/ 3+ years"

mn = all_growth_donors_3years['donors yoy'].mean()
mdn = all_growth_donors_3years['donors yoy'].median()
data.append({
    'k': 'All years',
    'mean': mn,
    'median': mdn
})

mn = all_growth_donors_3years[all_growth_donors_3years['year'].isin([2021, 2022, 2023])]['donors yoy'].mean()
mdn = all_growth_donors_3years[all_growth_donors_3years['year'].isin([2021, 2022, 2023])]['donors yoy'].median()
data.append({
    'k': '2021, 2022, 2023',
    'mean': mn,
    'median': mdn
})

mn = all_growth_donors_3years[all_growth_donors_3years['year']==2023]['donors yoy'].mean()
mdn = all_growth_donors_3years[all_growth_donors_3years['year']==2023]['donors yoy'].median()
data.append({
    'k': '2023',
    'mean': mn,
    'median': mdn
})


mn = all_growth_donors_3years[all_growth_donors_3years['is top 20']]['donors yoy'].mean()
mdn = all_growth_donors_3years[all_growth_donors_3years['is top 20']]['donors yoy'].median()
data.append({
    'k': 'Top 20%, all years',
    'mean': mn,
    'median': mdn
})

mn = all_growth_donors_3years[(all_growth_donors_3years['year'].isin([2021, 2022, 2023]))&(all_growth_donors_3years['is top 20'])]['donors yoy'].mean()
mdn = all_growth_donors_3years[(all_growth_donors_3years['year'].isin([2021, 2022, 2023]))&(all_growth_donors_3years['is top 20'])]['donors yoy'].median()
data.append({
    'k': 'Top 20%, 2021, 2022, 2023',
    'mean': mn,
    'median': mdn
})

mn = all_growth_donors_3years[(all_growth_donors_3years['year']==2023)&(all_growth_donors_3years['is top 20'])]['donors yoy'].mean()
mdn = all_growth_donors_3years[(all_growth_donors_3years['year']==2023)&(all_growth_donors_3years['is top 20'])]['donors yoy'].median()
data.append({
    'k': 'Top 20%, 2023',
    'mean': mn,
    'median': mdn
})

print(k)
pd.DataFrame(data)

Average donor count growth per org, per year, orgs w/ 3+ years


Unnamed: 0,k,mean,median
0,All years,0.587893,-0.04811
1,"2021, 2022, 2023",0.664707,-0.018393
2,2023,0.55464,-0.019672
3,"Top 20%, all years",0.434089,-0.034014
4,"Top 20%, 2021, 2022, 2023",0.414257,-0.022624
5,"Top 20%, 2023",0.481832,-0.022925


In [1361]:
print("Average donor count growth per org, per year, all orgs:")
mn = all_growth_donors['donors yoy'].mean()
mdn = all_growth_donors['donors yoy'].median()
print("- All years: mean: {:,.2f}; median: {:,.2f}".format(mn, mdn))

mn = all_growth_donors[all_growth_donors['year'].isin([2021, 2022, 2023])]['donors yoy'].mean()
mdn = all_growth_donors[all_growth_donors['year'].isin([2021, 2022, 2023])]['donors yoy'].median()
print("- 2021, 2022, 2023: mean: {:,.2f}%; median: {:,.2f}%".format(mn, mdn))

mn = all_growth_donors[all_growth_donors['year']==2023]['donors yoy'].mean()
mdn = all_growth_donors[all_growth_donors['year']==2023]['donors yoy'].median()
print("- 2023: mean: {:,.2f}%; median: {:,.2f}%".format(mn, mdn))

print()
print("Average donor count growth per org, per year, top 20%:")
mn = all_growth_donors[all_growth_donors['is top 20']]['donors yoy'].mean()
mdn = all_growth_donors[all_growth_donors['is top 20']]['donors yoy'].median()
print("- All years: mean: {:,.2f}%; median: {:,.2f}%".format(mn, mdn))

mn = all_growth_donors[(all_growth_donors['year'].isin([2021, 2022, 2023]))&(all_growth_donors['is top 20'])]['donors yoy'].mean()
mdn = all_growth_donors[(all_growth_donors['year'].isin([2021, 2022, 2023]))&(all_growth_donors['is top 20'])]['donors yoy'].median()
print("- 2021, 2022, 2023: mean: {:,.2f}%; median: {:,.2f}%".format(mn, mdn))

mn = all_growth_donors[(all_growth_donors['year']==2023)&(all_growth_donors['is top 20'])]['donors yoy'].mean()
mdn = all_growth_donors[(all_growth_donors['year']==2023)&(all_growth_donors['is top 20'])]['donors yoy'].median()
print("- 2023: mean: {:,.2f}%; median: {:,.2f}%".format(mn, mdn))

Average donor count growth per org, per year, all orgs:
- All years: mean: 2.80; median: -0.01
- 2021, 2022, 2023: mean: 2.45%; median: -0.00%
- 2023: mean: 2.34%; median: -0.01%

Average donor count growth per org, per year, top 20%:
- All years: mean: 3.65%; median: -0.01%
- 2021, 2022, 2023: mean: 2.42%; median: -0.02%
- 2023: mean: 3.88%; median: -0.02%


In [1491]:
all_growth_donors.groupby('org')['donors yoy'].median().mean(), all_growth_donors[all_growth_donors['year']==2023]['donors yoy'].agg(['mean', 'median'])

(2.9892881484311737,
 mean      2.340035
 median   -0.006594
 Name: donors yoy, dtype: float64)

In [1362]:
print("Average per org donor growth by year, all orgs:")
print("-"*40)
for year in sorted(all_growth_donors['year'].unique()):
    _df = all_growth_donors[all_growth_donors['year']==year]
    mn = _df['donors yoy'].dropna().mean()
    mdn = _df['donors yoy'].dropna().median()
    mn_top20 = _df[_df['is top 20']]['donors yoy'].dropna().mean()
    mdn_top20 = _df[_df['is top 20']]['donors yoy'].dropna().median()
    print("{} | all: {:.2f}% mean; {:.2f}% median | top 20%: {:.2f}% mean; {:.2f}% median".format(year, mn, mdn, mn_top20, mdn_top20))

Average per org donor growth by year, all orgs:
----------------------------------------
2007 | all: 4.25% mean; 4.25% median | top 20%: 4.25% mean; 4.25% median
2008 | all: 5.11% mean; 1.20% median | top 20%: 7.48% mean; 1.80% median
2009 | all: 8.65% mean; 0.52% median | top 20%: 13.69% mean; 0.83% median
2010 | all: 2.04% mean; 0.38% median | top 20%: 1.84% mean; 0.45% median
2011 | all: 1.68% mean; 0.24% median | top 20%: 2.00% mean; 0.34% median
2012 | all: 2.80% mean; 0.23% median | top 20%: 2.36% mean; 0.25% median
2013 | all: 5.88% mean; 0.12% median | top 20%: 8.95% mean; 0.26% median
2014 | all: 2.40% mean; 0.04% median | top 20%: 3.85% mean; 0.20% median
2015 | all: 1.12% mean; 0.05% median | top 20%: 1.23% mean; 0.12% median
2016 | all: 2.59% mean; 0.04% median | top 20%: 3.95% mean; 0.10% median
2017 | all: 7.27% mean; 0.04% median | top 20%: 10.94% mean; 0.10% median
2018 | all: 6.52% mean; 0.00% median | top 20%: 7.47% mean; 0.05% median
2019 | all: 3.03% mean; 0.04% med

In [1363]:
print("Average donor count year over year growth per segment, across all years:")
print("-"*40)
for i in range(len(SEGMENTS)):
    seg = all_growth_donors[all_growth_donors['segment']==i]
    mn_donors = seg['donors yoy'].mean() * 100.
    mdn_donors = seg['donors yoy'].median() * 100.
    try:
        print("${:,} to ${:,}: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i][0], SEGMENTS[i][1], mn_donors, mdn_donors))
    except:
        print("${:,}+: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i], mn_donors, mdn_donors))
        
print()
print("Average donor count year over year growth per segment, 2021, 2022, 2023:")
print("-"*40)
for i in range(len(SEGMENTS)):
    seg = all_growth_donors[all_growth_donors['segment']==i]
    mn_donors = seg[seg['year'].isin([2021, 2022, 2023])]['donors yoy'].mean() * 100.
    mdn_donors = seg[seg['year'].isin([2021, 2022, 2023])]['donors yoy'].median() * 100.
    try:
        print("${:,} to ${:,}: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i][0], SEGMENTS[i][1], mn_donors, mdn_donors))
    except:
        print("${:,}+: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i], mn_donors, mdn_donors))
        
print()
print("Average donor count year over year growth per segment, 2023:")
print("-"*40)
for i in range(len(SEGMENTS)):
    seg = all_growth_donors[all_growth_donors['segment']==i]
    mn_donors = seg[seg['year'].isin([2023])]['donors yoy'].mean() * 100.
    mdn_donors = seg[seg['year'].isin([2023])]['donors yoy'].median() * 100.
    try:
        print("${:,} to ${:,}: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i][0], SEGMENTS[i][1], mn_donors, mdn_donors))
    except:
        print("${:,}+: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i], mn_donors, mdn_donors))

Average donor count year over year growth per segment, across all years:
----------------------------------------
$0 to $100,000: 234.76% mean; -3.85% median
$100,000 to $1,000,000: 248.57% mean; 0.00% median
$1,000,000 to $10,000,000: 262.93% mean; -1.68% median
$10,000,000+: 537.36% mean; -2.35% median

Average donor count year over year growth per segment, 2021, 2022, 2023:
----------------------------------------
$0 to $100,000: 199.64% mean; -6.41% median
$100,000 to $1,000,000: 272.74% mean; 1.31% median
$1,000,000 to $10,000,000: 176.33% mean; -1.19% median
$10,000,000+: 459.66% mean; -2.08% median

Average donor count year over year growth per segment, 2023:
----------------------------------------
$0 to $100,000: 218.50% mean; -5.19% median
$100,000 to $1,000,000: 183.85% mean; 0.00% median
$1,000,000 to $10,000,000: 149.32% mean; -1.83% median
$10,000,000+: 767.81% mean; -1.28% median


In [1364]:
print("Average donor count year over year growth per segment, 2021, 2022, 2023, orgs w/ 3+ years records:")
print("-"*40)
for i in range(len(SEGMENTS)):
    seg = all_growth_donors_3years[all_growth_donors_3years['segment']==i]
    mn_donors = seg[seg['year'].isin([2021, 2022, 2023])]['donors yoy'].mean() * 100.
    mdn_donors = seg[seg['year'].isin([2021, 2022, 2023])]['donors yoy'].median() * 100.
    try:
        print("${:,} to ${:,}: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i][0], SEGMENTS[i][1], mn_donors, mdn_donors))
    except:
        print("${:,}+: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i], mn_donors, mdn_donors))

Average donor count year over year growth per segment, 2021, 2022, 2023, orgs w/ 3+ years records:
----------------------------------------
$0 to $100,000: 79.19% mean; -10.06% median
$100,000 to $1,000,000: 66.63% mean; 0.00% median
$1,000,000 to $10,000,000: 69.98% mean; -2.06% median
$10,000,000+: 37.76% mean; -2.85% median


## 5. Average donor retention rate

Using data built in `/research/donor retention/retention data build.ipynb` and stored in a local CSV at `/research/donor retention/org_year.csv`

In [1365]:
retention = pd.read_csv("../donor retention/org_year.csv")

In [1366]:
retention.head(3)

Unnamed: 0,org,year,donors_unique,donors_retained,new_donors,new_donors_retained,new_donors_last_year,recurring_donors,retention,new_donor_retention
0,430204,2017,3010,82,3010,82,615,166,0.027243,0.133333
1,430204,2018,4350,358,4350,336,2928,233,0.082299,0.114754
2,430204,2019,4896,688,4896,563,3966,155,0.140523,0.141957


In [1367]:
def tag_retention_segment(org_id):
    try:
        this_org = orgs[org_id==orgs['org']].sort_values('year', ascending=True)
        vol = this_org['volume'].iloc[-1]
        for i in range(len(SEGMENTS)):
            try:
                if vol >= SEGMENTS[i][0] and vol < SEGMENTS[i][1]:
                    return i
            except:
                if vol >= SEGMENTS[i]:
                    return i
    except:
        return 10

retention['rank'] = retention['org'].apply(tag_retention_segment)

In [1368]:
retention['rank'].value_counts()

0    22420
1     4491
2      271
3       38
Name: rank, dtype: int64

In [1369]:
print("All orgs, all years")
print(retention[['retention', 'new_donor_retention']].agg(['mean', 'median']).reset_index())

print()
print("All orgs, 2023")
print(retention[retention['year']==2023][['retention', 'new_donor_retention']].agg(['mean', 'median']).reset_index())

All orgs, all years
    index  retention  new_donor_retention
0    mean   0.308741             0.192450
1  median   0.250000             0.156766

All orgs, 2023
    index  retention  new_donor_retention
0    mean   0.337041             0.182302
1  median   0.272727             0.153846


In [1370]:
print("Average donor retention by fundraising performance segment:")
print("-"*60)
for i in range(len(SEGMENTS)):
    mean_ret = retention[(retention['rank']==i)]['retention'].mean() * 100.
    median_ret = retention[(retention['rank']==i)]['retention'].median() * 100.
    try:
        print("${:,} - ${:,}: {:.2f}% mean; {:.2f}% median".format(SEGMENTS[i][0], SEGMENTS[i][1], mean_ret, median_ret))
    except:
        print("${:,}+: {:.2f}% mean; {:.2f}% median".format(SEGMENTS[i], mean_ret, median_ret))

Average donor retention by fundraising performance segment:
------------------------------------------------------------
$0 - $100,000: 31.45% mean; 24.82% median
$100,000 - $1,000,000: 27.99% mean; 25.88% median
$1,000,000 - $10,000,000: 31.08% mean; 29.63% median
$10,000,000+: 33.54% mean; 31.41% median


## 6. Average first time donor retention rate

In [1371]:
print("Average new donor retention by fundraising performance segment:")
print("-"*60)
for i in range(len(SEGMENTS)):
    mean_ret = retention[(retention['rank']==i)]['new_donor_retention'].mean() * 100.
    median_ret = retention[(retention['rank']==i)]['new_donor_retention'].median() * 100.
    try:
        print("${:,} - ${:,}: {:.2f}% mean; {:.2f}% median".format(SEGMENTS[i][0], SEGMENTS[i][1], mean_ret, median_ret))
    except:
        print("${:,}+: {:.2f}% mean; {:.2f}% median".format(SEGMENTS[i], mean_ret, median_ret))

Average new donor retention by fundraising performance segment:
------------------------------------------------------------
$0 - $100,000: 18.27% mean; 14.29% median
$100,000 - $1,000,000: 23.31% mean; 20.37% median
$1,000,000 - $10,000,000: 26.45% mean; 24.55% median
$10,000,000+: 38.11% mean; 37.43% median


## 7. Constituent growth

Data queried from GBQ containing new constituents per organization per year, iterated by org to calculate year over year change percentage and cumulative sum. Calculations performed on the resulting dataset with varying timeframes and filters.

Data file from Jaime Luong in `bloomerang_orgs.constituents.csv`

Data updated from Jaime in `rev-segment-data/2024-12-9.jaime.Constituent Growth.csv`

In [1453]:
constituents = pd.read_csv("rev-segment-data/2024-12-9.jaime.Constituent Growth.csv")

In [1454]:
constituents['org'] = constituents['DatabaseName']
constituents.drop(['DatabaseName'], axis=1, inplace=True)

In [1455]:
print("{:,} rows".format(len(constituents)))
print("{} to {}".format(constituents['constituent_created_year'].min(), constituents['constituent_created_year'].max()))
print("{:,} orgs".format(len(constituents['org'].unique())))

52,569 rows
2018 to 2024
15,268 orgs


In [1456]:
constituents = constituents.merge(bloom_orgs[['org', 'is top 20']], on='org').dropna().drop_duplicates()

In [1457]:
print("{:,} rows".format(len(constituents)))
print("{} to {}".format(constituents['constituent_created_year'].min(), constituents['constituent_created_year'].max()))
print("{:,} orgs".format(len(constituents['org'].unique())))

51,910 rows
2018 to 2024
14,768 orgs


In [1458]:
constituents.tail(3)

Unnamed: 0,created_year,constituent_created_year,segment,new_constituents,year_number,org,is top 20
310215,2014,2024,3) $1M - $10M,1569,7,zumix,True
310222,2023,2023,2) $100K - $999K,753,1,zyep,False
310229,2023,2024,2) $100K - $999K,59,2,zyep,False


### baseline growth

In [1459]:
orgs_year_data = None
for org in constituents['org'].unique():
    _df = constituents[constituents['org']==org].copy().sort_values('constituent_created_year', ascending=True)
    _df['const_growth'] = _df['new_constituents'].pct_change()
    
    org_created_year = _df['created_year'].iloc[0]
    cols = ['org', 'constituent_created_year', 'const_growth', 'segment', 'is top 20']
    
    if int(org_created_year) <= 2016:
        # 2018 is 2 years after org creation, all years should be valid
        orgs_year_data = pd.concat([orgs_year_data, _df[cols].dropna()])
    else:
        orgs_year_data = pd.concat([orgs_year_data, _df[cols].dropna().iloc[1:]])

In [1460]:
def extract_segment_num(s):
    return int(s[0]) - 1

orgs_year_data['segment_tag'] = orgs_year_data['segment'].apply(extract_segment_num)
orgs_year_data['segment_tag'].value_counts()

2    10816
1    10659
3     2926
0     2161
Name: segment_tag, dtype: int64

In [1461]:
print("{:,} entries".format(len(orgs_year_data)))
print("{:,} de-duped entries".format(len(orgs_year_data.drop_duplicates())))

26,562 entries
26,562 de-duped entries


In [1462]:
print("Constituents growth rate:")
print()
print("All orgs, all years:")
print("-"*40)
mn = orgs_year_data['const_growth'].mean()
mdn = orgs_year_data['const_growth'].median()
print("Mean: {:,.2f}%; median: {:,.2f}%".format(mn * 100., mdn * 100.))
print()

print("All orgs, 2021, 2022, 2023:")
print("-"*40)
mn = orgs_year_data[orgs_year_data['constituent_created_year'].isin([2021, 2022, 2023])]['const_growth'].mean()
mdn = orgs_year_data[orgs_year_data['constituent_created_year'].isin([2021, 2022, 2023])]['const_growth'].median()
print("Mean: {:,.2f}%; median: {:,.2f}%".format(mn * 100., mdn * 100.))

print()
print("Group by org:")
print("All orgs, all years:")
print("-"*40)
mn = orgs_year_data.groupby('org')['const_growth'].median().mean()
mdn = orgs_year_data.groupby('org')['const_growth'].median().median()
print("Mean: {:,.2f}%; median: {:,.2f}%".format(mn * 100., mdn * 100.))
print()

print("All orgs, 2021, 2022, 2023:")
print("-"*40)
mn = orgs_year_data[orgs_year_data['constituent_created_year'].isin([2021, 2022, 2023])].groupby('org')['const_growth'].median().mean()
mdn = orgs_year_data[orgs_year_data['constituent_created_year'].isin([2021, 2022, 2023])].groupby('org')['const_growth'].median().median()
print("Mean: {:,.2f}%; median: {:,.2f}%".format(mn * 100., mdn * 100.))


Constituents growth rate:

All orgs, all years:
----------------------------------------
Mean: 134.82%; median: -15.31%

All orgs, 2021, 2022, 2023:
----------------------------------------
Mean: 134.79%; median: -10.46%

Group by org:
All orgs, all years:
----------------------------------------
Mean: 70.12%; median: -14.79%

All orgs, 2021, 2022, 2023:
----------------------------------------
Mean: 73.67%; median: -9.09%


In [1463]:
print()
print("Average year over year constituent growth, by revenue segments:")
print("-"*40)
for i in range(len(SEGMENTS)):
    seg_growth = orgs_year_data[(orgs_year_data['segment_tag']==i)]
    mn_growth = seg_growth.groupby('org')['const_growth'].median().replace([np.inf, -np.inf], np.nan).dropna().mean() * 100.
    mdn_growth = seg_growth.groupby('org')['const_growth'].median().replace([np.inf, -np.inf], np.nan).dropna().median() * 100.
    try:
        print("${:,} to ${:,}: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i][0], SEGMENTS[i][1], mn_growth, mdn_growth))
    except:
        print("${:,}+: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i], mn_growth, mdn_growth))

print()
print("Average year over year constituent growth for 2021, 2022, 2023, by revenue segments:")
print("-"*40)
for i in range(len(SEGMENTS)):
    seg_growth = orgs_year_data[(orgs_year_data['constituent_created_year'].isin([2021, 2022, 2023]))&(orgs_year_data['segment_tag']==i)]
    mn_growth = seg_growth.groupby('org')['const_growth'].median().replace([np.inf, -np.inf], np.nan).dropna().mean() * 100.
    mdn_growth = seg_growth.groupby('org')['const_growth'].median().replace([np.inf, -np.inf], np.nan).dropna().median() * 100.
    try:
        print("${:,} to ${:,}: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i][0], SEGMENTS[i][1], mn_growth, mdn_growth))
    except:
        print("${:,}+: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i], mn_growth, mdn_growth))


Average year over year constituent growth, by revenue segments:
----------------------------------------
$0 to $100,000: 162.47% mean; -26.49% median
$100,000 to $1,000,000: 66.38% mean; -15.47% median
$1,000,000 to $10,000,000: 68.15% mean; -12.50% median
$10,000,000+: 2.32% mean; -14.48% median

Average year over year constituent growth for 2021, 2022, 2023, by revenue segments:
----------------------------------------
$0 to $100,000: 271.39% mean; -25.23% median
$100,000 to $1,000,000: 68.63% mean; -8.85% median
$1,000,000 to $10,000,000: 43.76% mean; -6.63% median
$10,000,000+: 15.65% mean; -7.67% median


In [1476]:
print("Baseline constituent growth for top 20%, all years:")
print(orgs_year_data.groupby(['org', 'is top 20'])['const_growth'].median().reset_index().groupby('is top 20')['const_growth'].agg(['mean', 'median']).reset_index())

print()
print("Baseline constituent growth for top 20%, 2021, 2022, 2023:")
orgs_year_data[orgs_year_data['constituent_created_year'].isin([2021, 2022, 2023])].groupby(['org', 'is top 20'])['const_growth'].median().reset_index().groupby('is top 20')['const_growth'].agg(['mean', 'median']).reset_index()

Baseline constituent growth for top 20%, all years:
   is top 20      mean    median
0      False  0.854278 -0.173469
1       True  0.256790 -0.096331

Baseline constituent growth for top 20%, 2021, 2022, 2023:


Unnamed: 0,is top 20,mean,median
0,False,0.907852,-0.115385
1,True,0.27832,-0.041237


### cumulative sum growth

In [1483]:
orgs_cumsum_data = None
for org in constituents['org'].unique():
    _df = constituents[constituents['org']==org].copy().sort_values('constituent_created_year', ascending=True)
    _df['const_cumsum'] = _df['new_constituents'].cumsum()
    _df['const_growth'] = _df['const_cumsum'].pct_change()
    
    org_created_year = _df['created_year'].iloc[0]
    cols = ['org', 'constituent_created_year', 'const_growth', 'segment', 'is top 20']
    _df.dropna(inplace=True)
    
    if len(_df) > 0:
        if int(org_created_year) <= 2016:
            # 2018 is 2 years after org creation, all years should be valid
            orgs_cumsum_data = pd.concat([orgs_cumsum_data, _df[cols].dropna()])
        else:
            orgs_cumsum_data = pd.concat([orgs_cumsum_data, _df[cols].dropna().iloc[1:]])

In [1484]:
orgs_cumsum_data['segment_tag'] = orgs_cumsum_data['segment'].apply(extract_segment_num)
orgs_cumsum_data['segment_tag'].value_counts()

2    10816
1    10659
3     2926
0     2161
Name: segment_tag, dtype: int64

In [1485]:
print("Constituents cumsum growth rate:")
print()
print("All orgs, all years:")
print("-"*40)
mn = orgs_cumsum_data['const_growth'].mean()
mdn = orgs_cumsum_data['const_growth'].median()
print("Mean: {:,.2f}%; median: {:,.2f}%".format(mn * 100., mdn * 100.))
print()

print("All orgs, 2021, 2022, 2023:")
print("-"*40)
mn = orgs_cumsum_data[orgs_cumsum_data['constituent_created_year'].isin([2021, 2022, 2023])]['const_growth'].mean()
mdn = orgs_cumsum_data[orgs_cumsum_data['constituent_created_year'].isin([2021, 2022, 2023])]['const_growth'].median()
print("Mean: {:,.2f}%; median: {:,.2f}%".format(mn * 100., mdn * 100.))

Constituents cumsum growth rate:

All orgs, all years:
----------------------------------------
Mean: 42.14%; median: 12.18%

All orgs, 2021, 2022, 2023:
----------------------------------------
Mean: 29.34%; median: 12.14%


In [1486]:
print()
print("Average year over year constituent cumsum growth, by revenue segments:")
print("-"*40)
for i in range(len(SEGMENTS)):
    seg_growth = orgs_cumsum_data[(orgs_cumsum_data['segment_tag']==i)]
    mn_growth = seg_growth.groupby('org')['const_growth'].median().replace([np.inf, -np.inf], np.nan).dropna().mean() * 100.
    mdn_growth = seg_growth.groupby('org')['const_growth'].median().replace([np.inf, -np.inf], np.nan).dropna().median() * 100.
    try:
        print("${:,} to ${:,}: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i][0], SEGMENTS[i][1], mn_growth, mdn_growth))
    except:
        print("${:,}+: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i], mn_growth, mdn_growth))

print()
print("Average year over year constituent cumsum growth for 2021, 2022, 2023, by revenue segments:")
print("-"*40)
for i in range(len(SEGMENTS)):
    seg_growth = orgs_cumsum_data[(orgs_cumsum_data['constituent_created_year'].isin([2021, 2022, 2023]))&(orgs_cumsum_data['segment_tag']==i)]
    mn_growth = seg_growth.groupby('org')['const_growth'].median().replace([np.inf, -np.inf], np.nan).dropna().mean() * 100.
    mdn_growth = seg_growth.groupby('org')['const_growth'].median().replace([np.inf, -np.inf], np.nan).dropna().median() * 100.
    try:
        print("${:,} to ${:,}: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i][0], SEGMENTS[i][1], mn_growth, mdn_growth))
    except:
        print("${:,}+: {:,.2f}% mean; {:,.2f}% median".format(SEGMENTS[i], mn_growth, mdn_growth))


Average year over year constituent cumsum growth, by revenue segments:
----------------------------------------
$0 to $100,000: 40.04% mean; 13.07% median
$100,000 to $1,000,000: 25.18% mean; 12.61% median
$1,000,000 to $10,000,000: 21.96% mean; 11.16% median
$10,000,000+: 15.94% mean; 8.66% median

Average year over year constituent cumsum growth for 2021, 2022, 2023, by revenue segments:
----------------------------------------
$0 to $100,000: 44.22% mean; 10.76% median
$100,000 to $1,000,000: 23.48% mean; 12.78% median
$1,000,000 to $10,000,000: 27.56% mean; 11.85% median
$10,000,000+: 15.06% mean; 8.77% median


In [1487]:
print("Baseline constituent growth for top 20%, all years:")
print(orgs_cumsum_data.groupby(['org', 'is top 20'])['const_growth'].median().reset_index().groupby('is top 20')['const_growth'].agg(['mean', 'median']).reset_index())

print()
print("Baseline constituent growth for top 20%, 2021, 2022, 2023:")
orgs_cumsum_data[orgs_cumsum_data['constituent_created_year'].isin([2021, 2022, 2023])].groupby(['org', 'is top 20'])['const_growth'].median().reset_index().groupby('is top 20')['const_growth'].agg(['mean', 'median']).reset_index()

Baseline constituent growth for top 20%, all years:
   is top 20      mean    median
0      False  0.264873  0.125265
1       True  0.173840  0.096707

Baseline constituent growth for top 20%, 2021, 2022, 2023:


Unnamed: 0,is top 20,mean,median
0,False,0.273583,0.122347
1,True,0.234013,0.110714
