<a href="https://colab.research.google.com/github/johndenver122/Programming-Notes/blob/main/Exploring_the_Bootcamp_Business_Model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Exploring the Coding Bootcamp Business Model

This is an exploration of the coding bootcamp business model, based on the steps in the student lifecycle.

It uses [@lethain's](https://lethain.com/systems-thinking/) python library [systems](https://github.com/lethain/systems) to simulate the model. I find that it's a more natural tool than a spreadsheet for unit economics modeling, but if you're familiar with unit economics modeling in spreadsheets, this should feel like home.

If you've never seen python, this library, or a unit economics model before, never fear! I explain each step - it'll just be a little more to learn ;)

Thanks for checking it out, drop me a note if you have thoughts or feedback!

[Rob](rob.co.bb/about)

## How to use this notebook

You _could_ just passively read this, but [you shouldn't!](https://andymatuschak.org/books/) 

In order to learn, you have to actively engage. Plus, it's more fun!

Some ideas:
* 🔎 Run the cells and inspect their output step by step (instead of just reading the output)
* 🖨 Copy your own version of the document, and change the models themselves - simplify, or add complexity.
* 📈 Change the assumptions (the numbers in the models), and see how it affects the outcomes.
* 🔢 If you know the details of some particular bootcamp's model, use those numbers - and see if they match up to the actual numbers in your financials
* 📆 Use this to project a bootcamp's numbers for an upcoming year
* 🚌 Model other kinds of schools or businesses
* 📝 Write down your thoughts and share them, or talk about it with a friend!

## Imports and setup

Loads in the pieces we need to run the rest of the notebook.

In [None]:
!pip install systems
from systems.parse import parse
from systems.viz import as_dot
from IPython.core.display import HTML
from bokeh.plotting import figure, show, output_notebook
from bokeh.models import NumeralTickFormatter
import pprint
output_notebook()

Collecting systems
  Downloading systems-0.1.0-py3-none-any.whl (16 kB)
Installing collected packages: systems
Successfully installed systems-0.1.0


# Simplified bootcamp student lifecycle

The business model we're building starts with the bones of the student lifecycle. 

**Prospective** students learn about the school. Some of those prospective students enroll and become **students**. Students take the program hopefully learn a ton of awesome skills, then **graduate**. Graduates start their job search, and eventually get **placed**.

We'll start with this simplified version of the bootcamp student lifecycle model, then dig into each stage in greater depth.

If you start playing with the syntax and want to know more about how it works, you can read about it here: https://github.com/lethain/systems

In [None]:
simple_spec = """

[World]       > Prospectives    @ 1000
Prospectives  > Students        @ 0.1
Students      > Graduates       @ 0.85
Graduates     > Placed          @ 0.85

"""
simple_model = parse(simple_spec)

## What's this syntax mean?

This lets us define a model of the world, with _flows_ between different _stocks_. In this case, at every tick, there are:

* 1000 new prospective students
* 10 percent enroll
* 85 percent graduate
* 85 percent find a job

The `>` means a _flow_ from one _stock_ to another, at the _rate_ indicated by the part after the `@`. So,

```
Students      > Graduates       @ 0.85
```

means that students turn into graduates at a rate of 85% per tick.

_Note: This is a pretend bootcamp. I just made the numbers up._

### What's a _tick_?

The systems library lets us simulate this model. We can run it for some number of rounds, and see how many students are at in each lifecycle stage for each round.

In this simplified model, each round is one 'length of bootcamp'. We'll call it 3 months, since that's a nice simple number for the model. 

In the next model down, I added a "Months" column to make the chart a little easier to follow.

## Simulating the bootcamp for 15 'ticks'

Each tick is 3 months, which is the length of our made-up program.


In [None]:
simple_results = simple_model.run(rounds=15)
simple_rendered = simple_model.render_html(simple_results)
HTML(simple_rendered)

0,1,2,3,4
0,0,0,0,0
1,1000,0,0,0
2,1000,100,0,0
3,1000,100,85,0
4,1000,100,85,72
5,1000,100,85,144
6,1000,100,85,216
7,1000,100,85,288
8,1000,100,85,360
9,1000,100,85,432


## A quick bootcamp metrics check

The imaginary school has placed 864 out of 1020 graduates, after enrolling 1300 students. 

100 students are currently enrolled, and 85 students are still counted as job-seeking.

If the school was reporting their outcomes, they might claim that 85% of graduates get placed in a job, even though only 66% of students they've ever enrolled have actually been placed in a job.

This gap makes explaining bootcamp placement numbers hard! We'll get more into the complications of placement numbers later on.

We also shouldn't ignore the 95 students who dropped out or 156 students who didn't get placed within a tick. Dropping out of a bootcamp or getting stuck in your job search can be deeply frustrating and disempowering experiences. Many of those students go on to find success in another program or find a job after the placement period, or find a different career path.  Still, the emotional toll is not to be ignored.

# Adding in costs and revenue

Turning the student journey into a business model means adding in cost and revenue at each step.

Just like our student lifecycle, we're going to start with a simplified version of the cost and revenue and then add more factors.

In [None]:
simple_costs_spec = """

[time]        > Months          @ 3

[World]       > Prospectives    @ 1000
Prospectives  > Students        @ 0.1
Students      > Graduates       @ 0.85
Graduates     > Placed          @ 0.85
Placed        > [Job]           @ 1.0

[money]       > Expenses        @ (CAC * Students)
[money]       > Expenses        @ (COGS * Students)
[money]       > Revenue         @ (Tuition * Placed)

CAC(1000)
COGS(5000)
Tuition(10000)
"""
simple_costs_model = parse(simple_costs_spec)
simple_costs_results = simple_costs_model.run(rounds=15)
simple_costs_rendered = simple_costs_model.render_html(simple_costs_results)
HTML(simple_costs_rendered)

0,1,2,3,4,5,6,7,8,9,10
0,0,0,0,0,0,0,0,1000,5000,10000
1,3,1000,0,0,0,0,0,1000,5000,10000
2,6,1000,100,0,0,0,0,1000,5000,10000
3,9,1000,100,85,0,600000,0,1000,5000,10000
4,12,1000,100,85,72,1200000,0,1000,5000,10000
5,15,1000,100,85,72,1800000,720000,1000,5000,10000
6,18,1000,100,85,72,2400000,1440000,1000,5000,10000
7,21,1000,100,85,72,3000000,2160000,1000,5000,10000
8,24,1000,100,85,72,3600000,2880000,1000,5000,10000
9,27,1000,100,85,72,4200000,3600000,1000,5000,10000


## Explaining this model

### CAC and COGS

**CAC**: **C**ost of **A**cquiring a **C**ustomer

**COGS**: **C**ost **O**f **G**oods **S**old. 

These terms are ubiquitous in financial models, so I'll use them here. They do feel slightly icky in this context - we're talking about students and teachers!

Once you know what the terms stand for, they're easier to interpret. 

**CAC** - How much do you spend on marketing to get a new student?

**COGS** - How much does it cost to teach each student and help them get a job?

I picked `$1000` for our imaginary bootcamp CAC, `$5000` for COGS, and `$10000` for tuition. As we flesh out the model, we'll dig into what makes up those numbers.

Most of the bigger, trendier, or more established bootcamps offer either an Income Share Agreement (ISA), where the student pays a portion of their salary after they get a job, or have a tuition-refund guarantee for students who don't get a job. For this pretend bootcamp, I count revenue based on the number of placements - not students or graduates. To make the code easier, the model also counts the tuition as 'paid' the tick after the student is placed - in reality, the payments would really only start then, and continue for the term of the student's ISA or loan.

## Imaginary bootcamp gross margin

We've picked the numbers for our simplified bootcamp somewhat favorably - the school makes `$10,000` per placement, at a cost of `$5000` - a gross profit of `$5000` per student. A nice `50%` margin! It's not a `80%` software margin, but it's not retail or food.

Well - not quite. This is what the margin would be if the school graduated 100% of students, and placed 100% of graduates.

The school still has to cover the costs for the students who don't graduate and don't get placed - even though they don't pay tuition. If the school places `72%` of students who start$^2$, they end up paying for about `1.4` students per placement$^3$.

Gross profit per placement is actually `$10000 - ($5000 * 1.4)`, or `$3000`. `30%` isn't terrible, but it's not as good as `50%`!

In case the jargon is new - gross margin is _before_ you count CAC, overhead, or R&D - other expenses.

Another caveat: we'll count CAC as the cost to _enroll_ a student. That means it will get this same `1.4` multiplier. That's slightly different from how most businesses count CAC. Later on, we'll get into what costs go into CAC.

2. `85% graduation rate * 85% placement rate`
3. `100 / 72`, rounded

## Time to positive cash

Let's look at the model output again:

In [None]:
HTML(simple_costs_rendered)

0,1,2,3,4,5,6,7,8,9,10
0,0,0,0,0,0,0,0,1000,5000,10000
1,3,1000,0,0,0,0,0,1000,5000,10000
2,6,1000,100,0,0,0,0,1000,5000,10000
3,9,1000,100,85,0,600000,0,1000,5000,10000
4,12,1000,100,85,72,1200000,0,1000,5000,10000
5,15,1000,100,85,72,1800000,720000,1000,5000,10000
6,18,1000,100,85,72,2400000,1440000,1000,5000,10000
7,21,1000,100,85,72,3000000,2160000,1000,5000,10000
8,24,1000,100,85,72,3600000,2880000,1000,5000,10000
9,27,1000,100,85,72,4200000,3600000,1000,5000,10000


For the first year, the bootcamp loses money _fast_. It's lost `$1.2 million` after year 1.

The school is cashflow-positive starting in round 5, when students start paying back their tuition. (Remember, we're leaving out a bunch of other costs, so this is really 'unit-economics-cashflow-positive').

For the first 3 years, this imaginary bootcamp has lost money!

It's only in month 45 that the school is actually cash positive.

Remember - this still counts the money as coming in a lump sum the tick after the student is placed. For schools that collect based on an ISA, the money will come in slower.

Let's visualize the imaginary bootcamp bank account:

In [None]:
x = [row["Months"] for row in simple_costs_results]
y = [row["Revenue"] - row["Expenses"] for row in simple_costs_results]
p = figure(title="Cash", width=600, height=400, y_axis_type="linear")
p.xaxis[0].axis_label = 'Months'
p.yaxis[0].axis_label = 'Dollars'
p.yaxis[0].formatter = NumeralTickFormatter(format="$0")
p.line(x, y)
show(p)

## This is why bootcamps raise money

🤑🤑🤑

That cool million-point-two has to come from somewhere.

Some bootcamps raise venture capital money, selling a piece of their company. Some charge tuition up front, putting the burden on students to take out loans if they don't have the cash. Some finance their contracts with students, essentially taking out a loan based on the assumption that students will eventually pay back their tuition.

We're not going to dig into how bootcamps make their bank accounts work. This chart is mostly to point to the reason why they need _something_ in place to solve this problem.

# Funnels and Conversion Rates

Let's dig into some of the numbers in the model.

Each of the numbers is an assumption about our bootcamp. We can break those assumptions into smaller pieces, spell out what goes into them, and write down formulas for what's going on.

We'll start with the enrollment, graduation, and placement rates on the right side of the `@`s. Later on, we'll dive into the factors that make up the costs.

## Enrollment

We picked 1000 as the number of prospective students, and 10% as the conversion rate from prospective student to enrollment. 

Let's explode those numbers into a more complicated model - the bootcamp marketing and admissions funnel.

### Marketing and Admissions Funnel

Sales and marketing folks love their funnels.

If you look at a bar graph of marketing stages, with the stages laid chronologically down the y-axis and counts of customers as centered bars along the x-axis, it looks like an inverted triangle.

![sales funnel](https://upload.wikimedia.org/wikipedia/commons/b/b3/The_Purchase_Funnel.jpg)

Each stage in the funnel usually has some associated steps that the marketing or sales team can do to increase conversions to the next step in the funnel.

Bootcamps are similar. The difference is that they apply filtering steps to their funnel. Students apply to the bootcamp, and sometimes have interviews to get in.

The original model had

```
[World]       > Prospectives    @ 1000
Prospectives  > Students        @ 0.1
```

Let's add some steps.

People in the world first become **aware** of bootcamps at some rate (e.g. click on an ad, or visit the landing page). Some of those **aware** people, upon being marketed to (or so the story goes) become **interested** in the bootcamp. Some of those interested prospects **apply** to the bootcamp. Those applicants become **interviews** at some rate, as some applications are rejected, and some applicants don't show up for interviews. Interviewees are **admitted** at some rate, and some of the admitted students decide to **enroll**.

Different bootcamps may break this down into different steps, but we'll call it close enough.

Let's pick assumptions so that we still end up with results that are close to the original numbers.

In [None]:
awareness_rate = 10000
interested_rate = 0.1
# 1000 new Prospectives

application_rate = 0.5
reject_rate = 0.3 # So, 7/10 applicants get an interview
no_show_rate = 0.1 # 1/10 interviewees doesn't show up
interview_rate = (1 - reject_rate) * (1 - no_show_rate)
interview_pass_rate = 0.5 # half of interviewees pass
enrollment_rate = 0.65 # 65% of admitted students enroll

# overall conversion from Prospective 
overall_conversion_rate = application_rate * interview_rate * interview_pass_rate * enrollment_rate

marketing_admissions_spec = f"""
[World]     > Aware         @ {awareness_rate}
Aware       > Interested    @ {interested_rate}
Interested  > Applicants    @ {application_rate}
Applicants  > Interviews    @ {interview_rate}
Interviews  > Admitted      @ {interview_pass_rate}
Admitted    > Students      @ {enrollment_rate}
"""
marketing_admissions_model = parse(marketing_admissions_spec)
print(overall_conversion_rate) 

0.10237500000000001


I picked the numbers so that we got close to our original 1000 prospects and `10%` enrollment, but these rates are probably within reason for bootcamps.

We end up with 102 students enrolled each tick instead of the 100 from the original model.

In [None]:
marketing_admissions_results = marketing_admissions_model.run(rounds=8)
marketing_admissions_rendered = marketing_admissions_model.render_html(marketing_admissions_results)
HTML(marketing_admissions_rendered)

0,1,2,3,4,5,6
0,0,0,0,0,0,0
1,10000,0,0,0,0,0
2,10000,1000,0,0,0,0
3,10000,1000,500,0,0,0
4,10000,1000,500,315,0,0
5,10000,1000,500,315,157,0
6,10000,1000,500,315,157,102
7,10000,1000,500,315,157,204
8,10000,1000,500,315,157,306


Note that round 6 here is when we get our first students. That was Round 1 of the earlier model. The ticks probably don't make as much sense here, this is just illustrative.

## Graduation Rate

Let's pick apart the next number we made up - the 85% graduation rate.

In the simplified model:

```
Students      > Graduates       @ 0.85
```

Schools don't randomly graduate 85% of their students. Programs see students drop out if they have life events that prevent them from continuing, or if they don't pass an assessment. Some students decide that the program isn't for them, and drop out. Online, some students fade out over time, moving through the curriculum slower and slower, and finally stopping forward progress.

For our model, let's split the program into some number of _modules_, and say that for each module, there's a chance of dropping out, and a chance of not passing a gating assessment.

For most programs, these rates are not the same for every module. Some programs may see lots of students decide early on that the program isn't for them. Some programs may have a difficult assessment in a particular module.

For the simplicity of the code, we're going to say that all the modules in the bootcamp see the same rate of dropoff and pass rate.

In [None]:
num_modules = 4
dropout_chance = .01
module_pass_rate = .97
graduation_rate = (module_pass_rate - dropout_chance) ** num_modules

graduation_spec = f"""
[Prospects] > Students   @ 102
Students      > Graduates       @ {graduation_rate}
"""
graduation_model = parse(graduation_spec)
print(f"graduation_rate: {graduation_rate}")

graduation_results = graduation_model.run(rounds=4)
graduation_rendered = graduation_model.render_html(graduation_results)
HTML(graduation_rendered)

graduation_rate: 0.8493465599999999


0,1,2
0,0,0
1,102,0
2,102,86
3,102,172
4,102,258


We get close to our earlier `85%` assumption if we let our bootcamp have 4 modules, each with a `1%` chance of dropping out and a `97%` chance of passing the end-of-module assessment.

Note: I had to drop into python for these, since the systems library can't do exponents.

## The Job Search

```
Graduates     > Placed          @ 0.85
```

If only it were so simple.

After students graduate, they take some time to polish their resume and portfolio projects, find companies with job openings that fit their skills and interests, and start applying to jobs.

Some bootcamps have a career services, employer partnerships, or job placement team that helps connect students to employers and successfully land a job.

Other bootcamps leave this step entirely up to the student.

Some bootcamps partner with external services for all or part of this career services and job hunt stage.

There's lots of different ways to adjust job search variables. I'm picking ones that are easy to model.

In [None]:
job_app_rate = 20 # how many applications does a student submit?
phone_screen_rate = 0.1 # fraction of applications submitted result in a passed phone screen?
onsite_rate = 0.3 # fraction of passed phone screens result in a passed onsite?
accept_rate = 0.95 # fraction of passed onsites resulting in accepted offers
placement_rate = job_app_rate * phone_screen_rate * onsite_rate * accept_rate
print(placement_rate)

0.57


By tweaking these numbers, we can find rates that get students jobs after a certain number of applications. It's probably more random than this - students are different in their interests, skills, and how they interview.

To make this slightly closer to the real world, grads do not stop job searching after one tick. Most 3 month bootcamps count placements in the 6 months following graduation, so we'll count placements for two ticks.

In [None]:
placement_rate = job_app_rate * phone_screen_rate * onsite_rate * accept_rate
placement_spec = f"""
[Students]     > Graduates(86)    @ 86
Graduates      > Searchers        @ Conversion(1 - {placement_rate})
Searchers      > NotPlaced        @ Conversion(1 - {placement_rate})
[AllSearchers] > Placed           @ (Graduates + Searchers) * {placement_rate}
"""
placement_model = parse(placement_spec)
placement_results = placement_model.run(rounds=5)
placement_rendered = placement_model.render_html(placement_results)
HTML(placement_rendered)

0,1,2,3,4
0,86,0,0,0.0
1,86,36,0,49.02
2,86,36,15,118.56
3,86,36,30,188.1
4,86,36,45,257.64
5,86,36,60,327.17999999999995


We end up with our overall placement rate of `85%` by picking numbers so that the placement rate _per tick_ is `57%`. That means that a lot of job searching grads will end up taking more than the length of the bootcamp to find a job.

# Costs of a bootcamp

Now let's take a closer look at what goes into those CAC and COGS numbers.

Note: We're still going to leave out a bunch of things that a real bootcamp would have in its financials. Overhead, curriculum development, engineering and product, the list goes on.

## What costs go into CAC?

Let's take the marketing and admissions funnel steps that we defined above and attach cost numbers to each step.

- Digital Marketing spend
  - Paid Search
  - Paid Social
- Marketing overhead
- Admissions application review
- Admissions interviews

Real accountants and finance folks might not put all of this stuff into CAC (especially marketing overhead). In my experience, these costs behave like CAC costs do - they tend to scale with revenue. I'm lumping them in.


Picking numbers does make our CAC calculation more complicated.

Each of these costs are the cost to get a prospective student to the next step in the funnel.

You end up paying 
```
price / (product of subsequent steps conversion rates)`
```
for each of these steps.

In [None]:
paid_social = 5
paid_search = 15
application_review = 10
admissions_interview = 40
marketing_admissions_overhead = 200

cac_spec = f"""
[World]     > Aware         @ {awareness_rate}
Aware       > Interested    @ {interested_rate}
Interested  > Applicants    @ {application_rate}
Applicants  > Interviews    @ {interview_rate}
Interviews  > Admitted      @ {interview_pass_rate}
Admitted    > Students      @ {enrollment_rate}

[money] > PaidSocial        @ Aware * {paid_social}
[money] > PaidSearch        @ Interested * {paid_search}
[money] > Application       @ Applicants * {application_review}
[money] > Interview         @ Interviews * {admissions_interview}
[money] > Overhead          @ Admitted * {marketing_admissions_overhead}
"""

cac_model = parse(cac_spec)
cac_results = cac_model.run(rounds=7)
cac_rendered = cac_model.render_html(cac_results)
HTML(cac_rendered)

0,1,2,3,4,5,6,7,8,9,10,11
0,0,0,0,0,0,0,0,0,0,0,0
1,10000,0,0,0,0,0,0,0,0,0,0
2,10000,1000,0,0,0,0,50000,0,0,0,0
3,10000,1000,500,0,0,0,100000,15000,0,0,0
4,10000,1000,500,315,0,0,150000,30000,5000,0,0
5,10000,1000,500,315,157,0,200000,45000,10000,12600,0
6,10000,1000,500,315,157,102,250000,60000,15000,25200,31400
7,10000,1000,500,315,157,204,300000,75000,20000,37800,62800


In [None]:
def get_per_tick(name, results):
  return results[-1][name] - results[-2][name]

cac = [
    ("PaidSocial", get_per_tick("PaidSocial", cac_results) / get_per_tick("Students", cac_results)),
    ("PaidSearch", get_per_tick("PaidSearch", cac_results) / get_per_tick("Students", cac_results)),
    ("Application", get_per_tick("Application", cac_results) / get_per_tick("Students", cac_results)),
    ("Interview", get_per_tick("Interview", cac_results) / get_per_tick("Students", cac_results)),
    ("Overhead", get_per_tick("Overhead", cac_results) / get_per_tick("Students", cac_results)),
]

pprint.pprint(cac)
total = sum([y for (x,y) in cac])
print(f"\ntotal: {total}")

[('PaidSocial', 490.19607843137254),
 ('PaidSearch', 147.05882352941177),
 ('Application', 49.01960784313726),
 ('Interview', 123.52941176470588),
 ('Overhead', 307.84313725490193)]

total: 1117.6470588235293


So, on a per-student basis, this works out to about

* `$490` on paid social
* `$146` on paid search
* `$49` on reviewing applications
* `$124` interviewing
* `$307` on marketing and admissions overhead

For a total CAC of `$1118` per enrollment.

I think these are reasonable guesses for spend at each stage - if you have better numbers, try yours! 

We're also in the ballpark of the $1000 we estimated in our naive model.

**Wait!**

There's still that pesky `1.4` multiplier we've got to deal with. If our imaginary bootcamp wants to get _paid_, we have to graduate and place students - not just enroll them.

The CAC on a _per-placement_ basis comes out to `$1564`.

## What costs go into COGS?

For bootcamp cost modeling, its really important to figure out how much it costs to teach students to code.

Like most businesses, staff is a big cost.

Bootcamps tend to employ two kinds of instructional staff - **instructors** and **coaches**. (The names are different at different bootcamps, but the roles are usually similar). Instructors are more senior, often having some experience in the industry. Coaches are frequently recent graduates of the program.

In-person bootcamps also have to pay for the costs of running their **space**. Since this cost goes up when schools have more students, I'm including it in COGS, though it might 'properly' be called OPEX or something.

Also included in COGS are the costs for placing students in jobs. For the model, this is going to boil down to staff costs for **employer partnerships** and **career coaches**.

COGS also ought to include whatever other miscellaneous money that goes up for every student - snacks, swag, graduation certificates, textbooks, licenses, hosting fees for websites, etc. I'm dumping all this into a little **other** variable and calling it a day.

I'm making informed guesses about salaries, but in reality there's a wide range. Actual salaries vary by region, experience, and other factors.

In [None]:
# Instructional staff
student_instructor_ratio = 1/30
student_coach_ratio = 1/10
instructor_annual = 125000
coach_annual = 70000

# Space
space_cost_per_student = 400

# Career Services
employer_partner_ratio = 1/150
career_coach_ratio = 1/30
employer_partner_annual = 70000
career_coach_annual = 55000

# Miscellaneous
other = 50
ticks_per_year = 4

cogs_spec = f"""
[Prospective] > Students        @ 100
Students      > Graduates       @ 0.85
Graduates     > Placed        @ 0.85

[money]       > Instructors      @ (Students * {student_instructor_ratio} * {instructor_annual} / {ticks_per_year})
[money]       > Coaches          @ (Students * {student_coach_ratio} * {coach_annual} / {ticks_per_year})
[money]       > Space            @ (Students * {space_cost_per_student})
[money]       > Other            @ (Students * {other})
[money]       > EmployerPartners @ (Graduates * {employer_partner_ratio} * {employer_partner_annual} / {ticks_per_year})
[money]       > CareerCoaches    @ (Graduates * {career_coach_ratio} * {career_coach_annual} / {ticks_per_year})
"""

cogs_model = parse(cogs_spec)
cogs_results = cogs_model.run(rounds=7)
cogs_rendered = cogs_model.render_html(cogs_results)
HTML(cogs_rendered)

0,1,2,3,4,5,6,7,8,9
0,0,0,0,0.0,0.0,0,0,0.0,0.0
1,100,0,0,0.0,0.0,0,0,0.0,0.0
2,100,85,0,104166.66666666669,175000.0,40000,5000,0.0,0.0
3,100,85,72,208333.3333333333,350000.0,80000,10000,9916.666666666666,38958.333333333336
4,100,85,144,312500.0,525000.0,120000,15000,19833.33333333333,77916.66666666667
5,100,85,216,416666.6666666667,700000.0,160000,20000,29750.0,116875.0
6,100,85,288,520833.3333333334,875000.0,200000,25000,39666.66666666666,155833.33333333334
7,100,85,360,625000.0,1050000.0,240000,30000,49583.33333333333,194791.6666666667


This gives us the total spending per item for each tick, which we can use to find the COGS per placement.

Why per placement?

Our pretend bootcamp only gets paid when students get placed, at least in our unit economics model.

Let's calculate.


In [None]:
def get_per_tick(name, results):
  return results[-1][name] - results[-2][name]

cogs = [
    ("Instructors", get_per_tick("Instructors", cogs_results) / get_per_tick("Placed", cogs_results)),
    ("Coaches", get_per_tick("Coaches", cogs_results) / get_per_tick("Placed", cogs_results)),
    ("Space", get_per_tick("Space", cogs_results) / get_per_tick("Placed", cogs_results)),
    ("Other", get_per_tick("Other", cogs_results) / get_per_tick("Placed", cogs_results)),
    ("CareerCoaches", get_per_tick("CareerCoaches", cogs_results) / get_per_tick("Placed", cogs_results)),
    ("EmployerPartners", get_per_tick("EmployerPartners", cogs_results) / get_per_tick("Placed", cogs_results)),
]

pprint.pprint(cogs)
total = sum([y for (x,y) in cogs])
print(f"\ntotal: {total}")

[('Instructors', 1446.7592592592587),
 ('Coaches', 2430.5555555555557),
 ('Space', 555.5555555555555),
 ('Other', 69.44444444444444),
 ('CareerCoaches', 541.087962962963),
 ('EmployerPartners', 137.73148148148144)]

total: 5181.134259259259


So, our COGS on a per-placement basis are

* `$1446` - Instructor salary
* `$2430` - Coaches
* `$555` - Space
* `$69`  - Other
* `$541`  - Career Coaching
* `$138`  - Employer Partnerships

For a total of  `$5181`.

I picked the numbers to get us to a COGS close to our original `5000`, but I think they're probably close to what some real bootcamps have.

Tweaking the variables can be illustrative for understanding what drives costs.

If you change the number of students per instructor, or the salary of coaches, how does that impact the cost per placement?

Below, there's a complete model, where you can adjust pairs of values like `(instructor_ratio, module_pass_rate)` to see the effect of a hypothesis like: "Increasing the number of instructors will help more students pass their module assessments" and see the impact on costs and revenue.

# Pulling it all together

The full bootcamp business model, with all the glorious complications that we've added in.

There are a ton of variables you can adjust in the model - the default values here are the ones we picked above, but you can plug in values from actual or hypothetical bootcamps to see how they do.

Have fun!

## Big, Hairy Bootcamp Model


In [None]:
# Marketing and Admissions

awareness_rate = 10000 # number of new 'aware' people every tick
interested_rate = 0.1  # % converted from 'aware' to 'interested'
application_rate = 0.5 # % of 'interested' who apply
reject_rate = 0.3 # % apps that don't get an interview
no_show_rate = 0.1 # % interviewees that don't show up
interview_rate = (1 - reject_rate) * (1 - no_show_rate)
interview_pass_rate = 0.5
enrollment_rate = 0.65 # % of admitted students that enroll

# Cost to convert to the next funnel step

paid_social = 5 # cost of each new 'aware'
paid_search = 15 # cost to convert 'aware' -> 'interested'
application_review = 10 # cost to review an application
admissions_interview = 40 # cost to schedule, run, and evaluate an admissions interview
marketing_admissions_overhead = 200 # per enrollment overhead

# Program

num_modules = 4
dropout_chance = .01
module_pass_rate = .97
graduation_rate = (module_pass_rate - dropout_chance) ** num_modules

# Program Costs

student_instructor_ratio = 1/30
student_coach_ratio = 1/10
instructor_annual = 125000
coach_annual = 70000

space_cost_per_student = 400
other_costs = 50

# Job Search

job_app_rate = 20 # how many applications does a student submit?
phone_screen_rate = 0.1 # fraction of applications submitted result in a passed phone screen?
onsite_rate = 0.3 # fraction of passed phone screens result in a passed onsite?
accept_rate = 0.95 # fraction of passed onsites resulting in accepted offers
placement_rate = job_app_rate * phone_screen_rate * onsite_rate * accept_rate

# Career Services Costs

employer_partner_ratio = 1/150
career_coach_ratio = 1/30
employer_partner_annual = 70000
career_coach_annual = 55000

# Tuition
tuition = 10000

# Time
ticks_per_year = 4
months_per_tick = 3

whole_bootcamp_spec = f"""
[time]      > Month         @ {months_per_tick}

[World]     > Aware         @ {awareness_rate}
Aware       > Interested    @ {interested_rate}
Interested  > Applicants    @ {application_rate}
Applicants  > Interviews    @ {interview_rate}
Interviews  > Admitted      @ {interview_pass_rate}
Admitted    > Students      @ {enrollment_rate}

Students       > Graduates    @ {graduation_rate}
Graduates      > Searchers        @ Conversion(1 - {placement_rate})
Searchers      > NotPlaced        @ Conversion(1 - {placement_rate})
[AllSearchers] > Placed           @ (Graduates + Searchers) * {placement_rate}

[money] > PaidSocial        @ Aware * {paid_social}
[money] > PaidSearch        @ Interested * {paid_search}
[money] > Application       @ Applicants * {application_review}
[money] > Interview         @ Interviews * {admissions_interview}
[money] > Overhead          @ Admitted * {marketing_admissions_overhead}

[money]       > Instructors      @ (Students * {student_instructor_ratio} * {instructor_annual} / {ticks_per_year})
[money]       > Coaches          @ (Students * {student_coach_ratio} * {coach_annual} / {ticks_per_year})
[money]       > Space            @ (Students * {space_cost_per_student})
[money]       > Other            @ (Students * {other_costs})
[money]       > EmployerPartners @ (Graduates * {employer_partner_ratio} * {employer_partner_annual} / {ticks_per_year})
[money]       > CareerCoaches    @ (Graduates * {career_coach_ratio} * {career_coach_annual} / {ticks_per_year})

[money]       > Revenue         @ ({tuition} * Placed)
"""

whole_bootcamp_model = parse(whole_bootcamp_spec)
whole_bootcamp_results = whole_bootcamp_model.run(rounds=20)
whole_bootcamp_rendered = whole_bootcamp_model.render_html(whole_bootcamp_results)
HTML(whole_bootcamp_rendered)

0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23
0,0,0,0,0,0,0,0,0,0,0,0.0,0,0,0,0,0,0.0,0.0,0,0,0.0,0.0,0.0
1,3,10000,0,0,0,0,0,0,0,0,0.0,0,0,0,0,0,0.0,0.0,0,0,0.0,0.0,0.0
2,6,10000,1000,0,0,0,0,0,0,0,0.0,50000,0,0,0,0,0.0,0.0,0,0,0.0,0.0,0.0
3,9,10000,1000,500,0,0,0,0,0,0,0.0,100000,15000,0,0,0,0.0,0.0,0,0,0.0,0.0,0.0
4,12,10000,1000,500,315,0,0,0,0,0,0.0,150000,30000,5000,0,0,0.0,0.0,0,0,0.0,0.0,0.0
5,15,10000,1000,500,315,157,0,0,0,0,0.0,200000,45000,10000,12600,0,0.0,0.0,0,0,0.0,0.0,0.0
6,18,10000,1000,500,315,157,102,0,0,0,0.0,250000,60000,15000,25200,31400,0.0,0.0,0,0,0.0,0.0,0.0
7,21,10000,1000,500,315,157,102,86,0,0,0.0,300000,75000,20000,37800,62800,106250.0,178500.00000000003,40800,5100,0.0,0.0,0.0
8,24,10000,1000,500,315,157,102,86,36,0,49.02,350000,90000,25000,50400,94200,212500.0,357000.00000000006,81600,10200,10033.333333333334,39416.66666666666,0.0
9,27,10000,1000,500,315,157,102,86,36,15,118.56,400000,105000,30000,63000,125600,318750.0,535500.0000000001,122400,15300,20066.666666666668,78833.33333333333,490200.0


**Placed count**

The "Placed" count in the chart is a little off, since it misses some fractional students. I adjust it in the printed stats, including the tuition collected, since that's based on placements.

Technical explanation: the systems library doesn't handle multiple conversions from a single stock, since it applies flows in order, and a conversion zeros out the stock it was converting from. 

## Totals, Results, Summary

In [None]:
## Formatting the results

def get_steady_state(name, results):
  return results[-1][name]

def get_per_tick(name, results):
  return results[-1][name] - results[-2][name]

def get_total(name, results):
  return sum([round[name] for round in results])

total_students = get_total("Students", whole_bootcamp_results)
total_graduates = get_total("Graduates", whole_bootcamp_results)
total_placed = whole_bootcamp_results[-1]["Placed"]
total_not_placed = whole_bootcamp_results[-1]["NotPlaced"]
current_students = whole_bootcamp_results[-1]["Students"] 
still_searching = whole_bootcamp_results[-1]["Graduates"] + whole_bootcamp_results[-1]["Searchers"]

print("")
print(f"total students: {total_students}")
print(f"current_students:  {current_students}")
print(f"dropped out: {total_students - total_graduates - current_students}")
print(f"total graduates: {total_graduates}")
print(f"graduation_rate: {graduation_rate:.1%}")

# print(f"Rounding errors: {total_placed + total_not_placed + still_searching} should equal {total_graduates}")
# print(f"total placed is actually: {total_graduates - (total_not_placed + still_searching)}")
total_placed = total_graduates - (total_not_placed + still_searching)

print(f"total placed: {total_placed}")
print(f"total not placed: {total_not_placed}")
print(f"still searching: {still_searching}")

grad_placement_rate = total_placed / (total_graduates - still_searching)
success_rate = total_placed / (total_students - current_students - still_searching)
print("")
print(f"placement rate (per tick): {placement_rate:.1%}")
print(f"placement rate: {grad_placement_rate:.1%}")
print(f"prospective success rate (placed / total students): {success_rate:.1%}")

print("\nCAC")
whole_bootcamp_cac = [
    ("PaidSocial", get_per_tick("PaidSocial", whole_bootcamp_results) / get_steady_state("Students", whole_bootcamp_results)),
    ("PaidSearch", get_per_tick("PaidSearch", whole_bootcamp_results) / get_steady_state("Students", whole_bootcamp_results)),
    ("Application", get_per_tick("Application", whole_bootcamp_results) / get_steady_state("Students", whole_bootcamp_results)),
    ("Interview", get_per_tick("Interview", whole_bootcamp_results) / get_steady_state("Students", whole_bootcamp_results)),
    ("Overhead", get_per_tick("Overhead", whole_bootcamp_results) / get_steady_state("Students", whole_bootcamp_results)),
]

for name, cost in whole_bootcamp_cac:
  print(f"{name}: ${cost:.2f} (${cost / placement_rate:.2f} per placement)")
whole_bootcamp_cac_total = sum([y for (x,y) in whole_bootcamp_cac])
print(f"\nmarketing funnel spend per student per tick:\n ${whole_bootcamp_cac_total:.2f}")
print(f"marketing funnel, per placement:\n ${whole_bootcamp_cac_total / placement_rate:.2f}")
# counted from 'interested' on
overall_conversion_rate = application_rate * interview_rate * interview_pass_rate * enrollment_rate
print(f"overall funnel conversion rate: {overall_conversion_rate:.1%}\n") 

whole_bootcamp_cogs = [
    ("Instructors", get_per_tick("Instructors", whole_bootcamp_results) / get_per_tick("Placed", whole_bootcamp_results)),
    ("Coaches", get_per_tick("Coaches", whole_bootcamp_results) / get_per_tick("Placed", whole_bootcamp_results)),
    ("Space", get_per_tick("Space", whole_bootcamp_results) / get_per_tick("Placed", whole_bootcamp_results)),
    ("Other", get_per_tick("Other", whole_bootcamp_results) / get_per_tick("Placed", whole_bootcamp_results)),
    ("CareerCoaches", get_per_tick("CareerCoaches", whole_bootcamp_results) / get_per_tick("Placed", whole_bootcamp_results)),
    ("EmployerPartners", get_per_tick("EmployerPartners", whole_bootcamp_results) / get_per_tick("Placed", whole_bootcamp_results)),
]

print("COGS:")
for name, cost in whole_bootcamp_cogs:
  print(f"{name}: ${cost:.2f}")
whole_bootcamp_cogs_total = sum([y for (x,y) in whole_bootcamp_cogs])
print(f"\nTotal COGS per placement: ${whole_bootcamp_cogs_total:.2f}\n")

cost_centers = ["PaidSocial", "PaidSearch", "Application",	"Interview",	"Overhead",	"Instructors",	"Coaches",	"Space",	"Other",	"EmployerPartners",	"CareerCoaches"]
expenses_total = sum([whole_bootcamp_results[-1][name] for name in cost_centers])
print(f"Expenses (CAC and COGS): ${expenses_total}")
tuition_collected = total_placed * tuition
print(f"Tuition Revenue: ${tuition_collected}")




total students: 1530
current_students:  102
dropped out: 224
total graduates: 1204
graduation_rate: 84.9%
total placed: 902
total not placed: 180
still searching: 122

placement rate (per tick): 57.0%
placement rate: 83.4%
prospective success rate (placed / total students): 69.1%

CAC
PaidSocial: $490.20 ($859.99 per placement)
PaidSearch: $147.06 ($258.00 per placement)
Application: $49.02 ($86.00 per placement)
Interview: $123.53 ($216.72 per placement)
Overhead: $307.84 ($540.08 per placement)

marketing funnel spend per student per tick:
 $1117.65
marketing funnel, per placement:
 $1960.78
overall funnel conversion rate: 10.2%

COGS:
Instructors: $1527.90
Coaches: $2566.87
Space: $586.71
Other: $73.34
CareerCoaches: $566.82
EmployerPartners: $144.28

Total COGS per placement: $5465.92

Expenses (CAC and COGS): $7249550.0
Tuition Revenue: $9020000


## What did we learn?

* Bootcamps take a long time to make money, and the margins (at least for this `$10,000`-tuition bootcamp) aren't amazing.

* With the assumptions of this model, the biggest costs are instructional staff, then marketing, then career services and space. 

* For a prospective student, the chances of graduating and getting a job are not the same as the placement rate of the bootcamp. Bootcamps usually advertise the placement rate based on 'eligible students' - students who graduate and actively look for a job. For someone considering a bootcamp, you might care instead about your overall chances of success. With the numbers in this model, the placement rate is `83.4%`, but the 
prospective success rate is only `69.1%`.

* Charging for job placements means that COGS and CAC end up with multipliers based on the graduation and job placement rates. (namely, the inverse of the product of the graduation and placement rates). Maybe this is an obvious result, but seeing the multiplier on the numbers made it more real to me.

* Small per-click costs early in the marketing funnel get greatly magnified by the dropoff in the rest of the funnel. The `$5` to make a new person aware of the bootcamp turned into `$588` per student who made it all the way to placement in a job.

* The earlier in the funnel, the greater the magnification by later funnel steps rates. Late-stage costs like employer partnerships have a _much_ smaller multiplier than early stage costs like marketing.

## Real Bootcamp numbers

If you adjust the numbers to match the publicly available numbers for some actual bootcamps, you can try to see how they're doing.

Some of the variables don't make sense for some bootcamps. For example, online bootcamps spend `$0` on space for students. Similarly, if you want to model a bootcamp that accepts all students without an interview, set the interview pass rate to `100%` and the interview cost to `$0`.

Some structural changes require changing not just the numbers, but the formulas. If you wanted to model a bootcamp where every student who enrolled paid (instead of only placed students), you'd have to change the formula to multiply by `Students` instead of `Placed`. Since we built this up piece by piece, this kind of change shouldn't be too hard!

Some sources for bootcamp numbers:

- https://cirr.org/
- https://careerkarma.com/blog/bootcamp-market-report-2020/
- https://www.coursereport.com/reports/coding-bootcamp-market-size-research-2019
- https://www.switchup.org/rankings/coding-bootcamp-survey
- https://www.classcentral.com/report/bootcamps-and-isas/

Some sources for ad cost numbers:

- https://www.wordstream.com/blog/ws/2017/02/28/facebook-advertising-benchmarks (suggests an education "Cost per Action" of `$7.85` on Facebook and similar advertisers)
- https://www.webfx.com/blog/marketing/much-cost-advertise-google-adwords/ (Google Ads cost per click)

# More to explore

While this model generates insights, it leaves so much out! 

Here's a partial list of things-left-out.

## Relationships between variables

Lots of the variables act as independent in this model, but aren't really independent in a real school. If you fire all your coaches, your graduation rate won't stay the same. Setting the coach ratio to zero won't have that effect automatically in this model, so you could generate wacky results.

You can form hypotheses about how variables are related, and make connected changes.

Hypotheses to play with:

* spending on marketing affects how many people become aware, interested, apply, and enroll.
* changing tuition affects marketing conversion numbers
* paying teachers more makes the graduation numbers go up
* changing the employer partnerships ratios affects placement numbers

Try these experiments out in the notebook - it's free to run the model!

## ISAs, Loans, and Tuition

Repayment isn't actually a flat tuition fee. When students enter an ISA, they pay a percentage of their salary for the length of the ISA. 

For other tuition payment models, students might have a payment plan, the loan provider might have a structured deal with the school, or there might be other factors.

The model here doesn't include ISA factors into the tuition repayment, but you might try to add it in if you're interested in how that affects things!

## Overhead and Management costs

We left out all the overhead and management costs, product development, etc.

It would be interesting to try to include them, though potentially hard.

An (admittedly work-intensive and slightly creepy) strategy for doing it might be to pull a bootcamp's org chart from LinkedIn, guesstimate salaries, and add it all up.

Depending on the bootcamp, overhead could be as large or larger than all the other costs, or only a fraction. 

## Costs aren't smooth

In this model, we treat things as continous when they are really discrete, since that's easier to model. In reality, many elements of the cost model  move in discrete jumps - you can't hire half a teacher. You can't place a fractional student, etc.

## Marketing Funnel

I picked ads costs and conversion numbers mostly to make the model simple. If you talk to someone who actually runs one of these funnels, it's going to look _very_ different from what I have here.

Some differences:

* multiple channels, with different prices
* mix of organic and paid interest
* uncertainty of attribution
* multiple interactions needed to convert between each 'step' in this model
* very different costs and conversion rates

## Student aptitude or prior experience

One reason bootcamps have an application process is to try to pick students who they think will be a good fit for their program. This is some combination of prior experience, attitude, grit, and aptitude. This is probably pretty complicated, and if bootcamps could accurately tell who would succeed in their bootcamp, they might increase their graduation and job placement rates.

This is hard to model, and it has justice and fairness implications. If bootcamps just let in people who are likely to get a job afterwards because of their other credentials, the bootcamp is just acting as a filter.

Bootcamps likely have false positives and false negatives at every active filtering step - their application, their interviews, and their in-program gating assessments. Driving down false negatives would presumably help the conversion rates for later stages in the model. Driving down false positive rates would let more people through, which is better for those students and for overall revenue. 

Assessments are hard to build well.

## Employer Partnerships

Some number of students will be placed through employer partnerships.

Like everything about this model, this is an oversimplified version of how employer partnerships work. 

I've left employer partnerships out of the overall placements model, but if you want to play with it, you can.

In [None]:
cold_reach_out_to_employers = 100
response_rate = 0.2
employer_inbound = 5
match_rate = 0.5
positions_per_company = 2
partnership_placements = (cold_reach_out_to_employers * response_rate + employer_inbound) * match_rate * positions_per_company
print(partnership_placements)

25.0


## Brand

Potential effects of brand on the model:
- attract better teachers, cheaper (improvement to graduation rate, decrease in COGS)
- attract students, cheaper (increase marketing effectiveness, increase enrollment conversion, decrease CAC)
- attract employers (improved placement rate, lower placement costs)

It might be possible to model this as a scalar factor that gets multiplied into these. For bootcamps whose brand is actually a negative factor, the value could be less than 1.

Different factors might affect brand - the active investment in brand marketing activities by the school, ratings and reviews on websites, PR and media coverage, the school's outcomes numbers, and more.

## Alumni

Having a large alumni network affects the ability to recruit new students, teachers, and place students into jobs.

It might work like the brand factor, but act as a function of the number of students who ever graduated.

## Competition

Bootcamps don't exist in a bubble. When one bootcamp advertises, prospects become aware not only of that bootcamp, but of bootcamps as an educational option for them. While the industry is still fairly new, and most growth for any bootcamp comes from educating new prospective students about bootcamps, we're gradually moving towards a world of bootcamps competing with each other for students. 

The marketing and admissions funnel here doesn't say anything about a fixed pool of people to advertise to, or competing with other bootcamps for those students. I don't know if there's a good natural way to model how competition would work - maybe some combination of brand, marketing, and alumni.

## Other business lines

The model presented here is based on what I think of as the traditional code bootcamp model. That is, full time, fixed length, with students that are career-seeking.

Companies that run this kind of bootcamp also make money other ways. Selling enterprise training is the most notable, but other business lines include running coworking spaces, charging employers for placements, selling curriculum, selling education software, and building other software (sometimes using students as freelancers, in effect). 

These are all left out of this analysis, but could be included.