# Table of Contents
 <p><div class="lev1"><a href="#Notation-and-Definition-of-Variables"><span class="toc-item-num">1 - </span>Notation and Definition of Variables</a></div><div class="lev1"><a href="#Gender-Diversity-Evolution-Model"><span class="toc-item-num">2 - </span>Gender Diversity Evolution Model</a></div><div class="lev2"><a href="#Department-composition"><span class="toc-item-num">2.1 - </span>Department composition</a></div><div class="lev2"><a href="#Department-size-and-rank-targets"><span class="toc-item-num">2.2 - </span>Department size and rank targets</a></div><div class="lev2"><a href="#Attrition-process"><span class="toc-item-num">2.3 - </span>Attrition process</a></div><div class="lev2"><a href="#Promotion-process"><span class="toc-item-num">2.4 - </span>Promotion process</a></div><div class="lev2"><a href="#Hiring-process"><span class="toc-item-num">2.5 - </span>Hiring process</a></div><div class="lev3"><a href="#Number-of-vacancies-at-each-rank"><span class="toc-item-num">2.5.1 - </span>Number of vacancies at each rank</a></div><div class="lev3"><a href="#Multinomial-Hiring-Process"><span class="toc-item-num">2.5.2 - </span>Multinomial Hiring Process</a></div><div class="lev3"><a href="#Professor-Demand-functions"><span class="toc-item-num">2.5.3 - </span>Professor Demand functions</a></div><div class="lev2"><a href="#Simulation"><span class="toc-item-num">2.6 - </span>Simulation</a></div>

# Notation and Definition of Variables


This is a list of some of the mathematical notation used in the equations below.

| Variable     | Description                      | Range                                           |
|--------------|----------------------------------|-------------------------------------------------|
| $R$ | the set of professor ranks:Assistant, Associate, Full denoted numerically    | $\{1,2,3\} $ |
| $r$| rank of a particular group of professors   | $r \in R$                                       |
| $N$| total number of professors in the department| $N \in \mathbb Z^+$                            |
| $n^r$| number of professors at a given rank     | $n^r \in \mathbb Z^+$                           |
| $G$  | the set of professor genders             | $\{m, f\}$                                      |
| $g$  | the gender of a particular professor     | $g \in G$                                       |
| $q^r$ | Target percentage of professors at rank r |$ q^{r}$ $\in [0,1] $                              |
| $T^N$ | Target number for the size of the department | $T^N$ $\in \mathbb Z^+$                      |
| $\bar{\alpha}^r_g$ | The long-term hiring rate for a professor at rank $r$ and gender $g$| $\bar{\alpha}^r_g$   $\in [0,1]$                                                                                          |
| $\beta$ | the sensitivity of the hiring rate to the gap between the rank target and actual values | $\beta \in \mathbb R^+$                                                                                        |
| $\gamma$ | the sensitivity of the hiring rate to the gap between the target and actual department size| $\gamma \in \mathbb R^+$                                                                            |
| $v^r$| number of vacancies at rank r                                    | $v^r \in \mathbb Z^+$   |
| $v^N$| total number of vacancies in the department                      | $v^N$ $\in \mathbb Z^+$   |
| $h^r_g$| number of professors hired at rank $r$ and gender $g$  | $h^r_g$ $\in \mathbb Z^+ $        |




# Gender Diversity Evolution Model

## Department composition
Let's lay a very simple foundation for the model by setting some definitions. An academic department is made up of both male and female professors. Each professor has a rank contingent on the professor's position in the promotion ladder. The most junior rank is the Assistant professor who is a tenure-track candidate but who has not yet achieved tenure. The second rank is the Associate professor who has achieved tenure, but who does not yet have the publication/teaching/committee record to achieve the level of full professor. The final rank is the Full professor which is the top of the promotion ladder. 

To formalize these basic definitions, we define $n^r$ as the number of professors at a given rank. That number is equal to the sum of male $n^r_m$ and female $n^r_f$ professors at that rank, in the department. 

$$
\begin{equation}
n^r = n^r_f + n^r_m
\end{equation}
$$
The total size of the department $N$ is equal to the number of professors at each rank.

$$
\begin{equation}
N = n^1 + n^2 + n^3 = \sum_{i=1}^{3} n^i
\end{equation}
$$

Note that because the department size and number of professors at each rank will change from timestep to timestep, we include the time variable as a parameter. As a convention we will omit the time parameter unless its inclusion is important to the clarify of the argument. As an example, the department size equation above is more accurately described by:

$$
\begin{equation}
N(t) = n^1(t) + n^2(t) + n^3(t) = \sum_{i=1}^{3} n^i(t)
\end{equation}
$$


## Department size and rank targets

The university administration in concert with each department's leadership sets targets for the overall department size as well as the number of professors at each rank. First we define the target size of the entire department $T^N \in \mathbb Z_+ << \infty$. 

Because the department size may fluctuate from year to year--due to faculty hiring or attrition, it is convenient to express the target number of professors at each rank: Assistant, Associate, and Full as shares of the target department size. These shares are represented by $q^r \in [0,1]$. For example a department may set the target shares at 20% Assistant professors, 15% Associate professors, and 65% full professors. The professor shares are constrained by the requirement:

$$
\begin{equation}
\sum_{r=1}^{3} q^r(t) = 1
\end{equation}
$$

Based upon these pre-determined shares, the target number of professors in the department at a given rank is:

$$
\begin{equation}
\text{target number of professors at rank r} = \lfloor q^r\cdot T^N \rfloor
\end{equation}
\label{numprofrank}
$$

(I need to figure out whether I need this)

Ideally equation (\ref{numprofrank}) would always produce an integer value so that we would not have to deal with fractional professors. To guarantee that the target number of professors at any rank is always an integer and that the sum of the rank targets is less than or equal to the target department size, we employ the floor function: $\lfloor \rfloor$ which rounds down to the nearest integer value. The floor function ensures:

$$
\begin{equation}
\sum_{r = 1}^3 \lfloor q^r\cdot T^N \rfloor \leq T^N
\end{equation}
$$

As an example consider a department with a target department size $T^N$ of 100. A possible professor share for full professors $q^3$ may be 70%. Thus the target number of full professors in the department would be $0.70 * 100 = 70$. While the number of professors at each rank as well as the department size may vary from year to year, the target percentage $q^r$ will remain fixed.

One assumption is that the department size target changes very slowly over time and hence we treat it as fixed for the purposes of our model. For longer term simulations, this assumption is not necessarily valid. The model is flexible enough to include a growth rate for the department size over time, which would be a simple extension of this model. 

In any given year the actual department size and the actual shares of professors by rank will differ from the target. We measure the percentage deviation from target with the following formula for overall department size. 

$$
\begin{equation}
\text{Deviation from Target department size at time t} = 1 - \frac{N(t)}{T^N}
\end{equation}
$$

The deviation is positive if the department size is less than the target and will be negative if the deviation is in excess of the target. The sign of the deviation indicates whether the deviation should inflate or deflate subsequent hiring probability in the next timestep. The deviation from target for an individual professor rank is similarly measured as:

$$
\begin{equation}
\text{Deviation from department professor rank target at time t} = q^r - \frac{n^r(t)}{T^N}
\end{equation}
\label{devprofrank}
$$

The denominator of the second term is $T^N$ meaning that we compare the target share $q^r$ versus the actual share of the target department size. Some might suggest using $N(t)$ as the denominator instead, which would compare the target share with the actual relative share $\frac{n^r(t)}{N(t)}$. The choice of denominator depends on a choice whether maintaining relative balance in professors shares is more or less important than pursuing the department size targets. This is still an open question. 

Note that the rank and department size targets are gender agnostic.


## Attrition process

Each year a number of faculty at each rank will leave, due to retirement, alternative job opportunities, not receiving tenure, etc. These proportions are fixed for each level, and the attrition rates will differ by level and gender. Obviously, the attrition rate will be greatest for full professors, with lower attrition rates for associate and assistant professors.

The attrition rate for a rank and gender is denoted $a^r_g \in [0,1]$. Note that the attrition rate is a fixed constant and does not vary with time, meaning $a^r_g(t) = a^r_g$.

Within a given year, the number of attritions of a particular rank and gender group is at each level is:

$$
\begin{equation}
\text{Number of professor attritions at gender g, rank r, in time t} = a^r_g \cdot n^r_g(t)  
\end{equation}
$$

for example, the number of attrition of female assistant professors in a given year would be:

$$
\begin{equation}
\text{Number of female assistant professor attritions at time t} = a^1_f\cdot n^1_f(t)
\end{equation}
$$

Thus the total number of attritions at a given rank is equal to the sum of female and male attrition at that level.

$$
\begin{equation}
\text{Total number of attritions at rank r in year t } = a^r_f\cdot n^r_f(t) + a^r_m\cdot n^r_m(t)
\end{equation}
$$


## Promotion process

Each year a portion of the faculty at the Assistant and Associate professor levels are promoted to a higher level. The promotion rate is also fixed and invariant to time. The promotion rate is denoted by $p^r_g \in [0,1]$, where $r$ represents the current rank of the professor and $g$ the gender. Thus $p^1_f$ represents the promotion rate for a female Assistant professor to an Associate professor.

$$
\begin{equation}
\text{Number of promotions from rank r to rank r+1, for gender g, at time t} = p^r_g \cdot n^r_g(t)   
\end{equation}
$$

As promotions occur and faculty are reshuffled between ranks, there is a corresponding effect upon the amount of deviation between target and actual faculty numbers at each rank. In other words the equation (\ref{devprofrank}) must adjust to the new number of professors at each rank in the subsequent timestep. This change in deviation will influence the demands to hire at each rank. 

The promotion rates for male and female professors are based upon department level data. However, setting a common promotion rate does not seem to change the underlying dynamics.

## Hiring process

### Number of vacancies at each rank
Each year the department makes decisions to hire additional faculty. Some of these hires fill vacancies created by attrition while other hires may arise from chances to add highly qualified individuals to the department.

The two questions surrounding hiring are who to hire--meaning rank and gender--and how many to hire--meaning the number of vacancies.  I address the second question first. As mentioned above, each department has both an actual and target number of faculty at each rank as well as the department overall. The number of vacancies open for hiring ($v^r$) at a given rank is equal to the difference between the rank target, $\lfloor q^r\cdot T^N \rfloor$,versus the actual number of professors at that rank $n^r$. 

$$
\begin{equation}
v^r = max\{\lfloor q^r \cdot T^N \rfloor - n^r(t), 0\}
\end{equation}
$$

Including the $max()$ function ensures that the number of vacancies at all levels is non-negative. Even if particular level is above its target size, this would not generate negative vacancies. Instead, the department would simply have less incentive to hire at that level. The total number of vacancies in the department ($v^N$) is then equal to:

$$
\begin{equation}
v^N = \sum^3_{r=1} max\{\lfloor q^r\cdot T^N \rfloor- n^r(t), 0\}
\end{equation}
$$

### Multinomial Hiring Process
We model hiring as a multinomial probabilistic process where all of the vacancies are filled by a mix of assistant, associate, full professors, or are left unfilled based upon a probability for each outcome. The multinomial distribution generalizes the binomial distribution. Formally the definition of the multinomial distribution is that "it models the probability of counts for rolling a k-sided dice n times."[Wikipedia: https://en.wikipedia.org/wiki/Multinomial_distribution]. For example, suppose we have a die with six sides and a $\frac{1}{6}$ probability of getting any value between 1 and 6. If we rolled this die 1 time, we might get a vector such as [0,0,1,0,0,0] meaning that the die roll yielded a 3. Each roll will yield a different result, and the next roll might yield a value of 1 with corresponding vector [1,0,0,0,0,0]. When we sum the number of times we obtained each value after 10 rolls, we could obtain a vector like [1,1,3,2,2,1]. This final vector of counts after $k$-rolls is what the multinomial distribution yields, namely the probability distribtion over this final set of category counts. 

In our model there is a probability over the hiring of professors at each level at each time step. These probabilities vary with time, depending on professor demand--as I explain below. The probability of hiring a professor at rank $r$, gender $g$ at time $t$ is $p^r_g(t)$ Using these hiring probabilities, we can generate a random draw from the multinomial distribution. The rather ugly looking multinomial density function for professor hiring is:

$$
\begin{equation}
P(h_m^1(t),..., h_m^3(t),h_f^1(t),..., h_f^3(t); v^N, p_m^1(t), ... p_m^3(t), p_f^1(t), ... p_f^3(t)) = \\
\frac{v^N(t)!}{h^1_m(t)!\cdot h^2_m(t)!\cdot h^3_m(t)! \cdot h^1_f(t)!\cdot h^2_f(t)! \cdot h^3_f(t)!} \times p_m^1(t) \cdots p_m^3(t) \times p_f^1(t) \cdots p_f^3(t)
\end{equation}
\label{multinomial}
$$

While expression (\ref{multinomial}) might look ugly, it is simply the standard multinomial distribution with the relevant model parameters.

### Professor Demand functions

The hiring probability for each profesor group $p^r_g(t)$ is associated with demand for professors at that level. Most new hires occur at the Assistant professor level. These professors have the lowest salaries by virtue of their junior position, and also because they have not yet achieved tenure. Hiring Associate or Full professors, while possible, is less frequent due to the higher cost. One challenge that departments face is that over time the actual number of professors at all ranks may deviate from the target proportions. While promotions between ranks may alleviate some of these deviations, the promotion of professors from the Assistant rank to the Associate rank is understandably slow due to the awarding of tenure. As deviations between actual and target numbers accumulate at the the higher ranks, there may be enough pressure to hire more faculty at higher ranks as well. Hiring at higher ranks is especially likely when one-time shocks--like a high number of retirements--create sudden shortfalls at a given rank. 

The relative demand for professors at each rank is modelled with a demand function $\lambda^r_g(t)$. The higher the demand function, the greater the demand for hiring faculty at that rank and gender. To convert the demand function into hiring probabilities $p^r_g$, we renormalize the demand values to 1 with the following formula:

$$
\begin{equation}
p^r_g(t+1) = \frac{\lambda^r_g(t)}{\sum_{g \in G}^{}\sum_{r = 1}^{3} \lambda^r_g(t)}
\end{equation}
$$

The demand function for professors [\ref{demand}] includes three components. First there an average demand for that professor, estimated from historical data. This base rate, $\bar{\alpha}^r_g \in [0,1]$. The second component, $\beta\cdot h(\cdot)$, is an amplifying/dampening factor that increases/decreases demand depending on the degree of deviation between the target number of professors at the rank in question and current number. If the department is below target, then this term will increase hiring demand. If the department is above target in this category, then this term will dampen demand. The final term, $\gamma \cdot g(\cdot)$, is an inflation/deflation factor related to how far the department is from its target size. The same logic that applies to the second term applies to this final term in the equation. The constants $\beta,\ \ \gamma$ define the sensitivity of the functions $h(\cdot), \ g(\cdot)$ to deviations from the target values.

$$
\begin{equation}
\lambda^r_g(t) = \bar{\alpha}^r_g + \beta \cdot h(q^r - \frac{n^r(t)}{T^N}) + \gamma \cdot g(1 - \frac{N(t)}{T^N})
\end{equation}
\label{demand}
$$

We defined the deviation between target and actual faculty for both the second and third terms in percentage terms so that we would have a continuous pattern of variation. If we compared target versus actual faculty numbers, then these would be integer differences and discontinuous. 

The functions $h(\cdot), \ g(\cdot)$ can be any type of function that increases above zero and decreases below zero. In this initial model we employ a simple cubic function:

$$
\begin{equation}
h := (q^r - \frac{n^r(t)}{N(t)})^3 \\
g := (1 - \frac{N(t)}{T^N})^3
\end{equation}
$$

These cubic functions behave as in the image below. As either $\beta$, or $\gamma$ increase the plot migrates from the red line, toward the inner blue and then green line. Thus as $\beta$ or $\gamma$ increase, the hiring urgency defined by $h,g$ grows at a faster rate.    

<img src="images/desmos-graph.png" alt="Drawing" style="width: 400px;"/>

## Simulation

Based upon the equations above, we begin with a set of given values, including the department size targets, the number of professors at each rank and gender, and the average values for the demand function $\bar{\alpha}^r_g$ . The simulation will first execute attritions and promotions at all three levels, and then generate a set of hires at each level. These hires will be added to the department size numbers in the subsequent timestep. 

This simulation may run over as many timesteps or years as the user may wish. 