# Table of Contents
 <p><div class="lev1 toc-item"><a href="#Notation-and-Definition-of-Variables" data-toc-modified-id="Notation-and-Definition-of-Variables-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Notation and Definition of Variables</a></div><div class="lev1 toc-item"><a href="#Gender-Diversity-Evolution-Model" data-toc-modified-id="Gender-Diversity-Evolution-Model-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Gender Diversity Evolution Model</a></div><div class="lev2 toc-item"><a href="#Department-composition" data-toc-modified-id="Department-composition-21"><span class="toc-item-num">2.1&nbsp;&nbsp;</span>Department composition</a></div><div class="lev2 toc-item"><a href="#Department-size-and-rank-targets" data-toc-modified-id="Department-size-and-rank-targets-22"><span class="toc-item-num">2.2&nbsp;&nbsp;</span>Department size and rank targets</a></div><div class="lev2 toc-item"><a href="#Attrition-process" data-toc-modified-id="Attrition-process-23"><span class="toc-item-num">2.3&nbsp;&nbsp;</span>Attrition process</a></div><div class="lev2 toc-item"><a href="#Promotion-process" data-toc-modified-id="Promotion-process-24"><span class="toc-item-num">2.4&nbsp;&nbsp;</span>Promotion process</a></div><div class="lev2 toc-item"><a href="#Hiring-process" data-toc-modified-id="Hiring-process-25"><span class="toc-item-num">2.5&nbsp;&nbsp;</span>Hiring process</a></div><div class="lev3 toc-item"><a href="#Number-of-vacancies-at-each-rank" data-toc-modified-id="Number-of-vacancies-at-each-rank-251"><span class="toc-item-num">2.5.1&nbsp;&nbsp;</span>Number of vacancies at each rank</a></div><div class="lev3 toc-item"><a href="#Multinomial-Hiring-Process" data-toc-modified-id="Multinomial-Hiring-Process-252"><span class="toc-item-num">2.5.2&nbsp;&nbsp;</span>Multinomial Hiring Process</a></div><div class="lev3 toc-item"><a href="#Professor-Demand-functions" data-toc-modified-id="Professor-Demand-functions-253"><span class="toc-item-num">2.5.3&nbsp;&nbsp;</span>Professor Demand functions</a></div><div class="lev2 toc-item"><a href="#Simulation" data-toc-modified-id="Simulation-26"><span class="toc-item-num">2.6&nbsp;&nbsp;</span>Simulation</a></div>

# Notation and Definition of Variables


This is a list of some of the mathematical notation used in the equations below.

| Variable     | Description                      | Range                                           |
|--------------|----------------------------------|-------------------------------------------------|
| $R$ | the set of professor ranks:Assistant, Associate, Full denoted numerically    | $\{1,2,3\} $ |
| $r$| rank of a particular group of professors   | $r \in R$                                       |
| $N$| total number of professors in the department| $N \in \mathbb Z^+$                            |
| $n^r$| number of professors at a given rank     | $n^r \in \mathbb Z^+$                           |
| $G$  | the set of professor genders             | $\{m, f\}$                                      |
| $g$  | the gender of a particular professor     | $g \in G$                                       |
| $q^r$ | Target percentage of professors at rank r |$q^r \in [0,1]$                               |
| $T^N$ | Target number for the size of the department | $T^N$ $\in \mathbb Z^+$                      |
| $\bar{\alpha}^r_g$ | The long-term hiring rate for a professor at rank $r$ and gender $g$| $\bar{\alpha}^r_g$   $\in \lbrack 0,1 \rbrack$                                                                                          |
| $\beta$ | the sensitivity of the hiring rate to the gap between the rank target and actual values | $\beta \in \mathbb R^+$                                                                                        |
| $\gamma$ | the sensitivity of the hiring rate to the gap between the target and actual department size| $\gamma \in \mathbb R^+$                                                                            |
| $v^r$| number of vacancies at rank r                                    | $v^r \in \mathbb Z^+$   |
| $v^N$| total number of vacancies in the department                      | $v^N$ $\in \mathbb Z^+$   |
| $h^r_g$| number of professors hired at rank $r$ and gender $g$  |$h^r_g \in \mathbb Z^+$|




# Gender Diversity Evolution Model

## Department composition
Let's lay a very simple foundation for the model by setting some definitions. An academic department is made up of both male and female professors. Each professor has a rank contingent on the professor's position in the promotion ladder. The most junior rank is the Assistant professor who is a tenure-track candidate but who has not yet achieved tenure. The second rank is the Associate professor who has achieved tenure, but who does not yet have the publication/teaching/committee record to achieve the level of full professor. The final rank is the Full professor which is the top of the promotion ladder. 

To formalize these basic definitions, we define $n^r$ as the number of professors at a given rank. That number is equal to the sum of male $n^r_m$ and female $n^r_f$ professors at that rank, in the department. 

$$
\begin{equation}
n^r = n^r_f + n^r_m 
\end{equation}
$$

The total size of the department $N$ is equal to the number of professors at each rank.

$$
\begin{equation}
N = n^1 + n^2 + n^3 = \sum_{i=1}^{3} n^i
\end{equation}
$$

Note that because the department size and number of professors at each rank will change from timestep to timestep, we include the time variable as a parameter. As a convention we will omit the time parameter unless its inclusion is important to the clarify of the argument. As an example, the department size equation above is more accurately described by:

$$
\begin{equation}
N(t) = n^1(t) + n^2(t) + n^3(t) = \sum_{i=1}^{3} n^i(t)
\end{equation}
$$


## Department size and rank targets

The university administration in concert with each department's leadership sets target for the overall department size. First we define the target size of the entire department $T^N \in \mathbb Z_+$. 

Because the department size may fluctuate from year to year--due to faculty hiring or attrition, it is convenient to express the number of professors at each rank: Assistant, Associate, and Full as shares of the target department size. These shares are represented by $q^r \in [0,1]$. For example a department may be composed of 20% Assistant professors, 15% Associate professors, and 65% full professors in a given year. Due to professor churn, the professor shares will also change from year to year. The professor shares are constrained by the requirement:

$$
\begin{equation}
\sum_{r=1}^{3} q^r(t) = 1
\end{equation}
$$

To calculate the number of professors at a particular rank at a given time, simply multiple the current professor share times the target department share: $q^r * T^N$. To prevent this equation from providing fractional numbers of professors, we use a floor function ($\lfloor \rfloor$) to round the values down to the nearest integer.  

$$
\begin{equation}
n^r \approx \lfloor q^r\cdot T^N \rfloor
\end{equation}
$$

As an example consider a department with a target department size $T^N$ of 100. A possible professor share for full professors $q^3$ may be 70%. Thus the target number of full professors in the department would be $0.70 * 100 = 70$. 

One assumption is that the department size target changes very slowly over time and hence we treat it as fixed for the purposes of our model. For longer term simulations, this assumption is not necessarily valid. The model is flexible enough to include a growth rate for the department size over time, which would be a simple extension of this model. 

In any given year the actual department size will differ from the target. We measure the percentage deviation from target with the following formula for overall department size. 

$$
\begin{equation}
\text{Deviation from Target department size at time t} = 1 - \frac{N(t)}{T^N}
\end{equation}
$$

The deviation is positive if the department size is less than the target and will be negative if the deviation is in excess of the target. The sign of the deviation indicates whether the deviation should inflate or deflate subsequent hiring probability in the next timestep. 

## Attrition process

Each year a number of faculty at each rank will leave, due to retirement, alternative job opportunities, not receiving tenure, etc. These proportions are fixed for each level, and the attrition rates will differ by level and gender. Obviously, the attrition rate will be greatest for full professors, with lower attrition rates for associate and assistant professors.

The attrition rate for a rank and gender is denoted $a^r_g \in [0,1]$. Note that the attrition rate is a fixed constant and does not vary with time, meaning $a^r_g(t) = a^r_g$.

Within a given year, the number of attritions of a particular rank and gender group at each level is:

$$
\begin{equation}
\text{Number of professor attritions at gender g, rank r, in time t} = a^r_g \cdot n^r_g(t)  
\end{equation}
$$

for example, the number of attritions of female assistant professors in a given year would be:

$$
\begin{equation}
\text{Number of female assistant professor attritions at time t} = a^1_f\cdot n^1_f(t)
\end{equation}
$$

Thus the total number of attritions at a given rank is equal to the sum of female and male attrition at that level.

$$
\begin{equation}
\text{Total number of attritions at rank r in year t } = a^r_f\cdot n^r_f(t) + a^r_m\cdot n^r_m(t)
\end{equation}
$$


## Promotion process

Each year a portion of the faculty at the Assistant and Associate professor levels are promoted to a higher level. The promotion rate is also fixed and invariant to time. The promotion rate is denoted by $p^r_g \in [0,1]$, where $r$ represents the current rank of the professor and $g$ the gender. Thus $p^1_f$ represents the promotion rate for a female Assistant professor to an Associate professor.

$$
\begin{equation}
\text{Number of promotions from rank r to rank r+1, for gender g, at time t} = p^r_g \cdot n^r_g(t)   
\end{equation}
$$

The promotion rates for male and female professors are based upon department level data. However, setting a common promotion rate does not seem to change the underlying dynamics.

## Hiring process

### Number of vacancies at each rank
Each year the department makes decisions to hire additional faculty. Some of these hires fill vacancies created by attrition while other hires may arise from chances to add highly qualified individuals to the department.

The two questions surrounding hiring are who to hire--meaning rank and gender--and how many to hire--meaning the number of vacancies. I address the second question first. The number of FTE vacancies in a department is based upon the difference between the current number of professors and the target number of professors. If the department is under its target size, then the department will choose to add additional faculty until the current number equals the target number. Of course there is no guarantee that all vacancies will be filled within a single year, as availability of suitable candidates and other factors may limit the total number of hires. Formally, the number of vacancies in the department is:

$$
\begin{equation}
v^N = \sum^3_{r=1} max\{\lfloor q^r\cdot T^N \rfloor- n^r(t), 0\}
\end{equation}
$$

Including the $max()$ function ensures that the number of vacancies always is non-negative. Even if the department size is above its target size, this would not generate negative vacancies--meaning layoffs. Instead, the department would simply stop hiring until the department size fell below the target again. 


### Multinomial Hiring Process
We model hiring as a multinomial probabilistic process where all of the vacancies are filled by a mix of assistant, associate, full professors, or are left unfilled based upon a probability for each outcome. The multinomial distribution generalizes the binomial distribution. Formally the definition of the multinomial distribution is that "it models the probability of counts for rolling a k-sided dice n times."[Wikipedia: https://en.wikipedia.org/wiki/Multinomial_distribution]. For example, suppose we have a die with six sides and a $\frac{1}{6}$ probability of getting any value between 1 and 6. If we rolled this die 1 time, we might get a vector such as [0,0,1,0,0,0] meaning that the die roll yielded a 3. Each roll will yield a different result, and the next roll might yield a value of 1 with corresponding vector [1,0,0,0,0,0]. When we sum the number of times we obtained each value after 10 rolls, we could obtain a vector like [1,1,3,2,2,1]. This final vector of counts after $k$-rolls is what the multinomial distribution yields, namely the probability distribtion over this final set of category counts. 

In our model there is a probability over the hiring of professors at each level at each time step. These probabilities vary with time, depending on professor demand--as I explain below. The probability of hiring a professor at rank $r$, gender $g$ at time $t$ is $p^r_g(t)$ Using these hiring probabilities, we can generate a random draw from the multinomial distribution. The rather ugly looking multinomial density function for professor hiring is:

$$
\begin{equation}
P(h_m,f^{1,2,3}(t),h^{none}(t); v^N, p_m,f^{1,2,3}(t), p^{none}(t)) = 
\frac{v^N(t)!}{h^{1,2,3}_{m,f}(t)!\cdot h^{none}(t)!} \times p_{m,f}^{1,2,3}(t) \times \cdot p^{none}
\end{equation}
\label{multinomial}
$$

The $h^{none}, p^{none}$ represent the probability of not hiring anyone for a given vacancy. While expression (\ref{multinomial}) might look ugly, it is simply the standard multinomial distribution with the relevant model parameters.

Note also, that in the case where the department is over size, $v^N(t)$ is equal to zero and no hiring would occur in that year.

### Professor Demand functions

The hiring probability for each profesor group $p^r_g(t)$ is associated with demand for professors at that level. Most new hires occur at the Assistant professor level. These professors have the lowest salaries by virtue of their junior position, and also because they have not yet achieved tenure. Hiring Associate or Full professors, while possible, is less frequent due to the higher cost. One challenge that departments may face is that over time the distribution of professors at all ranks may vary due to retirements of full professors and hires of Assistant professors. While promotions from junior to senior ranks may alleviate some of these distributional changes, the promotion of professors from the Assistant rank to the Associate rank is understandably slow due to the awarding of tenure. Our model privileges hiring at the Assistant professor level, but does not preclude hiring at higher levels. Hiring at higher ranks is more likely when a large number of vacancies occur, for example a high number of retirements in a given year. 

The relative demand for professors at each rank is modelled with a demand function $\lambda^r_g(t)$. The higher the demand function, the greater the demand for hiring faculty at that rank and gender. To convert the demand function into hiring probabilities $p^r_g$, we renormalize the demand values to 1 with the following formula:

$$
\begin{equation}
p^r_g(t+1) = \frac{\lambda^r_g(t)}{\sum_{g \in G}^{}\sum_{r = 1}^{3} \lambda^r_g(t)}
\end{equation}
$$

The demand function for professors [\ref{demand}] includes three components. First there an average demand for that professor, estimated from historical data. This base rate, $\bar{\alpha}^r_g \in [0,1]$. The second term, $\gamma \cdot f(\cdot)$, is an inflation/deflation factor related to how far the department is from its target size. If the department is above/below target then there is a reduced/increased demand for professors. The constant $\gamma_g^r$ indicates how a shortfall of professors translates into shifting demand for hiring. Note that each $\gamma_g^r$ is different for each professor group. 

$$
\begin{equation}
\lambda^r_g(t) = \bar{\alpha}^r_g + \gamma_g^r \cdot f(1 - \frac{N(t)}{T^N})
\end{equation}
\label{demand}
$$

We defined the deviation between target and actual faculty: the second term, with respect to percentage deviation so that we would have a continuous pattern of variation. If we compared target versus actual faculty numbers, then these would be integer differences and discontinuous. 

The function $f(\cdot)$ can be any type of function that increases above zero and decreases below zero. In this initial model we employ a simple cubic function:

$$
\begin{equation}
f := (1 - \frac{N(t)}{T^N})^3
\end{equation}
$$

This cubic function behaves as in the image below. As $\gamma$ increases the plot migrates from the red line, toward the inner blue and then green line. Thus as $\gamma$ increases, the hiring urgency defined by $f$ grows at a faster rate.    

<img src="images/desmos-graph.png" alt="Drawing" style="width: 400px;"/>

## Simulation

Based upon the equations above, we begin with a set of given values, including the department size targets, the number of professors at each rank and gender, and the average values for the demand function $\bar{\alpha}^r_g$ . The simulation will first execute attritions and promotions at all three levels, and then generate a set of hires at each level. These hires will be added to the department size numbers in the subsequent timestep. 

This simulation may run over as many timesteps or years as the user may wish. 