# Bootstrap Analaysis Code for SBM

James Yu, 10 June 2023

In [1]:
using FileIO, Statistics
NUMBER_OF_TYPES = 4
NUMBER_OF_SINKS = 4;

The bootstrap output is stored as a serialized `jld2` file. This allows complicated Julia data structures to be packed neatly into storable file objects.

In [2]:
bootstrap_output = load(".estimates/bootstrap_output.jld2")

Dict{String, Any} with 4 entries:
  "placement_rates"  => [1.52909 0.108108 0.0134031 0.00151095; 0.642959 0.1227…
  "type_allocation"  => [3.0 3.0 … 4.0 3.0; 3.0 3.0 … 3.0 3.0; … ; 8.0 8.0 … 8.…
  "placement_counts" => [552.0 76.0 41.0 6.0; 452.0 168.0 75.0 10.0; … ; 473.0 …
  "dept_labels"      => ["Aalto University" "Aalto University" … "Aalto Univers…

We store four components in this file: the Poisson placement rates, the total placements between departments, as well as the type allocation and associated names of these departments.

Given the raw data, we can compute some summary statistics:

In [3]:
mean(bootstrap_output["placement_rates"], dims = 3)

8×4×1 Array{Float64, 3}:
[:, :, 1] =
 1.41671    0.105752   0.0120811   0.00131644
 0.651465   0.125555   0.0129025   0.00114738
 0.248567   0.101097   0.0140545   0.001216
 0.020912   0.0161137  0.00255044  0.00133763
 0.333165   0.0767553  0.012534    0.00592656
 0.190183   0.0602189  0.00747603  0.00171182
 0.212533   0.0433876  0.004804    0.000711695
 0.0267521  0.0158021  0.00373293  0.000806544

In [4]:
var(bootstrap_output["placement_rates"], dims = 3)

8×4×1 Array{Float64, 3}:
[:, :, 1] =
 0.0312463    0.000926512  5.75302e-6  6.96893e-7
 0.00925306   0.000638534  8.63567e-6  1.13437e-7
 0.00104914   0.00026346   5.79274e-6  4.86205e-8
 3.96454e-5   1.40708e-5   2.06011e-7  4.59629e-8
 0.00130591   0.000188562  9.51921e-6  1.56291e-6
 0.000172925  8.62864e-5   1.63213e-6  1.11377e-7
 0.000241541  5.57546e-5   7.73326e-7  3.48873e-8
 1.97791e-6   3.188e-6     1.64944e-7  7.42827e-9

For the total placements, we can do a similar thing:

In [5]:
mean(bootstrap_output["placement_counts"], dims = 3)

8×4×1 Array{Float64, 3}:
[:, :, 1] =
 519.675   79.875   34.425    5.475
 500.55   204.175   76.825   10.025
 706.85   604.2    311.7     39.275
  87.275  139.725   82.6     62.9
 133.55    64.775   39.225   26.925
 375.15   248.725  114.725   38.125
 467.825  199.975   82.275   17.775
 431.175  531.375  465.475  146.375

In [6]:
std(bootstrap_output["placement_counts"], dims = 3)

8×4×1 Array{Float64, 3}:
[:, :, 1] =
 73.2076  13.1893    6.20954   3.39674
 52.2562  37.3726   13.6586    3.02543
 58.286   57.1598   40.3766    7.26067
 29.9512  28.9438   16.0796   13.2622
 11.1814   9.88002   9.12446   5.91559
 30.8026  26.124    17.6969    7.69011
 36.9024  25.4059   13.7542    4.9948
 59.2487  37.5627   43.656    20.9978

We are also interested in the frequency of assignment of each department to a given type. To analyze this, we want to create a table measuring the frequency by which each department is placed in each type. Thus:

In [7]:
average_type_allocation = mean(bootstrap_output["type_allocation"], dims = 2)

for sorted_type in 1:NUMBER_OF_TYPES
    counter = 0
    inst_hold = []
    println("TYPE $sorted_type:")
    for (i, assign_type) in enumerate(Int.(round.(average_type_allocation)))
        if sorted_type == assign_type
            push!(inst_hold, (i, bootstrap_output["dept_labels"][i, 1]))
            counter += 1
        end
    end
    for (i, dept) in sort(inst_hold, by = x->x[2])
        frequency = bootstrap_output["type_allocation"][i, :]
        counter_table = zeros(Int64, NUMBER_OF_TYPES)
        for element in frequency
            counter_table[Int(element)] += 1
        end
        print(dept)
        print(": Avg Type $(average_type_allocation[i]) (")
        seen = false
        for (t, num) in enumerate(counter_table)
            if num != 0
                if seen print(", ") end
                seen = true
                print("Type $t: $(num)x")
            end
        end
        println(")")
    end
    println("Total Institutions: $counter")
    println()
end

TYPE 1:
Boston University: Avg Type 1.15 (Type 1: 34x, Type 2: 6x)
Columbia University: Avg Type 1.0 (Type 1: 40x)
Duke University: Avg Type 1.0 (Type 1: 40x)
Harvard University: Avg Type 1.0 (Type 1: 40x)
London School of Economics and Political Science: Avg Type 1.125 (Type 1: 35x, Type 2: 5x)
Massachusetts Institute of Technology: Avg Type 1.175 (Type 1: 33x, Type 2: 7x)
New York University: Avg Type 1.0 (Type 1: 40x)
Northwestern University: Avg Type 1.0 (Type 1: 40x)
Princeton University: Avg Type 1.0 (Type 1: 40x)
Stanford University: Avg Type 1.0 (Type 1: 40x)
University of California Los Angeles (UCLA): Avg Type 1.0 (Type 1: 40x)
University of California, Berkeley: Avg Type 1.0 (Type 1: 40x)
University of Chicago: Avg Type 1.0 (Type 1: 40x)
University of Maryland: Avg Type 1.0 (Type 1: 40x)
University of Michigan: Avg Type 1.0 (Type 1: 40x)
University of Pennsylvania: Avg Type 1.0 (Type 1: 40x)
University of Wisconsin, Madison: Avg Type 1.0 (Type 1: 40x)
Yale University: Avg Ty