# Bootstrap Analaysis Code for SBM

James Yu, 10 June 2023

In [1]:
using FileIO, Statistics
NUMBER_OF_TYPES = 4
NUMBER_OF_SINKS = 4;

The bootstrap output is stored as a serialized `jld2` file. This allows complicated Julia data structures to be packed neatly into storable file objects.

In [2]:
bootstrap_output = load(".estimates/bootstrap_output.jld2")

Dict{String, Any} with 4 entries:
  "placement_rates"  => [1.25207 0.0786713 0.0109428 0.000239234; 0.487762 0.07…
  "type_allocation"  => [1.0 1.0 … 1.0 1.0; 4.0 4.0 … 4.0 4.0; … ; 8.0 8.0 … 8.…
  "placement_counts" => [606.0 90.0 39.0 1.0; 558.0 208.0 65.0 13.0; … ; 506.0 …
  "dept_labels"      => ["Columbia University" "Columbia University" … "Columbi…

We store four components in this file: the Poisson placement rates, the total placements between departments, as well as the type allocation and associated names of these departments.

Given the raw data, we can compute some summary statistics:

In [4]:
mean(bootstrap_output["placement_rates"], dims = 3)

8×4×1 Array{Float64, 3}:
[:, :, 1] =
 1.34603    0.110287   0.0118673   0.00135527
 0.576721   0.111215   0.0111147   0.00139103
 0.246182   0.0947503  0.0125098   0.00115544
 0.0170196  0.0138622  0.00249879  0.00118616
 0.349849   0.071794   0.0135029   0.0052838
 0.18117    0.0551953  0.00745912  0.00179296
 0.209035   0.0399636  0.00457567  0.000676247
 0.0265566  0.0150036  0.00345671  0.000840428

In [5]:
var(bootstrap_output["placement_rates"], dims = 3)

8×4×1 Array{Float64, 3}:
[:, :, 1] =
 0.0357659    0.00207511   3.5916e-6   1.51625e-6
 0.0110953    0.00107924   8.67546e-6  3.69581e-7
 0.00154718   0.000424547  8.39806e-6  5.6894e-8
 4.6309e-5    1.96978e-5   3.3682e-7   2.97026e-8
 0.00136872   0.000272191  4.17841e-6  9.44873e-7
 0.000124051  0.000142728  1.32766e-6  1.36861e-7
 0.000325863  7.31036e-5   8.40776e-7  5.6247e-8
 6.1505e-7    4.35281e-6   2.57746e-7  7.52552e-9

For the total placements, we can do a similar thing:

In [7]:
mean(bootstrap_output["placement_counts"], dims = 3)

8×4×1 Array{Float64, 3}:
[:, :, 1] =
 551.5   89.4   37.0    5.7
 485.9  190.6   70.2   12.3
 761.3  603.2  294.5   37.2
  73.1  118.7   80.6   52.1
 148.3   62.2   43.6   23.0
 378.5  234.6  118.0   38.5
 485.5  189.9   80.8   16.3
 452.4  523.1  444.3  146.7

In [8]:
std(bootstrap_output["placement_counts"], dims = 3)

8×4×1 Array{Float64, 3}:
[:, :, 1] =
 75.7397  17.0046    5.03322   4.80856
 57.9836  27.1752    6.25033   5.90762
 35.4371  49.012    43.1618    8.24352
 35.0411  31.1129   20.7429   14.2474
 11.5089   7.45058   4.59952   3.8873
 35.8554  20.8444   12.9013    8.61846
 26.0352  19.0114   13.5384    6.16532
 56.6219  32.6716   48.9059   22.101

We are also interested in the frequency of assignment of each department to a given type. To analyze this, we want to create a table measuring the frequency by which each department is placed in each type. Thus:

In [10]:
average_type_allocation = mean(bootstrap_output["type_allocation"], dims = 2)

for sorted_type in 1:NUMBER_OF_TYPES
    counter = 0
    inst_hold = []
    println("TYPE $sorted_type:")
    for (i, assign_type) in enumerate(Int.(round.(average_type_allocation)))
        if sorted_type == assign_type
            push!(inst_hold, (i, bootstrap_output["dept_labels"][i, 1]))
            counter += 1
        end
    end
    for (i, dept) in sort(inst_hold, by = x->x[2])
        frequency = bootstrap_output["type_allocation"][i, :]
        counter_table = zeros(Int64, NUMBER_OF_TYPES)
        for element in frequency
            counter_table[Int(element)] += 1
        end
        print(dept)
        print(": Avg Type $(average_type_allocation[i]) (")
        seen = false
        for (t, num) in enumerate(counter_table)
            if num != 0
                if seen print(", ") end
                seen = true
                print("Type $t: $(num)x")
            end
        end
        println(")")
    end
    println("Total Institutions: $counter")
    println()
end

TYPE 1:
Boston University: Avg Type 1.0 (Type 1: 10x)
Columbia University: Avg Type 1.0 (Type 1: 10x)
Duke University: Avg Type 1.0 (Type 1: 10x)
Harvard University: Avg Type 1.0 (Type 1: 10x)
London School of Economics and Political Science: Avg Type 1.1 (Type 1: 9x, Type 2: 1x)
Massachusetts Institute of Technology: Avg Type 1.2 (Type 1: 8x, Type 2: 2x)
New York University: Avg Type 1.0 (Type 1: 10x)
Northwestern University: Avg Type 1.0 (Type 1: 10x)
Ohio State University: Avg Type 1.4 (Type 1: 6x, Type 2: 4x)
Princeton University: Avg Type 1.0 (Type 1: 10x)
Stanford University: Avg Type 1.0 (Type 1: 10x)
University of California Los Angeles (UCLA): Avg Type 1.0 (Type 1: 10x)
University of California, Berkeley: Avg Type 1.0 (Type 1: 10x)
University of Chicago: Avg Type 1.0 (Type 1: 10x)
University of Maryland: Avg Type 1.0 (Type 1: 10x)
University of Michigan: Avg Type 1.0 (Type 1: 10x)
University of Pennsylvania: Avg Type 1.0 (Type 1: 10x)
University of Wisconsin, Madison: Avg Type