## Emerging HSI school stats (physical sciences)

In [1]:
import pandas as pd
import numpy as np
import glob
import os
import matplotlib.pyplot as plt
import completions_module as cmodule

This notebook contains various tables for `Physical Sciences` degrees (2009-2019).

- Table 1: Bachelors / Hispanic + Latinx (total sum for year range)
- Table 2: Bachelors / Hispanic + Latinx (annual mean for year range)
- Table 3: Bachelors / Grand total (total sum for year range)
- Table 4: Bachelors / Grand total (annual mean for year range)

- Table 5: Masters / Hispanic + Latinx (total sum for year range)
- Table 6: Masters / Hispanic + Latinx (annual mean for year range)
- Table 7: Masters / Grand total (total sum for year range)
- Table 8: Masters / Grand total (annual mean for year range)

- Table 9: Doctorate / Hispanic + Latinx (total sum for year range)
- Table 10: Doctorate / Hispanic + Latinx (annual mean for year range)
- Table 11: Doctorate / Grand total (total sum for year range)
- Table 12: Doctorate / Grand total (annual mean for year range)


_Author: Maria J. Molina (molina@ucar.edu)_

**First, lets grab emerging hsi unis that are in ucar provided list**

In [2]:
hsi = pd.read_excel('../data/HACU-2019-20-Emerging-Hispanic-Institutions.xlsx', engine='openpyxl')
hsi_uni_list = hsi['Name'].unique()

all_files = glob.glob("../data/completions_*.csv")
all_files = sorted(all_files)
df = cmodule.open_and_concat(all_files)
ucar_uni_list = df['institution name'].unique()

ucar_list_hsi = ucar_uni_list[np.isin(ucar_uni_list, hsi_uni_list)]

In [3]:
df_ucar_hsi = df.loc[df['institution name'].isin(ucar_list_hsi)]

**Now, lets grab emerging hsi unis that are not in the ucar provided list**

In [5]:
all_files = glob.glob("../data/emerginghsi_completions_*.csv")  # emerging HSI list
all_files = sorted(all_files)
df_hsi = cmodule.open_and_concat(all_files)

**Finally, append the two lists of unis together**

In [6]:
all_hsi_unis = pd.concat([df_ucar_hsi, df_hsi])

### Top 30 US institutions: Hispanic/Latinx bachelor degree recipients in the Physical Sciences (2009-2019)

In [7]:
df = cmodule.bs_degrees(all_hsi_unis)
df = cmodule.physical_sciences(df)
df.groupby('institution name').sum()['Hispanic or Latino total'].sort_values(ascending=False).head(30)  # TOTAL SUM (2009-2019)

institution name
The University of Texas at Austin                          510
University of Florida                                      369
Texas A & M University-College Station                     356
University of California-San Diego                         341
University of California-Berkeley                          260
University of California-Los Angeles                       249
University of California-Davis                             209
University of South Florida-Main Campus                    180
Florida State University                                   175
Arizona State University-Tempe                             149
Massachusetts Institute of Technology                      132
Stanford University                                        115
University of Miami                                        107
Sam Houston State University                               105
California Polytechnic State University-San Luis Obispo     86
Northern Illinois University          

In [8]:
df = cmodule.bs_degrees(all_hsi_unis)
df = cmodule.physical_sciences(df)
df.groupby('institution name').mean()['Hispanic or Latino total'].sort_values(ascending=False).head(30)  # ANNUAL MEAN (2009-2019)

institution name
The University of Texas at Austin                          46.363636
Texas A & M University-College Station                     18.736842
University of Florida                                      16.772727
University of California-San Diego                         15.500000
University of South Florida-Main Campus                    12.000000
University of California-Los Angeles                       11.857143
University of California-Berkeley                          11.818182
Arizona State University-Tempe                             11.461538
University of California-Davis                              9.500000
Florida State University                                    8.333333
The University of Texas at Dallas                           6.818182
Massachusetts Institute of Technology                       6.285714
California Institute of Technology                          5.818182
Stanford University                                         5.227273
University of Mia

### Top 30 US institutions: Grand Total bachelor degree recipients in the Physical Sciences (2009-2019)

In [9]:
df = cmodule.bs_degrees(all_hsi_unis)
df = cmodule.physical_sciences(df)
df.groupby('institution name').sum()['Grand total'].sort_values(ascending=False).head(30)  # TOTAL SUM (2009-2019)

institution name
University of California-Berkeley                          3337
University of California-San Diego                         3143
The University of Texas at Austin                          2848
University of Florida                                      2170
Texas A & M University-College Station                     2165
University of California-Los Angeles                       1825
University of California-Davis                             1672
Massachusetts Institute of Technology                      1241
Florida State University                                   1213
University of South Florida-Main Campus                    1169
Arizona State University-Tempe                             1097
Stanford University                                         904
Northern Illinois University                                860
California Polytechnic State University-San Luis Obispo     841
SUNY at Albany                                              771
Washington State Univer

In [10]:
df = cmodule.bs_degrees(all_hsi_unis)
df = cmodule.physical_sciences(df)
df.groupby('institution name').mean()['Grand total'].sort_values(ascending=False).head(30)  # ANNUAL MEAN (2009-2019)

institution name
The University of Texas at Austin                          258.909091
University of California-Berkeley                          151.681818
University of California-San Diego                         142.863636
Texas A & M University-College Station                     113.947368
University of Florida                                       98.636364
University of California-Los Angeles                        86.904762
Arizona State University-Tempe                              84.384615
University of South Florida-Main Campus                     77.933333
University of California-Davis                              76.000000
Washington State University                                 65.818182
Massachusetts Institute of Technology                       59.095238
Florida State University                                    57.761905
The University of Texas at Dallas                           56.909091
California Institute of Technology                          51.727273
Nor

### Top 30 US institutions: Hispanic/Latinx master degree recipients in the Physical Sciences (2009-2019)

In [11]:
df = cmodule.ms_degrees(all_hsi_unis)
df = cmodule.physical_sciences(df)
df.groupby('institution name').sum()['Hispanic or Latino total'].sort_values(ascending=False).head(30)  # TOTAL SUM (2009-2019)

institution name
University of California-San Diego         86
Johns Hopkins University                   64
University of California-Los Angeles       63
Texas A & M University-College Station     62
Stanford University                        57
University of California-Berkeley          42
Florida State University                   36
The University of Texas at Austin          36
Rice University                            22
University of California-Davis             20
University of South Florida-Main Campus    18
Texas A & M University-Commerce            17
The University of Texas at Dallas          17
University of Florida                      16
Portland State University                  11
Governors State University                 11
DePaul University                          10
SUNY at Albany                             10
University of Miami                         9
Illinois Institute of Technology            9
Massachusetts Institute of Technology       8
Arizona State Uni

In [12]:
df = cmodule.ms_degrees(all_hsi_unis)
df = cmodule.physical_sciences(df)
df.groupby('institution name').mean()['Hispanic or Latino total'].sort_values(ascending=False).head(30)  # annual mean (2009-2019)

institution name
University of California-San Diego         7.818182
Johns Hopkins University                   5.818182
University of California-Los Angeles       5.727273
Texas A & M University-College Station     5.636364
Stanford University                        5.181818
University of California-Berkeley          3.818182
Florida State University                   3.272727
The University of Texas at Austin          3.272727
Rice University                            1.692308
University of California-Davis             1.666667
University of South Florida-Main Campus    1.636364
Texas A & M University-Commerce            1.545455
The University of Texas at Dallas          1.545455
University of Florida                      1.454545
Portland State University                  1.000000
SUNY at Albany                             0.909091
University of Miami                        0.818182
Illinois Institute of Technology           0.818182
Governors State University                 0.78

### Top 30 US institutions: Grand total master degree recipients in the Physical Sciences (2009-2019)

In [13]:
df = cmodule.ms_degrees(all_hsi_unis)
df = cmodule.physical_sciences(df)
df.groupby('institution name').sum()['Grand total'].sort_values(ascending=False).head(30)  # TOTAL SUM (2009-2019)

institution name
Johns Hopkins University                         1361
University of California-San Diego               1109
University of California-Los Angeles              807
University of California-Berkeley                 770
Texas A & M University-College Station            741
Florida State University                          716
Rice University                                   676
The University of Texas at Austin                 611
Stanford University                               607
University of Florida                             418
University of California-Davis                    397
University of South Florida-Main Campus           368
The University of Texas at Dallas                 350
California Institute of Technology                338
Lamar University                                  329
New York University                               283
SUNY at Albany                                    276
Illinois Institute of Technology                  265
Fairleigh D

In [14]:
df = cmodule.ms_degrees(all_hsi_unis)
df = cmodule.physical_sciences(df)
df.groupby('institution name').mean()['Grand total'].sort_values(ascending=False).head(30)  # annual mean (2009-2019)

institution name
Johns Hopkins University                         123.727273
University of California-San Diego               100.818182
University of California-Los Angeles              73.363636
University of California-Berkeley                 70.000000
Texas A & M University-College Station            67.363636
Florida State University                          65.090909
The University of Texas at Austin                 55.545455
Stanford University                               55.181818
Rice University                                   52.000000
University of Florida                             38.000000
University of South Florida-Main Campus           33.454545
University of California-Davis                    33.083333
The University of Texas at Dallas                 31.818182
California Institute of Technology                30.727273
Lamar University                                  29.909091
New York University                               25.727273
SUNY at Albany         

### Top 30 US institutions: Hispanic/Latinx doctorate degree recipients in the Physical Sciences (2009-2019)

In [15]:
df = cmodule.phd_degrees(all_hsi_unis)
df = cmodule.physical_sciences(df)
df.groupby('institution name').sum()['Hispanic or Latino total'].sort_values(ascending=False).head(30)  # total sum (2009-2019)

institution name
University of California-Berkeley          63
University of California-San Diego         49
University of California-Los Angeles       46
The University of Texas at Austin          43
Texas A & M University-College Station     39
Massachusetts Institute of Technology      37
Stanford University                        36
University of California-Davis             33
University of Florida                      28
University of South Florida-Main Campus    26
California Institute of Technology         25
Rice University                            24
University of Miami                        18
University of Southern California          16
Arizona State University-Tempe             15
The University of Texas at Dallas          13
Johns Hopkins University                    9
Florida State University                    8
Washington State University                 8
New York University                         7
SUNY at Albany                              6
New Jersey Instit

In [16]:
df = cmodule.phd_degrees(all_hsi_unis)
df = cmodule.physical_sciences(df)
df.groupby('institution name').mean()['Hispanic or Latino total'].sort_values(ascending=False).head(30)  # annual mean (2009-2019)

institution name
University of California-Berkeley          5.727273
University of California-San Diego         4.454545
University of California-Los Angeles       4.181818
The University of Texas at Austin          3.909091
Texas A & M University-College Station     3.545455
Massachusetts Institute of Technology      3.363636
Stanford University                        3.272727
University of California-Davis             3.000000
University of Florida                      2.545455
University of South Florida-Main Campus    2.363636
California Institute of Technology         2.272727
Rice University                            2.181818
University of Miami                        1.636364
Arizona State University-Tempe             1.363636
The University of Texas at Dallas          1.181818
Johns Hopkins University                   0.818182
University of Southern California          0.800000
Florida State University                   0.727273
Washington State University                0.72

### Top 30 US institutions: Grand total doctorate degree recipients in the Physical Sciences (2009-2019)

In [17]:
df = cmodule.phd_degrees(all_hsi_unis)
df = cmodule.physical_sciences(df)
df.groupby('institution name').sum()['Grand total'].sort_values(ascending=False).head(30)  # TOTAL SUM (2009-2019)

institution name
University of California-Berkeley          1390
Stanford University                        1178
Massachusetts Institute of Technology      1097
The University of Texas at Austin           996
University of California-San Diego          942
Texas A & M University-College Station      881
California Institute of Technology          807
University of California-Los Angeles        785
University of Florida                       756
University of California-Davis              667
Florida State University                    597
University of Southern California           501
Rice University                             495
Arizona State University-Tempe              479
Johns Hopkins University                    420
University of South Florida-Main Campus     413
New York University                         280
University of Miami                         266
The University of Texas at Dallas           228
Washington State University                 226
SUNY at Albany         

In [18]:
df = cmodule.phd_degrees(all_hsi_unis)
df = cmodule.physical_sciences(df)
df.groupby('institution name').mean()['Grand total'].sort_values(ascending=False).head(30)  # annual mean (2009-2019)

institution name
University of California-Berkeley          126.363636
Stanford University                        107.090909
Massachusetts Institute of Technology       99.727273
The University of Texas at Austin           90.545455
University of California-San Diego          85.636364
Texas A & M University-College Station      80.090909
California Institute of Technology          73.363636
University of California-Los Angeles        71.363636
University of Florida                       68.727273
University of California-Davis              60.636364
Florida State University                    54.272727
Rice University                             45.000000
Arizona State University-Tempe              43.545455
Johns Hopkins University                    38.181818
University of South Florida-Main Campus     37.545455
New York University                         25.454545
University of Southern California           25.050000
University of Miami                         24.181818
The Univers