## HSI School stats (atmospheric sciences)

In [1]:
import pandas as pd
import numpy as np
import glob
import os
import matplotlib.pyplot as plt
import completions_module as cmodule

This notebook contains various tables for `Atmospheric Sciences` degrees (2009-2019).

- Table 1: Bachelors / Hispanic + Latinx (total sum for year range)
- Table 2: Bachelors / Hispanic + Latinx (annual mean for year range)
- Table 3: Bachelors / Grand total (total sum for year range)
- Table 4: Bachelors / Grand total (annual mean for year range)

- Table 5: Masters / Hispanic + Latinx (total sum for year range)
- Table 6: Masters / Hispanic + Latinx (annual mean for year range)
- Table 7: Masters / Grand total (total sum for year range)
- Table 8: Masters / Grand total (annual mean for year range)

- Table 9: Doctorate / Hispanic + Latinx (total sum for year range)
- Table 10: Doctorate / Hispanic + Latinx (annual mean for year range)
- Table 11: Doctorate / Grand total (total sum for year range)
- Table 12: Doctorate / Grand total (annual mean for year range)


_Author: Maria J. Molina (molina@ucar.edu)_

**First, lets grab hsi unis that are in ucar provided list**

In [2]:
hsi = pd.read_excel('../data/HACU-2019-20-Hispanic-Serving-Institutions.xlsx', engine='openpyxl')
hsi_uni_list = hsi['Name'].unique()

all_files = glob.glob("../data/completions_*.csv")
all_files = sorted(all_files)
df = cmodule.open_and_concat(all_files)
ucar_uni_list = df['institution name'].unique()

ucar_list_hsi = ucar_uni_list[np.isin(ucar_uni_list, hsi_uni_list)]

In [3]:
df_ucar_hsi = df.loc[df['institution name'].isin(ucar_list_hsi)]

**Now, lets grab hsi unis that are not in the ucar provided list**

In [4]:
all_files = glob.glob("../data/hsi_completions_*.csv")  # HSI list
all_files = sorted(all_files)
df_hsi = cmodule.open_and_concat(all_files)

**Finally, append the two lists of unis together**

In [5]:
all_hsi_unis = pd.concat([df_ucar_hsi, df_hsi])

### Top 30 US institutions: Hispanic/Latinx bachelor degree recipients in the Atmospheric Sciences (2009-2019)

In [6]:
df = cmodule.bs_degrees(all_hsi_unis)
df = cmodule.atmospheric_sciences(df)
df.groupby('institution name').sum()['Hispanic or Latino total'].sort_values(ascending=False).head(30)  # TOTAL SUM (2009-2019)

institution name
University of the Incarnate Word           23
San Jose State University                  12
Metropolitan State University of Denver     3
University of Arizona                       0
Texas A & M University-Corpus Christi       0
Howard Payne University                     0
CUNY City College                           0
Name: Hispanic or Latino total, dtype: int64

In [7]:
df = cmodule.bs_degrees(all_hsi_unis)
df = cmodule.atmospheric_sciences(df)
df.groupby('institution name').mean()['Hispanic or Latino total'].sort_values(ascending=False).head(30)  # ANNUAL MEAN (2009-2019)

institution name
University of the Incarnate Word           1.769231
San Jose State University                  1.090909
Metropolitan State University of Denver    0.230769
University of Arizona                      0.000000
Texas A & M University-Corpus Christi      0.000000
Howard Payne University                    0.000000
CUNY City College                          0.000000
Name: Hispanic or Latino total, dtype: float64

### Top 30 US institutions: Grand Total bachelor degree recipients in the Atmospheric Sciences (2009-2019)

In [9]:
df = cmodule.bs_degrees(all_hsi_unis)
df = cmodule.atmospheric_sciences(df)
df.groupby('institution name').sum()['Grand total'].sort_values(ascending=False).head(30)  # TOTAL SUM (2009-2019)

institution name
Metropolitan State University of Denver    82
San Jose State University                  65
University of the Incarnate Word           44
University of Arizona                       0
Texas A & M University-Corpus Christi       0
Howard Payne University                     0
CUNY City College                           0
Name: Grand total, dtype: int64

In [10]:
df = cmodule.bs_degrees(all_hsi_unis)
df = cmodule.atmospheric_sciences(df)
df.groupby('institution name').mean()['Grand total'].sort_values(ascending=False).head(30)  # ANNUAL MEAN (2009-2019)

institution name
Metropolitan State University of Denver    6.307692
San Jose State University                  5.909091
University of the Incarnate Word           3.384615
University of Arizona                      0.000000
Texas A & M University-Corpus Christi      0.000000
Howard Payne University                    0.000000
CUNY City College                          0.000000
Name: Grand total, dtype: float64

### Top 30 US institutions: Hispanic/Latinx master degree recipients in the Atmospheric Sciences (2009-2019)

In [11]:
df = cmodule.ms_degrees(all_hsi_unis)
df = cmodule.atmospheric_sciences(df)
df.groupby('institution name').sum()['Hispanic or Latino total'].sort_values(ascending=False).head(30)  # TOTAL SUM (2009-2019)

institution name
University of Arizona        5
San Jose State University    4
University of Houston        1
Texas Tech University        1
Name: Hispanic or Latino total, dtype: int64

In [12]:
df = cmodule.ms_degrees(all_hsi_unis)
df = cmodule.atmospheric_sciences(df)
df.groupby('institution name').mean()['Hispanic or Latino total'].sort_values(ascending=False).head(30)  # annual mean (2009-2019)

institution name
University of Arizona        0.454545
San Jose State University    0.400000
University of Houston        0.100000
Texas Tech University        0.090909
Name: Hispanic or Latino total, dtype: float64

### Top 30 US institutions: Grand total master degree recipients in the Atmospheric Sciences (2009-2019)

In [13]:
df = cmodule.ms_degrees(all_hsi_unis)
df = cmodule.atmospheric_sciences(df)
df.groupby('institution name').sum()['Grand total'].sort_values(ascending=False).head(30)  # TOTAL SUM (2009-2019)

institution name
University of Arizona        61
Texas Tech University        50
San Jose State University    33
University of Houston        13
Name: Grand total, dtype: int64

In [14]:
df = cmodule.ms_degrees(all_hsi_unis)
df = cmodule.atmospheric_sciences(df)
df.groupby('institution name').mean()['Grand total'].sort_values(ascending=False).head(30)  # annual mean (2009-2019)

institution name
University of Arizona        5.545455
Texas Tech University        4.545455
San Jose State University    3.300000
University of Houston        1.300000
Name: Grand total, dtype: float64

### Top 30 US institutions: Hispanic/Latinx doctorate degree recipients in the Atmospheric Sciences (2009-2019)

In [15]:
df = cmodule.phd_degrees(all_hsi_unis)
df = cmodule.atmospheric_sciences(df)
df.groupby('institution name').sum()['Hispanic or Latino total'].sort_values(ascending=False).head(30)  # total sum (2009-2019)

institution name
University of Houston    0
University of Arizona    0
Name: Hispanic or Latino total, dtype: int64

In [16]:
df = cmodule.phd_degrees(all_hsi_unis)
df = cmodule.atmospheric_sciences(df)
df.groupby('institution name').mean()['Hispanic or Latino total'].sort_values(ascending=False).head(30)  # annual mean (2009-2019)

institution name
University of Houston    0.0
University of Arizona    0.0
Name: Hispanic or Latino total, dtype: float64

### Top 30 US institutions: Grand total doctorate degree recipients in the Atmospheric Sciences (2009-2019)

In [17]:
df = cmodule.phd_degrees(all_hsi_unis)
df = cmodule.atmospheric_sciences(df)
df.groupby('institution name').sum()['Grand total'].sort_values(ascending=False).head(30)  # TOTAL SUM (2009-2019)

institution name
University of Arizona    28
University of Houston    20
Name: Grand total, dtype: int64

In [18]:
df = cmodule.phd_degrees(all_hsi_unis)
df = cmodule.atmospheric_sciences(df)
df.groupby('institution name').mean()['Grand total'].sort_values(ascending=False).head(30)  # annual mean (2009-2019)

institution name
University of Arizona    2.800000
University of Houston    2.222222
Name: Grand total, dtype: float64