In [48]:
import plotly.io as pio

pio.renderers.default = "vscode+jupyterlab+notebook_connected"

Ricardo Lombera

Professor Feldman

Computing in Context

6 December 2024

# Project 3

## Unemployment Rate vs. Suicide Rate - A Study on Machista and Masculinity

The topic of mental heath is frequently dismissed in households throughout the United States. The issue is that, just because the issue isn't discussed, doesn't mean that the issue does not exist. For men, dicussions on mental health and emotions are frequently discouraged. In households with families of color, the situation could be exasperated further because of heavily ingrained *machismo*, or toxic masculinity. I want to research this issue further, focusing on one main aspect of male expectations; being a provider.

In this research project, I want to bring forward 2 hypotheses. First, I hypothesize that men have a higher suicide rate than women. However, I would like to look at this issue further and see whether men of color have higher suicide rates than white men. Second, I want to compare this data to the long-term unemployment rate for men. I hypothesize that there is a correlation between **long-term** unmployment and suicide rate. I believe that there is a correlation between the two datasets because men are pressured to provide for their families. If men do not have a job, then they might feel as though they are fulfilling their duties as men to take care of their families.

Data on [Long-Term Unemployment](https://www.epi.org/data/#?subject=ltunemp) was taken from the Economic Policy Institute and data on [Death by Suicide](https://catalog.data.gov/dataset/death-rates-for-suicide-by-sex-race-hispanic-origin-and-age-united-states-020c1) was taken from the Center for Disease Control under the US Department of Health and Human Services. 


Dataset(s) to be used: [Long-Term Unemployment](https://www.epi.org/data/#?subject=ltunemp), [Death by Suicide](https://catalog.data.gov/dataset/death-rates-for-suicide-by-sex-race-hispanic-origin-and-age-united-states-020c1)

Analysis question: Do suicide rates in men correlate with long-term unemployment rates?

Columns that will (likely) be used:

Date

Race and Gender

Suicide Rate

Unemployment Rate

(If you’re using multiple datasets) Columns to be used to merge/join them:

[Long-Term Unemployment] [Race and Gender]

[Suicide Rate] [Date]

Hypothesis: Suicide rates and unemployment rates have similar trends (peaks and valleys) for men

Site URL: https://ricardo-lombera-computing-in-context-fall-2024.readthedocs.io/en/latest/

## Part 1: Unemployment Data 

In [49]:
import pandas as pd
import plotly.express as px

In [50]:
raw_suicide_rate = pd.read_csv('https://data.cdc.gov/api/views/9j2v-jamp/rows.csv?accessType=DOWNLOAD')
raw_suicide_rate.head()

Unnamed: 0,INDICATOR,UNIT,UNIT_NUM,STUB_NAME,STUB_NAME_NUM,STUB_LABEL,STUB_LABEL_NUM,YEAR,YEAR_NUM,AGE,AGE_NUM,ESTIMATE,FLAG
0,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",1,Total,0,All persons,0.0,1950,1,All ages,0.0,13.2,
1,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",1,Total,0,All persons,0.0,1960,2,All ages,0.0,12.5,
2,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",1,Total,0,All persons,0.0,1970,3,All ages,0.0,13.1,
3,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",1,Total,0,All persons,0.0,1980,4,All ages,0.0,12.2,
4,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",1,Total,0,All persons,0.0,1981,5,All ages,0.0,12.3,


Based on an inital view of the data using `.head()`, I moved forward with only year, race, and estimate data. I utilized Unit and Age to filter my data as I was not interested in any specific age groups, but the groups as a whole.

In [51]:
shortened_suicide_rate_data=raw_suicide_rate[['INDICATOR', 'UNIT', 'STUB_LABEL', 'YEAR', 'AGE', 'ESTIMATE']]
shortened_suicide_rate_data

Unnamed: 0,INDICATOR,UNIT,STUB_LABEL,YEAR,AGE,ESTIMATE
0,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",All persons,1950,All ages,13.2
1,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",All persons,1960,All ages,12.5
2,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",All persons,1970,All ages,13.1
3,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",All persons,1980,All ages,12.2
4,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",All persons,1981,All ages,12.3
...,...,...,...,...,...,...
6385,Death rates for suicide,"Deaths per 100,000 resident population, crude",Female: Not Hispanic or Latino: Black or Afric...,2018,65 years and over,1.3
6386,Death rates for suicide,"Deaths per 100,000 resident population, crude",Female: Hispanic or Latino: All races: 15-24 y...,2018,15-24 years,4.1
6387,Death rates for suicide,"Deaths per 100,000 resident population, crude",Female: Hispanic or Latino: All races: 25-44 y...,2018,25-44 years,4.4
6388,Death rates for suicide,"Deaths per 100,000 resident population, crude",Female: Hispanic or Latino: All races: 45-64 y...,2018,45-64 years,3.2


In [52]:
only_age_adjusted = shortened_suicide_rate_data[shortened_suicide_rate_data['UNIT'] == 'Deaths per 100,000 resident population, age-adjusted']
only_age_adjusted

Unnamed: 0,INDICATOR,UNIT,STUB_LABEL,YEAR,AGE,ESTIMATE
0,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",All persons,1950,All ages,13.2
1,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",All persons,1960,All ages,12.5
2,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",All persons,1970,All ages,13.1
3,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",All persons,1980,All ages,12.2
4,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",All persons,1981,All ages,12.3
...,...,...,...,...,...,...
809,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",Female: Not Hispanic or Latino: Black or Afric...,2018,All ages,2.9
810,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",Female: Hispanic or Latino: All races,2018,All ages,2.8
811,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",Female: Not Hispanic or Latino: American India...,2018,All ages,11.1
812,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",Female: Not Hispanic or Latino: Asian,2018,All ages,3.8


In [53]:
only_all_ages = only_age_adjusted[only_age_adjusted['AGE'] == 'All ages']
only_all_ages

Unnamed: 0,INDICATOR,UNIT,STUB_LABEL,YEAR,AGE,ESTIMATE
0,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",All persons,1950,All ages,13.2
1,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",All persons,1960,All ages,12.5
2,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",All persons,1970,All ages,13.1
3,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",All persons,1980,All ages,12.2
4,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",All persons,1981,All ages,12.3
...,...,...,...,...,...,...
809,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",Female: Not Hispanic or Latino: Black or Afric...,2018,All ages,2.9
810,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",Female: Hispanic or Latino: All races,2018,All ages,2.8
811,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",Female: Not Hispanic or Latino: American India...,2018,All ages,11.1
812,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",Female: Not Hispanic or Latino: Asian,2018,All ages,3.8


In [54]:
only_all_ages.STUB_LABEL.unique()

array(['All persons', 'Male', 'Female', 'Male: White',
       'Male: Black or African American',
       'Male: American Indian or Alaska Native',
       'Male: Asian or Pacific Islander', 'Female: White',
       'Female: Black or African American',
       'Female: American Indian or Alaska Native',
       'Female: Asian or Pacific Islander',
       'Male: Not Hispanic or Latino: White',
       'Male: Not Hispanic or Latino: Black or African American',
       'Male: Hispanic or Latino: All races',
       'Male: Not Hispanic or Latino: American Indian or Alaska Native',
       'Male: Not Hispanic or Latino: Asian or Pacific Islander',
       'Female: Not Hispanic or Latino: White',
       'Female: Not Hispanic or Latino: Black or African American',
       'Female: Hispanic or Latino: All races',
       'Female: Not Hispanic or Latino: American Indian or Alaska Native',
       'Female: Not Hispanic or Latino: Asian or Pacific Islander',
       'Male: Not Hispanic or Latino: Asian',
      

In [55]:
only_white_black_latino_suicides = only_all_ages[only_all_ages['STUB_LABEL'].isin(['Male: White', 'Male: Black or African American', 'Female: White', 'Female: Black or African American', 'Male: Hispanic or Latino: All races', 'Female: Hispanic or Latino: All races'])]
only_white_black_latino_suicides

Unnamed: 0,INDICATOR,UNIT,STUB_LABEL,YEAR,AGE,ESTIMATE
126,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",Male: White,1950,All ages,22.3
127,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",Male: White,1960,All ages,21.1
128,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",Male: White,1970,All ages,20.8
129,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",Male: White,1980,All ages,20.9
130,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",Male: White,1981,All ages,20.9
...,...,...,...,...,...,...
797,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",Male: Black or African American,2018,All ages,11.8
799,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",Female: White,2018,All ages,7.0
800,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",Female: Black or African American,2018,All ages,2.7
804,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",Male: Hispanic or Latino: All races,2018,All ages,12.1


In [56]:
only_white_black_latino_suicides.STUB_LABEL.unique()

array(['Male: White', 'Male: Black or African American', 'Female: White',
       'Female: Black or African American',
       'Male: Hispanic or Latino: All races',
       'Female: Hispanic or Latino: All races'], dtype=object)

To clarify the race and ethnicity data, I renamed the variables to be more straightforward and formatted as `"Race Gender."`

In [57]:
def race_label(row):
    if row['STUB_LABEL']=='Male: White':
        return 'White Men'
    elif row['STUB_LABEL']=='Male: Black or African American':
        return 'Black Men'
    elif row['STUB_LABEL']=='Female: White':
        return 'White Women'
    elif row['STUB_LABEL']=='Female: Black or African American':
        return 'Black Women'
    elif row['STUB_LABEL']=='Male: Hispanic or Latino: All races':
        return 'Hispanic Men'
    elif row['STUB_LABEL']=='Female: Hispanic or Latino: All races':
        return 'Hispanic Women'
    else:
        return 'Invalid Race'

In [58]:
only_white_black_latino_suicides['STUB_LABEL'] = only_white_black_latino_suicides.apply(race_label, axis=1)



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



In [59]:
only_white_black_latino_suicides.STUB_LABEL.unique()

array(['White Men', 'Black Men', 'White Women', 'Black Women',
       'Hispanic Men', 'Hispanic Women'], dtype=object)

To ensure that I only had specific race categories, I ran `.sample()` several times to see if any 'invalid race categories' appeared.

In [60]:
only_white_black_latino_suicides.sample()

Unnamed: 0,INDICATOR,UNIT,STUB_LABEL,YEAR,AGE,ESTIMATE
184,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",Black Men,1993,All ages,12.9


In [61]:
only_white_black_latino_suicides.rename(columns = {'INDICATOR': 'Cause of Death', 'UNIT': 'Unit','STUB_LABEL': 'Race', 'YEAR': 'Year','AGE': 'Age', 'ESTIMATE': 'Suicide Rate'}, inplace=True)
only_white_black_latino_suicides



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



Unnamed: 0,Cause of Death,Unit,Race,Year,Age,Suicide Rate
126,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",White Men,1950,All ages,22.3
127,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",White Men,1960,All ages,21.1
128,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",White Men,1970,All ages,20.8
129,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",White Men,1980,All ages,20.9
130,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",White Men,1981,All ages,20.9
...,...,...,...,...,...,...
797,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",Black Men,2018,All ages,11.8
799,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",White Women,2018,All ages,7.0
800,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",Black Women,2018,All ages,2.7
804,Death rates for suicide,"Deaths per 100,000 resident population, age-ad...",Hispanic Men,2018,All ages,12.1


With my initial cleaning now complete, I wanted dropped `'Unit'` beause that information was no longer relevant.

In [62]:
only_white_black_latino_suicides.drop(columns=['Unit'], inplace=True)
only_white_black_latino_suicides



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



Unnamed: 0,Cause of Death,Race,Year,Age,Suicide Rate
126,Death rates for suicide,White Men,1950,All ages,22.3
127,Death rates for suicide,White Men,1960,All ages,21.1
128,Death rates for suicide,White Men,1970,All ages,20.8
129,Death rates for suicide,White Men,1980,All ages,20.9
130,Death rates for suicide,White Men,1981,All ages,20.9
...,...,...,...,...,...
797,Death rates for suicide,Black Men,2018,All ages,11.8
799,Death rates for suicide,White Women,2018,All ages,7.0
800,Death rates for suicide,Black Women,2018,All ages,2.7
804,Death rates for suicide,Hispanic Men,2018,All ages,12.1


For easier graphing, I used `to_datetime()` to convert the year data to a proper date

In [63]:
only_white_black_latino_suicides["Year"] = pd.to_datetime(only_white_black_latino_suicides["Year"], format="%Y")
only_white_black_latino_suicides



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



Unnamed: 0,Cause of Death,Race,Year,Age,Suicide Rate
126,Death rates for suicide,White Men,1950-01-01,All ages,22.3
127,Death rates for suicide,White Men,1960-01-01,All ages,21.1
128,Death rates for suicide,White Men,1970-01-01,All ages,20.8
129,Death rates for suicide,White Men,1980-01-01,All ages,20.9
130,Death rates for suicide,White Men,1981-01-01,All ages,20.9
...,...,...,...,...,...
797,Death rates for suicide,Black Men,2018-01-01,All ages,11.8
799,Death rates for suicide,White Women,2018-01-01,All ages,7.0
800,Death rates for suicide,Black Women,2018-01-01,All ages,2.7
804,Death rates for suicide,Hispanic Men,2018-01-01,All ages,12.1


In [64]:
only_white_black_latino_suicides.dtypes

Cause of Death            object
Race                      object
Year              datetime64[ns]
Age                       object
Suicide Rate             float64
dtype: object

After dropping the unit column because the information was no longer needed, I converted the ["Year"] column to datetime to assist in future merging.

In [65]:
only_white_black_latino_suicides.rename(columns={'Year': 'Date'}, inplace=True)
only_white_black_latino_suicides



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



Unnamed: 0,Cause of Death,Race,Date,Age,Suicide Rate
126,Death rates for suicide,White Men,1950-01-01,All ages,22.3
127,Death rates for suicide,White Men,1960-01-01,All ages,21.1
128,Death rates for suicide,White Men,1970-01-01,All ages,20.8
129,Death rates for suicide,White Men,1980-01-01,All ages,20.9
130,Death rates for suicide,White Men,1981-01-01,All ages,20.9
...,...,...,...,...,...
797,Death rates for suicide,Black Men,2018-01-01,All ages,11.8
799,Death rates for suicide,White Women,2018-01-01,All ages,7.0
800,Death rates for suicide,Black Women,2018-01-01,All ages,2.7
804,Death rates for suicide,Hispanic Men,2018-01-01,All ages,12.1


In [66]:
only_white_black_latino_suicides.dtypes

Cause of Death            object
Race                      object
Date              datetime64[ns]
Age                       object
Suicide Rate             float64
dtype: object

With confirm that all columns had the correct data types, I plotted the suicide rate data. On first inspection, I realized that data on Hispanic men and women were not included until 1985. This could come from the [developments happening in the 1960s, 1970s, and 1980](https://www.history.com/news/hispanic-latino-latinx-chicano-background) that identified Latin Americans as Hispanic to separate them from the White American population.

Secondly, White men have the highest suicide rate, with [more than twice the rate of suicide than the general American population](https://yaleglobalhealthreview.com/2017/05/14/white-male-suicide-the-exception-to-privelege/). Articles, which also support the data presented, state that Black men have a suicide rate about [one-third that of White men](https://www.nytimes.com/2020/12/30/upshot/suicide-demographic-differences.html). This could come from the fact that Black American deaths are less likely to be coded as suicide because of Black Americans are less likely to report mental health issues or leave physical notes. The deaths for White men generally have also increased because of drug overdoses and alcohol, [known as deaths of despair](https://www.nytimes.com/2015/11/03/health/death-rates-rising-for-middle-aged-white-americans-study-finds.html).

Lastly, both Black and Hispanic men and Black and Hispanic women have similar rates of suicide. One reason may be because people can be both racially Black and ethnically Hispanic. Culture may also play a role in lower-suicide rates because both populations have strong religious backgrounds that discourage suicide.

In [67]:
only_white_black_latino_suicides_fig = px.line(
    only_white_black_latino_suicides,
    x="Date",
    y="Suicide Rate",
    color="Race",
    title="Suicide Rate by Race and Gender",
)
only_white_black_latino_suicides_fig.show()

## Part 2: Suicide Rate

In [68]:
long_term_unemployment = pd.read_csv('EPI Data Library - Long-term unemployment.csv')
long_term_unemployment

Unnamed: 0,Date,All,Women,Men,Black,Hispanic,White,Black Women,Black Men,Hispanic Women,Hispanic Men,White Women,White Men
0,Sep-2024,0.6%,0.6%,0.7%,1.1%,0.7%,0.5%,1.1%,1.2%,0.7%,0.6%,0.4%,0.6%
1,Aug-2024,0.6%,0.6%,0.7%,1.1%,0.7%,0.5%,1.0%,1.2%,0.7%,0.6%,0.4%,0.6%
2,Jul-2024,0.6%,0.6%,0.7%,1.1%,0.7%,0.5%,1.0%,1.2%,0.7%,0.6%,0.4%,0.6%
3,Jun-2024,0.6%,0.6%,0.7%,1.1%,0.6%,0.5%,1.0%,1.2%,0.7%,0.6%,0.4%,0.6%
4,May-2024,0.6%,0.5%,0.7%,1.1%,0.6%,0.5%,1.0%,1.2%,0.6%,0.6%,0.4%,0.5%
...,...,...,...,...,...,...,...,...,...,...,...,...,...
545,Apr-1979,0.4%,0.4%,0.4%,1.0%,0.4%,0.3%,0.8%,1.1%,0.4%,0.4%,0.3%,0.3%
546,Mar-1979,0.4%,0.4%,0.4%,1.0%,0.5%,0.3%,0.8%,1.1%,0.5%,0.5%,0.3%,0.3%
547,Feb-1979,0.4%,0.4%,0.4%,1.0%,0.5%,0.3%,0.9%,1.1%,0.5%,0.6%,0.3%,0.4%
548,Jan-1979,0.4%,0.4%,0.4%,1.0%,0.6%,0.3%,0.9%,1.1%,0.5%,0.6%,0.3%,0.4%


My first hypothesis was correct; men have higher rates of suicide than women, but White men had the highest suicide rate. Now, I want to test my second hypothesis; that suicide rates for men correlate with rates of long-term unemployment; this time only focusing on data for men.

In [69]:
shortened_unemployment_data=long_term_unemployment[['Date', 'Black Men', 'Hispanic Men', 'White Men']]
shortened_unemployment_data

Unnamed: 0,Date,Black Men,Hispanic Men,White Men
0,Sep-2024,1.2%,0.6%,0.6%
1,Aug-2024,1.2%,0.6%,0.6%
2,Jul-2024,1.2%,0.6%,0.6%
3,Jun-2024,1.2%,0.6%,0.6%
4,May-2024,1.2%,0.6%,0.5%
...,...,...,...,...
545,Apr-1979,1.1%,0.4%,0.3%
546,Mar-1979,1.1%,0.5%,0.3%
547,Feb-1979,1.1%,0.6%,0.4%
548,Jan-1979,1.1%,0.6%,0.4%


In [70]:
shortened_unemployment_data["Date"] = pd.to_datetime(long_term_unemployment["Date"], format="%b-%Y")
shortened_unemployment_data



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



Unnamed: 0,Date,Black Men,Hispanic Men,White Men
0,2024-09-01,1.2%,0.6%,0.6%
1,2024-08-01,1.2%,0.6%,0.6%
2,2024-07-01,1.2%,0.6%,0.6%
3,2024-06-01,1.2%,0.6%,0.6%
4,2024-05-01,1.2%,0.6%,0.5%
...,...,...,...,...
545,1979-04-01,1.1%,0.4%,0.3%
546,1979-03-01,1.1%,0.5%,0.3%
547,1979-02-01,1.1%,0.6%,0.4%
548,1979-01-01,1.1%,0.6%,0.4%


To create a more clear comparison, I needed the labels and format to match the suicide rate data that was produced. I used [`.melt()`](https://pandas.pydata.org/docs/reference/api/pandas.melt.html) to unpivot the dataframe.

In [71]:
reshaped_unemployment_data = pd.melt(shortened_unemployment_data, id_vars=['Date'], var_name='Race', value_name='Unemployment')
reshaped_unemployment_data['Unemployment'] = reshaped_unemployment_data['Unemployment'].str.replace('%', '').astype(float) / 100
pd.DataFrame(reshaped_unemployment_data)

Unnamed: 0,Date,Race,Unemployment
0,2024-09-01,Black Men,0.012
1,2024-08-01,Black Men,0.012
2,2024-07-01,Black Men,0.012
3,2024-06-01,Black Men,0.012
4,2024-05-01,Black Men,0.012
...,...,...,...
1645,1979-04-01,White Men,0.003
1646,1979-03-01,White Men,0.003
1647,1979-02-01,White Men,0.004
1648,1979-01-01,White Men,0.004


In order for the data to match with the suicide rate data, I needed to resample the data, going from monthly to yearly. For the resample there were several steps and I wanted to follow the exact order for each race category in order:
1) Separate the data by each race category.
2) Resample using ['YE'] and reset the index to bring date back into the label. 
3) Confirm columns are Date and Unemployment and that they are a datetime and float, respectively.
4) Put Race back into the columns.
5) Confirm columns are Date, Unemployment, and Race and that they are datetime, float, and object, respectively.

In [72]:
unemployment_black_men = reshaped_unemployment_data[reshaped_unemployment_data['Race'] == 'Black Men']
unemployment_black_men

Unnamed: 0,Date,Race,Unemployment
0,2024-09-01,Black Men,0.012
1,2024-08-01,Black Men,0.012
2,2024-07-01,Black Men,0.012
3,2024-06-01,Black Men,0.012
4,2024-05-01,Black Men,0.012
...,...,...,...
545,1979-04-01,Black Men,0.011
546,1979-03-01,Black Men,0.011
547,1979-02-01,Black Men,0.011
548,1979-01-01,Black Men,0.011


In [73]:
unemployment_black_men.set_index('Date', inplace=True)
yearly_unemployment_black_men = unemployment_black_men.resample('YE').mean(numeric_only=True).reset_index()
pd.DataFrame(yearly_unemployment_black_men)

Unnamed: 0,Date,Unemployment
0,1978-12-31,0.011
1,1979-12-31,0.01075
2,1980-12-31,0.011083
3,1981-12-31,0.01975
4,1982-12-31,0.026
5,1983-12-31,0.042417
6,1984-12-31,0.036833
7,1985-12-31,0.026167
8,1986-12-31,0.022417
9,1987-12-31,0.020667


In [74]:
yearly_unemployment_black_men.columns

Index(['Date', 'Unemployment'], dtype='object')

In [75]:
yearly_unemployment_black_men.dtypes

Date            datetime64[ns]
Unemployment           float64
dtype: object

In [76]:
yearly_unemployment_black_men['Race'] = 'Black Men'
yearly_unemployment_black_men.head()

Unnamed: 0,Date,Unemployment,Race
0,1978-12-31,0.011,Black Men
1,1979-12-31,0.01075,Black Men
2,1980-12-31,0.011083,Black Men
3,1981-12-31,0.01975,Black Men
4,1982-12-31,0.026,Black Men


In [77]:
yearly_unemployment_black_men.columns

Index(['Date', 'Unemployment', 'Race'], dtype='object')

In [78]:
yearly_unemployment_black_men.dtypes

Date            datetime64[ns]
Unemployment           float64
Race                    object
dtype: object

In [79]:
unemployment_hispanic_men = reshaped_unemployment_data[reshaped_unemployment_data['Race'] == 'Hispanic Men']
unemployment_hispanic_men

Unnamed: 0,Date,Race,Unemployment
550,2024-09-01,Hispanic Men,0.006
551,2024-08-01,Hispanic Men,0.006
552,2024-07-01,Hispanic Men,0.006
553,2024-06-01,Hispanic Men,0.006
554,2024-05-01,Hispanic Men,0.006
...,...,...,...
1095,1979-04-01,Hispanic Men,0.004
1096,1979-03-01,Hispanic Men,0.005
1097,1979-02-01,Hispanic Men,0.006
1098,1979-01-01,Hispanic Men,0.006


In [80]:
unemployment_hispanic_men.set_index('Date', inplace=True)
yearly_unemployment_hispanic_men = unemployment_hispanic_men.resample('YE').mean(numeric_only=True).reset_index()
pd.DataFrame(yearly_unemployment_hispanic_men)

Unnamed: 0,Date,Unemployment
0,1978-12-31,0.006
1,1979-12-31,0.004417
2,1980-12-31,0.00525
3,1981-12-31,0.008333
4,1982-12-31,0.010667
5,1983-12-31,0.0195
6,1984-12-31,0.015583
7,1985-12-31,0.0125
8,1986-12-31,0.012083
9,1987-12-31,0.010333


In [81]:
yearly_unemployment_hispanic_men.columns

Index(['Date', 'Unemployment'], dtype='object')

In [82]:
yearly_unemployment_hispanic_men.dtypes

Date            datetime64[ns]
Unemployment           float64
dtype: object

In [83]:
yearly_unemployment_hispanic_men['Race'] = 'Hispanic Men'
yearly_unemployment_hispanic_men.head()

Unnamed: 0,Date,Unemployment,Race
0,1978-12-31,0.006,Hispanic Men
1,1979-12-31,0.004417,Hispanic Men
2,1980-12-31,0.00525,Hispanic Men
3,1981-12-31,0.008333,Hispanic Men
4,1982-12-31,0.010667,Hispanic Men


In [84]:
unemployment_white_men = reshaped_unemployment_data[reshaped_unemployment_data['Race'] == 'White Men']
unemployment_white_men

Unnamed: 0,Date,Race,Unemployment
1100,2024-09-01,White Men,0.006
1101,2024-08-01,White Men,0.006
1102,2024-07-01,White Men,0.006
1103,2024-06-01,White Men,0.006
1104,2024-05-01,White Men,0.005
...,...,...,...
1645,1979-04-01,White Men,0.003
1646,1979-03-01,White Men,0.003
1647,1979-02-01,White Men,0.004
1648,1979-01-01,White Men,0.004


In [85]:
unemployment_white_men.set_index('Date', inplace=True)
yearly_unemployment_white_men = unemployment_white_men.resample('YE').mean(numeric_only=True).reset_index()
pd.DataFrame(yearly_unemployment_white_men)

Unnamed: 0,Date,Unemployment
0,1978-12-31,0.004
1,1979-12-31,0.003167
2,1980-12-31,0.003667
3,1981-12-31,0.0065
4,1982-12-31,0.008333
5,1983-12-31,0.016167
6,1984-12-31,0.014167
7,1985-12-31,0.009
8,1986-12-31,0.008
9,1987-12-31,0.007667


In [86]:
yearly_unemployment_white_men['Race'] = 'White Men'
yearly_unemployment_white_men.head()

Unnamed: 0,Date,Unemployment,Race
0,1978-12-31,0.004,White Men
1,1979-12-31,0.003167,White Men
2,1980-12-31,0.003667,White Men
3,1981-12-31,0.0065,White Men
4,1982-12-31,0.008333,White Men


In order to create a line graph, I needed to put the data back to together, so I concatenated the data using `pd.concat()`.

In [87]:
yearly_race_rates_concatenated = pd.concat([yearly_unemployment_black_men, yearly_unemployment_hispanic_men, yearly_unemployment_white_men], axis=0)
yearly_race_rates_concatenated

Unnamed: 0,Date,Unemployment,Race
0,1978-12-31,0.011000,Black Men
1,1979-12-31,0.010750,Black Men
2,1980-12-31,0.011083,Black Men
3,1981-12-31,0.019750,Black Men
4,1982-12-31,0.026000,Black Men
...,...,...,...
42,2020-12-31,0.006167,White Men
43,2021-12-31,0.014417,White Men
44,2022-12-31,0.009083,White Men
45,2023-12-31,0.004917,White Men


The line graph that was created flipped the results of the suicide rate data. Black men had the highest rate of long-term unemployment, followed by Hispanic men and White men. 

In [88]:
yearly_race_rates_concatenated_fig = px.line(
    yearly_race_rates_concatenated,
    x="Date",
    y="Unemployment",
    color="Race",
    title="Unemployment by Race and Gender",
)
yearly_race_rates_concatenated_fig.show()

## Comparing the Data

Here are the resulting graphs again:

In [89]:
only_white_black_latino_suicides_fig.show()

In [90]:
yearly_race_rates_concatenated_fig.show()

To test my theory, I looked at the peaks of suicide rate to see if they matched peaks in unemployment rate. I will pick 1 for each group:
1) 1986 - White Men: Not a match
2) 1989 - Hispanic Men: Not a match
3) 1989 - Black Men: Not a match

My second hypothesis is wrong.


## Conclusion

Complex topics such as suicide and unemployment cannot simply be attributed to a single reason. Through this research, I learned about the several issues that play a role in increasing or decreasing suicide rates and unemployment rates such as gun control, geographic location, and substance abuse. As the famous saying goes; correlation does not mean causation.

I also want to acknowledge that, along with White men having one of the highest suicide rates, Native American and Alaska Native populations have the highest suicide rate, with [21.8 per 100,000 people](https://www.cdc.gov/nchs/data/hestat/suicide/rates_1999_2017.htm). This project only focused on Black men, Hispanic Men, and White men because the long-term suicide rate data only included these 3 categories.

Looking to the Future: In 2024, the Department of Health and Human Services added a [National Strategy for Suicide Prevention](https://www.hhs.gov/programs/prevention-and-wellness/mental-health-substance-abuse/national-strategy-suicide-prevention/index.html) to address these issues. The 4 tier plan is as follows:
1) Community Based Suicide Prevention
2) Treatment and Crisis Services
3) Surveillance, Quality Improvement, and Research
4) Health Equity in Suicide Prevention

Through this plan, this could not only provide suicide prevention, but can create more accurate suicide rate data that can show how this issue effects every community. However, now this is up to the incoming presidential administraton to enforce this approach.