# What accommodations best prevent suicide?
The dataset contains age-standardised suicide rates from 2015 and 2016 by country and year, as well as counts for resources and people, such as mental health units in hospitals, outpatient facilities, and psychiatrists working in the sector. So age is done and this project's job is to adjust for everything else, so to speak. This is focussed on 2016 for being the more recent year and having far fewer missing figures.

There will of course be plenty of other factors not in the dataset, such as income levels and (in terms of the *reported* numbers, at least) how much suicide is stigmatised, so 1) a country's rate being high or low for its region will be as relevant as a comparison to the world average, and 2) remember this is just about the factors presented, not a complete picture.

Warning: I will treat this subject with levity because I'm sad.

In [25]:
import pandas as pd
import numpy as np
import plotly.express as px

In [26]:
df = pd.read_csv("suicide_dataset-11.csv")
df = df.replace({'Yes': True, 'No': False})
df2015 = df[df["year"] == 2015]
df2015g = df2015[df2015["sex"] != "Both"]
df2015 = df2015[df2015["sex"] == "Both"]
df2016 = df[df["year"] == 2016]
df2016g = df2016[df2016["sex"] != "Both"]
df2016 = df2016[df2016["sex"] == "Both"]

For example, here are the rates by country for 2016 (and the 2015 numbers aren't much different):

In [27]:
fig = px.choropleth(df2016, projection="winkel tripel", locations="iso", color="suicide_rate", color_continuous_scale=px.colors.sequential.Bluered)
fig.show()

If it looks like there are more countries with lower rates, it's because there are. Here's a breakdown for 2016:

In [28]:
fig = px.histogram(df2016, x="suicide_rate")
fig.show()

## Time for a bunch of linear regressions!

In [45]:
fig = px.scatter(df2016, x="mental_hospitals_per_100k", y="suicide_rate", trendline="ols")
fig.show()

### Time for a bunch of linear regressions with arbitrary outlier cutoffs!

I'll save you all the other pre-filtering scatterplots.

In [48]:
fig = px.scatter(df2016[df2016["mental_hospitals_per_100k"] < 0.5], x="mental_hospitals_per_100k", y="suicide_rate", trendline="ols")
fig.show()

This seems like the most basic statistic there is, and the positive trend is worrying. Even if you look at that tempting river and cut it down to <0.1, the data make an annoying L shape. But that's okay. This is a good time to mention that there are basically three "genres of accommodation" in the dataset:

* Facilities (per 100 000 people: mental hospitals, *beds in* mental hospitals, mental health units in general hospitals, *beds for mental health in* general hospitals, mental health outpatient facilities, mental health day treatment facilities, community residential facilities, and *beds in* community residential facilities)
* People (working in the mental health sector, per 100 000 people: psychiatrists, nurses, social workers, and psychologists)
* Government (fraction of government expenditure on mental health which was on mental hospitals, whether the country has a standalone law for mental health, when this law was enacted, whether the country has a standalone policy or plan for mental health, and when this policy was published)

Surely the number of beds is a better measure than the number of facilities, right?

In [31]:
fig = px.scatter(df2016[df2016["mental_h_beds_per_100k"] < 50], x="mental_h_beds_per_100k", y="suicide_rate", trendline="ols")
fig.show()

It's another positive trend. A milder one, but still. And there's another even more tempting place than last time to cut off the already cut off data that makes the figure look even more useless. I know this is the opposite of data, but here's an anecdote I don't remember the details of: I saw someone saying they lied to a doctor or psychologist of some kind to avoid being put in a mental hospital because it wouldn't have been good for them. Seems like they aren't the only one. Or countries with a bigger problem will tend to have more accommodations and this entire analysis is useless, but let's not make that our base assumption.

In [32]:
fig = px.scatter(df2016[df2016["general_h_beds_per_100k"] < 25], x="general_h_beds_per_100k", y="suicide_rate", trendline="ols")
fig.show()

With beds in general hospitals, the trend you see very much depends on where you cut the data off. Now let's complete the triad of bed statistics:

In [33]:
fig = px.scatter(df2016[df2016["comres_beds_per_100k"] < 15], x="comres_beds_per_100k", y="suicide_rate", trendline="ols")
fig.show()

A negative trend! I had to look up "community residential facility" and different areas can't seem to agree on what the exact definition is so I don't even know what this means, and it's another one where if you cut it off again you get an L shape, but hooray!

To complete the Facilities category I've made up, we'll look at outpatient and day treatment facilities. It could be argued that these are similar enough to make a good combined statistic, but some years in countries have data on one and not the other.

In [34]:
fig = px.scatter(df2016[df2016["outpatient_facilities_per_100k"] < 5], x="outpatient_facilities_per_100k", y="suicide_rate", trendline="ols")
fig.show()

It's a good thing that 10 line is there or I wouldn't be able to tell which direction the trend line is going.

In [35]:
fig = px.scatter(df2016[df2016["day_treatment_facilities_per_100k"] < 2], x="day_treatment_facilities_per_100k", y="suicide_rate", trendline="ols")
fig.show()

Conclusion: mental hospitals and anything related to them are a waste of resources.

### Scatterplots, part 2: Surely psychologists are useful?
But first, psychiatrists.

In [36]:
fig = px.scatter(df2016[df2016["psychiatrists_per_100k"] < 2], x="psychiatrists_per_100k", y="suicide_rate", trendline="ols")
fig.show()

It's kind of sad when you see an R² of .065 and think "well, it's something".

In [37]:
fig = px.scatter(df2016[df2016["nurses_per_100k"] < 20], x="nurses_per_100k", y="suicide_rate", trendline="ols")
fig.show()

I'm not seeing much. It makes sense, nurses aren't strongly associated with suicide prevention the same way a few other occupations are.

In [38]:
fig = px.scatter(df2016[df2016["social_workers_per_100k"] < 1], x="social_workers_per_100k", y="suicide_rate", trendline="ols")
fig.show()

More than most of them, but still not much.

In [39]:
fig = px.scatter(df2016[df2016["psychologists_per_100k"] < 4], x="psychologists_per_100k", y="suicide_rate", trendline="ols")
fig.show()

If you look at the points rather than just the trend line, you'll see this is actually a different chart. They have their similarities, but they have their differences.

Conclusion: don't pull all psychological education funding. The therapy sector does a little bit, maybe.

### Scatterplots, part 3: Government

In [40]:
fig = px.scatter(df2016[df2016["hospital_budget_pct"] < 5], x="hospital_budget_pct", y="suicide_rate", trendline="ols")
fig.show()

Well, we already knew what good all those mental hospitals are.

In [41]:
fig = px.histogram(df2016, x="standalone_law", y="suicide_rate", barmode="group", histfunc="avg")
fig.show()

Uhhhh.

In [42]:
fig = px.histogram(df2016, x="standalone_policy", y="suicide_rate", barmode="group", histfunc="avg")
fig.show()

Oh, look, having a plan makes it worse. Again, or those are the countries that need it. Remember that this whole document could have the wrong idea.

## Conclusion
If you're considering suicide, talk to a professional. It's the only thing that does anything.