# CPSC 368 - Databases in Data Science
### Mental Health Quality Discrepancies Between Men and Women in Tech
By Chloe Zandberg and Olivia Lam

### How do different genders in the tech industry describe whether they would disclose their mental health to a potential employer?

This question seeks to uncover any disproportionate comfort—or lack thereof—in mental health complication disclosure between employees of female and male gender identifications in the technology industry. Regardless of whether a discrepancy is found, this question aims to prod at a possible stigma around mental health, particularly in the technology industry, where aptitude and technical skills are highly valued and mental health complications can be seen as a barrier to these qualities. 

In [23]:
import sqlite3
import pandas as pd
import altair as alt

In [24]:
# First we must connect our database to our query to pull our desired data
connection = sqlite3.connect('../database/my_database.db')

q2_query = """
SELECT u.gender, mh_in_interview, COUNT(*) as count_responses
FROM Users u
JOIN Mental_health m ON u.userID = m.userID
GROUP BY u.gender, m.mh_in_interview;
"""

# Store it as a dataframe so we can save as a csv
oli_df = pd.read_sql_query(q2_query, connection)
oli_df.to_csv('chloe-imp.csv', index=False)

# Close the connection
connection.close()

print("CSV file has been saved to your local directory.")

CSV file has been saved to your local directory.


In [35]:
df = pd.read_csv('chloe-imp.csv')
df

Unnamed: 0,gender,mh_in_interview,count_responses
0,Female,Maybe,63
1,Female,No,419
2,Female,Yes,7
3,Male,Maybe,301
4,Male,No,937
5,Male,Yes,73


### Creating visualization

I will use Altair, a library in Python used for visualization to construct the view to answer our question. As I will be comparing proportions, I will make a bar chart.

In [44]:
mh_order = ['Yes', 'Maybe', 'No']

chart = alt.Chart(df, title=alt.Title('How do men and women in the tech industry describe whether they would disclose their mental health to a potential employer?', subtitle='Mental health condition disclosure in an interview against gender for tech employees between 2014-2019')
                 ).mark_bar().encode(
    y=alt.Y('gender:N').title('Gender of respondent'),
    x=alt.X('count_responses', sort=mh_order).stack('normalize').title('Proportion of responses'),
    color=alt.Color('mh_in_interview:O').legend(title=['Would they disclose their', 'mental health condition', 'to a potential employer?']))
chart.properties(width=500,
                height=300)