## Über den Datensatz

Die Daten, mit denen wir arbeiten stammen von OSMI (Open Sourcing Mental Illness), einer Non-Profit Organisation, die sich der Sensibilisierung für psychische Erkrankungen in der Tech-Szene widmet. 
Seit 2014 führt OSMI jedes Jahr ein "Mental Health in Tech" Survey durch und stellt die Daten freundlicherweise auf ihrer Seite zur freien Verfügung: https://osmihelp.org/research

Wir nehmen für die Sitzung den Datensatz von 2016, da das mit 1432 Teilnehmenden und 63 Items der größte Datensatz ist. Bevor wir uns die Daten anschauen, müssen wir noch ein paar Packages importieren. 

### Schritt 1: Importiere relevante Packages

Das pandas Paket ist das wichtigste Python-Package für den Import, die Vorverarbeitung und Exploration von Datensätzen. 

Mit "as" legen wir fest, dass wir im Code z.B. nur pd statt pandas schreiben müssen um eine Funktion aus diesem Package zu benutzen - das ist nicht unbedingt notwendig, aber sehr praktisch. 

Wir können auch nur bestimmte Funktionen von Packages importieren, indem wir diese mit "from" spezifizieren (hier wird z.B. nur die "Path" Funktion aus dem "pathlib" Package importiert. 

In [1]:
import pandas as pd 
import numpy as np
import helpers

import seaborn as sns
import matplotlib.pyplot as plt

# Display plots inside the notebook
%matplotlib inline

from pathlib import Path

# Display all dataframe columns in outputs (it has 63 columns, which is wider than the notebook)
# This sets it up to display with a horizontal scroll instead of hiding the middle columns
pd.set_option('display.max_columns', 100) 

### Schritt 2: Lese den Datensatz ein

Zunächst müssen wir (wie auch in R) den Datensatz einlesen. Dafür benutzen wir die read_csv Funktion. Hier ist es wichtig, dass ihr den richtigen Pfad zum Datensatz nennt. 
Leider unterscheidet sich Windows hier von den meisten anderen Betriebssystemen (ein kleiner Exkurs dazu: https://docs.microsoft.com/de-de/archive/blogs/larryosterman/why-is-the-dos-path-character). 

Damit es hier keine Fehler gibt benutzen wir die Path Funktion aus dem pathlib Package - dieses erkennt das Betriebssystem und wandelt den Pfad entsprechend um. 

In [2]:
data_folder = Path("data_tech/") 

file_to_open = data_folder / "OSMI_2016.csv" # in file_to_open ist jetzt der Pfad zu unserer .csv Datei gespeichert

In [3]:
df_2016 = pd.read_csv(file_to_open) # hier lesen wir den Datensatz ein und speichern ihn in der Variable df_2016

### Schritt 3: Exploratory Data Analysis (EDA) = Daten explorieren und verstehen

In [4]:
df_2016.shape # wie viele Zeilen und Spalten hat unser Datensatz? (Zeilen, Spalten)

(1433, 63)

In [5]:
df_2016.head(5) # Zeige mir die ersten 10 Zeilen des Datensatzes. 

Unnamed: 0,Are you self-employed?,How many employees does your company or organization have?,Is your employer primarily a tech company/organization?,Is your primary role within your company related to tech/IT?,Does your employer provide mental health benefits as part of healthcare coverage?,Do you know the options for mental health care available under your employer-provided coverage?,"Has your employer ever formally discussed mental health (for example, as part of a wellness campaign or other official communication)?",Does your employer offer resources to learn more about mental health concerns and options for seeking help?,Is your anonymity protected if you choose to take advantage of mental health or substance abuse treatment resources provided by your employer?,"If a mental health issue prompted you to request a medical leave from work, asking for that leave would be:",Do you think that discussing a mental health disorder with your employer would have negative consequences?,Do you think that discussing a physical health issue with your employer would have negative consequences?,Would you feel comfortable discussing a mental health disorder with your coworkers?,Would you feel comfortable discussing a mental health disorder with your direct supervisor(s)?,Do you feel that your employer takes mental health as seriously as physical health?,Have you heard of or observed negative consequences for co-workers who have been open about mental health issues in your workplace?,Do you have medical coverage which includes treatment of mental health issues?,Do you know local or online resources to seek help for a mental health disorder?,"If you have been diagnosed or treated for a mental health disorder, do you ever reveal this to clients or business contacts?","If you have revealed a mental health issue to a client or business contact, do you believe this has impacted you negatively?","If you have been diagnosed or treated for a mental health disorder, do you ever reveal this to coworkers or employees?","If you have revealed a mental health issue to a coworker or employee, do you believe this has impacted you negatively?",Do you believe your productivity is ever affected by a mental health issue?,"If yes, what percentage of your work time (time performing primary or secondary job functions) is affected by a mental health issue?",Do you have previous employers?,Have your previous employers provided mental health benefits?,Were you aware of the options for mental health care provided by your previous employers?,Did your previous employers ever formally discuss mental health (as part of a wellness campaign or other official communication)?,Did your previous employers provide resources to learn more about mental health issues and how to seek help?,Was your anonymity protected if you chose to take advantage of mental health or substance abuse treatment resources with previous employers?,Do you think that discussing a mental health disorder with previous employers would have negative consequences?,Do you think that discussing a physical health issue with previous employers would have negative consequences?,Would you have been willing to discuss a mental health issue with your previous co-workers?,Would you have been willing to discuss a mental health issue with your direct supervisor(s)?,Did you feel that your previous employers took mental health as seriously as physical health?,Did you hear of or observe negative consequences for co-workers with mental health issues in your previous workplaces?,Would you be willing to bring up a physical health issue with a potential employer in an interview?,Why or why not?,Would you bring up a mental health issue with a potential employer in an interview?,Why or why not?.1,Do you feel that being identified as a person with a mental health issue would hurt your career?,Do you think that team members/co-workers would view you more negatively if they knew you suffered from a mental health issue?,How willing would you be to share with friends and family that you have a mental illness?,Have you observed or experienced an unsupportive or badly handled response to a mental health issue in your current or previous workplace?,Have your observations of how another individual who discussed a mental health disorder made you less likely to reveal a mental health issue yourself in your current workplace?,Do you have a family history of mental illness?,Have you had a mental health disorder in the past?,Do you currently have a mental health disorder?,"If yes, what condition(s) have you been diagnosed with?","If maybe, what condition(s) do you believe you have?",Have you been diagnosed with a mental health condition by a medical professional?,"If so, what condition(s) were you diagnosed with?",Have you ever sought treatment for a mental health issue from a mental health professional?,"If you have a mental health issue, do you feel that it interferes with your work when being treated effectively?","If you have a mental health issue, do you feel that it interferes with your work when NOT being treated effectively?",What is your age?,What is your gender?,What country do you live in?,What US state or territory do you live in?,What country do you work in?,What US state or territory do you work in?,Which of the following best describes your work position?,Do you work remotely?
0,0,26-100,1.0,,Not eligible for coverage / N/A,,No,No,I don't know,Very easy,No,No,Maybe,Yes,I don't know,No,,,,,,,,,1,"No, none did",N/A (not currently aware),I don't know,None did,I don't know,Some of them,None of them,Some of my previous employers,Some of my previous employers,I don't know,None of them,Maybe,,Maybe,,Maybe,"No, I don't think they would",Somewhat open,No,,No,Yes,No,,,Yes,"Anxiety Disorder (Generalized, Social, Phobia,...",0,Not applicable to me,Not applicable to me,39,Male,United Kingdom,,United Kingdom,,Back-end Developer,Sometimes
1,0,6-25,1.0,,No,Yes,Yes,Yes,Yes,Somewhat easy,No,No,Maybe,Yes,Yes,No,,,,,,,,,1,"Yes, they all did",I was aware of some,None did,Some did,"Yes, always",None of them,None of them,"No, at none of my previous employers",Some of my previous employers,Some did,None of them,Maybe,It would depend on the health issue. If there ...,No,While mental health has become a more prominen...,"No, I don't think it would","No, I don't think they would",Somewhat open,No,,Yes,Yes,Yes,"Anxiety Disorder (Generalized, Social, Phobia,...",,Yes,"Anxiety Disorder (Generalized, Social, Phobia,...",1,Rarely,Sometimes,29,male,United States of America,Illinois,United States of America,Illinois,Back-end Developer|Front-end Developer,Never
2,0,6-25,1.0,,No,,No,No,I don't know,Neither easy nor difficult,Maybe,No,Maybe,Maybe,I don't know,No,,,,,,,,,1,"No, none did",N/A (not currently aware),None did,Some did,I don't know,I don't know,Some of them,Some of my previous employers,I don't know,I don't know,Some of them,Yes,"They would provable need to know, to Judge if ...",Yes,"Stigma, mainly.",Maybe,Maybe,Somewhat open,Maybe/Not sure,Yes,No,Maybe,No,,,No,,1,Not applicable to me,Not applicable to me,38,Male,United Kingdom,,United Kingdom,,Back-end Developer,Always
3,1,,,,,,,,,,,,,,,,1.0,"Yes, I know several","Sometimes, if it comes up",I'm not sure,"Sometimes, if it comes up",I'm not sure,Yes,1-25%,1,Some did,N/A (not currently aware),None did,None did,I don't know,Some of them,Some of them,Some of my previous employers,Some of my previous employers,I don't know,Some of them,Yes,"old back injury, doesn't cause me many issues ...",Maybe,would not if I was not 100% sure that the disc...,"Yes, I think it would",Maybe,Neutral,No,,No,Yes,Yes,"Anxiety Disorder (Generalized, Social, Phobia,...",,Yes,"Anxiety Disorder (Generalized, Social, Phobia,...",1,Sometimes,Sometimes,43,male,United Kingdom,,United Kingdom,,Supervisor/Team Lead,Sometimes
4,0,6-25,0.0,1.0,Yes,Yes,No,No,No,Neither easy nor difficult,Yes,Maybe,Maybe,No,No,No,,,,,,,,,1,I don't know,N/A (not currently aware),Some did,None did,I don't know,Some of them,Some of them,"No, at none of my previous employers",Some of my previous employers,Some did,Some of them,Maybe,Depending on the interview stage and whether I...,No,I don't know,"Yes, I think it would",Maybe,Somewhat open,"Yes, I experienced",Yes,Yes,Yes,Yes,"Anxiety Disorder (Generalized, Social, Phobia,...",,Yes,"Anxiety Disorder (Generalized, Social, Phobia,...",1,Sometimes,Sometimes,43,Female,United States of America,Illinois,United States of America,Illinois,Executive Leadership|Supervisor/Team Lead|Dev ...,Sometimes


In [6]:
df_2016.tail(3) # Zeige mir die letzten 3 Zeilen des Datensatzes

Unnamed: 0,Are you self-employed?,How many employees does your company or organization have?,Is your employer primarily a tech company/organization?,Is your primary role within your company related to tech/IT?,Does your employer provide mental health benefits as part of healthcare coverage?,Do you know the options for mental health care available under your employer-provided coverage?,"Has your employer ever formally discussed mental health (for example, as part of a wellness campaign or other official communication)?",Does your employer offer resources to learn more about mental health concerns and options for seeking help?,Is your anonymity protected if you choose to take advantage of mental health or substance abuse treatment resources provided by your employer?,"If a mental health issue prompted you to request a medical leave from work, asking for that leave would be:",Do you think that discussing a mental health disorder with your employer would have negative consequences?,Do you think that discussing a physical health issue with your employer would have negative consequences?,Would you feel comfortable discussing a mental health disorder with your coworkers?,Would you feel comfortable discussing a mental health disorder with your direct supervisor(s)?,Do you feel that your employer takes mental health as seriously as physical health?,Have you heard of or observed negative consequences for co-workers who have been open about mental health issues in your workplace?,Do you have medical coverage which includes treatment of mental health issues?,Do you know local or online resources to seek help for a mental health disorder?,"If you have been diagnosed or treated for a mental health disorder, do you ever reveal this to clients or business contacts?","If you have revealed a mental health issue to a client or business contact, do you believe this has impacted you negatively?","If you have been diagnosed or treated for a mental health disorder, do you ever reveal this to coworkers or employees?","If you have revealed a mental health issue to a coworker or employee, do you believe this has impacted you negatively?",Do you believe your productivity is ever affected by a mental health issue?,"If yes, what percentage of your work time (time performing primary or secondary job functions) is affected by a mental health issue?",Do you have previous employers?,Have your previous employers provided mental health benefits?,Were you aware of the options for mental health care provided by your previous employers?,Did your previous employers ever formally discuss mental health (as part of a wellness campaign or other official communication)?,Did your previous employers provide resources to learn more about mental health issues and how to seek help?,Was your anonymity protected if you chose to take advantage of mental health or substance abuse treatment resources with previous employers?,Do you think that discussing a mental health disorder with previous employers would have negative consequences?,Do you think that discussing a physical health issue with previous employers would have negative consequences?,Would you have been willing to discuss a mental health issue with your previous co-workers?,Would you have been willing to discuss a mental health issue with your direct supervisor(s)?,Did you feel that your previous employers took mental health as seriously as physical health?,Did you hear of or observe negative consequences for co-workers with mental health issues in your previous workplaces?,Would you be willing to bring up a physical health issue with a potential employer in an interview?,Why or why not?,Would you bring up a mental health issue with a potential employer in an interview?,Why or why not?.1,Do you feel that being identified as a person with a mental health issue would hurt your career?,Do you think that team members/co-workers would view you more negatively if they knew you suffered from a mental health issue?,How willing would you be to share with friends and family that you have a mental illness?,Have you observed or experienced an unsupportive or badly handled response to a mental health issue in your current or previous workplace?,Have your observations of how another individual who discussed a mental health disorder made you less likely to reveal a mental health issue yourself in your current workplace?,Do you have a family history of mental illness?,Have you had a mental health disorder in the past?,Do you currently have a mental health disorder?,"If yes, what condition(s) have you been diagnosed with?","If maybe, what condition(s) do you believe you have?",Have you been diagnosed with a mental health condition by a medical professional?,"If so, what condition(s) were you diagnosed with?",Have you ever sought treatment for a mental health issue from a mental health professional?,"If you have a mental health issue, do you feel that it interferes with your work when being treated effectively?","If you have a mental health issue, do you feel that it interferes with your work when NOT being treated effectively?",What is your age?,What is your gender?,What country do you live in?,What US state or territory do you live in?,What country do you work in?,What US state or territory do you work in?,Which of the following best describes your work position?,Do you work remotely?
1430,0,100-500,1.0,,Yes,Yes,Yes,Yes,I don't know,Somewhat difficult,Maybe,Maybe,Yes,Yes,I don't know,Yes,,,,,,,,,1,Some did,I was aware of some,None did,Some did,Sometimes,"Yes, all of them",Some of them,Some of my previous employers,Some of my previous employers,None did,Some of them,Maybe,Fear that doing so would cause the employer to...,No,Fear that the employer would consider addition...,"Yes, it has","No, I don't think they would",Somewhat open,"Yes, I observed",Yes,Yes,Yes,Maybe,,"Anxiety Disorder (Generalized, Social, Phobia,...",Yes,"Anxiety Disorder (Generalized, Social, Phobia,...",1,Rarely,Sometimes,52,Male,United States of America,Georgia,United States of America,Georgia,Back-end Developer,Sometimes
1431,0,100-500,0.0,1.0,I don't know,I am not sure,No,Yes,I don't know,Somewhat difficult,Maybe,No,Maybe,Yes,No,No,,,,,,,,,1,"No, none did",N/A (not currently aware),None did,None did,I don't know,"Yes, all of them",None of them,"No, at none of my previous employers","No, at none of my previous employers",None did,None of them,Maybe,Stigma with some diseases,No,Feels like I'm making a mountain out of a mole...,"No, I don't think it would","No, I don't think they would",Somewhat open,"Yes, I experienced",Maybe,Yes,Maybe,Yes,"Anxiety Disorder (Generalized, Social, Phobia,...",,Yes,"Mood Disorder (Depression, Bipolar Disorder, etc)",0,Sometimes,Often,30,Female,United States of America,Nebraska,United States of America,Nebraska,DevOps/SysAdmin,Sometimes
1432,0,100-500,1.0,,Yes,No,No,No,I don't know,Very difficult,Maybe,No,Maybe,Maybe,No,No,,,,,,,,,0,,,,,,,,,,,,Maybe,,No,,"Yes, I think it would","No, I don't think they would",Somewhat open,Maybe/Not sure,No,I don't know,Yes,Yes,Obsessive-Compulsive Disorder|Eating Disorder ...,,No,,0,Not applicable to me,Often,25,non-binary,Canada,,Canada,,Other,Sometimes


Die Funktion **.info()** zeigt uns die Datentypen der columns an, und wie viele non-null (also nicht fehlende) Werte wir haben. 

In [7]:
df_2016.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1433 entries, 0 to 1432
Data columns (total 63 columns):
Are you self-employed?                                                                                                                                                              1433 non-null int64
How many employees does your company or organization have?                                                                                                                          1146 non-null object
Is your employer primarily a tech company/organization?                                                                                                                             1146 non-null float64
Is your primary role within your company related to tech/IT?                                                                                                                        263 non-null float64
Does your employer provide mental health benefits as part of healthcare coverage?        

Der häufigste Datentyp ist hier "object". Damit wir mit den Daten rechnen können, müssen wir die Items mit Typ object noch in numerische Werte transformieren. Außerdem sollten wir uns im nächsten Schritt die fehlenden Werte genauer ansehen. 

In [13]:
missing_df = helpers.missing_values_table(df_2016)
missing_df

The dataset has 63 columns.
There are 44 columns that have missing values.


Unnamed: 0,Missing Values,% of Total Values
"If you have revealed a mental health issue to a client or business contact, do you believe this has impacted you negatively?",1289,90.0
"If yes, what percentage of your work time (time performing primary or secondary job functions) is affected by a mental health issue?",1229,85.8
Is your primary role within your company related to tech/IT?,1170,81.6
Do you have medical coverage which includes treatment of mental health issues?,1146,80.0
Do you believe your productivity is ever affected by a mental health issue?,1146,80.0
"If you have revealed a mental health issue to a coworker or employee, do you believe this has impacted you negatively?",1146,80.0
"If you have been diagnosed or treated for a mental health disorder, do you ever reveal this to coworkers or employees?",1146,80.0
"If you have been diagnosed or treated for a mental health disorder, do you ever reveal this to clients or business contacts?",1146,80.0
Do you know local or online resources to seek help for a mental health disorder?,1146,80.0
"If maybe, what condition(s) do you believe you have?",1111,77.5


Von 63 Spalten haben 44 Missing Values - eine ganze Menge. Die Anzahl an fehlenden Werten variiert von 0.2 bis 90%. Eine grobe Daumenregel besagt, dass wir Spalten mit mehr als 50% fehlenden Werten komplett ausschließen sollten. Ein weiteres "To Do" für Schritt 4 ist also die Bereinigung von fehlenden Werten. 

In [34]:
for column in df_2016.columns.tolist(): 
    print([column, df_2016[column].nunique()])

['Are you self-employed?', 2]
['How many employees does your company or organization have?', 6]
['Is your employer primarily a tech company/organization?', 2]
['Is your primary role within your company related to tech/IT?', 2]
['Does your employer provide mental health benefits as part of healthcare coverage?', 4]
['Do you know the options for mental health care available under your employer-provided coverage?', 3]
['Has your employer ever formally discussed mental health (for example, as part of a wellness campaign or other official communication)?', 3]
['Does your employer offer resources to learn more about mental health concerns and options for seeking help?', 3]
['Is your anonymity protected if you choose to take advantage of mental health or substance abuse treatment resources provided by your employer?', 3]
['If a mental health issue prompted you to request a medical leave from work, asking for that leave would be:', 6]
['Do you think that discussing a mental health disorder wit

#### Age and Gender Column überprüfen

for column in df.columns:

In [45]:
df.DEM1.describe()

count    1433.000000
mean       34.286113
std        11.290931
min         3.000000
25%        28.000000
50%        33.000000
75%        39.000000
max       323.000000
Name: DEM1, dtype: float64

**To-Do's für Schritt 4:**
- Spalten umbenennen
- Missing Values
- Gender und Age transformieren

### Schritt 4: Daten für die Analyse aufbereiten

#### 4.1 Spalten umbenennen

In [9]:
rename_dict = {'Are you self-employed?': "EMP1", 
               'How many employees does your company or organization have?': "EMP2", 
              'Is your employer primarily a tech company/organization?': "EMP3", 
              'Is your primary role within your company related to tech/IT?': "EMP4",
              'Does your employer provide mental health benefits as part of healthcare coverage?':"MENT1",
              'Do you know the options for mental health care available under your employer-provided coverage?': "MENT2",
              'Has your employer ever formally discussed mental health (for example, as part of a wellness campaign or other official communication)?': "MENT3",
               'Does your employer offer resources to learn more about mental health concerns and options for seeking help?': "MENT4",
               'Is your anonymity protected if you choose to take advantage of mental health or substance abuse treatment resources provided by your employer?': "MENT5",
               'If a mental health issue prompted you to request a medical leave from work, asking for that leave would be:': "MENT6",
               'Do you think that discussing a mental health disorder with your employer would have negative consequences?': "MENT7",
               'Do you think that discussing a physical health issue with your employer would have negative consequences?': "MENT8",
               'Would you feel comfortable discussing a mental health disorder with your coworkers?': "MENT9",
               'Would you feel comfortable discussing a mental health disorder with your direct supervisor(s)?': "MENT10",
               'Do you feel that your employer takes mental health as seriously as physical health?': "MENT11",
               'Have you heard of or observed negative consequences for co-workers who have been open about mental health issues in your workplace?': "MENT12",
                'Do you have medical coverage which includes treatment of mental health issues?': "MENT13",
               'Do you know local or online resources to seek help for a mental health disorder?': "MENT14",
               'If you have been diagnosed or treated for a mental health disorder, do you ever reveal this to clients or business contacts?': "MENT15",
               'If you have revealed a mental health issue to a client or business contact, do you believe this has impacted you negatively?': "MENT16",
               'If you have been diagnosed or treated for a mental health disorder, do you ever reveal this to coworkers or employees?': "MENT17",
               'If you have revealed a mental health issue to a coworker or employee, do you believe this has impacted you negatively?': "MENT18",
               'Do you believe your productivity is ever affected by a mental health issue?': "MENT19",
               'If yes, what percentage of your work time (time performing primary or secondary job functions) is affected by a mental health issue?': "MENT20",
               'Do you have previous employers?': "PREMP1",
               'Have your previous employers provided mental health benefits?': "PREMP2",
               'Were you aware of the options for mental health care provided by your previous employers?': "PREMP3",
               'Did your previous employers ever formally discuss mental health (as part of a wellness campaign or other official communication)?': "PREMP4",
               'Did your previous employers provide resources to learn more about mental health issues and how to seek help?': "PREMP5",
               'Was your anonymity protected if you chose to take advantage of mental health or substance abuse treatment resources with previous employers?': "PREMP6",
               'Do you think that discussing a mental health disorder with previous employers would have negative consequences?': "PREMP7",
               'Do you think that discussing a physical health issue with previous employers would have negative consequences?': "PREMP8",
               'Would you have been willing to discuss a mental health issue with your previous co-workers?': "PREMP9",
               'Would you have been willing to discuss a mental health issue with your direct supervisor(s)?': "PREMP10",
               'Did you feel that your previous employers took mental health as seriously as physical health?': "PREMP11",
               'Did you hear of or observe negative consequences for co-workers with mental health issues in your previous workplaces?': "PREMP12",
               'Would you be willing to bring up a physical health issue with a potential employer in an interview?': "OPEN1",
               'Why or why not?': "OPEN2",
               'Would you bring up a mental health issue with a potential employer in an interview?': "OPEN3",
               'Why or why not?.1': "OPEN4",
               'Do you feel that being identified as a person with a mental health issue would hurt your career?': "OPEN5",
               'Do you think that team members/co-workers would view you more negatively if they knew you suffered from a mental health issue?': "OPEN6",
               'How willing would you be to share with friends and family that you have a mental illness?': "OPEN7",
               'Have you observed or experienced an unsupportive or badly handled response to a mental health issue in your current or previous workplace?': "OPEN8",
               'Have your observations of how another individual who discussed a mental health disorder made you less likely to reveal a mental health issue yourself in your current workplace?': "OPEN9",
               'Do you have a family history of mental illness?': "DIAG1",
               'Have you had a mental health disorder in the past?': "DIAG2",
               'Do you currently have a mental health disorder?': "DIAG3",
               'If yes, what condition(s) have you been diagnosed with?': "DIAG4",
               'If maybe, what condition(s) do you believe you have?': "DIAG5",
               'Have you been diagnosed with a mental health condition by a medical professional?': "DIAG6",
               'If so, what condition(s) were you diagnosed with?': "DIAG7",
               'Have you ever sought treatment for a mental health issue from a mental health professional?': "DIAG8",
               'If you have a mental health issue, do you feel that it interferes with your work when being treated effectively?': "DIAG9",
               'If you have a mental health issue, do you feel that it interferes with your work when NOT being treated effectively?': "DIAG10",
               'What is your age?': "DEM1", 
               'What is your gender?': "DEM2",
               'What country do you live in?': "DEM3",
               'What US state or territory do you live in?': "DEM4",
               'What country do you work in?': "DEM5",
               'What US state or territory do you work in?': "DEM6",
               'Which of the following best describes your work position?': "DEM7",
               'Do you work remotely?': "DEM8"
              }


In [10]:
df = df_2016.rename(columns = rename_dict, errors = "raise")

In [11]:
helpers.search_question(rename_dict, "DIAG1")

'Do you have a family history of mental illness?'

#### Gender Variable neu kategorisieren

In [35]:
df['DEM2'] = df["DEM2"].str.lower()

In [36]:
df.DEM2.unique()

array(['male', 'male ', 'female', 'm', 'i identify as female.', 'female ',
       'bigender', 'non-binary', 'female assigned at birth ', 'f',
       'woman', 'man', 'fm', 'cis female ', 'transitioned, m2f',
       'genderfluid (born female)', 'other/transfeminine',
       'female or multi-gender femme', 'female/woman', 'cis male',
       'male.', 'androgynous', 'male 9:1 female, roughly', nan,
       'male (cis)', 'other', 'nb masculine', 'cisgender female',
       'sex is male', 'none of your business', 'genderqueer', 'human',
       'genderfluid', 'enby', 'malr', 'genderqueer woman', 'mtf', 'queer',
       'agender', 'dude', 'fluid',
       "i'm a man why didn't you make this a drop down question. you should of asked sex? and i would of answered yes please. seriously how much text can this take? ",
       'mail', 'm|', 'male/genderqueer', 'fem', 'nonbinary',
       'female (props for making this a freeform field, though)',
       ' female', 'unicorn', 'male (trans, ftm)', 'cis-woman'

In [41]:
male = ['male', 'male ','m','man','cis male', 'male.','male 9:1 female, roughly','male (cis)','nb masculine', 'sex is male', 'malr',
       'dude', "i'm a man why didn't you make this a drop down question. you should of asked sex? and i would of answered yes please. seriously how much text can this take? ",
       'mail', 'm|','cisdude', 'cis man', "malr"]
female = ['female', 'i identify as female.', 'female ','female assigned at birth ', 'f',
       'woman', 'fm', 'cis female ','female or multi-gender femme', 'female/woman','cisgender female', 'fem',
         'female (props for making this a freeform field, though)', ' female', 'cis-woman']
other = ['bigender', 'female-bodied; no feelings about gender','non-binary','transitioned, m2f', 'genderfluid (born female)', 'other/transfeminine', 'androgynous',
        'other','none of your business', 'genderflux demi-girl','genderqueer', 'human','genderfluid', 'enby', 'genderqueer woman',
        'mtf', 'queer','agender','fluid', 'male/genderqueer', 'nonbinary','unicorn', 'male (trans, ftm)','genderflux demi-girl'
        'female-bodied; no feelings about gender','afab', 'transgender woman']

In [42]:
df['DEM2'] = df['DEM2'].apply(lambda x:"male" if x in male else x)
df['DEM2'] = df['DEM2'].apply(lambda x:"female" if x in female else x)
df['DEM2'] = df['DEM2'].apply(lambda x:"other" if x in other else x)

In [43]:
df.DEM2.unique()

array(['male', 'female', 'other', nan], dtype=object)

### Drop and Clean Columns