# Analyzing Relationships with Machine Learning

By: Oscar Ko

This notebook is created to analyze this dataset on relationships from Stanford:

https://data.stanford.edu/hcmst2017

---
---

# Imports and Data

In [2]:
import numpy as np
import pandas as pd

import warnings
warnings.filterwarnings('ignore')


imported_data = pd.read_stata("data/HCMST 2017 fresh sample for public sharing draft v1.1.dta")

imported_data.shape

(3510, 285)

In [3]:
imported_data.info(verbose=True, show_counts=True)

<class 'pandas.core.frame.DataFrame'>
Int64Index: 3510 entries, 0 to 3509
Data columns (total 285 columns):
 #    Column                            Non-Null Count  Dtype   
---   ------                            --------------  -----   
 0    CaseID                            3510 non-null   int16   
 1    CASEID_NEW                        3510 non-null   int32   
 2    qflag                             3510 non-null   category
 3    weight1                           2994 non-null   float64 
 4    weight1_freqwt                    2994 non-null   float32 
 5    weight2                           551 non-null    float64 
 6    weight1a                          3110 non-null   float64 
 7    weight1a_freqwt                   3110 non-null   float32 
 8    weight_combo                      3510 non-null   float32 
 9    weight_combo_freqwt               3510 non-null   float32 
 10   duration                          3510 non-null   int16   
 11   speed_flag                        3510 no

# Select Specific Features to Keep

- Some features are redundant.
    - For example, some are just recodings of each other.
- Some features contain more of the same information.
    - Q4 and w6_q4 contain subject's partner's gender, but only w6_q4 contains gender for couples that are still together AND couples that are broken up.
    - w6_q4 will be selected to use. Q4 will not be used.
- Some features I'm uncertain about whether or not they will be useful, so I will leave them commented out.

In [4]:
features = [
    
    "CASEID_NEW", # ID
    "w6_sex_frequency", # sexFrequency
    "ppp20072", # attendReligiousServiceFreq
    "pphhsize", # household size
    "pphouse", # type of house
    "ppincimp", # household income
    "ppmsacat", # metro area
    "pprent", # own, rent, other
    "ppwork", # employment status
    "w6_q15a1_truncated", # subject grew up in US?
    "w6_q15a4_truncated", # subject's living country when met partner
    "w6_q16", # how many relatives subject sees per month?
    "w6_q17", # how many times has subject been married?
    "w6_q23", # Who earned more (in 2016 or when last together)
    "interracial_5cat", # based on w6_subject_race and w6_q6b
    "w6_q32", # did you use an Internet service to meet partner?
    "age_when_met", # age when met in years,=ppage-(2017- w6_q21a_year)
    "w6_q4", # partner gender
    "partyid7", # subject's political party
    "w6_q12", # partner's political party
    "ppgender", # subject gender
    "S1", # isMarried
    "ppage", # subject age
    "w6_q9", # partner's age in 2017
    "subject_yrsed", # RECODE of ppeduc (Education (Highest Degree Received))
    "partner_yrsed", # RECODE of w6_q10 (partner's educational attainment)
    "subject_mother_yrsed", # RECODE of w6_q14 (Subject's mother's educational attainment)
    "partner_mother_yrsed", # RECODE of w6_q11 (partner's mother's Education)
    "w6_subject_race", # based on single races Race_x
    "w6_q6b", # partner's race
    "PPREG4", # region
    "w6_same_sex_couple_gender", # same sex couple specific (0=straight, 1=gay, 2=lesbian)
    "w6_attraction", # what gender(s) subject attracted to?
    "w6_q19", # couple living together?
    "w6_q34", # how would you describe the quality of your relationship with partner?
    "w6_identity_all", # subject sexual identity
    "PPT01", # household member age (number of babies in household ages 0-1)
    "PPT25", # household member age (number of toddlers in household ages 2-5)
    "PPT612", # household member age (number of children in household ages 6-12)
    "PPT1317", # household member age (number of teens in household ages 13-17)
    "PPT18OV", # household member age (number of adults in household ages 18+)
    
    # Year/Month of Relationship Stages  ---------------------------- 
    
#     "w6_q21a_year", # year subject first met partner
#     "w6_q21a_month", # month subject first met partner
#     "w6_q21b_year", # year subject began romantic relationship w partner
#     "w6_q21b_month", # month subject began romantic relationship w partner
#     "w6_q21c_year", # year subject first lived with partner
#     "w6_q21c_month", # month subject first lived with partner
#     "w6_q21d_year", # year subject married partner
#     "w6_q21d_month", # month subject married partner
#     "w6_q21e_year", # year of breakup
#     "w6_q21e_month", # month of breakup
#     "w6_q21f_year", # year partner died
#     "w6_q21f_month", # month partner died
    
    # (Fractions) Year/Month of Relationship Stages ---------------
    
    "year_fraction_met", # w6_q21a_year+((w6_q21a_month-0.5)/12)
    "year_fraction_relstart", # w6_q21b_year+((w6_q21b_month-0.5)/12)
    "time_from_met_to_rel", # year_fraction_relstart-year_fraction_met
    "year_fraction_first_cohab", # w6_q21c_year+((w6_q21c_month-0.5)/12)
    "time_from_rel_to_cohab", # year_fraction_first_cohab-year_fraction_relstart, neg reset to zero
    
    
    # Met in person -----------------------------------------------
    
#     "w6_q25", # did subject and partner attend same H.S.
#     "w6_q26", # did subject and partner attend same college
#     "w6_q27", # did subject and partner grow up in same city or town
#     "w6_q28", # did subject's parents know partner's parents before subject knew partner?
#     "w6_friend_connect_1_all", # subject knew partner's friends before meeting partner
#     "w6_friend_connect_2_all", # partner knew subjects friends before meeting subject
#     "w6_friend_connect_3_all", # subject's friends knew partner's friends before subject and partner met
#     "w6_friend_connect_4_all", # no prior connection between subject's friends and partner's friends
    
#     "hcm2017q24_R_cowork", # Respondent's coworker: indermediary or partner
#     "hcm2017q24_R_friend", # Respondent's friend: intermediary
#     "hcm2017q24_R_family", # Respondent's family: intermediary
#     "hcm2017q24_R_sig_other", # Respondent's (current or past) Significant Other: Intermediary
#     "hcm2017q24_R_neighbor", # Respondent's residential neighbor: intermediary or Partner
#     "hcm2017q24_P_cowork", # Partner's coworker: Intermediary or Respondent
#     "hcm2017q24_P_friend", # Partner's Friend: Intermediary
#     "hcm2017q24_P_family", # Partner's Family: Intermediary
#     "hcm2017q24_P_sig_other", # Partner's (current or past) Significant Other: Intermediary
#     "hcm2017q24_P_neighbor", # Partner's residential neighbor: Intermediary or Respondent
    
    "hcm2017q24_met_through_family", # 1 if R_family or P_family =1
    "hcm2017q24_met_through_friend", # 1 if R_friend or P_friend=1
    "hcm2017q24_met_through_as_nghbrs", # 1 if R_neighbor or P_neighbor=1
    "hcm2017q24_met_as_through_cowork", # 1 if R_cowork or P_cowork=1

    "hcm2017q24_school", # met in primary or secondary school
    "hcm2017q24_college", # met in college
    "hcm2017q24_mil", # met during military service
    "hcm2017q24_church", # met in or through church or religious organization
    "hcm2017q24_vol_org", # met through voluntary organization (non-church)
    "hcm2017q24_customer", # customer-client relationship
    "hcm2017q24_bar_restaurant", # restaurant, or othe public social gathering place
    "hcm2017q24_party", # private party
    
    "hcm2017q24_public", # met in public place
    "hcm2017q24_blind_date", # met on blind date
    "hcm2017q24_vacation", # met while on vacation
    "hcm2017q24_single_serve_nonint", # non internet single service
    "hcm2017q24_business_trip", # met while on business trip
    "hcm2017q24_work_neighbors", # met as work neighbors
    
    # Met online / dating app -----------------------------------------------
    
    "hcm2017q24_internet_other", # Internet, not otherwise classified
    "hcm2017q24_internet_dating", # met through Internet dating or phone app
    "hcm2017q24_internet_soc_network", # met through internet social networking
    
    "hcm2017q24_internet_game", # met through online gaming
    "hcm2017q24_internet_chat", # met through Internet chat
    "hcm2017q24_internet_org", # met through Internet site not mainly dedicated to dating

    "hcm2017q24_met_online", # met online, all kinds


    # to be filtered ---------------------
    
    "qflag", # DOV: Qualification Flag - Remove 2
    "speed_flag", # Respondents who completed survey in under 2 min - Remove under 2
    "S3", # Ever had a boyfriend or a girlfriend - Remove "No"
    "w6_took_the_survey", # Whether subject took the survey or was excluded
    "partnership_status", # Filter out 4 (never had) married, parner, ex
    
    
    # to be recoded ----------------------
    
    "ppeduc", # subject education
    "w6_q10", # partner's education 
    "w6_q11", # partner's mother's education
    "w6_q14", # subjects's mother's education
    
    "ppethm", # subject is Hispanic -- convert to binary

]

df = imported_data[features].copy()

pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

df.info(verbose=True)

<class 'pandas.core.frame.DataFrame'>
Int64Index: 3510 entries, 0 to 3509
Data columns (total 81 columns):
 #   Column                            Non-Null Count  Dtype   
---  ------                            --------------  -----   
 0   CASEID_NEW                        3510 non-null   int32   
 1   w6_sex_frequency                  2856 non-null   category
 2   ppp20072                          3394 non-null   category
 3   pphhsize                          3510 non-null   category
 4   pphouse                           3510 non-null   category
 5   ppincimp                          3510 non-null   category
 6   ppmsacat                          3510 non-null   category
 7   pprent                            3510 non-null   category
 8   ppwork                            3510 non-null   category
 9   w6_q15a1_truncated                3394 non-null   category
 10  w6_q15a4_truncated                3394 non-null   category
 11  w6_q16                            3367 non-null   float6

# Filtering

Removing respondents who:
- weren't qualified for the survey
- did the survey questionably quickly
- have never had a partner
- were excluded from survey

In [5]:
isQualified = df["qflag"] == "Qualified"
overTwoMin = df["speed_flag"] == "Completed survey in over 2 minutes"
hasOrHadPartner = df["S3"] != "No"
tookTheSurvey = df["w6_took_the_survey"] == "took the survey"

print("Original dataset had", len(df), "rows.")

df_filtered = df[isQualified & overTwoMin & hasOrHadPartner & tookTheSurvey].copy()

print("Filtered dataset has", len(df_filtered), "rows.")

Original dataset had 3510 rows.
Filtered dataset has 3391 rows.


# Renamed Dataframe

- Creating a copy of the df_filtered to rename columns.

In [6]:
# make a copy of df dataframe

df_renamed = df_filtered.copy()

cols_to_keepOut = ["qflag", 
                   "speed_flag", 
                   "S3", 
                   "w6_took_the_survey", 
                   "partnership_status"]

df_renamed.drop(cols_to_keepOut, axis=1, inplace=True)

cols_to_rename = {
    "CASEID_NEW": "ID",
    "w6_sex_frequency": "sexFrequency",
    "ppp20072": "attendReligiousServiceFreq",
    "pphhsize": "householdSize",
    "pphouse": "houseType",
    "ppincimp": "householdIncome",
    "ppmsacat": "isMetroArea",
    "pprent": "ownHouseRentOther", # own, rent, other
    "ppwork": "employmentStatus", # employment status
    "w6_q15a1_truncated": "subjectGrewUpInUS", # subject grew up in US?
    "w6_q15a4_truncated": "subjectCountryWhenMetPartner", # subject's living country when met partner
    "w6_q16": "numRelativesSeePerMonth", # how many relatives subject sees per month?
    "w6_q17": "numOfTimesMarried", # how many times has subject been married?
    "w6_q23": "whoEarnedMore", # Who earned more (in 2016 or when last together)
    "interracial_5cat": "interracial", # based on w6_subject_race and w6_q6b
    "w6_q32": "usedAnInternetServiceToMeet", # did you use an Internet service to meet partner?
    "age_when_met": "ageWhenMet", # age when met in years,=ppage-(2017- w6_q21a_year)
    "w6_q4": "partnerGender", # partner gender
    "partyid7": "subjectPoliticalParty", # subject's political party
    "w6_q12": "partnerPoliticalParty", # partner's political party
    "ppgender": "subjectGender", # subject gender
    "S1": "isMarried", # isMarried
    "ppage": "subjectAge", # subject age
    "w6_q9": "partnerAge", # partner's age in 2017
    "subject_yrsed": "subjectEduc_years", # RECODE of ppeduc (Education (Highest Degree Received))
    "partner_yrsed": "partnerEduc_years", # RECODE of w6_q10 (partner's educational attainment)
    "subject_mother_yrsed": "subjectMotherEduc_years", # RECODE of w6_q14 (Subject's mother's educational attainment)
    "partner_mother_yrsed": "partnerMotherEduc_years", # RECODE of w6_q11 (partner's mother's Education)
    "w6_subject_race": "subjectRace", # based on single races Race_x
    "w6_q6b": "partnerRace", # partner's race
    "PPREG4": "region", # region
    "w6_same_sex_couple_gender": "straightGayLesbian", # same sex couple specific (0=straight, 1=gay, 2=lesbian)
    "w6_attraction": "genderSubjectAttractedTo", 
        # what gender(s) subject attracted to? (opposite gender, opposite but, both, same but, same gender)
    "w6_q19": "isLivingTogether", # couple living together?
    "w6_q34": "relationshipQuality", # how would you describe the quality of your relationship with partner?
    "w6_identity_all": "subjectSexualIdentity", # subject sexual identity
    "PPT01": "numOfHouseMembersAges0to1", # household member age (number of babies in household ages 0-1)
    "PPT25": "numOfHouseMembersAges2to5", # household member age (number of toddlers in household ages 2-5)
    "PPT612": "numOfHouseMembersAges6to12", # household member age (number of children in household ages 6-12)
    "PPT1317": "numOfHouseMembersAges13to17", # household member age (number of teens in household ages 13-17)
    "PPT18OV": "numOfHouseMembersAges18toOver", # household member age (number of adults in household ages 18+)
    
    # (Fractions) Year/Month of Relationship Stages ---------------
    
    "year_fraction_met": "met_YearFraction", # w6_q21a_year+((w6_q21a_month-0.5)/12)
    "year_fraction_relstart": "shipStart_YearFraction", # w6_q21b_year+((w6_q21b_month-0.5)/12)
    "time_from_met_to_rel": "met_to_shipStart_diff", # year_fraction_relstart-year_fraction_met
    "year_fraction_first_cohab": "moveIn_YearFraction", # w6_q21c_year+((w6_q21c_month-0.5)/12)
    "time_from_rel_to_cohab": "shipStart_to_moveIn_YearFraction", # year_fraction_first_cohab-year_fraction_relstart, neg reset to zero
    
    
    # met in person (specific) -----------------------------------------------------
    
    "hcm2017q24_met_through_family": "metThru_family", # 1 if R_family or P_family =1
    "hcm2017q24_met_through_friend": "metThru_friend", # 1 if R_friend or P_friend=1
    "hcm2017q24_met_through_as_nghbrs": "metThru_orAs_neighbors", # 1 if R_neighbor or P_neighbor=1
    "hcm2017q24_met_as_through_cowork": "metAs_coworkers", # 1 if R_cowork or P_cowork=1
    
    "hcm2017q24_school": "metIn_school", # met in primary or secondary school
    "hcm2017q24_college": "metIn_college", # met in college
    "hcm2017q24_mil": "metIn_military", # met during military service
    "hcm2017q24_church": "metIn_church", # met in or through church or religious organization
    "hcm2017q24_vol_org": "metIn_voluntaryOrg", # met through voluntary organization (non-church)
    "hcm2017q24_customer": "metAs_customerAndClient", # customer-client relationship
    "hcm2017q24_bar_restaurant": "metIn_restaurantOrBar", # restaurant, or other public social gathering place
    "hcm2017q24_party": "metIn_privateParty", # private party
    
    "hcm2017q24_public": "metIn_public", # met in public place
    "hcm2017q24_blind_date": "metOn_blindDate", # met on blind date
    "hcm2017q24_vacation": "metOn_vacation", # met while on vacation
    "hcm2017q24_single_serve_nonint": "metThru_notInternetDatingService", # non internet single service
    "hcm2017q24_business_trip": "metOn_businessTrip", # met while on business trip
    "hcm2017q24_work_neighbors": "metAs_workNeighbors", # met as work neighbors
    
    
    # Met online / dating app -----------------------------------------------
    
    
    "hcm2017q24_internet_dating": "metOnline_datingSiteOrApp", # met through Internet dating or phone app
    "hcm2017q24_internet_soc_network": "metOnline_socialNetwork", # met through internet social networking
    
    "hcm2017q24_internet_game": "metOnline_gaming", # met through online gaming
    "hcm2017q24_internet_chat": "metOnline_chat", # met through Internet chat
    "hcm2017q24_internet_org": "metOnline_nonDatingSite", # met through Internet site not mainly dedicated to dating
    
    "hcm2017q24_internet_other": "metOnline_other", # Internet, not otherwise classified

    "hcm2017q24_met_online": "metOnline_all", # met online, all kinds


}

    
df_renamed.rename(columns=cols_to_rename, inplace=True)

df_renamed.info(verbose=True)

<class 'pandas.core.frame.DataFrame'>
Int64Index: 3391 entries, 0 to 3509
Data columns (total 76 columns):
 #   Column                            Non-Null Count  Dtype   
---  ------                            --------------  -----   
 0   ID                                3391 non-null   int32   
 1   sexFrequency                      2853 non-null   category
 2   attendReligiousServiceFreq        3391 non-null   category
 3   householdSize                     3391 non-null   category
 4   houseType                         3391 non-null   category
 5   householdIncome                   3391 non-null   category
 6   isMetroArea                       3391 non-null   category
 7   ownHouseRentOther                 3391 non-null   category
 8   employmentStatus                  3391 non-null   category
 9   subjectGrewUpInUS                 3391 non-null   category
 10  subjectCountryWhenMetPartner      3391 non-null   category
 11  numRelativesSeePerMonth           3365 non-null   float6

# Recode categorical education features

In [7]:
print(list(df_renamed["ppeduc"].unique()))
print("\n")
print(list(df_renamed["w6_q10"].unique()))

['Associate degree', 'Masters degree', '12th grade NO DIPLOMA', 'Bachelors degree', 'HIGH SCHOOL GRADUATE - high school DIPLOMA or the equivalent (GED)', 'Professional or Doctorate degree', 'Some college, no degree', '9th grade', '10th grade', '11th grade', '7th or 8th grade', '5th or 6th grade', '1st, 2nd, 3rd, or 4th grade', 'No formal education']


['HS graduate or GED', 'Master\x92s degree', 'Associate degree', 'Bachelor\x92s degree', 'Some college, no degree', 'Professional or Doctorate degree', '12th grade no diploma', '11th grade', '10th grade', '7th or 8th grade', 'No formal education', '9th grade', 'Refused', '5th or 6th grade', '1st-4th grade']


In [8]:
education_recodings = {"No formal education": "1-Less than high school",
                       "1st-4th grade": "1-Less than high school",
                       "1st, 2nd, 3rd, or 4th grade": "1-Less than high school",
                       "5th or 6th grade": "1-Less than high school",
                       "7th or 8th grade": "1-Less than high school",
                       "9th grade": "1-Less than high school",
                       "10th grade": "1-Less than high school",
                       "11th grade": "1-Less than high school",
                       "12th grade no diploma": "1-Less than high school",
                       "12th grade NO DIPLOMA": "1-Less than high school",
                       "HS graduate or GED": "2-High school",
                       "HIGH SCHOOL GRADUATE - high school DIPLOMA or the equivalent (GED)": "2-High school",
                       "Some college, no degree": "3-Some college",
                       "Associate degree": "3-Some college",
                       "Bachelors degree": "4-Bachelor's degree",
                       "Bachelor\x92s degree": "4-Bachelor's degree",
                       "Master\x92s degree": "5-Professional or Graduate degree",
                       "Masters degree": "5-Professional or Graduate degree",
                       "Professional or Doctorate degree": "5-Professional or Graduate degree"}


df_renamed["ppeduc"].replace(education_recodings, inplace=True) # subject's education
df_renamed["w6_q10"].replace(education_recodings, inplace=True) # partner's education
df_renamed["w6_q14"].replace(education_recodings, inplace=True) # subjects's mother's education
df_renamed["w6_q11"].replace(education_recodings, inplace=True) # partner's mother's education

In [9]:
print(list(df_renamed["ppeduc"].unique()))
print("\n")
print(list(df_renamed["w6_q10"].unique()))

['3-Some college', '5-Professional or Graduate degree', '1-Less than high school', "4-Bachelor's degree", '2-High school']


['2-High school', '5-Professional or Graduate degree', '3-Some college', "4-Bachelor's degree", '1-Less than high school', 'Refused']


### Rename education columns

In [10]:

cols_to_rename = {
    "ppeduc": "subjectEduc_cat",
    "w6_q10": "partnerEduc_cat",
    "w6_q14": "subjectMotherEduc_cat",
    "w6_q11": "partnerMotherEduc_cat"
}

    
df_renamed.rename(columns=cols_to_rename, inplace=True)

df_renamed.info(verbose=True)

<class 'pandas.core.frame.DataFrame'>
Int64Index: 3391 entries, 0 to 3509
Data columns (total 76 columns):
 #   Column                            Non-Null Count  Dtype   
---  ------                            --------------  -----   
 0   ID                                3391 non-null   int32   
 1   sexFrequency                      2853 non-null   category
 2   attendReligiousServiceFreq        3391 non-null   category
 3   householdSize                     3391 non-null   category
 4   houseType                         3391 non-null   category
 5   householdIncome                   3391 non-null   category
 6   isMetroArea                       3391 non-null   category
 7   ownHouseRentOther                 3391 non-null   category
 8   employmentStatus                  3391 non-null   category
 9   subjectGrewUpInUS                 3391 non-null   category
 10  subjectCountryWhenMetPartner      3391 non-null   category
 11  numRelativesSeePerMonth           3365 non-null   float6

# Recode and create binary isHispanic feature

In [11]:
print(list(df_renamed["ppethm"].unique()))

['White, Non-Hispanic', 'Hispanic', 'Black, Non-Hispanic', '2+ Races, Non-Hispanic', 'Other, Non-Hispanic']


In [12]:
df_renamed["isHispanic"] = df_renamed["ppethm"] == "Hispanic"

df_renamed.drop("ppethm", axis=1, inplace=True)

df_renamed["isHispanic"].unique()

array([False,  True])

In [13]:
df_renamed.info(verbose=True)

<class 'pandas.core.frame.DataFrame'>
Int64Index: 3391 entries, 0 to 3509
Data columns (total 76 columns):
 #   Column                            Non-Null Count  Dtype   
---  ------                            --------------  -----   
 0   ID                                3391 non-null   int32   
 1   sexFrequency                      2853 non-null   category
 2   attendReligiousServiceFreq        3391 non-null   category
 3   householdSize                     3391 non-null   category
 4   houseType                         3391 non-null   category
 5   householdIncome                   3391 non-null   category
 6   isMetroArea                       3391 non-null   category
 7   ownHouseRentOther                 3391 non-null   category
 8   employmentStatus                  3391 non-null   category
 9   subjectGrewUpInUS                 3391 non-null   category
 10  subjectCountryWhenMetPartner      3391 non-null   category
 11  numRelativesSeePerMonth           3365 non-null   float6

In [16]:
# optionally alphbetize column names

df_renamed.sort_index(axis=1).head(3)

Unnamed: 0,ID,ageWhenMet,attendReligiousServiceFreq,employmentStatus,genderSubjectAttractedTo,houseType,householdIncome,householdSize,interracial,isHispanic,isLivingTogether,isMarried,isMetroArea,metAs_coworkers,metAs_customerAndClient,metAs_workNeighbors,metIn_church,metIn_college,metIn_military,metIn_privateParty,metIn_public,metIn_restaurantOrBar,metIn_school,metIn_voluntaryOrg,metOn_blindDate,metOn_businessTrip,metOn_vacation,metOnline_all,metOnline_chat,metOnline_datingSiteOrApp,metOnline_gaming,metOnline_nonDatingSite,metOnline_other,metOnline_socialNetwork,metThru_family,metThru_friend,metThru_notInternetDatingService,metThru_orAs_neighbors,met_YearFraction,met_to_shipStart_diff,moveIn_YearFraction,numOfHouseMembersAges0to1,numOfHouseMembersAges13to17,numOfHouseMembersAges18toOver,numOfHouseMembersAges2to5,numOfHouseMembersAges6to12,numOfTimesMarried,numRelativesSeePerMonth,ownHouseRentOther,partnerAge,partnerEduc_cat,partnerEduc_years,partnerGender,partnerMotherEduc_cat,partnerMotherEduc_years,partnerPoliticalParty,partnerRace,region,relationshipQuality,sexFrequency,shipStart_YearFraction,shipStart_to_moveIn_YearFraction,straightGayLesbian,subjectAge,subjectCountryWhenMetPartner,subjectEduc_cat,subjectEduc_years,subjectGender,subjectGrewUpInUS,subjectMotherEduc_cat,subjectMotherEduc_years,subjectPoliticalParty,subjectRace,subjectSexualIdentity,usedAnInternetServiceToMeet,whoEarnedMore
0,2014039,30.0,Never,Working - as a paid employee,sexually attracted to men and women equally,A one-family house detached from any other house,"$40,000 to $49,999",1,no,False,,"No, I am not Married",Metro,no,no,no,no,no,no,no,no,no,no,no,no,no,no,yes,no,yes,no,no,no,no,no,no,no,no,2017.208374,0.0,,0,0,1,0,0,1.0,1.0,Owned or being bought by you or someone in you...,26.0,2-High school,12.0,[Partner Name] is Male,2-High school,12.0,Leans Republican,White,Northeast,,,2017.208374,,gay male couple,30,United States,3-Some college,14.0,Male,United States,3-Some college,14.0,Leans Democrat,White,bisexual,"Yes, an Internet dating or matchmaking site (l...",I earned more
1,2019003,21.0,Never,Not working - other,sexually attracted only to opposite gender,A one-family house detached from any other house,"$150,000 to $174,999",4,no,False,Yes,"Yes, I am Married",Metro,yes,no,no,no,no,no,no,no,no,no,no,no,no,no,no,no,no,no,no,no,no,no,no,no,no,1983.375,12.25,1996.125,0,2,2,0,0,1.0,1.0,Owned or being bought by you or someone in you...,52.0,5-Professional or Graduate degree,17.0,[Partner Name] is Male,2-High school,12.0,Leans Republican,White,Midwest,Excellent,Once a month or less,1995.625,0.5,hetero couple,55,United States,5-Professional or Graduate degree,17.0,Female,United States,4-Bachelor's degree,16.0,Not Strong Republican,White,heterosexual or straight,"No, I did NOT meet [Partner Name] through the ...",[Partner Name] earned more
2,2145527,36.0,Once or twice a month,Working - as a paid employee,sexually attracted only to opposite gender,A one-family house detached from any other house,"$200,000 to $249,999",5,no,False,Yes,"Yes, I am Married",Metro,no,no,no,no,no,no,no,no,yes,no,no,no,no,no,yes,no,no,no,no,yes,no,no,no,no,no,2006.041626,0.416748,2006.541626,0,0,2,1,2,1.0,0.0,Owned or being bought by you or someone in you...,45.0,3-Some college,14.0,[Partner Name] is Female,1-Less than high school,9.0,Leans Democrat,White,South,Good,2 to 3 times a month,2006.458374,0.083252,hetero couple,47,"Another country, please specify",5-Professional or Graduate degree,17.0,Male,"Another country, please specify",1-Less than high school,7.5,Leans Democrat,White,heterosexual or straight,"Yes, an Internet dating or matchmaking site (l...",I earned more
