# INFOSCI 2950: Final Project Phase III Submission

*Madelyn Leon, Lincy Chen, and Jessica Kuang*

---
## INTRODUCTION
Upon seeing the rise in Asian hate crimes this past year, our group was curious to find the contributing factors to the rise in these crimes. As we scoured the internet for information, however, the topic of the well-being of Asian Americans in the US piqued our interest. More specifically, we wanted to focus on the quality of life of Asian Americans. We were able to access a dataset by data.austintexas.gov, that included information on the health, finances, community support, and identity of Asian Americans in Texas. This source, titled “Final Report of the Asian American Quality of Life” inspired our research question: How can we define Asian-American communities in Texas in terms of health, happiness, and financial security? and How do health, finances, community support (or the lack thereof), and identities affect a Texan Asian American's quality of life?” Through our research, we found that there is strong evidence that supports specific factors like income, English fluency, and retirement as having a direct effect on the quality of life of Texan Asian Americans. 

---
## DATA DESCRIPTION
In terms of the data, the observations consist of the participants, and the attributes include ones listed as:
income', 'retired', 'us_born', 'english_speaking', 'english_difficulties', 'ethnicity','age', 'regular_exercise', 'healthy_diet', 'heart_disease', 'drinking', 'smoking', 'cancer', 'health_insurance', 'physical_check-up', 'quality_of_life', 'religion', 'gender', 'close_friends', 'discrimination_', 'duration_of_residency', 'household_size', 'education_completed’
This dataset was created so as to get specific data on factors like ethnicity, income, health, and community. The purpose is to get a better sense of Asian Americans who reside in Texas, and how their quality of life is. The creation of this dataset was funded by the local Austin government, with the data collection being done by local volunteers at city-wide events throughout the span of three years. Some processes that might’ve influenced what data was observed and recorded could have been the documentation process. From the dataset, there were some spots for certain participants that were empty, which could’ve been affected by the way the data was collected (ie: participants weren’t required to fill everything out, or participants weren’t monitored). There wasn’t much preprocessing that was done for the data aside from putting the results in the sheets. The participants who were involved with the data collection were aware of the data being collected and knew that the purpose of the data collection was to better understand the wellbeing of Asian Americans in Austin, Texas. The raw source can be found here.


---
## Data Limitations

Limited amount of quantitative data values will make it difficult to generate traditional-looking scatterplots and visualize possible relationships between variables. We foresee a barriers in conducting assessments that predict based off of existing trends due to this. When addressing demographics, the lack of records for previous years will be an obstacle when making comparisons on whether factors measuring the quality of life of Asian-Americans progressed or regressed. One real-world impact that can be derived from this limitation is that Asian-Americans will be less able to identify existing quality-of-life indicators, and  practices that the city of Austin should continue or cease based off patterns in the data. This will restrict the level of specificity in insights concerning emerging patterns we see. Data collected is also limited to representing the attitudes of Asian-Americans living in Texas. Their environment, for example, may be different from the attitudes of Asian-Americans in more or less urban areas, and cannot encapsulate the attitudes of all Asian-Americans. The meaning of the results in this case, would be less useful in terms of its applicability as we do not have randomly-sampled results of all Asian-Americans. This datasets' quality of record-keeping also faltered in areas where respondents were able to leave answers blank. This consequently resulted in NaNs, impacting the meaning of the results  

Certain ethnic groups dominate over others which may lead to skewed results when extrapolating quality of life measurements in mentioning the atittudes of Asian-American populations. For example, out of all the ethnic groups described, Protestant Asian-Americans reported experiencing the highest amount of discrimination. However, it would be innaccurate to assume that Protestantism is a motivating factor for racists to engage in discriminatory behaviors towards Asian-Americans. Some response variables will be affected by confounding variables such as different cultural aspects among sub-ethnic groups. (e.g. some ethnic groups have leaner diets which would impact conclusions drawn when studying Asian-American health). In this case, the higher rates of discrimination can also be explained in how the data shows Korean-Americans are primarily of Protestant faith. Given that they were also the ethnicity group to report the highest amounts of discrimination, this shows how the data can only go so far to explain why certain qualities might cause others. 

---
## Research Questions

**Main Goal: How can we predict an Asian-American's quality of life based on their health, income, identities, and experiences with discrimination? How do health, finances, community support (or the lack thereof), and identities affect an Texan Asian American's quality of life?**

We are going to perform a multi-linear regression where the outcome variable is quality of life. The predictor variables include 'income','retired', 'regular_exercise', 'healthy_diet', 'cancer','heart_disease', 'physical_checkup', 'close_friends','discrimination','household_size', 'education_completed','english_difficuties', and 'english_speaking'. This adresses our research question "How do health, finances, community support (or the lack thereof), and identities affect an Texan Asian American's quality of life?" because it analyzes the relationship between quality of life and all of the features that we believe would determine one's quality of life. We expect the $r^2$ value to be at least 0.5 because we believe that these are determinant of whether a person has a higher quality of life. 


For our second analysis, we will use a logistic regression to try to predict the quality of life of hypothetical people based on income, retired, english_speaking, regular_exercise, healthy_diet, close_friends, education_completed, physical_checkup, discrimination, ethnicity, and religion. We chose these categories by looking at the correlation heatmap and comparing between each of the variables to quality of life; we included the variables that showed a decent corrleation. We expect the result to be fairly accurate at predicting one's quality of life.

Questions for Grader: We want to use ethnicity in our logistic regresssion, but we are not sure as to how we could incorporate our current data which consists of strings into 0s and 1s. Also, is there a way to encapsulate all the ethnicity without creating too many columns? 

## Data Collection and Cleaning
### Data Collection
1. Go to data [landing page](https://data.austintexas.gov/City-Government/Final-Report-of-the-Asian-American-Quality-of-Life/hc5t-p62z). 
2. Click on Export > CSV.
3. Download publicly available `Final_Report_of_the_Asian_American_Quality_of_Life__AAQoL_.csv` into desired directory.

### Data Cleaning
1. Store raw data into a preliminary dataframe, `df`
2. Convert column names into snake_case

In [1]:
## load libraries
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import seaborn as sns
sns.set_style('white')
from sklearn.linear_model import LinearRegression,LogisticRegression
from sklearn.cluster import KMeans
from   sklearn.model_selection import cross_val_score

In [2]:
## Step 1
df = pd.read_csv('Final_Report_of_the_Asian_American_Quality_of_Life__AAQoL_.csv')

In [3]:
## Step 2
new_colnames = [i.lower() for i in df.columns]
new_colnames = [i.replace(" ","_") for i in new_colnames]

In [4]:
## Step 3
asian = df.copy()
asian.columns = new_colnames

In [5]:
## Step 4
asian = asian[['income', 'retired', 'us_born', 'english_speaking', 'english_difficulties', 'ethnicity','age', 'regular_exercise', 'healthy_diet', 'heart_disease', 'drinking', 'smoking',
              'health_insurance', 'physical_check-up', 'quality_of_life', 'religion', 'gender', 'close_friends', 'discrimination_', 'duration_of_residency', 'household_size',
              'education_completed', 'cancer']]
# Additional improvements to asian
asian = asian.rename(columns = {'discrimination_':'discrimination', 'physical_check-up':'physical_checkup'})

In [6]:
## Step 5, 6, 7

#replacing NaNs with 0
asian['english_speaking'] = asian['english_speaking'].fillna(0)
    
#replacing NaNs with 0
asian['english_difficulties'] = asian['english_difficulties'].fillna(0)

#replacing NaNs with 0
asian['retired'] = asian['retired'].fillna(0)

#replacing NaNs with 0
asian['us_born'] = asian['us_born'].fillna(0)

#replacing NaNs with 0
asian['health_insurance'] = asian['health_insurance'].fillna(0)

#replacing NaNs with 0
asian['physical_checkup'] = asian['physical_checkup'].fillna(0)

#replacing NaNs with 0
asian['regular_exercise'] = asian['regular_exercise'].fillna(0)

#replacing NaNs with 0
asian['healthy_diet'] = asian['healthy_diet'].fillna(0)

#replacing NaNs with 0
asian['heart_disease'] = asian['heart_disease'].fillna(0)

#replacing NaNs with 5.0
asian['quality_of_life'] = asian['quality_of_life'].fillna(5.0)

#replacing NaNs with 0
asian['discrimination'] = asian['discrimination'].fillna(0)

#replacing NaNs with -1
asian['duration_of_residency'] = asian['duration_of_residency'].fillna(-1)

#replacing NaNs with 0 because househols_size can not be 0 because participants are counting themselves
asian['household_size'] = asian['household_size'].fillna(0)

#replacing NaNs with -1
asian['education_completed'] = asian['education_completed'].fillna(-1)

#replacing NaNs with Unknown
asian['gender'] = asian['gender'].fillna('Unknown')

#replacing NaNs with Unknown
asian['ethnicity'] = asian['ethnicity'].fillna('Unknown')

#replacing NaNs with Unknown
asian['religion'] = asian['religion'].fillna('Unknown')

#replacing NaNs with median age
asian['age'] = asian['age'].fillna(40.0)

#replacing NaNs with median number of close friends
asian['close_friends'] = asian['close_friends'].fillna(3.0)

#replacing NaNs with 0
asian['income'] = asian['income'].fillna(0)

asian['cancer'] = asian['cancer'].fillna(0)

## <TO-DO> Paste affliated code here!
#changing english_speaking column to be represented by floats
asian['english_speaking'].replace({'Not at all': 1, 'Not well': 2, 'Well': 3, 'Very well': 4}, inplace = True)
asian['english_speaking'] = asian['english_speaking'].astype(float, errors = 'raise')

# changing english_difficulties column to be represented by floats
asian['english_difficulties'].replace({'Not at all': 1, 'Not much': 2, 'Much': 3, 'Very much': 4}, inplace = True)
asian['english_difficuties'] = asian['english_difficulties'].astype(float, errors = 'raise')

# changing retired column to to be represented by floats
asian["retired"].replace({"Retired": 1}, inplace=True)
asian["retired"] = asian["retired"].astype(float, errors='raise')

# changing us_born column to to be represented by floats
asian["us_born"].replace({"No": 0, "Yes": 1}, inplace=True)
asian["us_born"] = asian["us_born"].astype(float, errors='raise')

# changing health_insurance column 
asian['health_insurance'].replace({"Yes": "1"}, inplace=True)

# chaning physical_checkup column 
asian["physical_checkup"].replace({"Yes": "1"}, inplace=True)
asian['physical_checkup'] = asian['physical_checkup'].astype(float, errors = 'raise')

# changing income column entries to be represented by integers
asian['income'].replace({'$0 - $9,999': 1, '$10,000 - $19,999': 2, '$20,000 - $29,999': 3, '$30,000 - $39,999': 4, 
                        '$40,000 - $49,999': 5, '$50,000 - $59,999': 6, '$60,000 - $69,999': 7, '$70,000 and over': 8},
                       inplace = True)

In [7]:
asian.head()

Unnamed: 0,income,retired,us_born,english_speaking,english_difficulties,ethnicity,age,regular_exercise,healthy_diet,heart_disease,...,quality_of_life,religion,gender,close_friends,discrimination,duration_of_residency,household_size,education_completed,cancer,english_difficuties
0,0,0.0,0.0,0.0,0,Vietnamese,40.0,0.0,0.0,0.0,...,5.0,Unknown,Unknown,3.0,0.0,-1.0,0.0,-1.0,0.0,0.0
1,4,1.0,0.0,1.0,2,Chinese,60.0,0.0,0.0,0.0,...,5.0,Buddhist,Male,3.0,0.0,0.5,6.0,13.0,0.0,2.0
2,1,0.0,0.0,3.0,3,Chinese,23.0,0.0,1.0,0.0,...,8.0,Buddhist,Female,4.0,0.0,11.0,3.0,16.0,0.0,3.0
3,0,1.0,0.0,2.0,0,Chinese,73.0,1.0,1.0,0.0,...,5.0,Protestant,Female,3.0,0.0,50.0,1.0,13.0,0.0,0.0
4,0,0.0,0.0,3.0,4,Asian Indian,29.0,0.0,0.0,0.0,...,5.0,Hindu,Male,3.0,0.0,7.0,1.0,17.0,0.0,4.0


In [8]:
# for i in ['income','retired','ethnicity', 'age', 'regular_exercise', 'healthy_diet',
#           'heart_disease', 'physical_checkup', 'religion', 'close_friends', 'discrimination',
#           'duration_of_residency', 'household_size', 'education_completed','english_difficuties']:
#     sns.regplot(x = i, y = 'quality_of_life', data= asian, scatter_kws={'alpha':0.05})
#     plt.xlabel(i)
#     plt.ylabel('Quality of Life')
#     plt.show()

In [9]:
# cut_ethnicity = {'Asian Indian': 1, 'Filipino': 2, ' Chinese': 3, 'Korean': 4, 'Other': 5, 'Unknown': 6, 'Vietnamese': 7}

# cut_religion = {'Buddhist': 1, 'Protestant': 2, 'Hindu': 3, 'Muslim': 4, 'Catholic': 5, 'Other': 6, "None": 7, 'Unknown': 8}

# asian['ethnicity'] = asian['ethnicity'].map(cut_ethnicity)
# asian['religion'] = asian['religion'].map(cut_religion)

# #  = cat
# #  = dog


# asian.head(30)

In [10]:
# 1st analysis
multi_model = LinearRegression()
properties = ['income','retired', 'regular_exercise', 'healthy_diet', 'cancer',
          'heart_disease', 'physical_checkup', 'close_friends', 'discrimination',
         'household_size', 'education_completed','english_difficuties', 'english_speaking']
multi_model.fit(asian[properties], asian['quality_of_life'])

coefs= multi_model.coef_

for i in range(len(properties)):
    print('Coefficient for', properties[i], ':', round(coefs[i], 2))

score = multi_model.score(asian[properties], asian['quality_of_life']) 

print('new r^2: {:.2f}'.format(score))

Coefficient for income : 0.09
Coefficient for retired : -0.21
Coefficient for regular_exercise : 0.38
Coefficient for healthy_diet : 0.32
Coefficient for cancer : -0.0
Coefficient for heart_disease : 0.0
Coefficient for physical_checkup : 0.21
Coefficient for close_friends : 0.15
Coefficient for discrimination : -0.38
Coefficient for household_size : 0.04
Coefficient for education_completed : 0.0
Coefficient for english_difficuties : -0.12
Coefficient for english_speaking : 0.47
new r^2: 0.23


In [11]:
# 

# ## initialize a KMeans object
# clustering = KMeans(n_clusters = 7, random_state = 10)

# ## execute the KMeans algorithm on the penguins bill and flipper length data
# clustering.fit(asian[['close_friends', 'quality_of_life']])

# clustering.cluster_centers_

In [12]:
# cluster_labels = clustering.labels_
# print('First 5 elements of the cluster_labels array {}'.format(cluster_labels[:5]))

# asian['cluster_label'] = cluster_labels
# asian.head()

# sns.scatterplot(x= asian['close_friends'],
#                 y = asian['quality_of_life'], 
#                 hue = asian['ethnicity'], 
#                 alpha = 0.6, 
#                 style = asian['cluster_label'],
#                 s = 80)

# plt.xlabel('Number of Close Friends')
# plt.ylabel('Quality of Life')
# plt.show()

In [13]:
# analysis 2 : quality of life 1-5 is good 6-10 is bad 

# log_data = pd.Dataframe

# asian['quality_of_life'].dropna()

logistic_data = asian[['quality_of_life']]

# for i in logistic_data['quality_of_life']:
#     if i <= 5:
#         i = 0
#     else:
#         i = 1
        
logistic_data['quality_of_life'] = logistic_data['quality_of_life'].apply(lambda x: 1 if x > 5 else 0)
        
columns = asian[['income', 'retired', 'english_speaking', 'regular_exercise', 'healthy_diet', 'quality_of_life', 
           'close_friends', 'education_completed', 'physical_checkup', 'discrimination', 'ethnicity', 'religion']]

# would explain with intuition as well as how we chose these based on the correlation



# input_cols = list(columns.columns)

# input_cols.remove('quality_of_life')

input_cols = list(columns.columns)

input_cols.remove('quality_of_life')

# x = logistic_data.drop(['quality_of_life'], axis = 1)

# logistic_data[input_cols].values

for col in input_cols:   # get dummies for each predictor variable and add to df
    logistic_data = pd.concat(
        [logistic_data, 
            pd.get_dummies(columns[col], prefix=col)]
        , axis='columns')   
    
predictors = list(logistic_data.columns)
predictors.remove('quality_of_life')

target_model = LogisticRegression().fit(logistic_data[predictors].values, logistic_data['quality_of_life'])

for i, predictor in enumerate(predictors):
    print(f'{target_model.coef_[0, i]:.3f}\t{predictor}')
    
logistic_data.head()

-0.508	income_0
-0.633	income_1
0.115	income_2
-0.244	income_3
0.110	income_4
-0.655	income_5
0.412	income_6
0.571	income_7
0.831	income_8
0.160	retired_0.0
-0.160	retired_1.0
-0.457	english_speaking_0.0
-0.743	english_speaking_1.0
-0.308	english_speaking_2.0
0.549	english_speaking_3.0
0.958	english_speaking_4.0
-0.179	regular_exercise_0.0
0.179	regular_exercise_1.0
-0.293	healthy_diet_0.0
0.292	healthy_diet_1.0
-0.590	close_friends_0.0
-0.162	close_friends_1.0
-0.305	close_friends_2.0
0.049	close_friends_3.0
0.127	close_friends_4.0
0.881	close_friends_5.0
-0.191	education_completed_-1.0
0.382	education_completed_0.0
0.124	education_completed_2.0
-0.621	education_completed_3.0
0.221	education_completed_4.0
-0.076	education_completed_5.0
-0.421	education_completed_6.0
-0.374	education_completed_7.0
-0.711	education_completed_8.0
-0.452	education_completed_9.0
0.391	education_completed_10.0
0.559	education_completed_11.0
-0.173	education_completed_12.0
-0.286	education_completed_13.0
-0.

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  logistic_data['quality_of_life'] = logistic_data['quality_of_life'].apply(lambda x: 1 if x > 5 else 0)


Unnamed: 0,quality_of_life,income_0,income_1,income_2,income_3,income_4,income_5,income_6,income_7,income_8,...,ethnicity_Unknown,ethnicity_Vietnamese,religion_Buddhist,religion_Catholic,religion_Hindu,religion_Muslim,religion_None,religion_Other,religion_Protestant,religion_Unknown
0,0,1,0,0,0,0,0,0,0,0,...,0,1,0,0,0,0,0,0,0,1
1,0,0,0,0,0,1,0,0,0,0,...,0,0,1,0,0,0,0,0,0,0
2,1,0,1,0,0,0,0,0,0,0,...,0,0,1,0,0,0,0,0,0,0
3,0,1,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,1,0
4,0,1,0,0,0,0,0,0,0,0,...,0,0,0,0,1,0,0,0,0,0


In [14]:
np.random.seed(2950)
scores = cross_val_score(
    LogisticRegression(), 
    logistic_data[predictors], 
    asian['quality_of_life'], 
    cv=5)

print(f'Mean cross-validated accuracy: {scores.mean():.3f}')

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver opt

Mean cross-validated accuracy: 0.251


STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


In [15]:
print(logistic_data['ethnicity_Unknown'])

0       0
1       0
2       0
3       0
4       0
       ..
2604    0
2605    0
2606    0
2607    0
2608    0
Name: ethnicity_Unknown, Length: 2609, dtype: uint8


In [16]:
logistic_data.head()

Unnamed: 0,quality_of_life,income_0,income_1,income_2,income_3,income_4,income_5,income_6,income_7,income_8,...,ethnicity_Unknown,ethnicity_Vietnamese,religion_Buddhist,religion_Catholic,religion_Hindu,religion_Muslim,religion_None,religion_Other,religion_Protestant,religion_Unknown
0,0,1,0,0,0,0,0,0,0,0,...,0,1,0,0,0,0,0,0,0,1
1,0,0,0,0,0,1,0,0,0,0,...,0,0,1,0,0,0,0,0,0,0
2,1,0,1,0,0,0,0,0,0,0,...,0,0,1,0,0,0,0,0,0,0
3,0,1,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,1,0
4,0,1,0,0,0,0,0,0,0,0,...,0,0,0,0,1,0,0,0,0,0


In [17]:
print(input_cols)

['income', 'retired', 'english_speaking', 'regular_exercise', 'healthy_diet', 'close_friends', 'education_completed', 'physical_checkup', 'discrimination', 'ethnicity', 'religion']


---
## Evaluation of significance
After performing a multi-linear regression analysis, we found that an $r^2$ value of  0.23. We had hypothesized that our $r^2$ value would be at least 0.5. We don’t believe this is a high enough $r^2$ value to count as significant. An interpretation of the $r^2$ value would say that around 23% of the variation if quality of life is explained by the predictor variables that we included. 

To evaluate the significance of the logistic model we created, we calculated the mean cross-validated accuracy. We used a cross-validation method to evaluate our model using a 5-fold cross validation; we found this value to be 0.251. This is a relatively low-cross validation score. We would say that our model holds an average 25.1% accuracy for the rest of the data. 

---

## INTERPRETATION AND CONCLUSIONS
### MULTIPLE LINEAR REGRESSION ANALYSIS
My ${r^2}$ is 0.23. This means that 23% of the variation in quality of life can be explained by a set of explanatory variables. These explanatory vairables include income level, retirement status, regular exercise engagement, a healthy diet, physical checkups, cancer, encountereed discrimination, completed their education, and their experience speaking English.

For every unit increase of income level, on average, quality of life will tend to increase by 0.09. Since this value is positive, this means that with a higher income, for the Asian-American living in Austin, they will rate their quality of life higher. 

For retired Asian-American individuals, on average, quality of life will tend to decrease by 0.21. Since this value is negative, this means that for the retired Asian-American living in Austin, TX, they will rate their quality of life lower.

For Asian-Americans engaged in regular exercise, on average, quality of life will tend to increase by 0.38. Since this value is positive, this means that with a physically active lifestyle, for the Asian-American living in Austin, TX, they will rate their quality of life higher. 

For Asian-Americans committed to a healthy diet, on average, quality of life will tend to increase by 0.32. Since this value is positive, this means that with a healthy diet lifestyle, for the Asian-American living in Austin, TX, they will rate their quality of life higher. 

For Asian-Americans registered for physical check-ups, on average, quality of life will tend to increase by 0.21. Since this value is positive, this means that with the availability of physical check-ups, for the Asian-American living in Austin, TX, they will rate their quality of life higher. 

For every unit increase in the number of close friends, on average, quality of life will tend to increase by 0.15. Since this value is positive, this means that with more close friends, for the Asian-American living in Austin, TX, they will rate their quality of life higher. 

For Asian-Americans who have experienced discrimination, on average, quality of life will tend to decrease by 0.38. Since this value is negative, this means that the presence of discrimination, for the Asian-American living in Austin, TX, will rate their quality of life lower. 

For every unit increase in the number of people in households, on average, quality of life will tend to increase by 0.04. Since this value is positive, this means that with more household members, for the Asian-American living in Austin, TX, they will rate their quality of life a little bit higher. 

For Asian-Americans who have difficulties speaking English, on average, quality of life will tend to decrease by 0.12. Since this value is negative, with higher difficulties speaking English, the Asian-American living in Austin, TX, will give a lower quality of life rating. 

For every unit increase in English speaking capability, on average, quality of life will tend to increase by 0.47. Since this value is positive, this means that with higher English fluency, for the Asian-American living in Austin, TX, they will rate their quality of life significantly higher.

### LOGISTIC REGRESSION
Unhappy - quality of life rating less than 5
Happy - quality of life rating higher than 5

For Asian-Americans in Austin, TX that did not report their income, they had a higher odds of being unhappy because the odds ratio of a higher quality of life rating is less than 1 for these people. 

For Asian-Americans in Austin, TX that did not report their income, they had a higher odds of being unhappy because the odds ratio of a higher quality of life rating is less than 1 for those people. 

For Asian-Americans in Austin, TX that earned within the \\$0 to \\$9,999 income bracket, they had a higher odds of being unhappy because the odds ratio of a higher quality of life rating is less than 1 for those people. 

For Asian-Americans in Austin, TX that earned within the \\$10,000 to \\$19,999 income bracket, they had lower odds of being unhappy because the odds ratio of a higher quality of life rating is greater than 1 for these people.

For Asian-Americans in Austin, TX that earned within the \\$20,000 to \\$29,999 income bracket, they had a higher odds of being unhappy because the odds ratio of a higher quality of life rating is less than 1 for those people. 

For Asian-Americans in Austin, TX that earned within the \\$30,000 to \\$39,999 income bracket, they had a higher odds of being happy because the odds ratio of a higher quality of life rating is greater than 1 for these people.

For Asian-Americans in Austin, TX that earned within the \\$40,000 to \\$49,999 income bracket, they had a higher odds of being unhappy because the odds ratio of a higher quality of life rating is less than 1 for those people. 

For Asian-Americans in Austin, TX that earned within the \\$50,000 to \\$59,999 income bracket, they had a higher odds of being happy because the odds ratio of a higher quality of life rating is greater than 1 for these people.

For Asian-Americans in Austin, TX that earned within the \\$60,000 to \\$69,999 income bracket, they had a higher odds of being happy because the odds ratio of a higher quality of life rating is greater than 1 for these people.

For Asian-Americans in Austin, TX that earned within the \\$70,000 and above income bracket, they had a higher odds of being happy because the odds ratio of a higher quality of life rating is greater than 1 for these people.

For Asian-Americans in Austin, TX that are not retired, they had a higher odds of being happy because the odds ratio of a higher quality of life rating is greater than 1 for these people.

For Asian-Americans in Austin, TX that rated their English speaking capabilities higher than 2, they had a higher odds of being happy because the odds ratio of a higher quality of life rating is greater than 1 for these people.

For Asian-Americans in Austin, TX that regularly exercise, they had a higher odds of being happy because the odds ratio of a higher quality of life rating is greater than 1 for these people.

For Asian-Americans in Austin, TX that go to physical check-ups, they had a higher odds of being happy because the odds ratio of a higher quality of life rating is greater than 1 for these people.

For Asian-Americans in Austin, TX that maintained a healthy diet, they had a higher odds of being happy because the odds ratio of a higher quality of life rating is greater than 1 for these people.

For Asian-Americans in Austin, TX with more than 2 close friends, they had a higher odds of being happy because the odds ratio of a higher quality of life rating is greater than 1 for these people.

For Asian-Americans in Austin, TX that have experienced some form of discrimination, they had a higher odds of being unhappy because the odds ratio of a higher quality of life rating is less than 1 for those people. 

Chinese-Americans in Austin, TX compared with non-Chinese people tend to have higher odds of being happy because the odds ratio of a higher quality of life rating is greater than 1 for those people. 

For Indian-Americans in Austin, TX compared with non-Indian people tend to have higher odds of being happy because the odds ratio of a higher quality of life rating is greater than 1 for those people. 

For Filipino-Americans in Austin, TX compared with non-Filipino people tend to have higher odds of being happy because the odds ratio of a higher quality of life rating is greater than 1 for those people. 

Korean-Americans in Austin, TX compared with non-Korean people tend to have higher odds of being happy because the odds ratio of a higher quality of life rating is greater than 1 for those people. 

For Vietnamese-Americans in Austin, TX compared with non-Vietnamese people, they had a higher odds of being unhappy because the odds ratio of a higher quality of life rating is less than 1 for those people. 

For Asian-Americans in Austin, TX belonging to smaller, unreported ethnicities, they had a higher odds of being unhappy because the odds ratio of a higher quality of life rating is less than 1 for those people. This may be because of smaller cultural communities in comparison to other Asian ethnicities.

For Asian-Americans in Austin, TX of Buddhist faith tend to have higher odds of being happy in comparison to non-Buddhist followers because the odds ratio of a higher quality of life rating is greater than 1 for those people.

For Asian-Americans in Austin, TX of Catholic faith tend to have higher odds of being happy in comparison to non-Catholic followers because the odds ratio of a higher quality of life rating is greater than 1 for those people.

For Asian-Americans in Austin, TX of Hindu faith tend to have higher odds of being happy in comparison to non-Hindu followers because the odds ratio of a higher quality of life rating is greater than 1 for those people.

For Asian-Americans in Austin, TX of Muslim faith tend to have higher odds of being happy in comparison to non-Muslim followers because the odds ratio of a higher quality of life rating is greater than 1 for those people.

For Asian-Americans in Austin, TX of no faith tend to have lower odds of being happy in comparison to religious people because the odds ratio of a higher quality of life rating is less than 1 for those people.

For Asian-Americans in Austin, TX of Protestant faith tend to have lower odds of being happy in comparison to non-Protestant peoples because the odds ratio of a higher quality of life rating is less than 1 for those people.

For Asian-Americans in Austin, TX, belonging to unreported faith tend to have lower odds of being happy in comparison to people who reported their beliefs because the odds ratio of a higher quality of life rating is less than 1 for those people.
