# Tesco Creative Extension Project

In [53]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [187]:
#importing the correct data
ward_tesco = pd.read_csv('data/tesco/year_osward_grocery.csv')
ward_crime = pd.read_csv('data/crime_wards.csv', header=2)
ward_demographics = pd.read_csv('data/demographics_ward.csv', header=2)
ward_education = pd.read_csv('data/education_ward.csv', header=2)
ward_environment = pd.read_csv('data/environment_ward.csv', header=2)
ward_property = pd.read_csv('data/property_wards.csv', header=2)
ward_total_wellbeing = pd.read_csv('data/total_stats_ward.csv') #serves as validation set (Sort of)

In [188]:
ward_crime = ward_crime.dropna()
ward_demographics = ward_demographics.dropna(axis=1)
ward_education = ward_education.dropna()
ward_property= ward_property.dropna()

We need to create 11 different indicators + 1 with the food. here is the list :
* Housing
* Income
* Jobs
* Community
* Education
* Environment
* Civic Engagement
* Health
* Life Satisfaction
* Safety
* Work-Life Balance

To create our indicators we need our different categories. Let's go step by step:
**Housing** : 
- need average # of rooms shared per person --> gives an idea of how densly packed living conditions 
- need an access to an indoor private flushing toilet 
- need way of measuring housing expenditure: ratio to housing costs on household gross adjusted disposable income

In [194]:
ward_tesco.columns

Index(['area_id', 'weight', 'weight_perc2.5', 'weight_perc25', 'weight_perc50',
       'weight_perc75', 'weight_perc97.5', 'weight_std', 'weight_ci95',
       'volume',
       ...
       'man_day', 'population', 'male', 'female', 'age_0_17', 'age_18_64',
       'age_65+', 'avg_age', 'area_sq_km', 'people_per_sq_km'],
      dtype='object', length=202)

In [190]:
#we want to create a data set with only the latest data possible (since tesco is 2015)
column_names = ["New Code","Names","All Household spaces - 2011 Census",
                               "Household composition - 2011 Census All Households",
                               "Household composition - 2011 Census Couple household with dependent children",
                               "Household composition - 2011 Census Couple household without dependent children",
                               "Household composition - 2011 Census Lone parent household",
                               "Household composition - 2011 Census One person household",
                               "Household composition - 2011 Census Other multi person household",
                               "Accomodation Type - 2011 Census Whole house or bungalow: Detached",
                               "Accomodation Type - 2011 Census Whole house or bungalow: Semi-detached",
                               "Accomodation Type - 2011 Census Whole house or bungalow: Terraced",
                               "Accomodation Type - 2011 Census Flat, maisonette or apartment"]
housing = pd.DataFrame(data=ward_demographics, 
                       columns=column_names)
housing.rename(columns= {"New Code":"area_id"}, inplace=True)
column_names[0] = "area_id"
#get data from tesco for density and area sq km:
column_names.append("population")
column_names.append("area_sq_km")
column_names.append("people_per_sq_km")

In [191]:
housing = housing.merge(ward_tesco, on='area_id', how='inner')

In [192]:
housing = housing[column_names] 

In [193]:
housing.loc[housing['area_id'] == "E05000026"]

Unnamed: 0,area_id,Names,All Household spaces - 2011 Census,Household composition - 2011 Census All Households,Household composition - 2011 Census Couple household with dependent children,Household composition - 2011 Census Couple household without dependent children,Household composition - 2011 Census Lone parent household,Household composition - 2011 Census One person household,Household composition - 2011 Census Other multi person household,Accomodation Type - 2011 Census Whole house or bungalow: Detached,Accomodation Type - 2011 Census Whole house or bungalow: Semi-detached,Accomodation Type - 2011 Census Whole house or bungalow: Terraced,"Accomodation Type - 2011 Census Flat, maisonette or apartment",population,area_sq_km,people_per_sq_km
0,E05000026,Abbey,4753,4572,953,714,648,1163,1094,183,341,1076,3153,14370.0,1.26,11404.761905


**Income**
link here : http://www.oecdbetterlifeindex.org/topics/income/ 
* income 
* household net wealth : average total wealth of household assets (savings, stocks) minus liabilites (loans)
* household net adjusted disposable income 

In [173]:
#code here

**Jobs** 
link here : http://www.oecdbetterlifeindex.org/topics/jobs/
* job security -- expected loss of earnings when someone becomes unemployed
* personal earnings 
* long-term unemployment rate (have been actively searching for a job in past 12 months)
* employment rate 

In [174]:
#code here

**Community**
link here : http://www.oecdbetterlifeindex.org/topics/community/
* community
* quality of support network -- how much can you rely on friends --> we should change to indicator of community diversity perhaps with ethnic group diversity and religious diversity

In [175]:
#code here

**Education** link here : http://www.oecdbetterlifeindex.org/topics/education/
* years in education
* student skills -- average performance of student here GSED or whatever 
* education attainment -- percent of people 24- 64 years old having at least an upper-secondary education 

In [177]:
#code here

**Environment**
link here: http://www.oecdbetterlifeindex.org/topics/environment/
* water quality 
* air pollution -- measured in PM 2.5
* in addition : access to parks and greens 

In [178]:
#code here

**Civic Engagement** 
link here : http://www.oecdbetterlifeindex.org/topics/civic-engagement/
* voter turnout in latest elections
* stakeholder engagement for developing regulations -- might be hard to do

In [179]:
#code here

**Health**
link here: http://www.oecdbetterlifeindex.org/topics/health/
* life expectancy
* self-reported health (kind of hard)
* we can include like ambulances or whatever

In [180]:
#code here

**Life Satisfaction**
link here: http://www.oecdbetterlifeindex.org/topics/life-satisfaction/
* life satisfaction -- how satisfied are you with your life ? we have data for that

In [181]:
#code here

**Safety**
link here : http://www.oecdbetterlifeindex.org/topics/safety/
* homicide rate
* feeling safe walking alone at night (self reported) --> can use burgularies or something

In [182]:
# code here

**Work-Life Balance** 
link here : http://www.oecdbetterlifeindex.org/topics/work-life-balance/ 
* Time devoted to leisure and personal care
* employees working very long hours 

In [183]:
#code here