# Summary Of Questions And Answers

### 1. How many people live in Providence?

There are **179494 residents** living in the city of Providence in Rhode Island.

### 2. How many people who are at least 65 years old live in Providence?

There are **2738 people** who are at least 65 years old who live in the city of Providence.

### 3. How many people who are at least 65 years old live within one mile of Classical High School in Providence (called CLASSICAL in the data set)?

There are **533 people** who are at least 65 years old, live within 1 mile of Classical High School, in Providence.

### 4. What is the mean distance a resident of Providence who is at least 65 years old lives from their nearest school?

The mean distance a resident of Providence who is at least 65 years old lives from their nearest school is about **0.325 miles.**


In [1]:
import numpy as np
import pandas as pd
from dfply import *
from plotnine import *

In [2]:
population = pd.read_csv("population_by_blockgroup.csv")
distances = pd.read_csv("distance_from_school_to_blockgroup_in_miles.csv")
codebook = pd.read_csv("census_codebook.csv")

In [4]:
population.sample(7)

Unnamed: 0,state,county,tract,blockgroup,city,GEOID10,B01001_001E,B01001_002E,B01001_003E,B01001_004E,...,B01001_040E,B01001_041E,B01001_042E,B01001_043E,B01001_044E,B01001_045E,B01001_046E,B01001_047E,B01001_048E,B01001_049E
791,44,7,12000,4,NORTH PROVIDENCE,440070120004,602,301,0,15,...,15,0,17,33,0,16,0,0,0,87
273,44,9,50600,3,RICHMOND,440090506003,2167,1144,40,46,...,205,65,52,88,15,0,49,16,0,15
288,44,7,14600,1,CRANSTON,440070146001,1171,634,17,36,...,35,44,16,0,0,0,59,15,0,15
511,44,7,11800,5,NORTH PROVIDENCE,440070118005,1484,751,58,14,...,34,112,36,38,0,0,0,14,0,25
126,44,7,700,3,PROVIDENCE,440070007003,413,208,25,0,...,0,14,15,0,12,0,39,6,13,9
614,44,7,12401,4,JOHNSTON,440070124014,1553,941,57,45,...,69,63,0,36,15,0,12,33,29,0
476,44,7,15500,2,PAWTUCKET,440070155002,1248,758,37,27,...,10,14,9,21,0,0,10,11,0,43


In [8]:
distances.head(8)

Unnamed: 0,GEOID10,school_short_name,distance
0,440010301001,BAILEY,5.330176
1,440010301001,CARNEVALE,8.040659
2,440010301001,D’ABATE,7.418675
3,440010301001,FEINSTEIN AT BROAD,4.181948
4,440010301001,FEINSTEIN AT SACKETT,5.093149
5,440010301001,FOGARTY,5.097623
6,440010301001,FORTES,5.882528
7,440010301001,GREGORIAN,5.360098


In [6]:
codebook.head(10)

Unnamed: 0,variable_name,sex,min_age,max_age
0,B01001_001E,total,0,255
1,B01001_002E,male,0,255
2,B01001_003E,male,0,4
3,B01001_004E,male,5,9
4,B01001_005E,male,10,14
5,B01001_006E,male,15,17
6,B01001_007E,male,18,19
7,B01001_008E,male,20,20
8,B01001_009E,male,21,21
9,B01001_010E,male,22,24


I see a max_age as 255, which I assume would be a data input error, as I am fairly confident there has never been anyone who was at least 150 years old.

In [7]:
codebook.variable_name.unique()

array(['B01001_001E', 'B01001_002E', 'B01001_003E', 'B01001_004E',
       'B01001_005E', 'B01001_006E', 'B01001_007E', 'B01001_008E',
       'B01001_009E', 'B01001_010E', 'B01001_011E', 'B01001_012E',
       'B01001_013E', 'B01001_014E', 'B01001_015E', 'B01001_016E',
       'B01001_017E', 'B01001_018E', 'B01001_019E', 'B01001_020E',
       'B01001_021E', 'B01001_022E', 'B01001_023E', 'B01001_024E',
       'B01001_025E', 'B01001_026E', 'B01001_027E', 'B01001_028E',
       'B01001_029E', 'B01001_030E', 'B01001_031E', 'B01001_032E',
       'B01001_033E', 'B01001_034E', 'B01001_035E', 'B01001_036E',
       'B01001_037E', 'B01001_038E', 'B01001_039E', 'B01001_040E',
       'B01001_041E', 'B01001_042E', 'B01001_043E', 'B01001_044E',
       'B01001_045E', 'B01001_046E', 'B01001_047E', 'B01001_048E',
       'B01001_049E'], dtype=object)

The only thing that these unique values differ is the numbers/characters after the underscore \_. Anything else before the underscore is the same.

# 1. How many people live in Providence?

In [11]:

(
    codebook >>
    mask(X.sex == "total")
)

Unnamed: 0,variable_name,sex,min_age,max_age
0,B01001_001E,total,0,255


The block B01001_001E counts everybody of all genders from all ages.

In [24]:
(
    population >>
    
    # I want to grab people from the city of Providence
    mask(X.city == "PROVIDENCE") >>
    group_by(X.city) >>
    # The variable that counts everybody of all genders and ages
#     select(X.B01001_001E, everything()) >>
    summarize(totalPopulation = X.B01001_001E.sum())
)

Unnamed: 0,city,totalPopulation
0,PROVIDENCE,179494


There are 179494 residents living in the city of Providence in Rhode Island.

# 2. How many people who are at least 65 years old live in Providence?

In [21]:
(
    codebook >>
    mask((X.min_age == 65))
)

Unnamed: 0,variable_name,sex,min_age,max_age
19,B01001_020E,male,65,66
43,B01001_044E,female,65,66


The variables/blocks that contain counts of people of at least 65 years old are B01001_020E and B01001_044E

In [36]:
(
    population >>
    
    # I want to grab people from the city of Providence
    mask(X.city == "PROVIDENCE") >>
    
    # Group_by to prepare to aggregate
    group_by(X.city) >>
    
    # The variable that counts everybody of all genders and ages
#     select(X.B01001_001E) >>
#     summarize(totalPopulation = X.B01001_001E.sum())
    summarize(totalPopulation = X.B01001_020E.sum() + X.B01001_044E.sum())
    
)

Unnamed: 0,city,totalPopulation
0,PROVIDENCE,2738


There are 2738 people who are at least 65 years old who live in the city of Providence.

# 3. How many people who are at least 65 years old live within one mile of Classical High School in Providence (called CLASSICAL in the data set)?

In [46]:
# List of all the block groups who live within 1 mile from Classical High School
nearbyBlocks = (
    distances >>
    mask((X.school_short_name == "CLASSICAL") & (X.distance <= 1))
).GEOID10.unique()
nearbyBlocks

array([440070003001, 440070003002, 440070003003, 440070003005,
       440070004001, 440070004002, 440070006002, 440070007001,
       440070007002, 440070007003, 440070008001, 440070008002,
       440070008003, 440070009001, 440070009002, 440070010001,
       440070010002, 440070011001, 440070011002, 440070011003,
       440070012001, 440070012002, 440070012003, 440070013001,
       440070013002, 440070013003, 440070013004, 440070014001,
       440070014002, 440070014003, 440070019001, 440070025002],
      dtype=int64)

In [63]:
# (
#     population >>
#     mask(X.GEOID10 in nearbyBlocks)
# )
# Spawn a column that flags whether the block is within 1 mile of Classical High School
population["nearClassical"] = population.GEOID10.apply(lambda x: 1.0*(x in nearbyBlocks))

(
    population >>
    mask(X.nearClassical == 1, X.city == "PROVIDENCE") >>
    group_by(X.city) >>
   summarize(totalElderPopulation = X.B01001_020E.sum() + X.B01001_044E.sum())
)

Unnamed: 0,city,totalElderPopulation
0,PROVIDENCE,533


There are 533 people who are at least 65 years old, live within 1 mile of Classical High School, in Providence.

# 4. What is the mean distance a resident of Providence who is at least 65 years old lives from their nearest school?

In [93]:
(
    population >>
    
    # Want: The city to be in Providence
    # and have 
    mask((X.city == "PROVIDENCE") & ((X.B01001_020E > 0) | (X.B01001_044E > 0))) >>
    select(X.city, X.GEOID10, X.B01001_020E, X.B01001_044E) 
 
)

Unnamed: 0,city,GEOID10,B01001_020E,B01001_044E
3,PROVIDENCE,440070023005,0,6
9,PROVIDENCE,440070027001,21,24
11,PROVIDENCE,440070023003,0,9
17,PROVIDENCE,440070013002,23,0
20,PROVIDENCE,440070031003,0,42
...,...,...,...,...
766,PROVIDENCE,440070009002,13,0
769,PROVIDENCE,440070005002,0,29
789,PROVIDENCE,440070022003,11,9
801,PROVIDENCE,440070001023,17,18


In [104]:
# A list of all the block groups residing in Providence who have
# at least 1 resident, male or female, who is at least 65 years old
blocksInProvidence = (
    population >>
    
    # Want: The city to be in Providence
    # and have at least 1 resident, male or female, who is at least 65 years old
    mask((X.city == "PROVIDENCE") & ((X.B01001_020E > 0) | (X.B01001_044E > 0))) >>
    distinct(X.GEOID10)
).GEOID10.unique()
len(blocksInProvidence)

99

There are 99 block groups in Providence who have at least 1 resident, male or female, who is at least 65 years old.

In [119]:
distances["isBlocksInProvidenceOld"] = distances.GEOID10.apply(lambda x: 1.0*(x in blocksInProvidence))

(
    distances >>
    
    # Grab the blocks who are in Providence and have at least 1 old resident
    mask(X.isBlocksInProvidenceOld == 1) >>
    group_by(X.GEOID10) >>
    summarize(shortestDistance = X.distance.min()) >>
    ungroup() >>
    summarize(avgShortestDistance = X.shortestDistance.mean())
#     ggplot(aes(x = "GEOID10", y = "shortestDistance")) +
#     geom_point(position = "jitter")
    
)

Unnamed: 0,avgShortestDistance
0,0.324755


 The mean distance a resident of Providence who is at least 65 years old lives from their nearest school is about 0.325 miles.