We got an email from [edsci.feedback@census.gov] that told us we can find 2000 and 2010 PUMAs as column in the 2008-2012 5 year estimates. This notebook will try this approach for PUMA 4001

In [1]:
from ingest.PUMS_request import make_GET_request
url_2000_PUMAs = 'https://api.census.gov/data/2012/acs/acs5/pums?get=SERIALNO,SPORDER,PWGTP&PUMA00=4001&ucgid=0400000US36'
PUMS_2000_PUMAs = make_GET_request(url_2000_PUMAs, 'test PUMA00 column')

Exception: error making GET request for test PUMA00 column: There was an error while running your query.  We've logged the error and we'll correct it ASAP.  Sorry for the inconvenience.

In [None]:
PUMS_2000_PUMAs['PWGTP'] = PUMS_2000_PUMAs['PWGTP'].astype(int)
old_geos_count = PUMS_2000_PUMAs['PWGTP'].sum()
old_geos_count

113734

Around 114k people from 2012 5-year estimates in NYS have PUMA00 of 4001 

In [None]:
url_2010_PUMAs = 'https://api.census.gov/data/2012/acs/acs5/pums?get=SERIALNO,SPORDER,PWGTP&PUMA10=4001&ucgid=0400000US36'
PUMS_2010_PUMAs = make_GET_request(url_2010_PUMAs, 'test PUMA10 column')

In [None]:
PUMS_2010_PUMAs['PWGTP'] = PUMS_2010_PUMAs['PWGTP'].astype(int)
new_geos_count = PUMS_2010_PUMAs['PWGTP'].sum()
new_geos_count


29825

Around 30k people from 2012 5-year estimates in NYS have PUMA10 of 4001

I would assume that 1/5th of all records in 2012 5-year estimates come from 2012, does this hold up?

In [None]:
new_geos_count/(old_geos_count+new_geos_count)

0.20775430310882634

Yea looks good

Next compare to 2015-2019 5 year estimates 

In [None]:
url_2019_PUMAs = 'https://api.census.gov/data/2019/acs/acs5/pums?get=SERIALNO,SPORDER,PWGTP&ucgid=7950000US3604001'
PUMS_2019_PUMAs = make_GET_request(url_2019_PUMAs, 'use existing query for comparison')
PUMS_2019_PUMAs['PWGTP'] = PUMS_2019_PUMAs['PWGTP'].astype(int)

In [None]:
estimates_2012_total= old_geos_count+new_geos_count
print(f'records in 2019 5-year estimates PUMA 4001 {PUMS_2019_PUMAs["PWGTP"].sum()}')
print(f'records in 2012 5-year estimates PUMA 4001 {estimates_2012_total}')
print(f'{round((PUMS_2019_PUMAs["PWGTP"].sum()-estimates_2012_total)/(estimates_2012_total), 3)*100} % increase 2012-2019')

records in 2019 5-year estimates PUMA 4001 162630
records in 2012 5-year estimates PUMA 4001 143559
13.3 % increase 2012-2019


Close enough to pass smell test to me.

### Check Additional PUMAs for Sanity Check 

We want to take a look at a few different PUMAs for the 2000 and 2010 PUMAs and see if they might differ from expected values (using 3702)

In [None]:
#### Check different PUMA's - try 3702 

url_2000_PUMAs_3702 = 'https://api.census.gov/data/2012/acs/acs5/pums?get=SERIALNO,SPORDER,PWGTP&PUMA00=3702&ucgid=0400000US36'
PUMS_2000_PUMAs_3702 = make_GET_request(url_2000_PUMAs_3702, 'test PUMA00 column')


In [None]:
PUMS_2000_PUMAs_3702.head()

Unnamed: 0,SERIALNO,SPORDER,PWGTP,PUMA00,ST
0,2008000012871,1,15,3702,36
1,2008000012871,2,22,3702,36
2,2008000012871,3,19,3702,36
3,2008000012871,4,15,3702,36
4,2008000023128,1,14,3702,36


In [None]:
PUMS_2000_PUMAs_3702['PWGTP'] = PUMS_2000_PUMAs_3702['PWGTP'].astype(int)
old_geos_count_3702 = PUMS_2000_PUMAs_3702['PWGTP'].sum()
old_geos_count_3702

115273

In [None]:
url_2010_PUMAs_3702 = 'https://api.census.gov/data/2012/acs/acs5/pums?get=SERIALNO,SPORDER,PWGTP&PUMA10=3702&ucgid=0400000US36'
PUMS_2010_PUMAs_3702 = make_GET_request(url_2010_PUMAs, 'test PUMA10 column')

In [None]:
PUMS_2010_PUMAs_3702['PWGTP'] = PUMS_2010_PUMAs_3702['PWGTP'].astype(int)
new_geos_count_3702 = PUMS_2010_PUMAs_3702['PWGTP'].sum()
new_geos_count_3702

29825

In [None]:
new_geos_count_3702/(old_geos_count_3702+new_geos_count_3702)

0.20555073122992737

In [None]:
url_2019_PUMAs_3702 = 'https://api.census.gov/data/2019/acs/acs5/pums?get=SERIALNO,SPORDER,PWGTP&ucgid=7950000US3603702'
PUMS_2019_PUMAs_3702 = make_GET_request(url_2019_PUMAs_3702, 'use existing query for comparison')
PUMS_2019_PUMAs_3702['PWGTP'] = PUMS_2019_PUMAs_3702['PWGTP'].astype(int)

In [None]:
PUMS_2019_PUMAs_3702.head()

Unnamed: 0,SERIALNO,SPORDER,PWGTP,PUMA,ST
0,2015000001904,1,20,3702,36
1,2015000001904,2,25,3702,36
2,2015000006256,1,32,3702,36
3,2015000006256,2,32,3702,36
4,2015000007115,1,18,3702,36


In [None]:
estimates_2012_total_3702 = old_geos_count_3702+new_geos_count_3702
print(f'records in 2019 5-year estimates PUMA 3702 {PUMS_2019_PUMAs_3702["PWGTP"].sum()}')
print(f'records in 2012 5-year estimates PUMA 3702 {estimates_2012_total_3702}')
print(f'{round((PUMS_2019_PUMAs_3702["PWGTP"].sum()-estimates_2012_total_3702)/(estimates_2012_total_3702), 3)*100} % increase 2012-2019')

records in 2019 5-year estimates PUMA 3702 151184
records in 2012 5-year estimates PUMA 3702 145098
4.2 % increase 2012-2019


In [None]:
#### Check a puma in Queens - 4101

url_2000_PUMAs_4101 = 'https://api.census.gov/data/2012/acs/acs5/pums?get=SERIALNO,SPORDER,PWGTP&PUMA00=4101&ucgid=0400000US36'
PUMS_2000_PUMAs_4101 = make_GET_request(url_2000_PUMAs_4101, 'test PUMA00 column')


In [None]:
PUMS_2000_PUMAs_4101.head()

NameError: name 'PUMS_2000_PUMAs_4101' is not defined

Check multiple pumas

In [None]:
url_2000_PUMAs = 'https://api.census.gov/data/2012/acs/acs5/pums?get=SERIALNO,SPORDER,PWGTP&PUMA00&PUMA10=3702,3703,3704&ucgid=0400000US36'
PUMS_2000_PUMAs_bx = make_GET_request(url_2000_PUMAs, 'test PUMA00 column')

In [None]:
PUMS_2000_PUMAs_bx.head(20)

Unnamed: 0,SERIALNO,SPORDER,PWGTP,PUMA00,PUMA10,ST
0,2012000000168,1,23,-9,3702,36
1,2012000000168,2,43,-9,3702,36
2,2012000000317,1,11,-9,3704,36
3,2012000007396,1,1,-9,3703,36
4,2012000007721,1,7,-9,3704,36
5,2012000012852,1,37,-9,3702,36
6,2012000012852,2,39,-9,3702,36
7,2012000012852,3,40,-9,3702,36
8,2012000012852,4,34,-9,3702,36
9,2012000012852,5,34,-9,3702,36


In [None]:
#### Pick up tomorrow - make sure to push changes to ipnyb and double check population of the puma by summing weights 




#### j