percentage estimate set to 100 #15

SPTKL · 2019-12-18T19:11:11Z

With a as (
select variable, count(*) as count from pff_social."Y2013-2017-old" 
where p = 100
GROUP BY variable),
b as (select variable, count(*) as count from pff_social."Y2013-2017" 
where p = 100
GROUP BY variable)
SELECT a.variable, a.count as old_count, b.count as new_count 
FROM a,b
WHERE a.variable = b.variable;

From this query we can see that for some variables we have fairly different results for the number of records with 100 as percentage estimate.

Need to figure out if we need to set P=100 for certain variables by definition, if so, how to identify those variables

SPTKL · 2019-12-18T19:14:44Z

Also note that for many records, we have P > 100, if we use the current defined way to identify the total, e.g. CNI1864_1
for SI28

>>> lst = ['1400000US36085020700', '1400000US36085021300', '1400000US36085024700', '1400000US36085025100']
>>> df.loc[df.GEO_ID.isin(lst), ['DP02_0074E', 'DP02_0001E']]
938      3521.0      1787.0
940      1917.0      1012.0
941      4294.0      2133.0
954      3192.0      1566.0

it's clear that for each of these tracts, DP02_0074E > DP02_0001E, so do we have a special calculation rule here that we need to apply?

EricaMaurer · 2019-12-30T20:22:11Z

anything labeled a base in the data dictionary will be set to 100 in the percent column.

SPTKL closed this as completed Jan 2, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

percentage estimate set to 100 #15

percentage estimate set to 100 #15

SPTKL commented Dec 18, 2019

SPTKL commented Dec 18, 2019 •

edited

Loading

EricaMaurer commented Dec 30, 2019

percentage estimate set to 100 #15

percentage estimate set to 100 #15

Comments

SPTKL commented Dec 18, 2019

SPTKL commented Dec 18, 2019 • edited Loading

EricaMaurer commented Dec 30, 2019

SPTKL commented Dec 18, 2019 •

edited

Loading