Skip to content
This repository has been archived by the owner on Mar 12, 2024. It is now read-only.

WIP - FIxing ctpp_censustract_variables #33

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

SPTKL
Copy link
Contributor

@SPTKL SPTKL commented Jul 22, 2020

  • adding documentation to CTPP tables
  • there are two potential methods here
    c451869...3479587
  • c451869: slightly simpler, however say if one census tract doesn't have a certain mode (1-18), we will not have that row.
  • 3479587: more verbose, however say if one census tract doesn't have a certain mode, this commit will populate it as value = 0, moe = 0 --> one to one replication of the python method, exact same number of rows in output

Also it seems like previous python based method has issues in calculating MOEs:

select 
	a.geoid, a.variable, 
	a.value, b.value as val_new,  
	a.moe, b.moe as moe_new
from ctpp_censustract_variables."2012_2016" a, 
	ctpp_censustract_variables."2020-07-22" b
where a.geoid||a.variable =  b.geoid||b.variable
and a.moe != b.moe

image

(moe) trans_auto_total 
	= sqrt(
		trans_auto_2 ** 2 +
		trans_auto_3 ** 2 + 
		trans_auto_4 ** 2 +
		trans_auto_5_or_6 ** 2 +
		trans_auto_7_or_more **2 +
                 trans_auto_solo ** 2 )
	= sqrt(126**2 + 50**2 + 
		16**2 + 13**2 +
		 0**2 + 422**2) 
	= 444

The previous calculation yields 11, which seems to be wrong

@mgraber mgraber marked this pull request as draft July 22, 2020 19:52
@SPTKL SPTKL marked this pull request as ready for review July 23, 2020 13:44
@SPTKL SPTKL requested a review from mgraber July 23, 2020 13:44
@SPTKL SPTKL changed the title WIP - python to ct variables FIxing ctpp_censustract_variables Jul 23, 2020
recipes/ctpp_censustract_variables/build.sql Outdated Show resolved Hide resolved
recipes/ctpp_censustract_variables/build.sql Outdated Show resolved Hide resolved
FROM RECODE a,
-- pivot jsonb to columns (key -> field name, value -> field value),
-- this step is needed because not all tracts have all modes of travel
-- NULLs will be filled with 0s and calculated as 0s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know why certain tracts have missing data? Typically in survey data 0 means people were surveyed and none responded "yes." There are 0 estimates in the raw data, which makes me think that if the missing records were 0, they would have been included. Is it possible to fill NULLS with 0 just for calculating the sum but not when outputting the final table?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The raw data doesn't have all 18 modes for all census tracts. we should probably ask labs for comments, because if we remove the 0 records, it might break the front end.

Comment on lines +184 to +193
SELECT
a.geoid,
a.value::numeric::integer as value,
b.moe::numeric::integer as moe,
a.variable
FROM VAL a
JOIN MOE b
ON a.geoid = b.geoid
AND a.variable = b.variable
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

SELECT
        a.geoid,
        CASE
        	WHEN a.value::numeric = 0 AND b.moe::numeric = 0
        		THEN NULL
        	ELSE a.value::numeric::integer 
        END as value,
        CASE
        	WHEN a.value::numeric = 0 AND b.moe::numeric = 0
        		THEN NULL
        	ELSE b.moe::numeric::integer 
        END as moe,
        a.variable
    FROM VAL a
    JOIN MOE b
    ON a.geoid = b.geoid 
    AND a.variable = b.variable

@SPTKL SPTKL changed the title FIxing ctpp_censustract_variables WIP - FIxing ctpp_censustract_variables Jul 23, 2020
@SPTKL SPTKL marked this pull request as draft July 23, 2020 16:49
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants