Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle "NULL" value on SA/SB aggregate data page #5178

Closed
1 task
Tracked by #137
hcaofec opened this issue Apr 25, 2022 · 10 comments · Fixed by #5235
Closed
1 task
Tracked by #137

Handle "NULL" value on SA/SB aggregate data page #5178

hcaofec opened this issue Apr 25, 2022 · 10 comments · Fixed by #5235
Assignees
Milestone

Comments

@hcaofec
Copy link
Contributor

hcaofec commented Apr 25, 2022

Summary

In order to use AWS DMS to sync data, the NULL value in several columns across schedule A and schedule B aggregate tables has been changed. Where the aggregate item is null previously, It is replaced with a literal value 'NULL' (or ' ' in the case of state since it is a varchar(2) column) now.

Review and possible change need to be done to make sure this backend change is handled correctly in CMS.

Reference backend ticket: #5110

What we're after:
When calling the below endpoints, make sure the literal value "NULL" is handled correctly.

/schedules/schedule_a/by_employer/
https://api.open.fec.gov/v1/schedules/schedule_a/by_employer/?sort_null_only=false&employer=NULL&per_page=20&sort_hide_null=false&api_key=DEMO_KEY&page=1&sort_nulls_last=false

/schedules/schedule_a/by_occupation/
https://api.open.fec.gov/v1/schedules/schedule_a/by_occupation/?sort_null_only=false&per_page=20&sort_hide_null=false&occupation=NULL&api_key=DEMO_KEY&page=1&sort_nulls_last=false

/schedules/schedule_a/by_zip/
https://api.open.fec.gov/v1/schedules/schedule_a/by_zip/?sort_null_only=false&per_page=20&sort_hide_null=false&api_key=DEMO_KEY&zip=NULL&page=1&sort_nulls_last=false

/schedules/schedule_b/by_recipient/
https://api.open.fec.gov/v1/schedules/schedule_b/by_recipient/?sort_null_only=false&recipient_name=NULL&per_page=20&sort_hide_null=false&api_key=DEMO_KEY&page=1&sort_nulls_last=false

https://www.fec.gov/data/committee/C00213512/?tab=raising
image

https://www.fec.gov/data/committee/C00103861/?tab=spending
image

Completion criteria

  • "NULL" value is handled.
@hcaofec hcaofec added this to the PI 17 innovation milestone Apr 25, 2022
@hcaofec hcaofec changed the title NULL value are handled on SA/SB aggregate data page NULL value is handled on SA/SB aggregate data page Apr 25, 2022
@hcaofec hcaofec changed the title NULL value is handled on SA/SB aggregate data page Handle "NULL" value on SA/SB aggregate data page Apr 25, 2022
@JonellaCulmer
Copy link
Contributor

JonellaCulmer commented Apr 26, 2022

@hcaofec @patphongs Recommendation for NULL language replacement: "Not reported"

@patphongs
Copy link
Member

@JonellaCulmer @PaulClark2 While pairing with @hcaofec, we found out that the filters that were modified here are free text fields. So a filer could potentially put anything. Therefore, we see variations of "Not Reported", "Not Reportable", etc in the database. Updating the null values here to be labeled as "Not Reported" will further conflate the two different things. So I'm not exactly sure what to call nulls now. Thoughts?

@patphongs
Copy link
Member

Here's two examples that @hcaofec found of committees that have scenarios of NOT REPORTED, N/A, and NULL.

https://www.fec.gov/data/committee/C00393843/?cycle=2004&tab=raising

"cmte_id" "cycle" "employer" "total"
"C00393843" 2004 "NOT REPORTED" $762.00
"C00393843" 2004 "N/A" $250.00
"C00393843" 2004 "NULL" $3,750.00

https://www.fec.gov/data/committee/C00493825/?tab=raising&cycle=2016

"cmte_id" "cycle" "employer" "total"
"C00493825" 2016 "NOT REPORTED" $3,000.00
"C00493825" 2016 "N/A" $310,120.01
"C00493825" 2016 "NULL" $202,467.38

@patphongs patphongs modified the milestones: Sprint 18.2, Sprint 18.3 May 16, 2022
@patphongs
Copy link
Member

patphongs commented May 16, 2022

@JonellaCulmer @PaulClark2 I have shifted this over to 18.2 so we can figure out how to label nulls

@patphongs patphongs modified the milestones: Sprint 18.3, Sprint 18.2 May 16, 2022
@patphongs
Copy link
Member

@JonellaCulmer @PaulClark2 As a follow-up to our conversation today, I'm going to suggest that we replace the string "NULL" with the word "(blank)" in parentheses. My thinking is that forms are less likely to have blank with parentheses in their forms.

@hcaofec Could you also search our data to see if there's any instances in which someone has actually inserted the text "(blank)" in our data? If there isn't any, I feel like this would be a good replacement on the front end.

@hcaofec
Copy link
Contributor Author

hcaofec commented May 17, 2022

There are some rows with "blank" in the tables, only few with "(blank)".
@patphongs @JonellaCulmer

For employer:
image

@hcaofec
Copy link
Contributor Author

hcaofec commented May 17, 2022

For occupation:
image

@patphongs
Copy link
Member

Thank you @hcaofec for doing the look up for us.

@JonellaCulmer @PaulClark2 I have made a PR in to update the text for NULL. Right now it's being rendered as (BLANK), but I could change it to whatever you think may be best.

@patphongs
Copy link
Member

After discussing with @bmathesonFEC and @JonellaCulmer, we decided to make the text more specific: "(COMMITTEE DID NOT PROVIDE AN [COLUMN_NAME])". This has been implemented in the PR and ready for review. cc @PaulClark2

@PaulClark2
Copy link
Contributor

Sounds good. Thank you all for your work on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants