Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backfill real_efile.sa7 so 3 tsvector columns are populated #4093

Closed
1 task done
jason-upchurch opened this issue Nov 20, 2019 · 6 comments
Closed
1 task done

backfill real_efile.sa7 so 3 tsvector columns are populated #4093

jason-upchurch opened this issue Nov 20, 2019 · 6 comments

Comments

@jason-upchurch
Copy link
Contributor

jason-upchurch commented Nov 20, 2019

Summary

real_efile.sa7 has 3 TSVECTOR columns that are all empty. It does not appear any triggers generated the input for these columns. Issue #3834 corrects this going forward, however historical data needs to be backfilled.

[HIGH-LEVEL] Completion criteria

  • real_efile.sa7 data is backfilled for contributor_name_text, contributor_employer_text, and contributor_occupation_text columns.
@jason-upchurch
Copy link
Contributor Author

New entries appear to be populated, but still need to backfill

@jason-upchurch
Copy link
Contributor Author

Moving to 11.2 for the time being.

@fecjjeng
Copy link
Contributor

Table real_efile.sa7 contains the past 4 months data (up to 5 month the most) since we did a monthly cleanup at first Saturday of each month to only keep the most recent 4 months worth of data. The trigger had been introduced to the 3 cloud database on
DEV: 2019-12-02
STG: 2019-12-03
PRD: 2019-12-10
Any new data added to real_efile.sa7 will have the correct data.
On 1/12/2020, the earliest data in database is inserted on 9/1/2019 (create_dt). Therefore data inserted between 9/1/2019 to the date the trigger was added need to be updated to populate the columns.

@fecjjeng
Copy link
Contributor

A function were created to back-fill the columns. It is based on the create_dt on real_efile.reps. Appropriate repid in the corresponding time period is used to drive the update. After the updates were done, this particular function had been removed from the database to avoid confusion in the future since it is not part of the "official" objects needed for website project.

@fecjjeng
Copy link
Contributor

It had been verified that autovacuum had been invoked by database engine after the updates finished. This indicated that the statistics of the table had been updated to reflect the updated data.

@fecjjeng
Copy link
Contributor

Verified that all records had been backfilled:
select count(*) from real_efile.sa7 where contributor_name_text is null;
All three database had returned 0 rows with null contributor_name_text.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants