-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Income Limits Automation: Implement running queries and writing to S3 for VES Data Export #16217
Comments
@dsasser I took a swing at editing the other ticket #15203 , as well as creating this one. Please feel free to edit/alter as needed. Thanks! cc @jilladams |
Status Update 12/7/2023Running into a blocking issue: Executing the new Workflow in a self-hosted runner (ie: behind the TIC) throws an error when trying to download node dependency node-oracledb: From this workflow run:
I also tested a ruby-based action, and received a similar error, ostensibly having the same root cause: From this workflow run:
Eric ran into this recently with AP. Next steps appear to be escalating to VA Gateway Operations. |
From scrum: TIC (owned by VA Gateway Operations) doesn't allow use of the Node / Ruby versions that Github actions uses by default, which blocks the HTTPS TLS handshake. (required to talk to Oracle (what the VES DB is in) Lower than latest Node version might work. Lower Ruby version might work. Experimental. Won’t affect big picture versions used anywhere in other code, the versions in question here are isolated to the runner that’s running the Github actions job. If this doesn’t work, we’re back to not being able to use a Github action. In that case, we would be blocked until network changes can be made to the TIC rules by VA Gateway Operations, to allow legacy TLS operations. OR, would need to use a different system than Github actions. Possibility in that case: Could potentially run this code in Drupal (like Forms DB migration, or Police data theoretically will someday), but just to fetch the data / send it to S3. We would not want to store these 200,00 records in Drupal. If we get blocked on Github actions in this sprint, we can talk about this idea in more detail during code freeze sprint. It would almost certainly require CMS Collab Cycle review. |
Status Update 12/12/23Eric via Slack:
While VA is working on the issue, Eric has been trying alternatives such as using the RBEV installer (which failed). Next we will try setting up the runner to execute inside a container. *update: Running inside a container was successful! Next we need to resolve how secrets are stored/used, and permission for the S3 bucket, but at this moment we are unblocked. |
Mid-Sprint Update 12/13/23🟡 This ticket is at risk for getting complete by sprint close. As mentioned in the above comments, we have had several issues getting unblocked with the self-hosted runner installing Ruby/Node. Eric solved that for us yesterday, thankfully, so we are moving forward. There is a fair bit of work remaining, including:
|
Update 12/14/2023We have successfully connected to the VES database within the workflow, and queried all 5 tables! Eric worked his magic and we are unblocked and rolling again on this work. What remains:
The above represents between 3-5 points of work. |
I checked on a few other GHA runners and it looks like the base image already comes with AWS CLI already installed:
The aws/configure-aws-credentials action claims to make the credentials available to CLI calls:
If you're hitting some errors using AWS CLI, I can take a look. Edit: Ok I see where this is breaking down:
Using the absolute path might work |
@olivereri I tried setting the path to the AWS CLI to /usr/bin/aws but I'm getting: I'm not sure if this runner just doesn't have the CLI installed, or it is located somewhere mysterious, but I'm going to move forward by installing the CLI manually, which I have previously tested successfully. |
Update 12/18/23We have successful uploading of CSVs to S3! Here is what remains:
|
End of Sprint Update 12/19/23
UpdateShape of the data looks good, with one change that will need to be addressed in the Income Limits API code: the date format changed from
@FranECross some AC notes
|
Based on Daniel's notes about final status / ACs, I'm marking this done & closing. @dsasser you referenced a final cutover ticket and I didn't see that that exists already, so I stubbed out what I think you're saying. It could use a look when you get a chance, from you first, and then from @FranECross : #16512 |
Description
Finalize implementation of VES data exports to S3 by leveraging work on the linked ticket.
User story
AS AN Income limits owner
I WANT the app to be updated automatically, regularly, with new VES zipcode data
SO THAT Veterans receive the most up to date information in the UI.
Engineering notes / background
S3 file locations:
https://sitewide-public-websites-income-limits-data.s3-us-gov-west-1.amazonaws.com/std_zipcode.csv
https://sitewide-public-websites-income-limits-data.s3-us-gov-west-1.amazonaws.com/std_state.csv
https://sitewide-public-websites-income-limits-data.s3-us-gov-west-1.amazonaws.com/std_incomethreshold.csv
https://sitewide-public-websites-income-limits-data.s3-us-gov-west-1.amazonaws.com/std_gmtthresholds.csv
https://sitewide-public-websites-income-limits-data.s3-us-gov-west-1.amazonaws.com/std_county.csv
Starter Github Workflow:
Analytics considerations
Quality / testing notes
We need to consider monitoring / alarms for failed exports.
Acceptance criteria
The text was updated successfully, but these errors were encountered: