Skip to content
This repository has been archived by the owner on Feb 12, 2022. It is now read-only.

Continue to overwrite the bill as long as it's Estimated #8

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

kevinburke
Copy link

Currently we mark a bill as "finalized" when a new month's bill
appears in S3, but AWS continues to update a bill after a month is
over for several days. The correct way to check for a "finalized" bill
is to see whether the invoice_id row has a number or is marked as
"Estimated". As long as the bill is estimated, we want to overwrite it
with new data.

We can't do this earlier on because that would require peeking at the
S3 data, so overwrite the data in the line_items table as long as it's
not a finalized bill.

Finally, some rows appear in the spreadsheet for a given month as
"totals" of the other rows. These rows do not have an invoice_id (it's
empty) hence the check for '' below.

Currently we mark a bill as "finalized" when a new month's bill
appears in S3, but AWS continues to update a bill after a month is
over for several days. The correct way to check for a "finalized" bill
is to see whether the invoice_id row has a number or is marked as
"Estimated". As long as the bill is estimated, we want to overwrite it
with new data.

We can't do this earlier on because that would require peeking at the
S3 data, so overwrite the data in the line_items table as long as it's
not a finalized bill.

Some rows appear in the spreadsheet for a given month as "totals" of
the other rows. These rows do not have an invoice_id (it's empty)
hence the check for '' in the WHERE clause.

Finally, we'd previously cache a stale staged DBR, when Amazon updated
the DBR in the original bucket we'd still use the stale one from the
staging bucket for the `line_items` table. Fix this by fetching the
stage and the original bucket simultaneously, and only using the
stage CSV if it has the same modified date as the CSV in the original
bucket.
@kevinburke
Copy link
Author

I just pushed an additional change. Previously we'd cache a stale staged DBR, when Amazon updated the DBR in the original bucket we'd still use the stale one from the staging bucket for the line_items table. Fix this by fetching the stage and the original bucket simultaneously, and only using the stage CSV if it has the same modified date as the CSV in the original bucket.

Again this is an indication that if you are using this project as-is you are probably missing records and updates from Amazon between the end of a month and the time that Amazon sends you an invoice, which can be several days later.

@kevinburke
Copy link
Author

(This work was sponsored by Segment)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant