Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

COST-1771,1587,1677: GCP Add partition_time and fix cross over data for final cost. #3098

Merged
merged 28 commits into from
Sep 16, 2021

Conversation

myersCody
Copy link
Contributor

@myersCody myersCody commented Aug 30, 2021

Pull Request Summary

The overall purpose of this pull request is to fix COST-1677: Total cost for project does not match with total shown through Google console. After investigating why the cost is off it came down to two major problem. Which are highlighted in COST-1587 & COST-1771.

Since we were updating cost calculations already, we decided to modify our cost output to include credits (cost-1836) so that cost management will correctly reflect how much customers are actually paying.

Problem One: COST-1771: BigQuery including invoice month is removing cross over data

https://issues.redhat.com/browse/COST-1771
GCP is different from the other providers we support because the usage dates on their cost data does not always follow the month in which that cost data is billed.

For example, cost data for August 1st can contain rows that are billed for both July and August. I have been calling this scenario "cross over data".

In order to fix this issue we needed to remove the invoice month from our bigquery query. Additionally we need to adjust our time scope filters for gcp specifically, since invoice data can live outside of the usage dates of the month we needed to filter by the invoice month instead of dates for time scopes -1 & -2. Additionally, when summarizing the data we needed to include the invoice month.

Problem Two: COST-1587: Query GCP data on partition time

https://issues.redhat.com/browse/COST-1587
This issue was originally opened to cut down on the BigQuery cost in querying customer accounts. However, it was also a necessary step in order to correct some of the total cost problem we were seeing in GCP. When looking into cross over data described above we couldn't just query by the usage dates to get all the data in the month. However, it looks like GCP partitions based off the invoice month, so switching to the partition time allowed us to collect all of the GCP data.

Problem Three: COST-1836: Not handling discounts within our cost calculations

https://issues.redhat.com/browse/COST-1836
The final GCP issue that was address in this pull request is that we were not handling credits in our cost calculations. We were collecting the data in bigquery, but it was never passed up to the filter map to be used. Now if you go to a GCP endpoint you will see a credits section in the return.

Example:

"cost": {
                                        "raw": {
                                            "value": 0.1338525,
                                            "units": "USD"
                                        },
                                        "markup": {
                                            "value": 0.0,
                                            "units": "USD"
                                        },
                                        "usage": {
                                            "value": 0.0,
                                            "units": "USD"
                                        },
                                        "credit": {
                                            "value": 0.0,
                                            "units": "USD"
                                        },
                                        "total": {
                                            "value": 0.1338525,
                                            "units": "USD"
                                        }

Smoke Tests

Jenkins Job

  • We have some smoke test failures related to comparing: start_date=08-01-2021&usage_end=08-31-2021 and time_scope_filter of the previous month (-2). However, part of my work for COST-1771 was to deal with the cross over data problem. One annoying thing we run into with GCP is the usage_date doesn't match the invoice month. My solution to the problem was to not use the start & end date filters when the time scope was -1 or -2, and just use the invoice month filter of 202108 which returns all cost data associated with that month.

The Problem:

SELECT usage_start, usage_end FROM reporting_gcpcostentrylineitem_daily_summary WHERE usage_end > '2021-08-31' AND invoice_month='202108' GROUP BY usage_start, usage_end;
 usage_start | usage_end
-------------+------------
 2021-08-31  | 2021-09-01

When we use the the start & end date filters we are essentially doing usage_end<=2021-08-31 so we are skipping all the rows that have a usage_end of 2021-09-01. So, the last day of data is excluding data that is present when we use the time_scope_value filter.

As a result we have opened this Jira issue to think about some of the complexities regarding how to handle cross over data when using the start & end date filter on the gcp endpoint. In particular we need to think through how to handle this for multiple group bys and for the cost explorer.

Since these test failures are result of the processing changes we made to include cross over data in the daily summary tables (Whereas before we weren't grabbing cross over data from Bigquery). We have decided to address it as a new issue rather than a regression. Therefore, Luke and I have decided to skip those smoke tests until the jira issue above has been completed.

@myersCody myersCody self-assigned this Aug 30, 2021
@codecov
Copy link

codecov bot commented Sep 1, 2021

Codecov Report

Merging #3098 (adabf8f) into main (4e9912b) will decrease coverage by 0.0%.
The diff coverage is 82.9%.

@@           Coverage Diff           @@
##            main   #3098     +/-   ##
=======================================
- Coverage   94.1%   94.1%   -0.0%     
=======================================
  Files        310     310             
  Lines      26068   26110     +42     
  Branches    2950    2956      +6     
=======================================
+ Hits       24531   24569     +38     
- Misses       969     973      +4     
  Partials     568     568             

@myersCody myersCody added the django migration Change requires a Django database migration label Sep 7, 2021
@myersCody myersCody marked this pull request as ready for review September 7, 2021 18:52
@myersCody myersCody added the smoke-tests pr_check will build the image and run minimal required smokes label Sep 7, 2021
@myersCody
Copy link
Contributor Author

/retest

@myersCody myersCody removed the smoke-tests pr_check will build the image and run minimal required smokes label Sep 7, 2021
@myersCody myersCody removed the django migration Change requires a Django database migration label Sep 9, 2021
@myersCody
Copy link
Contributor Author

The migrations for this pull request have been moved over here: #3122

@myersCody myersCody added the smoke-tests pr_check will build the image and run minimal required smokes label Sep 13, 2021
@myersCody myersCody added smoke-tests pr_check will build the image and run minimal required smokes and removed smoke-tests pr_check will build the image and run minimal required smokes labels Sep 13, 2021
@myersCody
Copy link
Contributor Author

/retest

@myersCody myersCody removed the smoke-tests pr_check will build the image and run minimal required smokes label Sep 13, 2021
@adberglund adberglund requested a review from a team September 13, 2021 14:27
@myersCody myersCody added the smoke-tests pr_check will build the image and run minimal required smokes label Sep 13, 2021
@myersCody
Copy link
Contributor Author

/retest

@myersCody myersCody removed the smoke-tests pr_check will build the image and run minimal required smokes label Sep 13, 2021
@myersCody myersCody added smoke-tests pr_check will build the image and run minimal required smokes and removed smoke-tests pr_check will build the image and run minimal required smokes labels Sep 14, 2021
@myersCody myersCody removed the smoke-tests pr_check will build the image and run minimal required smokes label Sep 15, 2021
@myersCody myersCody added the smoke-tests pr_check will build the image and run minimal required smokes label Sep 15, 2021
@myersCody
Copy link
Contributor Author

/retest

@myersCody myersCody removed the smoke-tests pr_check will build the image and run minimal required smokes label Sep 15, 2021
@myersCody myersCody added the smoke-tests pr_check will build the image and run minimal required smokes label Sep 15, 2021
@myersCody
Copy link
Contributor Author

/retest

1 similar comment
@lcouzens
Copy link
Contributor

/retest

@myersCody myersCody removed the smoke-tests pr_check will build the image and run minimal required smokes label Sep 16, 2021
Copy link
Contributor

@dccurtis dccurtis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work here Cody!

@myersCody myersCody merged commit 9bd8631 into main Sep 16, 2021
@myersCody myersCody deleted the issues/cost-1677_1771_1587 branch September 16, 2021 18:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants