New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ELV-170 ProportionalAssetResourceCosts for PVs #2059
Conversation
5f376bb
to
631317e
Compare
pkg/kubecost/allocation.go
Outdated
@@ -2115,6 +2120,21 @@ func deriveProportionalAssetResourceCosts(options *AllocationAggregationOptions, | |||
} | |||
} | |||
|
|||
for name, pvAlloc := range alloc.PVs { | |||
// log.Infof("Inserting PARC: %+v", ProportionalAssetResourceCost{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can toss this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird, I deleted this, and force pushed, but it's still here in GH 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok weird. Just changed nothing, amended the commit, and force pushed. And now it's good.
// | ||
// E.g. avg(kube_pod_container_status_running{}) by (pod, namespace)[1h:1m] | ||
// with time=01:00:00 will return, for a pod running the entire time, | ||
// 61 timestamps where the first is 00:00:00 and the last is 01:00:00. | ||
s := time.Unix(int64(result.Values[0].Timestamp), 0).UTC() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I still have a misunderstanding on this method of calculating the window here. Here's a concrete example:
I plugged the query avg(kube_pod_container_status_running{pod="alan-prom-cost-analyzer-5b6d76dc6c-6m8v6"}) by (pod, namespace)[1h:5m]
into GKE dev 1 prometheus @ 12:13 PM GMT, 07/28.
I got the following results:
1 @1690542900
1 @1690543200
1 @1690543500
1 @1690543800
1 @1690544100
1 @1690544400
1 @1690544700
1 @1690545000
1 @1690545300
1 @1690545600
1 @1690545900
1 @1690546200
The first entry translates to Friday, July 28, 2023 11:15:00 AM; the last, to Friday, July 28, 2023 12:10:00 PM.
The outcome of this makes it seem like the pod lived for 55 mins in the 1 hour query range due to the resolution, but in reality it was running for all 60 mins. As I recall, this results in inconsistencies for some calculations since it thinks the pod died after 55 mins.
This behavior is the motivation for the things you see in my original PR where I am playing around with interpolation, to try and add in the missing 5 mins. I agree that blind interpolation as you commented on in my PR would cause an issue if the pod only ran for 55 mins then was killed.
Am interested in your thoughts - perhaps this is a non issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to discuss, because we've waffled on this a few times. I believe the resolution (i.e. :5m
) will snap you to 5m intervals, so if you run it at time=...13:00
you're going to miss those first 3 minutes, getting back only timestamps divisible by 5m, or 300. But I believe (and my testing supported this) that if you always run the queries with time=00:00:00
, which we now do, you will always get the first and last timestamp.
Using that same query you provide, here's what I see for 0m, 1-4m, and 5m:
00:00 we get 13 data, with the earliest and latest each at 0m
00:01-00:04 we get 12 data, losing the previous earliest and keeping 5m-0m
00:05 we get 13 data again, gaining a new latest so that earliest and latest are both 5m
Hope that's clarifying. But please push back if you cannot reproduce this, or have a different interpretation than mine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(A short history, too -- we used to not use the time=...
param, and tried to use offset
to go back. That was never perfect, and so we usually saw missing data and tried to interpolate it. But then sometimes we would get all the data! Then we realized our error and began using the more precise time=...T00:00:00Z
method.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
((A short feature pitch -- as long as Kubecost is running, we should listen for pod UP and DOWN events, and just get the exact right numbers for that IMO. This is not a good use of Prometheus querying. Very expensive and error prone. Rather, we should measure start, end, and arguably properties and requests and labels once by listening to the k8s API events, and then only use Prometheus for things that genuinely require _over_time
queries like usage data.))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
putting a listener on would be super neat and certainly lighter weight, an interesting idea!
So, when I query at the start of an hour exactly, then the behavior does match your results. Timestamps look good and match resolution exactly. I note that you say we now query at exactly hourly intervals, which would indeed fix this problem. Do we enforce this requirement anywhere? Playing around with this, I am able to issue an API call like this: curl 'http://localhost:9003/allocation?window=2023-07-30T00:03:00Z,2023-07-30T01:00:00Z&aggregate=namespace' -s
which isn't on the hour. Maybe I am making mountains out of mole hills here, I guess I am just concerned that if API users aren't hitting exactly on the hour/multiple of the resolution, their data might be inaccurate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good question, and worthy of some unpacking. In short, it's not a problem for ETL users, but you are correct that it would be a problem for "compute on demand" users.
First, let's consider the query you provide here. The default resolution is 1m, so we do in fact get the correct numbers for that query (i.e. 57m between start and end) because the window is evenly divisible by the resolution, and the start is on a multiple of 1m. (See http://eks.dev1.niko.kubecost.xyz:9090/model/allocation?window=2023-07-30T00:03:00Z,2023-07-30T01:00:00Z&aggregate=namespace) But if the resolution had been 5m, or if the query had started on a non-zero-seconds time (e.g. 00:03:27) then we'd have lost data for those 27 seconds.
But now, let's consider what happens if you do the same kind of query, but you ask for more than 1h of data. E.g. http://eks.dev1.niko.kubecost.xyz:9090/model/allocation?window=2023-07-30T00:03:00Z,2023-07-30T02:00:00Z&aggregate=namespace -- now, in fact, we start using ETL data, which is locked into hourly data (00:00-00:00) or daily data (00:00:00-00:00:00). And those ETL queries are, by nature, set to safe query times.
So 99.9% of queries are safe because they hit the ETL data, which is safe. And of the 0.01% of queries, we will still be accurate if the resolution is precisely 1m and the query starts and ends on a time with 0s. I figure that's a good chunk of those 0.01% of queries. And when we are "wrong" here, we're probably only missing a small amount of data, within the error bounds.
That said, I think a nice solution would be to identify custom "on-demand compute" queries that are likely to result in data being dropped, and to send the user a warning.
Does that make sense? What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes sense and was sort of my takeaway as well. My only pushback here would be that opencost users don't have ETL. But for azure folks, as long as we ensure they are using a high enough resolution then yeah, the few seconds they might miss wouldn't materially affect the readings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe a warning someday would be good but low priority
…ates from Prometheus results, fixing mocking and test cases for PARCS Signed-off-by: Niko Kovacevic <nikovacevic@gmail.com>
631317e
to
d72e80e
Compare
also Niko on an unrelated note, did you see this ask from artur @ AKS? #1998 (comment) |
I did! I don't know if that's a priority, or just a request, so I'll let @cliffcolvin or whomever handle that one. (I mentioned it to him last week.) Can loop back later. (Frankly, I don't even know if we have access to that data...) |
What does this PR change?
Does this PR relate to any other PRs?
How will this PR impact users?
Does this PR address any GitHub or Zendesk issues?
How was this PR tested?
Does this PR require changes to documentation?
Have you labeled this PR and its corresponding Issue as "next release" if it should be part of the next OpenCost release? If not, why not?