New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update PV and LB to use Unmounted #1363
Conversation
@@ -30,7 +30,7 @@ const SharedSuffix = "__shared__" | |||
// UnallocatedSuffix indicates an unallocated allocation property | |||
const UnallocatedSuffix = "__unallocated__" | |||
|
|||
// UnmountedSuffix indicated allocation to an unmounted PV |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the big changes in this PR is that an Unmounted allocation can now have LB cost for LBs for which no pod has their selector for the given window
@@ -2724,7 +2708,6 @@ func (p Pod) AppendContainer(container string) { | |||
// TODO:CLEANUP add PersistentVolumeClaims field to type Allocation? | |||
type PVC struct { | |||
Bytes float64 `json:"bytes"` | |||
Count int `json:"count"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
count is replaced by the coefficient calculated with intervals and the number of allocations in a pod
pkg/costmodel/allocation.go
Outdated
gib := pvc.Bytes / 1024 / 1024 / 1024 | ||
cost := pvc.Volume.CostPerGiBHour * gib * hrs | ||
pvcPodWindowMap[thisPVCKey][thisPodKey] = kubecost.NewWindow(&s, &e) | ||
} | ||
} | ||
} | ||
|
||
// Scale PV cost by PVC sharing coefficient. | ||
if coeffComponents, ok := sharedPVCCostCoefficientMap[pvcKey][podKey]; ok { | ||
cost *= getCoefficientFromComponents(coeffComponents) | ||
} else { | ||
log.Warnf("CostModel.ComputeAllocation: allocation %s and PVC %s have relation but no coeff", alloc.Name, pvc.Name) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where the total PVC cost is being multiplied by Coefficients which could sum to greater than 100%
673232d
to
3a63202
Compare
Name: pvc.Volume.Name, | ||
} | ||
alloc.PVs[pvKey] = &kubecost.PVAllocation{ | ||
ByteHours: pvc.Bytes * hrs / count, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not multiplying by coef here can result in reconciliation giving more cost to this allocation then it should actually get.
3a63202
to
6cbded8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good, though I will admit that this is a challenging review. Is there any way I can help with additional testing to make sure that this is all rock solid?
case string(AuditClusterEquality): | ||
return AuditClusterEquality |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's nice to have this to rely on -- I'm assuming that uncommenting this means that it passes now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It passes our tests, and on my cluster, given everything that I have seen in the Costmodel in the last couple weeks I don't think it will pass in all situations but I think it is a good audit have in that it lets us know our short comings on this invariant.
@@ -155,7 +155,7 @@ func ClusterDisks(client prometheus.Client, provider cloud.Provider, start, end | |||
ctx := prom.NewNamedContext(client, prom.ClusterContextName) | |||
queryPVCost := fmt.Sprintf(`avg(avg_over_time(pv_hourly_cost[%s])) by (%s, persistentvolume,provider_id)`, durStr, env.GetPromClusterLabel()) | |||
queryPVSize := fmt.Sprintf(`avg(avg_over_time(kube_persistentvolume_capacity_bytes[%s])) by (%s, persistentvolume)`, durStr, env.GetPromClusterLabel()) | |||
queryActiveMins := fmt.Sprintf(`count(pv_hourly_cost) by (%s, persistentvolume)[%s:%dm]`, env.GetPromClusterLabel(), durStr, minsPerResolution) | |||
queryActiveMins := fmt.Sprintf(`avg(kube_persistentvolume_capacity_bytes) by (%s, persistentvolume)[%s:%dm]`, env.GetPromClusterLabel(), durStr, minsPerResolution) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any risks to swapping this? And what exactly are the benefits? I can't think of any, in either direction, but want to check in the interest of being super careful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in regards to https://github.com/kubecost/kubecost-cost-model/pull/927 I suspect that the use of pv_hourly_cost for our active mins query is causing this log when kubecost is down and not emitting the metric but KSM still is. I might be totally off base on this, but I think this change is low risk and may address the problem
for _, pod := range podMap { | ||
for _, alloc := range pod.Allocations { | ||
cluster := alloc.Properties.Cluster | ||
nodeName := alloc.Properties.Node | ||
namespace := alloc.Properties.Namespace | ||
pod := alloc.Properties.Pod | ||
podName := alloc.Properties.Pod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice
pkg/costmodel/allocation.go
Outdated
return err | ||
} | ||
// LB describes the start and end time of a Load Balancer along with cost | ||
type LB struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, but we might not want to export this. Seems reasonable for internal use.
6cbded8
to
d01b648
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I think I'm on board with the idea here: allocating PV cost to Allocations based on the amount of time the Allocation attached the PV via PVC. I sort of bias towards Niko's (?) idea of allocating based on weighted cost, but at the end of the day I think having proper aggregation equality at the end of the day is more important.
I like the refactoring, too. Hopefully it'll be easier to navigate computeAllocation()
logic in the future!
I too would have advised against the interval approach if I knew about it in the start, it seemed like a waste to throw it out at this point rather than fix it. The accuracy to complexity trade off probably isn't worth it though. |
Yep, totally makes sense! |
Signed-off-by: Sean Holcomb <seanholcomb@gmail.com>
d01b648
to
0fabcbf
Compare
What does this PR change?
This PR fixes a few issues with PV Ingestion from Prom data with the goal of achieving Cluster Equality. Cluster Equality in this context means that queries on Allocations and Assets grouped by Cluster would have totals that are equal. Cluster Equality does depend on post ingestion processes such as Reconciliation and shared tenancy costs, however the way certain values were being ingestion had down stream effects those processes. The main focus here has been ensuring that PV and LB costs that were not being set to Allocations are now being applied to an Unmounted Allocation for a cluster. This Unmount Allocation now has its PVs and Services populated which allows it to be properly reconciled.
This PR fixes a bug with the interval calculations on PV ingestion. The intervals are a great tool for determining the proportion of a PV's cost should go to pods which are attached to it. This is especially true in Windows when pods do not exist for the entire interval. The bug is that we were determining the time coef (precent of time that an interval) was being calculate off the pods window rather than off the PV window. For example given:
In this example the PV runs for an hour and Pod1 runs for the first half hour and Pod2 for the second half hour.
With the bug these coefs:
This says that 100% of the time POD1 was running it can be attributed 100% of PV's cost, however this coefficient is multiplied with the total cost of the PVC which means 200% of the cost is distributed. To address this we calculate Time with the PVC window rather tan the POD window.
This Issue ties in with the larger issue of Unmounted resources because there was no handling of the case where no Pods were running at the same time a PVC.
Does this PR relate to any other PRs?
How will this PR impact users?
Does this PR address any GitHub or Zendesk issues?
How was this PR tested?
Does this PR require changes to documentation?
Have you labeled this PR and its corresponding Issue as "next release" if it should be part of the next Opencost release? If not, why not?