-
Notifications
You must be signed in to change notification settings - Fork 406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Out of cluster costs - Azure #414
Comments
Hi @Zenmodai! This is currently on our roadmap. We're actively discussing prioritization on this feature now. Reach out to us at team@kubecost.com if you want to discuss timeline/priority. |
Hi @dwbrown2 Maybe you can give an update here as well. When is this feature currently planned in the roadmap? I am sure others, who use Azure, would be interested in this as well. |
@Zenmodai this effort has been scoped but under development. We will share an update when it's completed, but please feel free to reach out to our team if this is a high priority for you. |
I've filed opencost/opencost#450 to track in the other repo |
Any news about this? @dwbrown2 @AjayTripathy |
Current expectation is that this work will be started in our next sprint! |
Happy to be an evaluator for this if need anyone to take a look :D |
Thanks, William! We currently have @Sean-Holcomb looking at this now. Working to find the right integration points with the Azure team. Let us know if anyone has a suggested contacts! |
This feature just launched in Beta in our v1.74 release! Initial documentation here. Reach out to us at team@kubecost.com if you want to learn more or if we can help in anyway. |
@Sean-Holcomb trailing comma left here: https://github.com/kubecost/cost-analyzer-helm-chart/blob/master/cost-analyzer/templates/azure-storage-config-secret.yaml#L16 (JSON is not well-formed). |
@dwbrown2 I configured the integration and Kubecost is able to gather cost from the subscription of the cluster. During parsing, however, I noticed a very high number (many thousands) of error messages logs like these:
|
@pierluigilenoci What did you do for the integration? I set up the export of the cost report and created the secret, added the secret name to the deployment, but Kubecost is not showing the out of cluster Azure costs in the dashboard. There are also no log messages regarding the reading of the reports or any errors, that something is wrong. Did you do something else? |
Hmm... strange, I followed the same guides, but nothing seems to happen regarding this new feature. |
same here |
@evertonmc @Zenmodai Were you able to verify the creation of the first export file on your Azure Storage account. Additionally do you still see the banner at the top of your Assets page about OOC not being set up? |
@pierluigilenoci from what I can tell you have some nested JSON in your tags column, can you confirm and if possible give me an example |
@pierluigilenoci it doesn't actually seem to be nested JSON but malformed JSON. I was only able to replicate the exact error you are getting with a trailing ':' on one of my rows tags following the closing bracket. If this is not the case for you, an example of a failing row would be instructive. |
@Sean-Holcomb
InvoiceSectionName/DepartmentName seems to be our company name for the invoice. I am not sure why it is trying to read the InvoiceSectionName/DepartmentName instead of the UsageDateTime column here. So it seems, that always the first column is used. Even after manually moving the UsageDateTime column to the first position, then another parse error is thrown
Is there something wrong in how the columns are parsed? Or is there still something wrong with our reports? Maybe you can provide an example command with parameters |
@Sean-Holcomb I assume you're talking about the export CSV file.
I am not able to distinguish which lines are generating the problem because the logs give no indication about it. The only JSON inside the report is like this:
What else can I do to help you debug the problem? |
@pierluigilenoci Given your error message we can rule out the nested JSON issue, I am specifically talking about the JSON in the "Tags" column of your CSV. The JSON is you provided is well formed so shouldn't be causing an issue. Specifically I am looking for rows that do not have a "MeterCategory" of "Virtual Machines" or "Storage". To replicate the error that you showed I had to add JSON to the "Tags" column which looked like this. A final note, this error is non fatal to the row that is being looked at it just prevents the cost from receiving labels that are generated from tags that you have set. That being said the costs of these rows are still showing up in your report |
@andreb89 The CSV parser should be agnostic to column ordering. It creates a map of the headers to column number. So when it looks for the "UsageDateTime" it is looking for the column with that string in the header. That being said does your CSV have any additional rows at the top of the file? Additionally the Algorithm grabs the most recent CSV in each month folder. It sounds like you tried uploading a modified file to the folder, is that correct? |
@Sean-Holcomb The top of our exported CSV looks as follows: So initially we already got the parse error, so next I tried to manually adjust the CSV, but it didn't really help, since then I got another parse error. So something in general seems to not work here. The unedited CSV export does not work and results in a parse error:
|
@Sean-Holcomb In addition to the bug @pierluigilenoci already mentioned before, I found another problem.
When you specify a secret for the azure storage name (azureStorageSecretName), but also use serviceKeySecretName or createServiceKeySecret, then in cost-analyzer-deployment-template.yaml - lines 259 and 263 you are mounting the /var/secrets path twice, which results in the following deployment error:
|
@andreb89 Okay that makes a lot of sense, unfortunately it looks like both the naming and row ordering is not going to be consistent between locales and Azure account types, so that does pose a challenge to me. As for the error message it just seems like it is unsuccessful in finding the UsageDateTime column which is strange because you listed that column as having the same name. The error is being generated by this
Where AzureLayout is "2006-01-02" and record is a []string for the CSV row and headerMap is map[string]int which matches column name to column number. The specific column headers that I am using as of right now are "MeterCategory", "UsageDateTime", "InstanceId", "AdditionalInfo", "Tags", "PreTaxCost", "SubscriptionGuid", "ConsumedService" and "ResourceGroup". If you can see if using these exact headers help, and I will think about a long term solution for this issue. "So initially we already got the parse error, so next I tried to manually adjust the CSV, but it didn't really help, since then I got another parse error." As for your second comment that is meant as an alternate method of configuration that is still a WIP thank you for pointing it out though. For now just stick with the one configuration outlined here Please let me know if you are seeing any other error messages or have any other insights you think I should know based off of what I have told you here. |
@andreb89 "Name der Abteilung (DepartmentName)", did you add the text in parentheses or is that how it shows up in the header? |
@Sean-Holcomb This is how it shows up. The export job was created with Azure CLI. If I create the export with the portal, then I do not have the German column headers, but then I cannot create exports which include the UsageDateTime. The different types of reports I can create from the portal do not have this column. |
@andreb89 Ok that is great news, I will start working on a fix for you |
@Sean-Holcomb I will share the complete CSV via Slack. |
@Sean-Holcomb I tried what you suggested and manually removed all header columns, which you currently aren't using and also edited the ones you are using to fit your header titles, e.g. "Abonnement-GUID (SubscriptionGuid)" => "SubscriptionGuid" Afterwards there were no further parse errors for the header titles themselves, rather I am now getting the same error as mentioned before: An example of a tag value is: But now we know, that the locales of the headers and the ordering is currently the problem for our default generated reports. After manually changing the report headers, at least I can now see the data in the kubecost asset dashboard. |
@andreb89 currently looking at @pierluigilenoci 's CSV that he sent on Slack and seeing that none of the strings in his tags column have {} around the outside, which seems to be the things casing the JSON parser issues. Is that how your tags column looks also? |
@Sean-Holcomb Yes, our tags column looks the same. All look like this: |
Fixes have been merged and will be in the next release. Please let me know if you have any additional feedback. |
@Sean-Holcomb I found a new bug. #783 |
@Sean-Holcomb now the CSV appears to be digested correctly. I'll send you a copy of the logs via Slack. |
@Sean-Holcomb I found a new bug. Currently, if you create the azure-storage-config secret with the helm chart, the secret is only added to the volumes (see lines: 107-111 in cost-analyzer-deployment-template.yaml). But the secret is not mounted in the actual container deployment. The else if On another note, maybe you could also add some more log messages for the Azure OOC feature, that will show up in the cost-model container, e.g. when the storage is searched for csv, when csv reports are parsed maybe with filename, etc. Another point that I noticed in the Assets dashboard. |
@andreb89 I have merged your suggested fix on the storage config so the next version should support that method of configuration. I will look into adding more logging for the CSV processing as you have suggested. The ETL runs every 3 hours if you want to trigger a rebuild you can use the endpoint In terms of showing OOC data on days which do not have in cluster data, that feature is currently unavailable, but it is a known behavior. |
@Sean-Holcomb Thanks for the endpoint, it is a great help during testing. I hope you can clear this up. Maybe there is actually still a bug here, since something doesn't add up. |
Hi @andreb89 -- I believe we set the date ranges on our end to skip data less than 2d old because it may be incomplete. Regarding what day is shown when, all data should be in UTC. |
@AjayTripathy But that is a strange behaviour. If I look at the costs, I wouldn't expect that the last 2 days are missing. I am not sure why the data from one day ago should not be complete, but even if it isn't, it should still be shown. Otherwise some of the time range options don't really make sense. |
@andreb89 I believe Sean and Ajay did this because Azure can provide partial data during this 48-hour time window. @Sean-Holcomb I think it could be ok to display partial assets data during this window. We however would not want to do reconciliation on partial data because this is likely to skew metrics heavily. Thoughts? |
@dwbrown2 That is exactly right, and that is how it is currently functioning with OOC showing everything that is available in the most current report and reconciliation excluding the partial data because of the skew that it causes. @andreb89 I think that you are experience 2 issues, a time shift on the in cluster costs due to the data being displayed in UTC, you can see this in your weekend being offset in your earlier post. The other is like you suggested the Azure data might not be syncing properly with the k8s generated data. If you could give me some more information that might be helpful for solving this issue. For starters what timezone are you in? If you are willing to provide one of the export CSV and the time it was generated along with matching assets page, that would be helpful too, feel free to reach out to me on slack. |
@Sean-Holcomb My timezone is CET (Central European Time). Currently I also use the parameter kubecostModel.utcOffset to set "+01:00" for the timezone. I hope this helps. Regarding the cost export, I don't think I can share it at the moment. |
@dwbrown2 Could please elaborate on why it takes 48 hour to display the complete Azure out of cluster costs? I haven't found any information about this. @Sean-Holcomb I just updated to the new version 1.76.0 and at least for me, the problem regarding the time shift in the displayed cluster costs remains. I am not sure what the problem is. You mentioned, that the data being displayed is in UTC, why should this be a problem and result in a time shift of the costs? I am using the parameter to offset UTC and in the reports there are only days specified. Is there something else that I have to configure regarding timezones? For further information, I added a more specific example. The following image should display the Asset costs from 8th March to 12th March (today). But it shows from 7th - 11th. At the far right there is an empty column, which I assume should be for the 12th, but it is completely empty. So is there something missing in the configuration or is this behavior a bug? |
@andreb89 In response to your first question. The 48 hour window is only for the adjustment column on in cluster costs. That is because cost data takes a day to be exported and additionally there is a day when the costs for the most recent day are incomplete we wait a full 48 hours before trying to use them to adjust Kubecost's in cluster estimates. The reasoning here is that if the cost data for the day is incomplete, it will over adjust prices for that day downward. For OOC costs there is no such window, the most recent costs in the exported cost CSV are displayed whenever Kubecost pulls in that data. Given your timezone the offset you are seeing is definitely a bug I have created an issue for you here #816. If you feel like I have missed something please add it in. |
Now that we've confirmed the core functionality works, @Sean-Holcomb shall we close this in favor or #816? |
@dwbrown2 sounds good. @ all please tag me in any additional Azure issues you create, and hopefully I can be helpful. |
We are using Kubecost with an Azure Kubernetes Cluster and we would like to track the costs of Azure specific out of cluster resources (e.g. databases) with Kubecost as well.
I checked the documentation, but currently there is no mention of out of cluster cost allocation/tracking for Azure.
Is this not yet implemented/supported or just the documentation missing?
The text was updated successfully, but these errors were encountered: