Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure Cost Management Package just throws first 1000 lines and not managing the paging from the cost management rest api #22528

Open
brasam opened this issue Feb 9, 2023 · 7 comments
Labels
Consumption - Billing All issues in cost management and Consumption API where billing/price related data is shown customer-reported Issues that are reported by GitHub users external to the Azure organization. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention This issue is responsible by Azure service team.

Comments

@brasam
Copy link

brasam commented Feb 9, 2023

Package Name: azure-mgmt-costmanagement
Package Version: 3.0.0
Operating System: Windows
Python Version: 3.9.6
Describe the bug
When you call the Usage method in python: https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/costmanagement/azure-mgmt-costmanagement/azure/mgmt/costmanagement/operations/_query_operations.py
that is related to this REST api endpoint here https://docs.microsoft.com/es-es/rest/api/cost-management/query/usage
the python api is not looping to get next page using the nextLink attribute in the response described here https://docs.microsoft.com/es-es/rest/api/cost-management/query/usage#queryresult
so we get a 1000 rows maximum per execution and no way to manage the paging with this library

To Reproduce

from azure.mgmt.costmanagement import CostManagementClient
from azure.mgmt.costmanagement.models import QueryAggregation,QueryGrouping,QueryDataset,QueryDefinition,QueryTimePeriod,QueryFilter,QueryComparisonExpression
from azure.mgmt.resource import ResourceManagementClient
from azure.identity import DefaultAzureCredential
from IPython.display import display, HTML
from typing import ContextManager

import json
import pandas as pd
import datetime as dt
import calendar
import numpy as np

thedate = dt.datetime.combine(dt.date.today(), dt.time())
first = thedate.replace(day=1)
last = thedate.replace(day = calendar.monthrange(thedate.year, thedate.month)[1])

credential = DefaultAzureCredential()
subscription_id = 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'

scope = '/subscriptions/{}'.format(subscription_id)

client = ResourceManagementClient(credential, subscription_id)

cmgmtc = CostManagementClient(credential = credential)

"""
'ResourceGroup','ResourceGroupName','ResourceLocation',
'ConsumedService','ResourceType','ResourceId',
'MeterId','BillingMonth','MeterCategory',
'MeterSubcategory','Meter','AccountName',
'DepartmentName','SubscriptionId','SubscriptionName',
'ServiceName','ServiceTier','EnrollmentAccountName',
'BillingAccountId','ResourceGuid','BillingPeriod',
'InvoiceNumber','ChargeType','PublisherType',
'ReservationId','ReservationName','Frequency',
'PartNumber','CostAllocationRuleName','MarkupRuleName',
'PricingModel','BenefitId','BenefitName',''
"""

query_template = (
QueryDefinition(
type = "ActualCost"
, timeframe = "ThisMonth"
, dataset =
QueryDataset(
granularity = "Monthly"
, aggregation = {
"totalCost": QueryAggregation(name = "Cost", function = "Sum")
,"totalCostUSD": QueryAggregation(name = "CostUSD", function = "Sum")
}
, grouping = [
QueryGrouping(name = "ResourceGroupName", type = "Dimension")
,QueryGrouping(name = "ResourceId" , type = "Dimension")
,QueryGrouping(name = "ResourceType" , type = "Dimension")
]
, filter =
QueryFilter(
dimensions =
QueryComparisonExpression(
name = "ResourceGroupName"
, operator = "In"
, values = ["RESOURCE_GROUP"]
)
)
)
)
)

replaced_query = (
query_template.deserialize(
json.loads(
json.dumps(
query_template.serialize()
).replace('RESOURCE_GROUP','destination_rg')
)
)
)

result = cmgmtc.query.usage( scope = scope, parameters = replaced_query)

data = pd.DataFrame(result.rows, columns = list(map(lambda col: col.name, result.columns)))

data_sorted = data.sort_values(by='CostUSD' ,ascending = False)

data_filtered = data_sorted

pd.set_option('display.max_rows', data_filtered.shape[0]+1)

display(HTML(data_filtered.to_html()))

Expected behavior
As a python developer using this package I would expect the result to be an iterable so I can get all result pages not just the first one

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

@ghost ghost added needs-triage This is a new issue that needs to be triaged to the appropriate team. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that customer-reported Issues that are reported by GitHub users external to the Azure organization. labels Feb 9, 2023
@ghost ghost removed the needs-triage This is a new issue that needs to be triaged to the appropriate team. label Feb 10, 2023
@navba-MSFT navba-MSFT added Service Attention This issue is responsible by Azure service team. needs-team-attention This issue needs attention from Azure service team or SDK team Consumption - Billing All issues in cost management and Consumption API where billing/price related data is shown and removed CXP Attention labels Feb 14, 2023
@navba-MSFT
Copy link
Contributor

Adding Service team to look into this.

@davidvesp
Copy link

Is there any update?

@g-raskar
Copy link

@navba-MSFT any update from the service team?

@navba-MSFT
Copy link
Contributor

@g-raskar I had a discussion with the Cost Management Query API product owners. Sharing the update here.

Most ARM APIs deal with Creating/Updating/Getting/Deleting an azure resource, so a get response is usually in the form of a “value” array of objects that represent the resource(s) information.
Example:
image

In our case, we do not do the above operations on a resource, we simply fetch the aggregated cost of a resource based o what the customer specifies (and I am sure, as customers of the API, you already know that).

Hence, our response in unconventional:

image

So due to the above we cannot add the x-ms-pageable field in the REST API specs swagger. And since this is not available in the swagger, the SDK cannot provide this functionality.

So in short, the ONLY workaround here is to invoke the Cost management query API, Get the NextLink marker and enumerate through it until the provided nextLink is null. Samples are here and here. Hope this helps.

@navba-MSFT navba-MSFT removed the needs-team-attention This issue needs attention from Azure service team or SDK team label May 31, 2023
@g-raskar
Copy link

g-raskar commented Jun 1, 2023

@navba-MSFT Thanks for the update!

@tikicoder
Copy link

@navba-MSFT
I appriciate the post. This would seem like a good reqson to update QueryDefinition or ForecastDefinition to take in a nextLink or even a skip token. So we as the devs would have to do some extra work, The whole reason to use the SDK is to handel certian things, why use the cost management sdk if we have to make rest api calls

@tikicoder
Copy link

In case someone comes here and needs code and needs to stay in python here is some code

`
from requests import post as request_post
import json

run the costmanagement sdk like normal

cost_response = CostManagementClient().query.usage

next_link = getattr(cost_response, "next_link", None)

next_cost_page = 1
cost_filter = cost_filter.serialize()
if isinstance(cost_filter, dict):
cost_filter = json.dumps(cost_filter)

while next_link is not None and next_link.strip():
cost_api_resquest = request_post(
url= next_link,
headers= {
"Authorization": f"Bearer {CostManagementClient()._config.credential.get_token('https://management.azure.com/.default').token}",
"Content-Type": "application/json"
},
data= cost_filter,
timeout= 10,
)

if cost_api_resquest.status_code == 429:
time.sleep(10)
next_cost_page += 1
continue

cost_api_response = cost_api_resquest.json()
next_link = cost_api_response["properties"].get("nextLink")
next_cost_page = 1

cost_response.rows += cost_api_response["properties"].get("rows")

return cost_response
`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Consumption - Billing All issues in cost management and Consumption API where billing/price related data is shown customer-reported Issues that are reported by GitHub users external to the Azure organization. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention This issue is responsible by Azure service team.
Projects
None yet
Development

No branches or pull requests

6 participants