-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aws_cost_by_service_daily - date filtering parameters not sent to the API #2149
Comments
Hi @shaicoleman, the It's crafted to deliver cost and usage insights per service on a daily basis for the previous year. From what I gather, you're looking to obtain cost and usage details over a specific time frame. Regrettably, the I'm currently addressing this issue and will keep you posted once the table design is finalized. Thank you! |
Hello @shaicoleman, I've made some updates in the
Query result:
We'd greatly appreciate it if you would like to test the changes in the PR branch and share your feedback with us to ensure the code changes meet the requirements. Here are the steps to test the PR branch:
Thank you! |
I'm having issues testing that branch with the instructions above:
|
Hey, @shaicoleman, could you please try running the command |
I can't get it to run:
SELECT service, unblended_cost_amount, period_start, period_end
FROM aws_cost_by_service_daily
WHERE service = 'Tax' AND
search_start_time >= current_timestamp - interval '30d'
SELECT *
FROM aws_cost_usage
WHERE search_start_time >= curerent_timestamp - interval '30d'
SELECT *
FROM aws_cost_usage
WHERE search_start_time >= '2024-03-01'
|
Hi @shaicoleman, we haven't yet implemented support for the I've made additional updates in the Note: When querying the Here are the query results with different operators(
Please feel free to share your feedback. Thanks! |
It seems that there's an issue returning empty cached results when repeating the same query and changing the date filters. e.g. the first query will return results correctly: select
period_start,
period_end,
search_start_time,
dimension_1 as account_id,
dimension_2 as service_name,
net_unblended_cost_amount::numeric::money
from
aws_cost_usage
where
granularity = 'MONTHLY'
and dimension_type_1 = 'LINKED_ACCOUNT'
and dimension_type_2 = 'SERVICE'
and search_start_time >= '2023-08-01' But then repeating the same query with a different date will return zero resuts and will not actually execute the query. select
period_start,
period_end,
search_start_time,
dimension_1 as account_id,
dimension_2 as service_name,
net_unblended_cost_amount::numeric::money
from
aws_cost_usage
where
granularity = 'MONTHLY'
and dimension_type_1 = 'LINKED_ACCOUNT'
and dimension_type_2 = 'SERVICE'
and search_start_time >= '2023-09-01'
And also would be good to get that functionality into the other tables as well |
Hi @shaicoleman, thank you for testing the changes. Regarding the issue where repeating the same query with a different date returns zero results, I have pushed a fix to the same branch ( Additionally, I've expanded the functionality to include Could you please pull the latest changes from the Let me know if you encounter any further issues. Thanks! |
Hi, It doesn't filter correctly the following query: SELECT service, unblended_cost_amount, period_start, period_end
FROM aws_cost_by_service_daily
WHERE search_start_time >= current_timestamp - interval '30d' AND service = 'Tax' It requests the default 1 year period, e.g. {"TimePeriod":{"End":"2024-04-24","Start":"2023-04-24"}} It also requests unnecessary metrics which aren't part of the query: {"Metrics":["BlendedCost","UnblendedCost","NetUnblendedCost","AmortizedCost","NetAmortizedCost","UsageQuantity","NormalizedUsageAmount"]} It would be good if it would only request the necessary metrics. Also, consider making the |
@shaicoleman, thank you for your insightful feedback. I've made the following updates to the branch
Apologies, that was my oversight. I neglected to push a commit from my local machine. Now, all tables should be able to accept a custom time range based on the query parameter.
Excellent suggestion! I've implemented it, and the results are impressive. I've added an optional string type column called For example, the query I would appreciate it if you could pull the latest changes from the Note: The Thanks again for your valuable input! |
Looks good to me! Thanks |
Hello, @shaicoleman, just to let you know, we've finalized the table design and implementation, and I've pushed the updates to the branch Changes details:
Query performance outcomes:
It would be great if you could pull the latest changes from the branch Thanks! |
It doesn't seem to calculate correctly the SELECT dimension_1 AS service, dimension_2 AS usage_type, unblended_cost_amount, period_start, period_end
FROM aws_cost_usage
WHERE granularity = 'DAILY' AND
dimension_type_1 = 'SERVICE' AND
dimension_type_2 = 'USAGE_TYPE' AND
period_start >= current_timestamp - interval '32d' AND
period_end BETWEEN current_timestamp - interval '31d' AND current_timestamp - interval '1d' AND
unblended_cost_amount <> 0 It sends the following The error message suggests it might accept a timestamp, but I haven't tried it, and the documentation says otherwise. Also note: The start date is inclusive, but the end date is exclusive. |
Hi @shaicoleman, Apologies for the late response. I have pushed another change to the I have tested the following query parameter combinations. SELECT dimension_1 AS service, dimension_2 AS usage_type, unblended_cost_amount, period_start, period_end
FROM aws_cost_usage
WHERE granularity = 'DAILY' AND
dimension_type_1 = 'SERVICE' AND
dimension_type_2 = 'USAGE_TYPE' AND
period_end >= current_timestamp - interval '32d' AND
unblended_cost_amount <> 0
SELECT dimension_1 AS service, dimension_2 AS usage_type, unblended_cost_amount, period_start, period_end
FROM aws_cost_usage
WHERE granularity = 'DAILY' AND
dimension_type_1 = 'SERVICE' AND
dimension_type_2 = 'USAGE_TYPE' AND
period_end >= current_timestamp - interval '32d' AND
unblended_cost_amount <> 0
SELECT dimension_1 AS service, dimension_2 AS usage_type, unblended_cost_amount, period_start, period_end
FROM aws_cost_usage
WHERE granularity = 'DAILY' AND
dimension_type_1 = 'SERVICE' AND
dimension_type_2 = 'USAGE_TYPE' AND
period_start >= current_timestamp - interval '32d' AND
period_end <= current_timestamp - interval '25d' AND
unblended_cost_amount <> 0
SELECT dimension_1 AS service, dimension_2 AS usage_type, unblended_cost_amount, period_start, period_end
FROM aws_cost_usage
WHERE granularity = 'DAILY' AND
dimension_type_1 = 'SERVICE' AND
dimension_type_2 = 'USAGE_TYPE' AND
period_start <= current_timestamp - interval '32d' AND
period_end >= current_timestamp - interval '35d' AND
unblended_cost_amount <> 0
SELECT dimension_1 AS service, dimension_2 AS usage_type, unblended_cost_amount, period_start, period_end
FROM aws_cost_usage
WHERE granularity = 'DAILY' AND
dimension_type_1 = 'SERVICE' AND
dimension_type_2 = 'USAGE_TYPE' AND
period_start <= current_timestamp - interval '32d' AND
unblended_cost_amount <> 0
SELECT dimension_1 AS service, dimension_2 AS usage_type, unblended_cost_amount, period_start, period_end
FROM aws_cost_usage
WHERE granularity = 'DAILY' AND
dimension_type_1 = 'SERVICE' AND
dimension_type_2 = 'USAGE_TYPE' AND
period_start >= current_timestamp - interval '32d' AND
unblended_cost_amount <> 0
SELECT dimension_1 AS service, dimension_2 AS usage_type, unblended_cost_amount, period_start, period_end
FROM aws_cost_usage
WHERE granularity = 'DAILY' AND
dimension_type_1 = 'SERVICE' AND
dimension_type_2 = 'USAGE_TYPE' AND
period_start >= current_timestamp - interval '32d' AND
period_end BETWEEN current_timestamp - interval '31d' AND current_timestamp - interval '1d' AND
unblended_cost_amount <> 0 Could you please give it a shot again by pulling the latest changes and let me know if I missed any edge cases?
Yes, we are passing the input as it is given in the query parameter, and the results are being displayed based on query parameter conditions. we are not modifying any of the rows. Could you please expand more, particularly for which query you see the deviation? Thanks |
@ParthaI , This works for most cases (including my usecase), but it fails with some edge cases. e.g. this will fall back to the default one year period, thus missing some expected results. SELECT dimension_1 AS service, dimension_2 AS usage_type, unblended_cost_amount, period_start, period_end
FROM aws_cost_usage
WHERE granularity = 'MONTHLY' AND
dimension_type_1 = 'SERVICE' AND
dimension_type_2 = 'USAGE_TYPE' AND
(period_end >= current_timestamp - interval '32d' OR period_end >= current_timestamp - interval '500d') AND
unblended_cost_amount <> 0 The ideal solution would be to handle the common cases automatically, but have fields for a manual override as before. I'm not sure it will always be possible to correctly detect if it is indeed a simple case or not (OR conditions, CTEs, unions, subqueries, etc.) Explicitly setting the fields as before is the most reliable way to do it. |
Thanks @shaicoleman, For your feedback. I will retake a look. |
One thing that would be nice to keep from the current solution is the ability to specify timestamps instead of just dates |
Hello, @shaicoleman. I appreciate your suggestions. I have been comparing the query results from the Query: SELECT dimension_1 AS service, dimension_2 AS usage_type, unblended_cost_amount, period_start, period_end
FROM aws_cost_usage
WHERE granularity = 'MONTHLY' AND
dimension_type_1 = 'SERVICE' AND
dimension_type_2 = 'USAGE_TYPE' AND
(period_end >= current_timestamp - interval '32d' OR period_end >= current_timestamp - interval '500d') AND
unblended_cost_amount <> 0 Plugin Environment:
There is no difference in the results from either branch.
Also, I have validated the result as well, I did not see any deviation between the results.
Could you please let me know if you noticed any mismatch in your case?
I believe it would be a breaking change for the user to give an error rather than fall back to an arbitrary date range. Instead, let the API return the results and perform the filtration at the Steampipe query level based on the query parameters (this approach looks good to me).
Great idea. According to our current code, if the granularity is set to Thank you! |
I checked the date range that was sent to the API, if the date range is wrong, there is no way the results can be correct. SELECT dimension_1 AS service, dimension_2 AS usage_type, unblended_cost_amount, period_start, period_end
FROM aws_cost_usage
WHERE granularity = 'MONTHLY' AND
dimension_type_1 = 'SERVICE' AND
dimension_type_2 = 'USAGE_TYPE' AND
(period_end >= current_timestamp - interval '1d' OR period_end >= current_timestamp - interval '2d') AND
unblended_cost_amount <> 0 The example above requests a whole year instead of 2 days. If the granuality is daily it can be very expensive, thus doesn't fix the issue it was supposed to solve in the first place. I understand not wanting to break compatibility, but the current behaviour can cause silently excessive costs, silently truncated data and inconsistent behaviour. This justifies breaking compatibility for the small amount of users who are using this table. Giving a clear error that can be easily fixed is much better than the silent bugs above, which required spending a lot of my time and yours to figure out. If breaking compatibility is not something you're willing to consider, then at least allow to optionally set the It could also print a warning to stderr (e.g. WARNING: future versions will require explicitly setting the search_start_time and search_end_time, defaulting to a year period) |
Hi @shaicoleman, I hope this message finds you well. I have reviewed your requirements from a costing perspective and would like to address the following points:
For the query you mentioned, I have made a slight modification to make the API call based on the query parameter, achieving the same result as the original query. WITH period_end_1d AS (
SELECT dimension_1 AS service, dimension_2 AS usage_type, unblended_cost_amount, period_start, period_end
FROM aws_cost_usage
WHERE granularity = 'MONTHLY' AND
dimension_type_1 = 'SERVICE' AND
dimension_type_2 = 'USAGE_TYPE' AND
period_end >= current_timestamp - interval '1d' AND
unblended_cost_amount <> 0
),
period_end_2d AS (
SELECT dimension_1 AS service, dimension_2 AS usage_type, unblended_cost_amount, period_start, period_end
FROM aws_cost_usage
WHERE granularity = 'MONTHLY' AND
dimension_type_1 = 'SERVICE' AND
dimension_type_2 = 'USAGE_TYPE' AND
period_end >= current_timestamp - interval '2d' AND
unblended_cost_amount <> 0
)
SELECT * FROM period_end_1d
UNION
SELECT * FROM period_end_2d; Thank you for your attention to these details. I look forward to your feedback. |
Hi @ParthaI , I don't really have anything more to add to what I've already said. In my opinion, having the possibility of silently truncating data or silently causing unforeseen expenses is a huge dealbreaker, and is much worse than breaking compatibility and having some redundancy in the query. There is no easy way to know if a query will trigger the correct behaviour or not without tracing the API calls. This means adding the @misraved , maybe give your feedback on what approach you prefer, and we'll go with that. |
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days. |
Any update on this? |
Sorry for the delay on this issue @shaicoleman!! I will review and release the fix by the end of this week 👍. Thank you for your patience. |
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days. |
For the following query:
Generates the following filter request:
This requests data for a whole year, instead of just for the requested period, and thus causes many unnecessary API requests which each cost $0.01.
Steampipe v0.22.0
turbot/aws v0.132.0
The text was updated successfully, but these errors were encountered: