-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to track query progress? #11
Comments
You're welcome! Thanks for the feedback - glad it's helpful. If you're interested in contributing or sharing ideas on implementation I'm open to it. I do feel like the internals could do with a refactor looking back over the codebase today and have started hacking out an idea on that out on a branch, but I don't think there's any need to make any user-facing changes. I hadn't thought much about interactive UX in all honesty: it definitely seems like a useful feature that'd be easier implemented inside the library than bolting it on afterwards. I'm happy to have a go at implementing when I can find some spare time, although I don't have access to a large organization to test after changing employer, and the highest number of accounts I was dealing with was only around 250~. The main bottleneck if I remember correctly is the org master account rate limiting sts:assumerole at an AWS api level, but after that I'd expect the constraint being how many thread workers a machine is happy with. I wonder if moving to lazy loading and yielding where possible would add a bit more performance - I intentionally set the "assume roles" and "do work" into two disparate steps on the first pass. |
I'd be happy to help out. I know enough Python to be dangerous and I've published a few experimental boto clients to make it easier to query across pages and across regions.
I'm happy to help you test it in my personal organization to get the right ergonomics and programmability. It should be possible to create a test AWS organization with many accounts. The initial quota for members is 4, but there's no published hard limit. CloudFormation stack sets could be used to populate them with arbitrary resources for the sake of having something to query. The biggest problem would be closing the member accounts afterwards. There's no API for that!
I've been thrown throttling errors from the organizations DescribeAccounts API. I don't have a reliable way to reproduce it. It usually works again on the second attempt.
I had a suspicion it worked like that! As soon as a credentials for an assumed role are obtained then it would be possible to start querying the given acccount. I'm not sure how it would work in practice. I have never programmed with asyncio. But I have had success with multiprocessing.Pool for similar tasks. |
Copied from #14 (comment):
Glad to hear it! I'll have a look after creating a PR for #14. |
Looks pretty! I'll give you some feedback on my next use of the tool :-) |
I love this! It's just what I need to get visual feedback of the progress. I've tried it in a smaller org of about 50 accounts. When this gets published, I hope I'll have the chance to use it on something bigger. I'd also like to use this in a tool I've built that uses botocove to collect inventory. https://github.com/iainelder/aws-org-inventory/ The API calls to list resources can take a long time to run in accounts with a lot of resources. |
Released in 1.4.0 - #19 |
Thanks! I'll try it out when I get a moment. |
First let me thank you for this tool. It's a game changer! botocove is the best tool I know for ad-hoc analysis across an organization.
Currently I'm working with two organizations that each have in the order of 500 to 1000 accounts.
Across such large organizations, botocove takes hundreds of seconds to return a result. Anecdotally, depending on network conditions, I can wait between 120 and 300 seconds to get a result.
That's still good enough for interactive use, but it would be helpful to get some kind of "loading bar"-style feedback to know how long I should expect to wait.
I've considered adding a counter to the function wrapped by botocove. I've not tried it yet, but I guess it would work. I would need to run botocove in a second thread to be able to check the counter value.
Another solution could be to make botocove return immediately and run in the background. It would return an object with a blocking call to get the result and other calls to get the number of queries in progress, the number completed, the number remaining, and so on.
Is that something you have already considered?
The text was updated successfully, but these errors were encountered: