-
Notifications
You must be signed in to change notification settings - Fork 26
feat!: Support table_concurrency and resource_concurrency #268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: Kemal <223029+disq@users.noreply.github.com>
hermanschaaf
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great 👍 I left a few comments / thoughts.
Not for this PR, but I suppose we could add another option later to control concurrency for child tables at well. It wouldn't be a global pool (could get deadlock), but we could say, for example, that every table is only allowed to spin up 10 goroutines for its children. That way you'd have some control and children can be processed concurrently, but you can't run into a deadlock situation. Goroutines and memory usage can theoretically grow unbounded though.
There are other more complicated options too. Having parent tables relinquish control over their goroutine before waiting on child tables to complete should also help avoid deadlock. But it gets real complicated fast; I like the approach in this PR.
Thanks! I don't think the 10 per child will work as then we can't bound goroutines (this is what we had with the list/detail resolvers where we just spawn up as many goroutines as we wanted) which is the whole point otherwise we can just have unlimited go routines and that's it but that we know can cause trouble. |
🤖 I have created a release *beep* *boop* --- ## [0.13.0](v0.12.10...v0.13.0) (2022-10-10) ### ⚠ BREAKING CHANGES * Support table_concurrency and resource_concurrency (#268) ### Features * Support table_concurrency and resource_concurrency ([#268](#268)) ([7717d6f](7717d6f)) ### Bug Fixes * Add custom log reader implementation to fix hang on long log lines ([#263](#263)) ([f8ca238](f8ca238)) * DeleteStale feature ([#269](#269)) ([837c5f3](837c5f3)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
This makes #268 backwards-compatible with prior versions, as support for `concurrency` is retained for now.
This adds documentation around the new concurrency options being added in the next release. Related PRs: - cloudquery/plugin-sdk#268 - cloudquery/plugin-sdk#271 Closes #2609
Summary
Closes cloudquery/cloudquery#2313 cloudquery/cloudquery#159 #264
This renames
concurrencytotable_concurrency(this has the same behaviour as previously but the name was too generic).Also this introduce a new option
resource_concurrency- this limits the number of go routines that resolve a specific resource (useful when a lot of column resolvers that have api calls andPreResourceResolver).Importe Note: both options are only for top-level concurrency control i.e we only spawn go-routines for parent tables and for resources of parent table. The reason for that is that there is no concurrency model that can work with one variable for recursive calls otherwise it will get deadlock.
So right now the SDK will support only for one level - in the future if we want to support additional level we can do
table_concurrency_1andresource_concurrency_1but I think for now we can skip this.