-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Async Code V0 #26324
Async Code V0 #26324
Conversation
* Set up initial supervisor and worker thread scaffolding. Set up WorkerConfig to move flush over. * Checkpoint: before moving staging operation interface. * Uber merge with Charles.
…byte into cgardens/async-destination
…byte into cgardens/async-destination
* memory manager * bring back close * use constant
…byte into cgardens/async-destination
- fix reclaiming memory from queues - suggest which knobs to turn next
import org.apache.commons.io.FileUtils; | ||
|
||
@Slf4j | ||
public class BufferManager { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this class can be further split and tested. will do so in a follow up PR
PTAL @ryankfu @cgardens, this is the first set of PRs to get our code from #26086 into master. With this first PR, I want to get the smallest reasonable set into master, with a focus on documenting the interfaces. I was hoping we could do this more piecemeal, however there is really no way since all these classes are somewhat related. Charles, I know you were hoping to comment and rewrite the queue bits. Feel free to do so here or in a follow up PR. My preference is a follow up PR to keep things small. The queue bits are well tested so I don't believe having this in master is 'that bad'. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am happy to take naming suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe AsyncDataUploader
? As suggested by ChatGPT 😛
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decent idea. My thought was to keep this in line with the interface name of DestinationFlushFunction
. Let's iterate on this next week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am happy to take naming suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with this V0 merge, since it doesn't include any of the StagingConsumerFactory
changes it doesn't affect snowflake directly. Agree that comments and additional breaking out the classes can be in follow up PR
It does feel a little weird that there's Postgres/MySQL changes in here but those look like spotlessJava
linting
I am merging this in as it doesn't affect any actual connectors. |
Follow up to #26324 - here we split up the BufferManager and add tests and comments. - Split up the buffer manager class into -> BufferManager, BufferEnqueue and BufferDequeue. - Move all buffer related code to the buffers package. - Rename test classes to match this split. - Add java docs and tests as part of this split. - Simplify the BufferDequeue interface to return a set streams representing the buffered streams instead of the underlying map of buffers. This lets us keep the memory queue package private. - all getYMethods now return Optionals for better error handling. This would have resulted in NPEs previously.
Follow up after #26324 . Introduce the AsyncStreamConsumer. After this, one more PR to add the Staging Consumer changes in.
Split out the smallest set of reasonable changes from #26086 . My goal was to split out the interface, as well as show how the interface it's meant to be used. Follow up PRs: - Split out classes from BufferManager and add more tests there. - Add in the AsyncConsumer with tests. - Add in the StagingConsumer factory.
Follow up to #26324 - here we split up the BufferManager and add tests and comments. - Split up the buffer manager class into -> BufferManager, BufferEnqueue and BufferDequeue. - Move all buffer related code to the buffers package. - Rename test classes to match this split. - Add java docs and tests as part of this split. - Simplify the BufferDequeue interface to return a set streams representing the buffered streams instead of the underlying map of buffers. This lets us keep the memory queue package private. - all getYMethods now return Optionals for better error handling. This would have resulted in NPEs previously.
Follow up after #26324 . Introduce the AsyncStreamConsumer. After this, one more PR to add the Staging Consumer changes in.
Split out the smallest set of reasonable changes from airbytehq#26086 . My goal was to split out the interface, as well as show how the interface it's meant to be used. Follow up PRs: - Split out classes from BufferManager and add more tests there. - Add in the AsyncConsumer with tests. - Add in the StagingConsumer factory.
Follow up to airbytehq#26324 - here we split up the BufferManager and add tests and comments. - Split up the buffer manager class into -> BufferManager, BufferEnqueue and BufferDequeue. - Move all buffer related code to the buffers package. - Rename test classes to match this split. - Add java docs and tests as part of this split. - Simplify the BufferDequeue interface to return a set streams representing the buffered streams instead of the underlying map of buffers. This lets us keep the memory queue package private. - all getYMethods now return Optionals for better error handling. This would have resulted in NPEs previously.
Follow up after airbytehq#26324 . Introduce the AsyncStreamConsumer. After this, one more PR to add the Staging Consumer changes in.
What
Split out the smallest set of reasonable changes from #26086 .
My goal was to split out the interface, as well as show how the interface it's meant to be used.
Follow up PRs:
BufferManager
and add more tests there.How
Split out the classes and add javadocs.
From original branch:
StreamDestinationFlusher
toDestinationFlushFunction
.UploadWorkers
toFlushWorkers
.Did not touch any tests.
Some formatting changes.
Recommended reading order
🚨 User Impact 🚨
No user impact since all of this is in a new module.
Pre-merge Actions
Expand the relevant checklist and delete the others.
New Connector
Community member or Airbyter
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.0.0.1
Dockerfile
has version0.0.1
README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog with an entry for the initial version. See changelog exampledocs/integrations/README.md
Airbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
Updating a connector
Community member or Airbyter
Airbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
Connector Generator
-scaffold
in their name) have been updated with the latest scaffold by running./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates
then checking in your changes