Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async Code V0 #26324

Merged
merged 42 commits into from
May 20, 2023
Merged

Async Code V0 #26324

merged 42 commits into from
May 20, 2023

Conversation

davinchia
Copy link
Contributor

@davinchia davinchia commented May 19, 2023

What

Split out the smallest set of reasonable changes from #26086 .

My goal was to split out the interface, as well as show how the interface it's meant to be used.

Follow up PRs:

  • Split out classes from BufferManager and add more tests there.
  • Add in the AsyncConsumer with tests.
  • Add in the StagingConsumer factory.

How

Split out the classes and add javadocs.

From original branch:

  • renamed StreamDestinationFlusher to DestinationFlushFunction.
  • renamed UploadWorkers to FlushWorkers.
  • also split out the MemoryManager class to its own class.

Did not touch any tests.

Some formatting changes.

Recommended reading order

  • DestinationFlushFunction.java and FlushWorkers.java for the javadocs.
  • Everything else is as is.

🚨 User Impact 🚨

No user impact since all of this is in a new module.

Pre-merge Actions

Expand the relevant checklist and delete the others.

New Connector

Community member or Airbyter

  • Community member? Grant edit access to maintainers (instructions)
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Connector version is set to 0.0.1
    • Dockerfile has version 0.0.1
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • docs/integrations/<source or destination>/<name>.md including changelog with an entry for the initial version. See changelog example
    • docs/integrations/README.md

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
Updating a connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Unit & integration tests added

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
Connector Generator
  • Issue acceptance criteria met
  • PR name follows PR naming conventions
  • If adding a new generator, add it to the list of scaffold modules being tested
  • The generator test modules (all connectors with -scaffold in their name) have been updated with the latest scaffold by running ./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates then checking in your changes
  • Documentation which references the generator is updated as needed

cgardens and others added 30 commits May 15, 2023 13:07
* Set up initial supervisor and worker thread scaffolding. Set up WorkerConfig to move flush over.

* Checkpoint: before moving staging operation interface.

* Uber merge with Charles.
* memory manager

* bring back close

* use constant
- fix reclaiming memory from queues
- suggest which knobs to turn next
import org.apache.commons.io.FileUtils;

@Slf4j
public class BufferManager {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this class can be further split and tested. will do so in a follow up PR

@davinchia davinchia marked this pull request as ready for review May 19, 2023 23:57
@davinchia davinchia requested review from a team as code owners May 19, 2023 23:57
@davinchia davinchia requested review from cgardens and ryankfu and removed request for a team May 20, 2023 00:02
@davinchia
Copy link
Contributor Author

PTAL @ryankfu @cgardens, this is the first set of PRs to get our code from #26086 into master.

With this first PR, I want to get the smallest reasonable set into master, with a focus on documenting the interfaces.

I was hoping we could do this more piecemeal, however there is really no way since all these classes are somewhat related.

Charles, I know you were hoping to comment and rewrite the queue bits. Feel free to do so here or in a follow up PR. My preference is a follow up PR to keep things small. The queue bits are well tested so I don't believe having this in master is 'that bad'.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy to take naming suggestions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe AsyncDataUploader? As suggested by ChatGPT 😛

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decent idea. My thought was to keep this in line with the interface name of DestinationFlushFunction. Let's iterate on this next week.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy to take naming suggestions.

Copy link
Contributor

@ryankfu ryankfu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with this V0 merge, since it doesn't include any of the StagingConsumerFactory changes it doesn't affect snowflake directly. Agree that comments and additional breaking out the classes can be in follow up PR

It does feel a little weird that there's Postgres/MySQL changes in here but those look like spotlessJava linting

@davinchia davinchia merged commit 8bfbef2 into master May 20, 2023
6 of 7 checks passed
@davinchia davinchia deleted the davinchia/async-code-0 branch May 20, 2023 20:41
@davinchia
Copy link
Contributor Author

I am merging this in as it doesn't affect any actual connectors.

davinchia added a commit that referenced this pull request May 22, 2023
Follow up to #26324 - here we split up the BufferManager and add tests and comments.

- Split up the buffer manager class into -> BufferManager, BufferEnqueue and BufferDequeue.
- Move all buffer related code to the buffers package.
- Rename test classes to match this split.
- Add java docs and tests as part of this split.
- Simplify the BufferDequeue interface to return a set streams representing the buffered streams instead of the underlying map of buffers. This lets us keep the memory queue package private.
- all getYMethods now return Optionals for better error handling. This would have resulted in NPEs previously.
davinchia added a commit that referenced this pull request May 22, 2023
Follow up after #26324 .

Introduce the AsyncStreamConsumer.

After this, one more PR to add the Staging Consumer changes in.
nguyenaiden pushed a commit that referenced this pull request May 25, 2023
Split out the smallest set of reasonable changes from #26086 .

My goal was to split out the interface, as well as show how the interface it's meant to be used.

Follow up PRs:
- Split out classes from BufferManager and add more tests there.
- Add in the AsyncConsumer with tests.
- Add in the StagingConsumer factory.
nguyenaiden pushed a commit that referenced this pull request May 25, 2023
Follow up to #26324 - here we split up the BufferManager and add tests and comments.

- Split up the buffer manager class into -> BufferManager, BufferEnqueue and BufferDequeue.
- Move all buffer related code to the buffers package.
- Rename test classes to match this split.
- Add java docs and tests as part of this split.
- Simplify the BufferDequeue interface to return a set streams representing the buffered streams instead of the underlying map of buffers. This lets us keep the memory queue package private.
- all getYMethods now return Optionals for better error handling. This would have resulted in NPEs previously.
nguyenaiden pushed a commit that referenced this pull request May 25, 2023
Follow up after #26324 .

Introduce the AsyncStreamConsumer.

After this, one more PR to add the Staging Consumer changes in.
marcosmarxm pushed a commit to natalia-miinto/airbyte that referenced this pull request Jun 8, 2023
Split out the smallest set of reasonable changes from airbytehq#26086 .

My goal was to split out the interface, as well as show how the interface it's meant to be used.

Follow up PRs:
- Split out classes from BufferManager and add more tests there.
- Add in the AsyncConsumer with tests.
- Add in the StagingConsumer factory.
marcosmarxm pushed a commit to natalia-miinto/airbyte that referenced this pull request Jun 8, 2023
Follow up to airbytehq#26324 - here we split up the BufferManager and add tests and comments.

- Split up the buffer manager class into -> BufferManager, BufferEnqueue and BufferDequeue.
- Move all buffer related code to the buffers package.
- Rename test classes to match this split.
- Add java docs and tests as part of this split.
- Simplify the BufferDequeue interface to return a set streams representing the buffered streams instead of the underlying map of buffers. This lets us keep the memory queue package private.
- all getYMethods now return Optionals for better error handling. This would have resulted in NPEs previously.
marcosmarxm pushed a commit to natalia-miinto/airbyte that referenced this pull request Jun 8, 2023
Follow up after airbytehq#26324 .

Introduce the AsyncStreamConsumer.

After this, one more PR to add the Staging Consumer changes in.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants