Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(elasticsearch): allow bulk delete #8424

Merged
merged 2 commits into from
Jul 31, 2023
Merged

Conversation

david-leifker
Copy link
Collaborator

Batch writes to elasticsearch generally contain only create/upsert requires while deletes are typically isolated. This PR offers the ability to allow combining delete requests (including delete_by_query) with upserts. If used responsibly this works fine, however in the past delete/modify requests effecting the same document was observed in the same batch of requests. This conflict caused inconsistencies due to failed bulk requests. In general, probably useful only in extreme cases, and temporarily, for bulk delete operations.

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

@github-actions github-actions bot added product PR or Issue related to the DataHub UI/UX devops PR or Issue related to DataHub backend & deployment labels Jul 14, 2023
@anshbansal anshbansal merged commit b77b4e2 into master Jul 31, 2023
36 checks passed
@anshbansal anshbansal deleted the allow-batch-delete-option branch July 31, 2023 04:35
spadhi7 added a commit to spadhi7/datahub that referenced this pull request Aug 29, 2023
* tag 'v0.10.5': (222 commits)
  fix(test): increase siblings.js test stability (datahub-project#8542)
  feat(search): Allow aggregating on facets that are not explicitly part of default filter set (datahub-project#8540)
  fix(ui) Make multiple small updates to new search and browse (datahub-project#8524)
  feat(presto-on-hive): allow v1 fieldpaths in the presto-on-hive source (datahub-project#8474)
  feat(cli): Adds ability to upload recipes to DataHub's UI (datahub-project#8317)
  feat(browseV2): add browseV2 logic to system update (datahub-project#8506)
  fix(ingest/json-schema): convert non-string enums to strings (datahub-project#8479)
  feat(ingestion/tableau): support column level lineage for custom sql (datahub-project#8466)
  test(ingest): test case statements with sql parser (datahub-project#8437)
  feat(ingest/vertica): performance improvement and bug fixes (datahub-project#8328)
  ci: reduce git fetch depth (datahub-project#8473)
  fix(ingest): remove duplication of tags (datahub-project#8532)
  docs: small update to homepage (datahub-project#8483)
  fix(ingest): pin boto3-stubs in CI (datahub-project#8527)
  feat(siblings): hiding non-existant siblings in FE (datahub-project#8528)
  fix(ingest/build): Fix sagemaker mypy and flake8 issues (datahub-project#8530)
  feat(metrics): add metrics for aspect write and bytes (datahub-project#8526)
  feat(elasticsearch): allow bulk delete (datahub-project#8424)
  fix(ui): use locale lowercase when filtering columns of an entity in the lineage (datahub-project#8213)
  fix(auth): ignore case when comparing http headers (datahub-project#8356)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
devops PR or Issue related to DataHub backend & deployment product PR or Issue related to the DataHub UI/UX
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants