Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[INLONG-6151][Manager] Add data cleansing task and optimize query indexes #6168

Merged
merged 13 commits into from
Oct 20, 2022

Conversation

woofyzhao
Copy link
Contributor

@woofyzhao woofyzhao commented Oct 13, 2022

Motivation

  • By default inlong manager only performs logical deletion.
  • In cases where there are huge amount of offline periodic tasks it may be required to cleanse logically deleted data to reduce storage size.
  • We may as well use external scripts and add a crontab job to do it, but also introduces considerable maintenance overheads. Besides in cloud environment the crontab job (CronJob) might not be convenient.

Modification

  • Provide an optional periodic task to do the data purging in the hope that no external scripts or configuration maintenance occur.
  • Optimize some table indexes to speed up query.

@woofyzhao woofyzhao changed the title [Manager][INLONG-6151] Add optional data cleansing to control storage size [INLONG-6151][Manager] Add optional data cleansing to control storage size Oct 13, 2022
@woofyzhao woofyzhao changed the title [INLONG-6151][Manager] Add optional data cleansing to control storage size [INLONG-6151][Manager] Add optional data cleansing task to control storage size, optimize query indexes Oct 14, 2022
@healchow healchow changed the title [INLONG-6151][Manager] Add optional data cleansing task to control storage size, optimize query indexes [INLONG-6151][Manager] Add data cleansing task and optimize query indexes Oct 19, 2022
@healchow healchow force-pushed the INLONG-6151 branch 2 times, most recently from 5071fb0 to 74f1ac9 Compare October 19, 2022 06:49
@healchow healchow merged commit eb24f31 into apache:master Oct 20, 2022
Yizhou-Yang added a commit to Yizhou-Yang/inlong-yyz that referenced this pull request Oct 24, 2022
* feature-master: (24 commits)
  [INLONG-6256][Sort] Support debezium-json format with schema parse for DebeziumJsonDynamicSchemaFormat (apache#6259)
  [INLONG-6236][CVE] Unified the hive is 3.1.3 after fixed CVE-2021-34538 (apache#6262)
  [INLONG-6251][Agent] Fix the ConcurrentModification error in the unit test (apache#6252)
  [INLONG-6174][Sort] MySql connector support meta data with debezium format  (apache#6210)
  [INLONG-6243][Sort] Support custom name for Sort job (apache#6244)
  [INLONG-6239][InLong] Add inlongctl in the root directory (apache#6240)
  [INLONG-6236][CVE] Fix the CVE-2022-42003 for jackson-databind (apache#6237)
  [INLONG-6241][Docker] Add manager and audit database name configuration (apache#6242)
  [INLONG-6220][Manager] Support query cluster nodes by the manager client (apache#6221)
  [INLONG-6234][DataProxy] Adjust the source report information acquisition source (apache#6235)
  [INLONG-6151][Manager] Add data cleansing task and optimize query indexes (apache#6168)
  [INLONG-6224][Sort] Import schema and data parsing ability for DynamicSchemaFormat (apache#6225)
  [INLONG-6188] join supports multiple fields (apache#6189)
  [INLONG-6207][Agent] Optimize the unit test for TestSQLServerReader (apache#6208)
  [INLONG-6231][CVE] Upgrade org.apache.hive:hive-exec to 3.1.3 (apache#6199)
  [INLONG-6222][Manager] Fix parse source status failure in Command tools (apache#6223)
  [INLONG-6152][Sort] MySQL connector support filtering kinds of row data (apache#6173)
  [INLONG-6228][DataProxy] Modify the default values of the dataproxy-tube.conf file (apache#6229)
  [INLONG-6209][Manager] Clean and reuse code for DataNode (apache#6211)
  [INLONG-6194][Agent] Support parsing metrics for different components (apache#6195)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature][Manager] Add logically deleted group purging
3 participants