-
Notifications
You must be signed in to change notification settings - Fork 78
feat(optimizer): [3/N] Analyzer #533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
209 commits
Select commit
Hold shift + click to select a range
2119555
feat(optimizer): add data model — schema, entities, DTOs, converters
mkuchenbecker 3c93d52
fix: address PR review feedback on optimizer data model
mkuchenbecker d419eb3
feat(optimizer): add repositories and repository tests
mkuchenbecker 7ff3b43
fix: consolidate repo methods — single find with optional filters
mkuchenbecker f7f6812
feat(optimizer): add REST service layer, controllers, and shared module
mkuchenbecker ef3260f
fix: update service impl to use consolidated find methods
mkuchenbecker ac1da01
feat(optimizer): add apps/optimizer shared module with find-only repos
mkuchenbecker be353ca
Merge mkuchenb/optimizer-1 into optimizer-2
mkuchenbecker 02a5ab3
fix: remove orphan fields from CompleteOperationRequest
mkuchenbecker 5c78c8f
Merge mkuchenb/optimizer-0 into optimizer-1
mkuchenbecker 2ddc445
Merge mkuchenb/optimizer-1 into optimizer-2
mkuchenbecker 01466c7
feat(optimizer): add service-layer integration tests
mkuchenbecker ff07fde
fix: assert stats history delta values in upsert test
mkuchenbecker c0802cb
feat(optimizer): add analyzer app — continuous table operation schedu…
mkuchenbecker 63b0768
fix: address PR review feedback on optimizer-3 analyzer
mkuchenbecker 1cbe556
Merge branch 'main' into mkuchenb/optimizer-0
mkuchenbecker 231e1a1
Merge branch 'mkuchenb/optimizer-0' into mkuchenb/optimizer-1
mkuchenbecker ff448a0
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker 936f3f3
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker f82d1b3
fix(optimizer): address PR #527 review feedback
mkuchenbecker e907a31
Merge branch 'mkuchenb/optimizer-0' into mkuchenb/optimizer-1
mkuchenbecker a109f02
fix(optimizer): propagate optimizer-0 renames into repos and tests
mkuchenbecker a1ef430
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker df01c26
fix(optimizer): propagate optimizer-0 renames into service + controller
mkuchenbecker 67538f8
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker 11dd115
fix(optimizer): propagate optimizer-0 renames into apps/optimizer + a…
mkuchenbecker 027fccd
fix(optimizer): add databaseName + tableName to apps/optimizer histor…
mkuchenbecker e0a49da
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker 4f4158d
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker 79753f1
fix(optimizer): index table_operations_history on (database_name, tab…
mkuchenbecker ae610ae
Merge branch 'mkuchenb/optimizer-0' into mkuchenb/optimizer-1
mkuchenbecker 85a432f
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker dceef97
feat(optimizer): unify REST prefix to /v1/optimizer; add name-based h…
mkuchenbecker 7170599
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker bf04488
fix(optimizer): align apps/optimizer entities with services schema
mkuchenbecker f2ac002
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker 5f6fa3b
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker 8054586
refactor(optimizer-analyzer): delete unused AnalyzerConfig
mkuchenbecker 5af5f14
refactor(optimizer-analyzer): remove circuit breaker, defer with TODO
mkuchenbecker 62f426a
feat(optimizer): add findLatestPerTable to history repo
mkuchenbecker e13a31b
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker 442f1e4
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker c4f194a
perf(optimizer-analyzer): use findLatestPerTable for history lookup
mkuchenbecker 6da624a
refactor(optimizer-analyzer): typed OperationType/Status, polish cade…
mkuchenbecker 52ba858
refactor(optimizer-analyzer): rename OrphanFilesDeletionAnalyzer → Ca…
mkuchenbecker beedad8
fix(optimizer-analyzer): update class name inside renamed files
mkuchenbecker 3483b25
perf(optimizer): index table_operations_history for findLatestPerTable
mkuchenbecker 9748548
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker e4a1ad1
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker f663537
docs(optimizer-analyzer): add scale roadmap as block comment
mkuchenbecker d7e3a65
docs(optimizer-analyzer): move scale roadmap to BDP-102182
mkuchenbecker 0293009
feat(optimizer): add findDistinctDatabaseNames to TableStatsRepository
mkuchenbecker 0efab45
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker 6fa885d
refactor(optimizer): Optional<T> for optional filter params in servic…
mkuchenbecker a5585f4
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker dd4faf2
refactor(optimizer-analyzer): address PR review — required op, per-db…
mkuchenbecker 91ba362
style(optimizer-analyzer): tighten AnalyzerRunner.analyze body
mkuchenbecker eba1392
feat(optimizer): promote internal model types to shared apps/optimizer
mkuchenbecker 10ed1bb
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker 35bcd38
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker 0dbe3d9
refactor(optimizer-analyzer): import shared model + explanatory comment
mkuchenbecker e576593
refactor(optimizer): rename apps/optimizer entities + repos to plural…
mkuchenbecker 9f88a4a
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker 6f98e1a
refactor(optimizer): consolidate entities/repos into apps/optimizer; …
mkuchenbecker d44783f
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker d90c26f
refactor(optimizer): move apps/optimizer module into services/optimizer
mkuchenbecker 62f33b7
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker 17e280f
refactor(optimizer): drop apps/optimizer-data dep; simplify history API
mkuchenbecker a5df7e4
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker b0898e3
refactor(optimizer-analyzer): depend on :services:optimizer
mkuchenbecker 9a129a8
refactor(optimizer): align data model — rename HistoryStatus; String …
mkuchenbecker a8978a0
Merge branch 'mkuchenb/optimizer-0' into mkuchenb/optimizer-1
mkuchenbecker dfb9102
refactor(optimizer): realign entity shapes with optimizer-0
mkuchenbecker e3bf9e1
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker fb71bd9
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker 681407e
feat(optimizer): add internal model layer
mkuchenbecker 2005bca
Merge branch 'mkuchenb/optimizer-0' into mkuchenb/optimizer-1
mkuchenbecker b689969
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker 7ea8868
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker d7767e8
fix(optimizer-analyzer): rewrite AnalyzerRunnerTest to use entity bui…
mkuchenbecker e3fb777
perf(optimizer): index table_operations_history for findLatestPerTable
mkuchenbecker f89889d
Merge branch 'mkuchenb/optimizer-0' into mkuchenb/optimizer-1
mkuchenbecker beaaf88
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker 4a8796c
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker d3e1726
refactor(optimizer): enforce layer boundaries in api/ + model/
mkuchenbecker db9513a
Merge branch 'mkuchenb/optimizer-0' into mkuchenb/optimizer-1
mkuchenbecker 1d469a7
refactor(optimizer): remove db-layer types from optimizer-0
mkuchenbecker eee8eca
refactor(optimizer): remove DB schema + schema-init properties
mkuchenbecker 0567753
Merge branch 'mkuchenb/optimizer-0' into mkuchenb/optimizer-1
mkuchenbecker 328e5b9
refactor(optimizer): scrub MySQL / JPA / datasource references
mkuchenbecker f7a5d20
refactor(optimizer): drop UpsertTableOperationsRequest
mkuchenbecker 2a532b5
refactor(optimizer): drop JobResult from the wire and internal model
mkuchenbecker 2e3a231
feat(optimizer): add debug echo fields to CompleteOperationRequest
mkuchenbecker db5eb29
refactor(optimizer): move application.properties out of optimizer-0
mkuchenbecker bbcf84a
Merge branch 'mkuchenb/optimizer-0' into mkuchenb/optimizer-1
mkuchenbecker ac3abc0
feat(optimizer): introduce db/ layer with per-layer types
mkuchenbecker e79eec7
refactor(optimizer): split TableStats envelope into snapshot + delta …
mkuchenbecker f955ded
fix(optimizer): drop CommitDeltaMetrics from TableStatsRow
mkuchenbecker 13987c1
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker 969949d
refactor(optimizer): rewire service layer onto api/model/db mappers
mkuchenbecker 861b584
feat(optimizer): extend model layer for service-only types
mkuchenbecker 41d4c6d
Merge branch 'mkuchenb/optimizer-0' into mkuchenb/optimizer-1
mkuchenbecker 69d9e8f
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker b60a3bf
feat(optimizer): extend ModelDbMapper for service-only types
mkuchenbecker eb6e3be
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker b80b2e5
refactor(optimizer): service layer returns only model/ types
mkuchenbecker ef453ca
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker ad11533
refactor(optimizer-analyzer): consume model/ types and ModelDbMapper
mkuchenbecker 25d98aa
feat(optimizer): restore batch CAS methods on TableOperationsRepository
mkuchenbecker 31fac5b
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker 51dab67
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker 188713d
docs(optimizer): comment every field on opt-0 api/ and model/ types
mkuchenbecker f060b5e
Merge branch 'mkuchenb/optimizer-0' into mkuchenb/optimizer-1
mkuchenbecker 1119699
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker 619df83
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker 8d64273
refactor(optimizer): remove clusterId from SnapshotMetrics
mkuchenbecker ee7bcab
Merge branch 'mkuchenb/optimizer-0' into mkuchenb/optimizer-1
mkuchenbecker c1ad246
refactor(optimizer): comment every db/ field; drop clusterId and version
mkuchenbecker 72b431c
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker 0b30130
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker c72aae8
refactor(optimizer): move api↔model conversion onto api types; delete…
mkuchenbecker 1fca287
Merge branch 'mkuchenb/optimizer-0' into mkuchenb/optimizer-1
mkuchenbecker 8ae8777
refactor(optimizer): move model↔db conversion onto model types; delet…
mkuchenbecker b3aacff
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker bb8aa4d
refactor(optimizer): service + controllers use type to/from methods
mkuchenbecker 6a23755
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker 95456be
refactor(optimizer-analyzer): use type to/from methods; drop ModelDbM…
mkuchenbecker af23d5e
fix(optimizer): make TableStats self-describing; route DTO conversion…
mkuchenbecker 3864e42
chore(optimizer): cascade self-describing TableStats from opt-0 to opt-1
mkuchenbecker 0a1125b
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker a6045b5
feat(optimizer): add TableStats↔TableStatsRow conversion on model
mkuchenbecker 4427de0
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker db5921e
refactor(optimizer): service stats methods take/return TableStats, no…
mkuchenbecker d82c17f
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker 3aebf64
chore(optimizer): enable toBuilder on model.Table and model.TableOper…
mkuchenbecker bf30f86
chore(optimizer): cascade toBuilder annotations from opt-0 to opt-1
mkuchenbecker faba6d7
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker 1d56fa6
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker b6c7f42
refactor(optimizer): drop fileCount enrichment from model.TableOperation
mkuchenbecker 177af95
Merge branch 'mkuchenb/optimizer-0' into mkuchenb/optimizer-1
mkuchenbecker 487ac56
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker 6ffc703
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker 7f51360
chore(analyzer): add TODOs for scale tests and query-builder migration
mkuchenbecker 2b06c92
feat(repo): add findClaimedIds for transactional batch-claim verifica…
mkuchenbecker 5b5aae2
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker c862777
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker 437a0ed
refactor(optimizer): add Dto suffix to all api/model classes (PR #527…
mkuchenbecker aabb51c
Merge branch 'mkuchenb/optimizer-0' into mkuchenb/optimizer-1
mkuchenbecker 928d537
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker eedf6d0
refactor(optimizer): update controllers for renamed api/model Dto types
mkuchenbecker 1166efb
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker 4f98c22
refactor(optimizer): rename api.model package to api.spec (PR #527 re…
mkuchenbecker 2c26872
Merge branch 'mkuchenb/optimizer-0' into mkuchenb/optimizer-1
mkuchenbecker b849b7d
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker 231efde
refactor(optimizer): update controller imports for api.model -> api.s…
mkuchenbecker a2580b1
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker f1c500b
refactor(optimizer): hard-fail in AnalyzerRunner if no analyzer is re…
mkuchenbecker f788ab6
fix(analyzer): point @EntityScan at the actual db package
mkuchenbecker b31decf
refactor(optimizer): move Dto suffix from api/spec to model
mkuchenbecker caf3294
Merge branch 'mkuchenb/optimizer-0' into mkuchenb/optimizer-1
mkuchenbecker c6a64bf
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker 91e89ef
refactor(optimizer): update controller + service refs after Dto suffi…
mkuchenbecker 8a4251d
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker c305aa9
refactor(analyzer): update model refs after Dto suffix swap
mkuchenbecker 4e86569
feat(optimizer): propagate jobId through model + api conversions
mkuchenbecker cc8aa80
Merge branch 'mkuchenb/optimizer-0' into mkuchenb/optimizer-1
mkuchenbecker efcceea
feat(optimizer): propagate jobId through model ↔ db conversions
mkuchenbecker f85edd5
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker c00f201
chore(optimizer): rename OPTIMIZER_DB_USERNAME → OPTIMIZER_DB_USER
mkuchenbecker 6998a0f
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker 1fe71f0
refactor(optimizer): rename CompleteOperationRequest → UpdateOperatio…
mkuchenbecker fb5e726
Merge branch 'mkuchenb/optimizer-0' into mkuchenb/optimizer-1
mkuchenbecker ad0c0f1
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker 947bedf
refactor(optimizer): rename completeOperation → updateOperation
mkuchenbecker ce5745b
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker b96c388
Merge remote-tracking branch 'linkedin/main' into mkuchenb/optimizer-1
mkuchenbecker d65b511
refactor(optimizer-repo): unify find/updateBatch with Optional params
mkuchenbecker 78de390
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker 49e43bc
refactor(optimizer-service): use Optional repo API + configurable limit
mkuchenbecker 210d5f0
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker 040046e
refactor(analyzer): switch to Optional repo API + configurable limit
mkuchenbecker b69e09a
test(optimizer-repo): truncate Instant to micros for CI precision
mkuchenbecker a028a98
Merge branch 'mkuchenb/optimizer-1' into mkuchenb/optimizer-2
mkuchenbecker 2b6f67c
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker 6eb6a1e
Merge remote-tracking branch 'linkedin/main' into mkuchenb/optimizer-2
mkuchenbecker de39a78
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker a89e037
feat(optimizer): require limit on list-API endpoints
mkuchenbecker 1e361af
feat(optimizer): basic error-code handling across controllers
mkuchenbecker a37169d
refactor(optimizer): simplify error handling per PR review
mkuchenbecker 6416c9d
refactor(optimizer): drop GlobalExceptionHandler + ApiError; use Spri…
mkuchenbecker bbef386
refactor(optimizer): revert UpdateOperationRequest doc edits
mkuchenbecker 02bbc5c
(wip) feat(optimizer): basic error-code handling across controllers (…
mkuchenbecker 266e2a7
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker 144da72
Merge remote-tracking branch 'linkedin/mkuchenb/optimizer-2' into mku…
mkuchenbecker c74ad5f
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker 6ef7964
docs(optimizer): add @ApiResponses to controllers for OpenAPI spec
mkuchenbecker 357d5c2
Merge branch 'mkuchenb/optimizer-2' into mkuchenb/optimizer-3
mkuchenbecker 08bebcb
Merge commit '78ba0abd' into mkuchenb/optimizer-3
mkuchenbecker 9abde3b
refactor(optimizer): inline analyzer.getOperationType().toDb() per re…
mkuchenbecker e7321d0
refactor(optimizer-analyzer): audit fixes per review
mkuchenbecker eae3a4a
refactor(optimizer-analyzer): nest under apps/optimizer/analyzer
mkuchenbecker 2477530
refactor(optimizer-analyzer): move from apps/ to services/optimizer/a…
mkuchenbecker f529a5c
refactor(optimizer-analyzer): rename property to analyzer.max-tables-…
mkuchenbecker 8e5e97c
style: spotless format on AnalyzerRunner
mkuchenbecker 87b9a75
fix(optimizer-analyzer): paginate per-DB scan to process all tables
mkuchenbecker 4b05cb2
style: spotless format on AnalyzerRunner
mkuchenbecker 21dbf34
Update services/optimizer/analyzer/src/main/java/com/linkedin/openhou…
mkuchenbecker adbff95
(wip) refactor(optimizer-analyzer): split AnalyzerApplication into ap…
mkuchenbecker 663564c
docs(optimizer-analyzer): strip BDP ticket references from javadoc
mkuchenbecker File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| plugins { | ||
| id 'openhouse.springboot-ext-conventions' | ||
| id 'org.springframework.boot' version '2.7.8' | ||
| } | ||
|
|
||
| // Deployable Spring Boot wrapper around the analyzer library. Holds AnalyzerApplication (the | ||
| // @SpringBootApplication entry point) and application.properties; the analysis logic lives in | ||
| // :services:optimizer:analyzer. | ||
| dependencies { | ||
| implementation project(':services:optimizer:analyzer') | ||
| implementation 'org.springframework.boot:spring-boot-starter:2.7.8' | ||
| implementation 'org.springframework.boot:spring-boot-starter-data-jpa:2.7.8' | ||
| runtimeOnly 'mysql:mysql-connector-java:8.0.33' | ||
| } |
29 changes: 29 additions & 0 deletions
29
...lyzerapp/src/main/java/com/linkedin/openhouse/optimizer/analyzer/AnalyzerApplication.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,29 @@ | ||
| package com.linkedin.openhouse.optimizer.analyzer; | ||
|
|
||
| import java.util.List; | ||
| import org.springframework.boot.CommandLineRunner; | ||
| import org.springframework.boot.SpringApplication; | ||
| import org.springframework.boot.autoconfigure.SpringBootApplication; | ||
| import org.springframework.boot.autoconfigure.domain.EntityScan; | ||
| import org.springframework.context.annotation.Bean; | ||
| import org.springframework.data.jpa.repository.config.EnableJpaRepositories; | ||
|
|
||
| /** Entry point for the Optimizer Analyzer application. */ | ||
| @SpringBootApplication | ||
| @EntityScan(basePackages = "com.linkedin.openhouse.optimizer.db") | ||
| @EnableJpaRepositories(basePackages = "com.linkedin.openhouse.optimizer.repository") | ||
| public class AnalyzerApplication { | ||
|
|
||
| public static void main(String[] args) { | ||
| SpringApplication.run(AnalyzerApplication.class, args); | ||
| } | ||
|
|
||
| /** | ||
| * Runs the analyzer once per registered {@link OperationAnalyzer} per process invocation. Each | ||
| * call is scoped to one operation type; the runner iterates databases internally. | ||
| */ | ||
| @Bean | ||
| public CommandLineRunner run(AnalyzerRunner runner, List<OperationAnalyzer> analyzers) { | ||
| return args -> analyzers.forEach(a -> runner.analyze(a.getOperationType())); | ||
| } | ||
| } | ||
8 changes: 8 additions & 0 deletions
8
apps/optimizer/analyzerapp/src/main/resources/application.properties
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,8 @@ | ||
| spring.application.name=openhouse-optimizer-analyzer | ||
| spring.main.web-application-type=none | ||
| spring.datasource.url=${OPTIMIZER_DB_URL:jdbc:h2:mem:analyzerdb;DB_CLOSE_DELAY=-1;MODE=MySQL} | ||
| spring.datasource.username=${OPTIMIZER_DB_USER:sa} | ||
| spring.datasource.password=${OPTIMIZER_DB_PASSWORD:} | ||
| spring.jpa.hibernate.ddl-auto=none | ||
| ofd.success-retry-hours=16 | ||
| ofd.failure-retry-hours=1 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,34 @@ | ||
| plugins { | ||
| id 'openhouse.springboot-ext-conventions' | ||
| id 'org.springframework.boot' version '2.7.8' | ||
| } | ||
|
|
||
| // Library jar — the @SpringBootApplication entry point lives in :apps:optimizer:analyzerapp. | ||
| // Disable bootJar so we don't try to assemble a runnable jar from a library that has no main | ||
| // class; keep jar enabled so consumers (the apps wrapper) get a normal library artifact. | ||
| bootJar { | ||
| enabled = false | ||
| } | ||
|
|
||
| jar { | ||
| enabled = true | ||
| archiveClassifier = '' | ||
| } | ||
|
|
||
| dependencies { | ||
| // api: the analyzer's public types (e.g. OperationAnalyzer's signature, OperationTypeDto) come | ||
| // from :services:optimizer, so consumers of this library see them on their compile classpath. | ||
| api project(':services:optimizer') | ||
| implementation 'org.springframework.boot:spring-boot-starter:2.7.8' | ||
| implementation 'org.springframework.boot:spring-boot-starter-webflux:2.7.8' | ||
| implementation 'org.springframework.boot:spring-boot-starter-data-jpa:2.7.8' | ||
| implementation 'org.springframework.boot:spring-boot-starter-aop:2.7.8' | ||
| runtimeOnly 'mysql:mysql-connector-java:8.0.33' | ||
| testImplementation 'org.springframework.boot:spring-boot-starter-test:2.7.8' | ||
| testImplementation 'com.squareup.okhttp3:mockwebserver:4.10.0' | ||
| testRuntimeOnly 'com.h2database:h2' | ||
| } | ||
|
|
||
| test { | ||
| useJUnitPlatform() | ||
| } |
172 changes: 172 additions & 0 deletions
172
...izer/analyzer/src/main/java/com/linkedin/openhouse/optimizer/analyzer/AnalyzerRunner.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,172 @@ | ||
| package com.linkedin.openhouse.optimizer.analyzer; | ||
|
|
||
| import com.linkedin.openhouse.optimizer.model.OperationTypeDto; | ||
| import com.linkedin.openhouse.optimizer.model.TableDto; | ||
| import com.linkedin.openhouse.optimizer.model.TableOperationDto; | ||
| import com.linkedin.openhouse.optimizer.model.TableOperationsHistoryDto; | ||
| import com.linkedin.openhouse.optimizer.repository.TableOperationsHistoryRepository; | ||
| import com.linkedin.openhouse.optimizer.repository.TableOperationsRepository; | ||
| import com.linkedin.openhouse.optimizer.repository.TableStatsRepository; | ||
| import java.util.List; | ||
| import java.util.Map; | ||
| import java.util.Optional; | ||
| import java.util.stream.Collectors; | ||
| import lombok.RequiredArgsConstructor; | ||
| import lombok.extern.slf4j.Slf4j; | ||
| import org.springframework.data.domain.Pageable; | ||
| import org.springframework.stereotype.Component; | ||
| import org.springframework.transaction.annotation.Transactional; | ||
|
|
||
| /** | ||
| * Core analysis loop. For one operation type per call, iterates databases and evaluates each table | ||
| * in a database against the matching {@link OperationAnalyzer}. | ||
| * | ||
| * <p>Both sides of the join — current operations and latest history per (table, type) — are loaded | ||
| * into maps once per database before the table loop. This is correct at small scale (≤~100k | ||
| * tables); past that the per-db query shape and projection need further tuning. | ||
| * | ||
| * <p>The per-db working-set upper bound is not yet empirically validated. | ||
| */ | ||
| @Slf4j | ||
| @Component | ||
| @RequiredArgsConstructor | ||
| public class AnalyzerRunner { | ||
|
|
||
| private final List<OperationAnalyzer> analyzers; | ||
| private final TableStatsRepository statsRepo; | ||
| private final TableOperationsRepository operationsRepo; | ||
| private final TableOperationsHistoryRepository historyRepo; | ||
|
|
||
| /** | ||
| * Run the analysis loop for {@code operationType} across all databases, with no filters. | ||
| * Equivalent to {@link #analyze(OperationTypeDto, Optional, Optional, Optional)} with all-empty | ||
| * filters. | ||
| */ | ||
| public void analyze(OperationTypeDto operationType) { | ||
| analyze(operationType, Optional.empty(), Optional.empty(), Optional.empty()); | ||
| } | ||
|
|
||
| /** | ||
| * Run the analysis loop for the given operation type, optionally scoped to a single database, | ||
| * table name, or table UUID. Iterates databases one at a time so the working set is bounded by | ||
| * tables-per-db, not tables-total. | ||
| */ | ||
| public void analyze( | ||
|
mkuchenbecker marked this conversation as resolved.
|
||
| OperationTypeDto operationType, | ||
| Optional<String> databaseName, | ||
| Optional<String> tableName, | ||
| Optional<String> tableUuid) { | ||
| OperationAnalyzer analyzer = | ||
| analyzers.stream() | ||
| .filter(a -> a.getOperationType() == operationType) | ||
| .findFirst() | ||
| .orElseThrow( | ||
| () -> | ||
| new IllegalStateException( | ||
| "No analyzer registered for operation type " + operationType)); | ||
| List<String> dbs = databaseName.map(List::of).orElseGet(statsRepo::findDistinctDatabaseNames); | ||
| log.info("Analyzing {} across {} database(s)", operationType, dbs.size()); | ||
| dbs.forEach(db -> analyzeDatabase(analyzer, db, tableName, tableUuid)); | ||
| log.info("Analysis complete for {}", operationType); | ||
| } | ||
|
|
||
| @Transactional | ||
| void analyzeDatabase( | ||
| OperationAnalyzer analyzer, | ||
| String databaseName, | ||
| Optional<String> tableName, | ||
| Optional<String> tableUuid) { | ||
|
|
||
| // Load the three join inputs unbounded for this database. Aligned page-by-page pagination on | ||
| // these maps would leave keys in one map's page mismatched with the others' — a table whose | ||
| // op/history happens to fall in a different page would be misread as "no current op / no | ||
| // history" and trigger duplicate scheduling. Correctness requires the maps to be complete | ||
| // relative to the tables being processed; the working set is bounded by tables-in-db, not by | ||
| // any per-cycle cap. | ||
| Map<String, TableOperationDto> currentOps = | ||
| operationsRepo | ||
| .find( | ||
| Optional.of(analyzer.getOperationType().toDb()), | ||
| Optional.empty(), | ||
| tableUuid, | ||
| Optional.of(databaseName), | ||
| tableName, | ||
| Optional.empty(), | ||
| Optional.empty(), | ||
| Pageable.unpaged()) | ||
| .stream() | ||
| .filter(e -> e.getTableUuid() != null) | ||
| .map(TableOperationDto::fromRow) | ||
| .collect( | ||
| Collectors.toMap( | ||
| TableOperationDto::getTableUuid, op -> op, TableOperationDto::mostRecent)); | ||
|
|
||
| Map<String, TableOperationsHistoryDto> latestHistory = | ||
| historyRepo.findLatest(analyzer.getOperationType().toDb(), Pageable.unpaged()).stream() | ||
| .filter(r -> r.getTableUuid() != null) | ||
| .map(TableOperationsHistoryDto::fromRow) | ||
| .collect( | ||
| Collectors.toMap( | ||
| TableOperationsHistoryDto::getTableUuid, | ||
| h -> h, | ||
| TableOperationsHistoryDto::after)); | ||
|
|
||
| List<TableDto> tables = | ||
| statsRepo.find(Optional.of(databaseName), tableName, tableUuid, Pageable.unpaged()).stream() | ||
| .filter(row -> row.getTableUuid() != null) | ||
| .map(TableDto::fromRow) | ||
| .collect(Collectors.toList()); | ||
|
|
||
| /* | ||
| * For each table in this database, decide whether to create a new PENDING operation. | ||
| * | ||
| * 1. Skip tables not opted in to this operation type. | ||
| * 2. Look up the table's current active operation (if any) and its most recent completed | ||
| * history entry from the maps loaded above. | ||
| * 3. Delegate the schedule-or-not decision to the analyzer's shouldSchedule — strategy | ||
| * encapsulates cadence, retry policy, and any future per-operation signals. | ||
| * 4. On true, persist a new PENDING operation. The scheduler picks it up on its next pass. | ||
| */ | ||
| int created = 0; | ||
| int failed = 0; | ||
| for (TableDto table : tables) { | ||
| if (!analyzer.isEnabled(table)) { | ||
| continue; | ||
| } | ||
| Optional<TableOperationDto> currentOp = | ||
| Optional.ofNullable(currentOps.get(table.getTableUuid())); | ||
| Optional<TableOperationsHistoryDto> entry = | ||
| Optional.ofNullable(latestHistory.get(table.getTableUuid())); | ||
| if (!analyzer.shouldSchedule(table, currentOp, entry)) { | ||
| continue; | ||
| } | ||
| try { | ||
| TableOperationDto op = TableOperationDto.pending(table, analyzer.getOperationType()); | ||
| operationsRepo.save(op.toRow()); | ||
| log.debug( | ||
| "Created PENDING {} operation for table {}.{}", | ||
| analyzer.getOperationType(), | ||
| table.getDatabaseName(), | ||
| table.getTableId()); | ||
| created++; | ||
| } catch (RuntimeException e) { | ||
| // One bad table should not abort the rest of the database. Log and continue; the next | ||
| // analyzer pass will retry for any table whose save failed here. | ||
| log.error( | ||
| "Failed to create PENDING {} operation for table {}.{}: {}", | ||
| analyzer.getOperationType(), | ||
| table.getDatabaseName(), | ||
| table.getTableId(), | ||
| e.toString(), | ||
| e); | ||
| failed++; | ||
| } | ||
| } | ||
| log.info( | ||
| "Finished analyzing Database {}: created {} PENDING {} operation(s) ({} failed)", | ||
| databaseName, | ||
| created, | ||
| analyzer.getOperationType(), | ||
| failed); | ||
| } | ||
| } | ||
83 changes: 83 additions & 0 deletions
83
...va/com/linkedin/openhouse/optimizer/analyzer/CadenceBasedOrphanFilesDeletionAnalyzer.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,83 @@ | ||
| package com.linkedin.openhouse.optimizer.analyzer; | ||
|
|
||
| import com.linkedin.openhouse.optimizer.model.OperationTypeDto; | ||
| import com.linkedin.openhouse.optimizer.model.TableDto; | ||
| import com.linkedin.openhouse.optimizer.model.TableOperationDto; | ||
| import com.linkedin.openhouse.optimizer.model.TableOperationsHistoryDto; | ||
| import java.time.Duration; | ||
| import java.util.Optional; | ||
| import org.springframework.beans.factory.annotation.Value; | ||
| import org.springframework.stereotype.Component; | ||
|
|
||
| /** | ||
| * Decides when to schedule an Orphan-Files-Deletion (OFD) run for a table. | ||
| * | ||
| * <p>OFD removes data files in the table's storage directory that are no longer referenced by any | ||
| * Iceberg snapshot — left-over output from failed writes, expired snapshots, or interrupted | ||
| * compactions. Running it too often wastes compute; running it too rarely lets orphan files | ||
| * accumulate and bloats storage cost. This analyzer balances the two on a per-table cadence. | ||
| * | ||
| * <h2>When OFD fires for a table</h2> | ||
| * | ||
| * All of the following must be true: | ||
| * | ||
| * <ol> | ||
| * <li><b>Opt-in.</b> The table sets {@code maintenance.optimizer.ofd.enabled=true} in its table | ||
| * properties. Without this flag, the analyzer ignores the table entirely. | ||
| * <li><b>No active operation already in flight.</b> If the table has a non-CANCELED operation row | ||
| * (PENDING, SCHEDULING, or SCHEDULED), the scheduler already owns it and the analyzer stays | ||
| * out. A CANCELED row does not block — it is treated as if no operation exists. | ||
| * <li><b>Cadence elapsed since the last completed run.</b> | ||
| * <ul> | ||
| * <li>If the table has <i>no</i> prior history, schedule immediately. | ||
| * <li>If the most recent history entry is {@code SUCCESS}, wait {@code | ||
| * ofd.success-retry-hours} (default 16h) after its {@code completedAt} before | ||
| * scheduling again. Set below 24h so that even when a run lands at an unlucky time of | ||
| * day, at least one re-evaluation is guaranteed within any rolling 24-hour window. | ||
| * <li>If the most recent history entry is {@code FAILED}, wait {@code | ||
| * ofd.failure-retry-hours} (default 1h) before retrying — shorter than the success | ||
| * interval so transient failures recover quickly. | ||
| * </ul> | ||
| * </ol> | ||
| * | ||
| * <p>The two retry intervals are configurable via {@code application.properties} and can be tuned | ||
| * per environment. The opt-in property is per-table and managed through the standard table- | ||
| * properties API. | ||
| */ | ||
| @Component | ||
| public class CadenceBasedOrphanFilesDeletionAnalyzer implements OperationAnalyzer { | ||
|
|
||
| static final String OFD_ENABLED_PROPERTY = "maintenance.optimizer.ofd.enabled"; | ||
|
|
||
| private final CadencePolicy cadencePolicy; | ||
|
|
||
| public CadenceBasedOrphanFilesDeletionAnalyzer( | ||
| @Value("${ofd.success-retry-hours:16}") long successRetryHours, | ||
| @Value("${ofd.failure-retry-hours:1}") long failureRetryHours) { | ||
| this.cadencePolicy = | ||
| new CadencePolicy(Duration.ofHours(successRetryHours), Duration.ofHours(failureRetryHours)); | ||
| } | ||
|
|
||
| /** Package-private for tests that supply a pre-built {@link CadencePolicy}. */ | ||
| CadenceBasedOrphanFilesDeletionAnalyzer(CadencePolicy cadencePolicy) { | ||
| this.cadencePolicy = cadencePolicy; | ||
| } | ||
|
|
||
| @Override | ||
| public OperationTypeDto getOperationType() { | ||
| return OperationTypeDto.ORPHAN_FILES_DELETION; | ||
| } | ||
|
|
||
| @Override | ||
| public boolean isEnabled(TableDto table) { | ||
| return "true".equals(table.getTableProperties().get(OFD_ENABLED_PROPERTY)); | ||
| } | ||
|
|
||
| @Override | ||
| public boolean shouldSchedule( | ||
| TableDto table, | ||
| Optional<TableOperationDto> currentOp, | ||
| Optional<TableOperationsHistoryDto> latestHistory) { | ||
| return cadencePolicy.shouldSchedule(currentOp, latestHistory); | ||
| } | ||
| } |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.