Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Blue green cache #1061

Merged
merged 8 commits into from Feb 27, 2023
Merged

feat: Blue green cache #1061

merged 8 commits into from Feb 27, 2023

Conversation

chubei
Copy link
Collaborator

@chubei chubei commented Feb 26, 2023

Closes #928

Removed clean on dozer startup.

Relavant logs:

...
2023-02-26T18:38:25.832862Z  INFO [pipeline] Inconsistent, resetting    
2023-02-26T18:38:25.841439Z  INFO [pipeline] Cache stocks writing to 82ffe650-9e40-4dea-b6d4-44503123994e while serving 6e7f4187-1e15-4214-ba79-300c0172c648    
2023-02-26T18:38:25.848192Z  INFO [pipeline] Cache stocks_meta writing to 542af2db-2096-4ef7-94d6-807592ed7077 while serving 04cd4326-4e5e-4837-8029-255660bbff07    
...  
2023-02-26T18:38:27.098957Z  INFO [api] Redirecting cache stocks to 82ffe650-9e40-4dea-b6d4-44503123994e    
2023-02-26T18:38:27.428381Z  INFO [api] Redirecting cache stocks_meta to 542af2db-2096-4ef7-94d6-807592ed7077    

@chubei chubei requested a review from v3g42 February 26, 2023 18:39
@chubei
Copy link
Collaborator Author

chubei commented Feb 26, 2023

There're still two problems:

  1. The processor nodes don't have consistent names across dozer runs, so the pipeline is never consistent and always starts the sources from scratch.
  2. Currently cache switch happens on the first SnapshottingDone message from whichever ancestor sources. I'm not sure if this is a good strategy.

@coveralls
Copy link

Pull Request Test Coverage Report for Build 4276399061

  • 473 of 591 (80.03%) changed or added relevant lines in 46 files are covered.
  • 69 unchanged lines in 9 files lost coverage.
  • Overall coverage increased (+0.1%) to 72.831%

Changes Missing Coverage Covered Lines Changed/Added Lines %
dozer-api/src/errors.rs 0 1 0.0%
dozer-api/src/grpc/internal/internal_pipeline_client.rs 60 61 98.36%
dozer-api/src/grpc/shared_impl/mod.rs 6 7 85.71%
dozer-cache/src/cache/lmdb/cache/checkpoint_database.rs 54 55 98.18%
dozer-cache/src/cache/lmdb/cache_manager.rs 18 19 94.74%
dozer-core/src/tests/dag_schemas.rs 0 1 0.0%
dozer-ingestion/src/connectors/kafka/tests.rs 0 1 0.0%
dozer-orchestrator/src/cli/repl/sql.rs 0 1 0.0%
dozer-orchestrator/src/pipeline/streaming_sink.rs 0 1 0.0%
dozer-sql/src/pipeline/aggregation/processor.rs 0 1 0.0%
Files with Coverage Reduction New Missed Lines %
dozer-api/src/grpc/shared_impl/mod.rs 1 87.91%
dozer-api/src/grpc/typed/tests/fake_internal_pipeline_server.rs 1 84.62%
dozer-core/src/builder_dag.rs 1 88.35%
dozer-orchestrator/src/pipeline/sinks.rs 2 75.8%
dozer-core/src/forwarder.rs 5 96.86%
dozer-orchestrator/src/simple/orchestrator.rs 7 63.72%
dozer-core/src/executor/receiver_loop.rs 10 94.51%
dozer-api/src/grpc/types_helper.rs 13 46.62%
dozer-ingestion/src/connectors/object_store/schema_helper.rs 29 51.67%
Totals Coverage Status
Change from base Build 4270817525: 0.1%
Covered Lines: 27769
Relevant Lines: 38128

💛 - Coveralls

@chubei chubei merged commit d7ede07 into getdozer:main Feb 27, 2023
@chubei chubei deleted the feat/bgcache branch February 27, 2023 15:22
@v3g42
Copy link
Contributor

v3g42 commented Feb 27, 2023

There're still two problems:

  1. The processor nodes don't have consistent names across dozer runs, so the pipeline is never consistent and always starts the sources from scratch.
  2. Currently cache switch happens on the first SnapshottingDone message from whichever ancestor sources. I'm not sure if this is a good strategy.

I don't fully follow the implications of first SnapshottingDone implications. Ideally it should be based on when the versions in the cache caught up with the older versions. I think Debezium describes this in full detail in some article.

Lets discuss this separately. But for now I think it is fair enough.

Also, in some cases users might want to explicitly trigger the switch.

@chubei
Copy link
Collaborator Author

chubei commented Feb 27, 2023

Also, in some cases users might want to explicitly trigger the switch.

We can add a method in the internal grpc server, creating the new alias and notifying API server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Blue/Green cache
3 participants