Split `ct_worker` into `ct_worker` and `generic_log_worker` #49

rozbb · 2025-06-12T05:01:35Z

This one looks like a lot but it's mostly just moving stuff around. I recommend reading commit-by-commit.

Some open problems:

I had to remove config_roots from the sequencer's metrics, since the generic sequencer knows nothing about roots. Not sure what the best way to fix that is.
We are now exporting the ObjectBackend async trait. Apparently exporting async traits is a warning:

warning: use of `async fn` in public traits is discouraged as auto trait bounds cannot be specified
   --> crates/generic_log_worker/src/lib.rs:379:5
    |
379 |     async fn upload(&self, key: &str, data: &[u8], opts: &UploadOptions) -> Result<()>;
    |     ^^^^^
    |
    = note: you can suppress this lint if you plan to use the trait only in your own code, or do not care about auto traits like `Send` on the `Future`
    = note: `#[warn(async_fn_in_trait)]` on by default
help: you can alternatively desugar to a normal `fn` that returns `impl Future` and add any desired bounds such as `Send`, but these cannot be relaxed without a breaking API change
    |
379 -     async fn upload(&self, key: &str, data: &[u8], opts: &UploadOptions) -> Result<()>;
379 +     fn upload(&self, key: &str, data: &[u8], opts: &UploadOptions) -> impl std::future::Future<Output = Result<()>> + Send;
    |

Also, I'm not sure if I made the right cut on the API. I think the contents of ct_worker/src/frontend_worker.rs could be more succinct. Curious to hear what you think!

Finally, the readme for the generic log sequencer is mostly a copy of the CT one. I could use help adding to it if you think it should have more.

…ct_worker

…ct_api dep from old ct_worker

… specify extension in tlog_tiles signing

lukevalenta

This is looking good! Leaving comments so far and I'll finish the review tomorrow.

crates/generic_log_worker/src/util.rs

lukevalenta · 2025-06-12T17:23:19Z

crates/generic_log_worker/src/sequencer_do.rs

+// The number of entries in the short-term deduplication cache.
+// This cache provides a secondary deduplication layer to bridge the gap in KV's eventual consistency.
+// It should hold at least <maximum-entries-per-second> x <kv-eventual-consistency-time (60s)> entries.
+const MEMORY_CACHE_SIZE: usize = 300_000;


Note to self: probably want to add this to the app/log config.

crates/generic_log_worker/src/sequencer_do.rs

crates/generic_log_worker/src/metrics.rs

crates/generic_log_worker/src/batcher_do.rs

crates/generic_log_worker/src/config.rs

crates/generic_log_worker/src/sequencer_do.rs

crates/ct_worker/src/batcher_do.rs

crates/ct_worker/src/sequencer_do.rs

rozbb · 2025-06-13T05:48:51Z

I pushed a commit that makes some of the changes you suggested. Let me know if I interpreted correctly. If so, I gotta do some cleanup.

It have two TODOs in there though that I could use feedback on:

If GenericSequencer::new() errors, it's because it couldn't fetch the name from State. Does this ever happen?
How precisely does naming work? Logs were previously lazily loaded on first fetch by name (in the request body). Is that request name necessarily always the durable object name? It seems like it's retrieved from the URL of end-user GET queries

* Move app config package back to ct_worker. The justification here is that each application is going to need to manage its own schema, and the AppConfig struct is directly used for parsing that config. * Add SequencerConfig (renamed from LogConfig) and BatcherConfig with options that are relevant for generic sequencers and batchers. * Rename Metrics -> SequencerMetrics * Construct the metrics registry in the app-specific code and pass it in to the Sequencer constructor. This allows apps to registry their own custom metrics, if desired. * Use [event(start)] for initializing logging and panic handling. This is the first thing that runs when the wasm is loaded. * Make the `new()` methods for GenericSequencer and GenericBatcher infallible, and handle the errors (by panicing, which is all we can really do) in the calling app.

…orker

lukevalenta · 2025-06-13T19:05:30Z

If GenericSequencer::new() errors, it's because it couldn't fetch the name from State. Does this ever happen?

In my edits I made GenericSequencer::new() infallible, and just panic in StaticCTSequencer::new() if we don't know the name. The only way the 'name' property wouldn't be present is it someone updates the frontend worker to try to create a Sequencer with a random ID instead of a log name, which would be a programmer error.

How precisely does naming work? Logs were previously lazily loaded on first fetch by name (in the request body). Is that request name necessarily always the durable object name? It seems like it's retrieved from the URL of end-user GET queries

The basic idea is that we have an allowlist of log names in the configuration, and every entrypoint must check valid_log_name to make sure that name is in the allowlist. If not, we should return a 404. Otherwise, the log name is always the name of the Durable object (except for Batchers, which have the _<shard_id> suffix that needs to be stripped off to get the log name).

rozbb · 2025-06-14T02:13:56Z

Ah ok then. This makes a lot of sense. Agreed on your edits!

Michael Rosenberg added 15 commits June 10, 2025 23:30

Remove config roots

7a335f1

Start moving ct_worker to new_ct_worker; the former will be use-agnostic

982a95c

Remove CONFIG stuff from old ct_worker; move CT batcher stuff to new_…

d1f44a3

…ct_worker

Move all the deployment stuff to new_ct_worker

4060e73

Make config a module, not a crate

7563948

Move open_checkpoint from static_ct_api to tlog_tiles; remove static_…

793f322

…ct_api dep from old ct_worker

Add back the check that extension lines are empty; add the ability to…

acbc882

… specify extension in tlog_tiles signing

Move extension verification out of tlog_tile and into generic sequencer

f18c877

Rename ct_worker -> generic_log_worker

1f0746a

Rename new_ct_worker -> ct_worker

dd0aa77

Move ct_worker metadata back to where it belongs

db742ec

Prune unnecesasry imports

084b494

Fix static_ct_api doctest

788cc9f

Add README and LICENSE to generic log worker

0e22e63

Remove unnecesasry dependencies

ff1831e

lukevalenta reviewed Jun 12, 2025

View reviewed changes

Move GenericSequencer's log config fetching from initialize() to new()

c615430

lukevalenta added 4 commits June 13, 2025 13:57

Clean up deps

b4471b0

Remove ct_worker/src/ctlog.rs and move upload_issuers from frontend_w…

9f57477

…orker

bugfix: strip suffix from batcher object name to get log name

83fb9de

Fix generic_log_worker Cargo.toml metadata

af7f66f

lukevalenta merged commit 07e3c5c into cloudflare:main Jun 16, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split `ct_worker` into `ct_worker` and `generic_log_worker` #49

Split `ct_worker` into `ct_worker` and `generic_log_worker` #49

Uh oh!

rozbb commented Jun 12, 2025

Uh oh!

lukevalenta left a comment

Uh oh!

Uh oh!

lukevalenta Jun 12, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rozbb commented Jun 13, 2025

Uh oh!

lukevalenta commented Jun 13, 2025

Uh oh!

rozbb commented Jun 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Split ct_worker into ct_worker and generic_log_worker #49

Split ct_worker into ct_worker and generic_log_worker #49

Uh oh!

Conversation

rozbb commented Jun 12, 2025

Uh oh!

lukevalenta left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lukevalenta Jun 12, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rozbb commented Jun 13, 2025

Uh oh!

lukevalenta commented Jun 13, 2025

Uh oh!

rozbb commented Jun 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Split `ct_worker` into `ct_worker` and `generic_log_worker` #49

Split `ct_worker` into `ct_worker` and `generic_log_worker` #49