Skip to content

Conversation

@danhhz
Copy link
Contributor

@danhhz danhhz commented Oct 6, 2023

This allows for faster iteration on mz-stash itself, as is happening in Pv2.

Conceptually, stash is only used in envd (by the catalog and by the storage controller) but it was being pulled in by all sorts of crates for the objects protos. Separate out the latter (plus StashError) into a new mz-stash-types crate that is still depended on by lots of things, but changes less often.

The various old versions of objects.proto want to be next to it, so pull them into mz-stash-types, too. In an ideal world, we'd probably move the vX_to_vX+1 upgrades also, but they depend on actual stash things (e.g. Transaction), so leave them where they are.

Finally, any given metric can only be registered once (otherwise we get a panic), but we need two stashes, so also pull out the stash Metrics into mz-stash-types. Ideally, this would have stayed where it was. A possible alternative to this would be something like putting a type-safe cache of registered metrics objects on MetricsRegistry.

Timing on my M1 laptop of bin/environmentd after invalidating stash went from 2m12s to 53s. A large part of this is likely that we've entirely eliminated clusterd, including linking.

touch src/stash/src/lib.rs && time ./bin/environmentd -- --nope

Deps before:

deps-stash-before

Deps after:

deps-stash-after

Motivation

  • This PR refactors existing code.

Tips for reviewer

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • This PR includes the following user-facing behavior changes:

@danhhz danhhz requested review from a team and jkosh44 October 6, 2023 16:38
@danhhz danhhz requested a review from a team as a code owner October 6, 2023 16:38
@danhhz danhhz requested a review from a team October 6, 2023 16:38
@danhhz danhhz requested a review from benesch as a code owner October 6, 2023 16:38
@shepherdlybot
Copy link

shepherdlybot bot commented Oct 6, 2023

This PR has higher risk. Make sure to carefully review the file hotspots. In addition to having a knowledgeable reviewer, it may be useful to add observability and/or a feature flag. What's This?

Risk Score Probability Buggy File Hotspots
🔴 81 / 100 61% 1
Buggy File Hotspots:
File Percentile
../src/postgres.rs 99

}
};
objects_v29::RoleId { value }
#[allow(dead_code)]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkosh44 It makes me a bit nervous that I messed something up for these to be unused. Is that expected?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, it's possible I left these in by mistake and they weren't needed. Why are you changing all of these From impls into normal functions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the structs they're defined on now live in mz_stash_types, so rust doesn't let us impl traits on them here (mz_stash) anymore

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, I looked through and I don't see this used anywhere. I probably just added it out of instinct after adding the database id function above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think the other ones are the same story? (There are a few)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sorry the other ones too. I looked through the file and I use most of those types in the tests, so I probably mistakenly thought I needed these functions, but I don't.

@aljoscha
Copy link
Contributor

aljoscha commented Oct 9, 2023

those before/after dependency graphs... 😂

}
};
objects_v29::RoleId { value }
#[allow(dead_code)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, I looked through and I don't see this used anywhere. I probably just added it out of instinct after adding the database id function above.

@danhhz
Copy link
Contributor Author

danhhz commented Oct 10, 2023

TFTR!

@danhhz danhhz enabled auto-merge October 10, 2023 16:11
@danhhz
Copy link
Contributor Author

danhhz commented Oct 10, 2023

Oh lolol cargo check on just mz-stash is even more dramatic (because the protos all got moved out)

Before

$ touch src/stash/src/lib.rs && time cargo check -p mz-stash --lib
    Checking mz-stash v0.0.0 (/Users/dan/mz/materialize/src/stash)
    Finished dev [unoptimized + debuginfo] target(s) in 35.01s

real	0m35.141s

After

$ touch src/stash/src/lib.rs && time cargo check -p mz-stash --lib
    Checking mz-stash v0.0.0 (/Users/dan/mz/materialize/src/stash)
    Finished dev [unoptimized + debuginfo] target(s) in 2.00s

real	0m2.081s

This allows for faster iteration on mz-stash itself, as is happening in
Pv2.

Conceptually, stash is only used in envd (by the catalog and by the
storage controller) but it was being pulled in by all sorts of crates
for the objects protos. Separate out the latter (plus StashError) into a
new mz-stash-types crate that is still depended on by lots of things,
but changes less often.

The various old versions of objects.proto want to be next to it, so pull
them into mz-stash-types, too. In an ideal world, we'd probably move the
vX_to_vX+1 upgrades also, but they depend on actual stash things (e.g.
Transaction), so leave them where they are.

Finally, any given metric can only be registered once (otherwise we get
a panic), but we need two stashes, so also pull out the stash Metrics
into mz-stash-types. Ideally, this would have stayed where it was. A
possible alternative to this would be something like putting a type-safe
cache of registered metrics objects on MetricsRegistry.

Timing on my M1 laptop of bin/environmentd after invalidating stash went
from 2m12s to 53s. A large part of this is likely that we've entirely
eliminated clusterd, including linking.

```
touch src/stash/src/lib.rs && time ./bin/environmentd -- --nope
```
@danhhz danhhz merged commit aa8df4f into MaterializeInc:main Oct 10, 2023
@danhhz danhhz deleted the deps_stash branch October 10, 2023 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants