[Runtime Epoch Split] (7/n) Split runtime everywhere else (#8940)

This does the bulk of the remaining split, which is all a bit intertwined. Although there are a ton of changes, the transformations are simple for the most part. RuntimeWithEpochManagerAdapter was the previous combined trait. There were two implementations of that trait: NightshadeRuntime and KeyValueRuntime. Now, RuntimeWithEpochManagerAdapter is split into three: EpochManagerAdapter, ShardTracker, and RuntimeAdapter: - EpochManagerAdapter is a direct abstraction of EpochManagerHandle (which is basically a `Mutex<EpochManager>`). The reason why we cannot use EpochManagerHandle directly instead of a trait is because KeyValueRuntime cannot provide a concrete EpochManager implementation. It has a completely custom validator schedule that is not compatible at all. NightshadeRuntime, on the other hand, contains an embedded EpochManagerHandle. - ShardTracker is just a TrackedConfig plus an `Arc<dyn EpochManagerAdapter>`. It doesn't belong to EpochManagerAdapter because the TrackedConfig is not inherent to the protocol. ShardTracker is responsible for one thing: calculating whether this node itself cares or will care about a specific shard. For querying whether some *other* validator is supposed to track a shard, use the EpochManagerAdapter, not ShardTracker. - RuntimeAdapter is an abstraction of everything that has to do with the runtime and account storage, such as `get_trie_for_shard`. It does bleed a little bit into the EpochManagerAdapter responsibilities, more on that later. How does this split work for the two implementations: NightshadeRuntime and KeyValueRuntime? - NightshadeRuntime is split into EpochManagerHandle, ShardTracker, and NightshadeRuntime. So essentially, EpochManagerHandle is moved out (well, conceptually, but it still keeps a reference to it because some runtime methods depend on it - eventually this can be moved out as well and have the EpochManagerAdapter be passed in), ShardTracker is moved out, and NightshadeRuntime itself now only implements RuntimeAdapter. - When constructing the NightshadeRuntime, we must first construct the EpochManagerHandle. This is because we cannot have the NightshadeRuntime internally construct a different EpochManagerHandle, or else the two EpochManagers will not synchronize state, and things fail. - KeyValueRuntime is split into KeyValueEpochManager, and KeyValueRuntime. There's no ShardTracker specific to KeyValueXXX anymore (because it's a concrete type to begin with). KeyValueRuntime takes as a constructor argument KeyValueEpochManager; but here it's really just to help initialize some fields (num_shards, epoch_length, etc.). Now, we talk about how users of this split are modified accordingly: - Wherever a struct used to keep a field `runtime_adapter: Arc<dyn RuntimeWithEpochManagerAdapter>`, it is now split into three fields, `epoch_manager: Arc<dyn EpochManagerAdapter>`, `shard_tracker: ShardTracker`, and `runtime: Arc<dyn RuntimeAdapter>`, but we can omit some of them if they are not actually used. Usages of the old field are individually changed to use only one of the new fields. - Note that queries to `self.runtime_adapter.cares_about_shard(..., ..., ..., is_me: true)` is changed to `self.shard_tracker.care_about_shard(..., ..., ..., true)` whereas `self.runtime_adapter.cares_about_shard(..., ..., ..., is_me: false)` is changed to `self.epoch_manager.cares_about_shard_from_prev_block`, because the latter does not actually need the ShardTracker. - Wherever we used to convert a runtime_adapter into one of the subtraits, we now must construct the subtrait directly: - any `Arc<dyn RuntimeWithEpochManagerAdapter>` argument that was passed in is split into up to three arguments depending on which are actually used. - any direct construction of NightshadeRuntime is easily converted to separate constructions of EpochManager, NightshadeRuntime, and ShardTracker. - any direct construction of KeyValueRuntime is easily converted to separate constructions of KeyValueEpochManager, KeyValueRuntime, and ShardTracker. - The TestEnv is one big part of this refactoring. The TestEnvBuilder used to take a `.runtime_adapters(Vec<Arc<dyn RuntimeWithEpochManagerAdapter>>)`, but now we don't have the RuntimeWithEpochManagerAdapters anymore. We could split this into three, of course, but this would present a pretty bad inconvenience to any tests that call helpers like create_nightshade_runtimes(...). So, the logic is now the following: - There are now .stores(...), .epoch_managers(...), .shard_trackers(...), and .runtimes(...) overrides available. These must be overridden in an appropriate partial order (epoch_managers > stores, shard_trackers > epoch_managers, runtimes > epoch_managers, and also, stores > clients, epoch_managers > validators). - Any override that is not explicitly provided will be constructed with a default (where N is the number of clients): - stores will be default-constructed as N elements of `create_test_store()`. - epoch_managers will be default-constructed as N elements of `KeyValueEpochManager` using the stores we have (this is why the stores must be finalized, either default-constructed or overridden, first, thus the partial order) - shard_trackers will be default-constructed as N elements of `ShardTracker` using the epoch_managers we have, with empty tracking config. - runtimes will be default-constructed as N elements of `KeyValueRuntime` using the stores and epoch_managers we have. - In addition, there are helpers that allow overriding with something other than the default: - `.track_all_shards()`: constructs shard_trackers that use the epoch_managers we have but with AllShards tracking - `.real_epoch_managers()`: constructs epoch_managers as N `Arc<EpochManagerHandle>`s instead of KeyValueEpochManager - `.nightshade_runtimes()` (this is an extension method that lives in integration-tests, to avoid a dependency cycle): constructs runtimes as N `Arc<NightshadeRuntime>`s using the stores and epoch_managers we have. - As a result, most of the constructions in integration-tests have been converted to using only these helper functions, as opposed to constructing runtimes directly in the tests. Some tests outside of integration-tests still construct directly, which is OK as they used to do that as well; we just now have to make sure to override stores, epoch_managers, and runtimes together so they are consistent.
near · May 11, 2023 · e32db71 · e32db71
1 parent cfa58b6
commit e32db71
Show file tree

Hide file tree

Showing 89 changed files with 2,188 additions and 1,810 deletions.
diff --git a/Cargo.lock b/Cargo.lock
diff --git a/chain/chain/src/blocks_delay_tracker.rs b/chain/chain/src/blocks_delay_tracker.rs
@@ -1,4 +1,5 @@
 use chrono::DateTime;
+use near_epoch_manager::EpochManagerAdapter;
 use near_primitives::block::{Block, Tip};
 use near_primitives::borsh::maybestd::collections::hash_map::Entry;
 use near_primitives::hash::CryptoHash;
@@ -14,7 +15,7 @@ use std::mem;
 use std::time::Instant;
 use tracing::error;
 
-use crate::{metrics, Chain, ChainStoreAccess, RuntimeWithEpochManagerAdapter};
+use crate::{metrics, Chain, ChainStoreAccess};
 
 const BLOCK_DELAY_TRACKING_COUNT: u64 = 50;
 
@@ -92,7 +93,7 @@ impl ChunkTrackingStats {
     fn to_chunk_processing_info(
         &self,
         chunk_hash: ChunkHash,
-        runtime_adapter: &dyn RuntimeWithEpochManagerAdapter,
+        epoch_manager: &dyn EpochManagerAdapter,
     ) -> ChunkProcessingInfo {
         let status = if self.completed_timestamp.is_some() {
             ChunkProcessingStatus::Completed
@@ -101,10 +102,10 @@ impl ChunkTrackingStats {
         } else {
             ChunkProcessingStatus::NeedToRequest
         };
-        let created_by = runtime_adapter
+        let created_by = epoch_manager
             .get_epoch_id_from_prev_block(&self.prev_block_hash)
             .and_then(|epoch_id| {
-                runtime_adapter.get_chunk_producer(&epoch_id, self.height_created, self.shard_id)
+                epoch_manager.get_chunk_producer(&epoch_id, self.height_created, self.shard_id)
             })
             .ok();
         let request_duration = if let Some(requested_timestamp) = self.requested_timestamp {
@@ -353,17 +354,17 @@ impl BlocksDelayTracker {
         block_height: BlockHeight,
         block_hash: &CryptoHash,
         chain: &Chain,
-        runtime_adapter: &dyn RuntimeWithEpochManagerAdapter,
+        epoch_manager: &dyn EpochManagerAdapter,
     ) -> Option<BlockProcessingInfo> {
         self.blocks.get(block_hash).map(|block_stats| {
             let chunks_info: Vec<_> = block_stats
                 .chunks
                 .iter()
                 .map(|chunk_hash| {
                     if let Some(chunk_hash) = chunk_hash {
-                        self.chunks.get(chunk_hash).map(|x| {
-                            x.to_chunk_processing_info(chunk_hash.clone(), runtime_adapter)
-                        })
+                        self.chunks
+                            .get(chunk_hash)
+                            .map(|x| x.to_chunk_processing_info(chunk_hash.clone(), epoch_manager))
                     } else {
                         None
                     }
@@ -451,7 +452,7 @@ impl Chain {
                             *height,
                             hash,
                             self,
-                            &*self.runtime_adapter,
+                            self.epoch_manager.as_ref(),
                         )
                     })
                     .collect::<Vec<_>>()
@@ -463,7 +464,8 @@ impl Chain {
             .iter()
             .flat_map(|(chunk_hash, _)| {
                 self.blocks_delay_tracker.chunks.get(chunk_hash).map(|chunk_stats| {
-                    chunk_stats.to_chunk_processing_info(chunk_hash.clone(), &*self.runtime_adapter)
+                    chunk_stats
+                        .to_chunk_processing_info(chunk_hash.clone(), self.epoch_manager.as_ref())
                 })
             })
             .collect::<Vec<_>>();