-
-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SBK-251] Store the last successfully processed block for fast restart #33
Comments
Related to #18 |
Still not an expert on taskiq but learned a decent amount now. This is my understanding:
Neither of these are especially fitting for our use case. I think that leaves us with only one option; to implement our own state storage for Silverback. That comes with the bonus of ultimate flexibility but I'm left wondering what other use cases it might have when thinking about the design. My initial thought was that we could use files in @fubuloubu I could use your input and future-think on this. Is there other state we might be interested in storing? Other databases or storage options we might leverage for other things in Silverback? Do we want to offer options to users or go for just what we need right now? |
After writing all that up, I'm kind of leaning towards Redis being required for persistence in Silverback. Maybe remove results backend settings and just offer up Redis connection settings to the user. Then we either have Redis to do what we want with, or the whole configuration is ephemeral. |
Hmm, so there's a few ways to look at this:
P.s. the data is read by the runner via the |
The results backend isn't going to cut it for us. We can't query it for what we want (e.g. last executed job). There's also a chance that other results backends are more friendly to this (e.g. if there's a postgres one with metadata cols). I'll look today. But I suspect we'll need to move to our own persistence layer.
Does this mean you're implementing the persistence layer or that you would leverage it as well? |
Here's some roughing I did of what I'm thinking, with some psuedo-code. We could have some kind of storage setup. There might be something like this already we can use on top of mongo or redis or whatever k/v storage we find most fitting. class BaseStorage(ABC):
@abstractmethod
def get(k: str) -> BaseModel:
...
@abstractmethod
def store(k: str, v: BaseModel):
... A possible model of silverback instance state: class RunnerState(BaseModel):
instance: str # UUID, or tag, or deployment/network name?
network: str
block_number: int
updated: datetime A rough model of handler results: class HandlerResult(BaseModel):
return_value: T
labels: Dict[str, str]
execution_time: float
network: str
block_number: int
instance: str We'd still need something to handle relations. We need a way to query "give me all events for this contract" which this doesn't cover. Maybe we can tag results with block and event data that might allow us to fetch. Or we add a contract model that has a list of event IDs or something. Depends on the storage, really. Rough example of how it might work in the runner: def _checkpoint(self, block_number: int) -> int:
"""Set latest checkpoint block number"""
if block_number > self.latest_block_number:
logger.debug(f"Checkpoint block #{self.latest_block_number}")
self.latest_block_number = block_number
self._storage.store(
f"{self.instance}:runner_state",
RunnerState(
instance=self.instance,
network=self.network,
block_number=result.result.block_number,
updated=datetime.utcnow(),
),
)
return self.latest_block_number
def _handle_result(self, task_type: str, result: TaskiqResult):
store_key: str
if task_type == "block":
store_key = f"{self.instance}:block:{result.block_number}:result"
elif task_type == "event":
store_key = f"{self.instance}:event:{result.block_number}:{contract_event.contract.address}:{contract_event.name}result"
self._storage.store(
store_key,
HandlerResult(
instance=self.instance,
network=self.network,
block_number=result.block_number,
execution_time=result.execution_time,
labels=result.labels,
return_value=result.return_value,
),
) @fubuloubu let me know if this sounds ok to you or if I'm barking up the wrong tree. |
Beanie looks interesting if we're itching to go the mongo route. Seems it fits well in with FastAPI and Pydantic. |
Raw draft PR up at #45 for early feedback. |
Overview
If the script restarts, you might want to know what the last successfully processed block was so you can continue from that point
Specification
Would be useful for a pattern like this
Dependencies
Include links to any open issues that must be resolved before this feature can be implemented.
SBK-251
The text was updated successfully, but these errors were encountered: