feat: sync latest gens in-memory for instant invalidation #676

sborrazas · 2022-05-12T11:41:09Z

This is done using the new MemStore which updates the in-memory
database changes using the GbTree structure.

jyeshe

Looks good and the amount of unit tests helping to understand the kind of state transactions are expected.

Looking from a different angle there were some refactoring opportunities and would add one PoC test as I have never used persistent_term to write big chunks of data, so depending on the number and size of blocks that the MemStore will keep, it would be needed to investigate if there is any limit for the persistent_term.put size/frequency.

With the iEx tried these two amounts of data:
1)

iex(1)> list = for i <- 1..100_000_000, do: i; nil
nil
iex(2)> :timer.tc(fn -> :persistent_term.put(State, %{list: list}) end)
literal_alloc: Cannot allocate 1600000088 bytes of memory (of type "literal").

iex(1)> list = for i <- 1..50_000_000, do: i; nil                      
nil
iex(2)> :timer.tc(fn -> :persistent_term.put(State, %{list: list}) end)
{411536, :ok}

(1) maybe fixable with vm args if needed, (2) took half of a second, doing only this write, so if the data per write/put is bigger than that or the processing in parallel affects the performance of persistent_term it would become a bottleneck with the time of writing taking more time than the time of producing data.

jyeshe · 2022-05-16T09:56:16Z

lib/ae_mdw/db/mem_store.ex

+  Uses a GbTree implementation for fast access to the keys, plus, being able to
+  iterate over them both forwards and backwards.


Without thinking on Hyperchains, it might be a good choice as the data is being written with :persistent_term which has the advantage over ets of reads without memory copy. It has also the benefit of a better control when the data is available for reads (with ets would require a GenServer).

lib/ae_mdw/db/mem_store.ex

jyeshe · 2022-05-16T10:57:55Z

lib/ae_mdw/sync/event_handler.ex

+  E.g.
+  ```
+  handler = EventHandler.init()
+  {:ok, handler2} = EventHandler.process_event({:new_height, 123456}, handler)


A nit pick naming issue as stateful event handlers are state machines. This leads to thinking about them so that perhaps it's an opportunity for representing/modeling it with a GenStateMachine. This could help separating the state of the machine from the db_state and mem_state making it explicit what part of the whole system state matters for the sync event handling (for the transition of the state machine). It looks like it would be feasible if the spawning is extracted out of the EventHandler in a way that instead of receiving a custom spawner, the state machine informs that there is something to be spawned (with the potential benefit of spawning without the need of a spawner abstraction or at least of using it, the spawner where it's instantiated).

sborrazas · 2022-05-24T22:05:52Z

@jyeshe I've now added the state machine implementation using GenStateMachine on 44115da

jyeshe

Code readability and maintainability was improved. Some refactoring or fixes suggested on encapsulation and process monitoring.

jyeshe · 2022-05-26T15:50:21Z

lib/ae_mdw/sync/server.ex

@@ -1,181 +1,353 @@
 defmodule AeMdw.Sync.Server do


This PR is an opportunity for a meaningful name as the behavior is being changed. In case it's difficult to find a name that describes the purpose of the module it's commonly a situation where the module has non cohesive responsibilities.

jyeshe · 2022-05-26T16:13:13Z

lib/ae_mdw/sync/server.ex

+           {:new_height, height()}
+           | {:db_done, gens_per_min()}
+           | {:mem_done, gens_per_min()}
+           | {:fork, height()}


Could you clean up or review the purpose of :fork event? It is not making any transition as described by the diagram above and besides that it only reads and writes fork_height which is not used in any other transition.

jyeshe · 2022-05-26T16:18:54Z

lib/ae_mdw/sync/server.ex

      ) do
-    gens_per_min = calculate_gens_per_min(gens_per_min, new_gens_per_min)
+    # CHECK FORK


Could you document why the fork from Node notification is not sufficient and not used anymore?

jyeshe · 2022-05-26T16:57:35Z

lib/ae_mdw/sync/server.ex

+
+      new_state
+    end)
+  end

  defp spawn_with_monitor(fun) do


It would be missing the monitor part... in order to make it a code more structured in Elixir way (in a sense that other devs know already what to expect), a Task.Supervisor.async_nolink could be used as it already monitors that task and despite the processing be very similar it provides more conveniences in terms of introspection and debugging (spawn). It also documents what to do next in case when the spawned process is finished.

Good idea! Done

jyeshe · 2022-05-26T17:14:21Z

lib/ae_mdw/sync/watcher.ex


-  def handle_info({:DOWN, ref, :process, _, reason}, %__MODULE__{sync_mon_ref: ref} = s) do
-    notify_operators(s.operators, reason)


I wonder if we could improve the notification feature rather than remove it. It might become helpful to be notified of unexpected behaviors with debugging info.

I agree, but right now this functionality is of no use at all, writing files and sending emails with the mail command. I think we should work on an actual useful feature with a module called ErrorNotifier or some sort of more descriptive name

jyeshe · 2022-05-26T17:30:57Z

lib/ae_mdw/sync/server.ex

+        %__MODULE__{restarts: restarts} = state_data
+      )
+      when restarts < @max_restarts do
+    Log.info("Mem Sync.Server error: #{inspect(reason)}")


After monitoring it's better use symmetrical calls including the demonitor.

Done, implemented using Task.Supervisor as suggested

jyeshe · 2022-05-26T17:41:14Z

lib/ae_mdw/sync/server.ex

+  @spec handle_event(:internal, internal_event(), state(), state_data()) ::
+          :gen_statem.event_handler_result(state())
+  def handle_event(:cast, {:new_height, chain_height}, :initialized, state_data) do
+    actions = [{:next_event, :internal, :check_sync}]


A constant (module attribute) would be helpful to know what actions are passed in :next_state.

Not sure what you mean by this, there's a typespec for all possible states and events. Can you elaborate please?

it's a small detail and on a personal taste for bigger values that are repeated I usually do something like:
@check_sync_next_event [{:next_event, :internal, :check_sync}]

jyeshe · 2022-05-26T17:46:49Z

lib/ae_mdw/sync/server.ex

+      db_state: db_state,
+      mem_state: mem_state,


Maybe there is a way to avoid exposing the whole database and in memory state to the State machine, having in the state_data only the state needed for the transitions.

The goal would be separate what is state needed to control the sync (the state machine state_data) from the state needed to execute the sync. But with the last refactor the use of db_state and mem_state are better encapsulated 👍

I would add only a comment explaining on the state machine notes that it needs only to compare the height of db_state and mem_state with the chain_height in order to decide what is the next transition (to the state syncing_db or syncing_mem).

jyeshe

Some thoughts:

Similarly to the controllers that were updated to read from in-memory state using the recent StatePlug, the async tasks would also need some way to be enqueued (written and read from) in-memory state.
Fork handling is missing on Watcher (subscriber) and Server
Since Server.gens_per_min() is called after each status controller request it would be better to move gens_per_min out of the syncing server state in order to avoid :gens_per_min messages competing with :new_height on Server mailbox.

sborrazas · 2022-06-13T18:40:00Z

Some thoughts:

Similarly to the controllers that were updated to read from in-memory state using the recent StatePlug, the async tasks would also need some way to be enqueued (written and read from) in-memory state.

Async-tasks are not persisted in-memory because the time it takes to process them is unknown, and by the time they are done, the memory contents might have completely changed. I think this is why we made them async tasks in the first place.

Fork handling is missing on Watcher (subscriber) and Server

For every new key block that appears in the chain (either by a new kb added or by a fork) we re-process all of the latest generations, regardless of whether it was caused by a fork or not. This is why there isn't really a need for determining if it's a fork or not, we just handle both the same way

Since Server.gens_per_min() is called after each status controller request it would be better to move gens_per_min out of the syncing server state in order to avoid :gens_per_min messages competing with :new_height on Server mailbox.

As per OTP guidelines it's not recommended to have messages that take long to process, which is why all messages that arrive on Sync.Server are instantly processed, including :new_height and :gens_per_min.

jyeshe · 2022-06-13T20:15:07Z

Some thoughts:

Similarly to the controllers that were updated to read from in-memory state using the recent StatePlug, the async tasks would also need some way to be enqueued (written and read from) in-memory state.

Async-tasks are not persisted in-memory because the time it takes to process them is unknown, and by the time they are done, the memory contents might have completely changed. I think this is why we made them async tasks in the first place.

The purpose of the async tasks is to run them in parallel with the sync. After 32c2575 for 1.7.3 the async task are enqueued only after 8 heights. I was waiting the in memory handling to fix this delay in order to restore the previous behaviour of 1.7.2. Opening an issue though.

Fork handling is missing on Watcher (subscriber) and Server

For every new key block that appears in the chain (either by a new kb added or by a fork) we re-process all of the latest generations, regardless of whether it was caused by a fork or not. This is why there isn't really a need for determining if it's a fork or not, we just handle both the same way

What if the fork happens during the processing of the in-memory first height backwards? Shouldn't the data that was just committed be really invalidated? (: I am giving the example of one height of difference but the fork might be even detected with higher delay (after more heights).

Since Server.gens_per_min() is called after each status controller request it would be better to move gens_per_min out of the syncing server state in order to avoid :gens_per_min messages competing with :new_height on Server mailbox.

As per OTP guidelines it's not recommended to have messages that take long to process, which is why all messages that arrive on Sync.Server are instantly processed, including :new_height and :gens_per_min.

It would be to avoid concurrency between multiple /status requests and any syncing process message as the GenServers can process only one message at a time while ETS allows multiple reads. The idea would would be to reduce a low risk of bottleneck (considering a change or a condition that makes any message slower) to none risk.

jyeshe

Note on required invalidation and optional :gens_per_minute improvement

jyeshe · 2022-06-13T21:18:19Z

Note on required invalidation and optional :gens_per_minute improvement

An example/history of sync and fork that can happen:

in memory sync 80, 81, 82, 83, 84, 85, 86, 87
80 to 87 heights are committed (persisted to database) and during the processing of 88 the chain detects that 87 doesn't belong to the main/rightful chain.

sborrazas · 2022-06-14T13:39:21Z

@jyeshe I've now updated the gens_per_min to be stored into a global persistent_term readable from anywhere as you requested.

Regarding the fork handling, like I mentioned previously the entire in-memory storage is re-processed for every change in the chain, including forks and new blocks, so that's handled already

jyeshe

Thanks for the refactor and for clarifying that a fork cannot happen on Aeternity blockchain after 10 heights from the top. I was still thinking of:
https://cointelegraph.com.br/news/ethereum-beacon-chain-experiences-7-block-reorg-what-s-going-on

thepiwo

looks really good from what I can see

jyeshe

Wait for resolution of:
#723 (comment)

Or a explicit requirement that allows this delay.

This is done using the new `MemStore` which updates the in-memory database changes using the GbTree structure.

This allows more explicitness in switching states

jyeshe

Discussed pending points documented here: "run in-memory aex9 tasks asynchronously" #749

sborrazas requested review from thepiwo and jyeshe May 12, 2022 11:41

sborrazas self-assigned this May 12, 2022

Base automatically changed from db-store to master May 12, 2022 16:26

sborrazas force-pushed the mem-store branch from 53ead18 to 56248e0 Compare May 12, 2022 16:27

jyeshe reviewed May 16, 2022

View reviewed changes

sborrazas force-pushed the mem-store branch from 56248e0 to 44115da Compare May 24, 2022 21:51

sborrazas requested a review from jyeshe May 24, 2022 22:05

jyeshe reviewed May 26, 2022

View reviewed changes

sborrazas force-pushed the mem-store branch from ad98ff1 to c00fe77 Compare June 10, 2022 12:58

sborrazas marked this pull request as ready for review June 10, 2022 13:11

sborrazas requested a review from jyeshe June 11, 2022 11:37

jyeshe requested changes Jun 13, 2022

View reviewed changes

sborrazas requested a review from jyeshe June 13, 2022 18:40

jyeshe reviewed Jun 13, 2022

View reviewed changes

jyeshe mentioned this pull request Jun 14, 2022

define delay tolerance for aex9 balance update processing #723

Closed

sborrazas force-pushed the mem-store branch from 7fbba32 to 3d32737 Compare June 14, 2022 13:15

sborrazas requested a review from jyeshe June 14, 2022 13:39

jyeshe approved these changes Jun 14, 2022

View reviewed changes

thepiwo approved these changes Jun 16, 2022

View reviewed changes

jyeshe requested changes Jun 21, 2022

View reviewed changes

sborrazas added 4 commits June 28, 2022 14:08

feat: sync latest gens in-memory for instant invalidation

0300061

This is done using the new `MemStore` which updates the in-memory database changes using the GbTree structure.

refactor: implement sync server with GenStateMachine

6d8ff18

This allows more explicitness in switching states

fix: include fork handling, remove comment

86d1b2a

refactor: use Tasks async_nolink/2 to spawn syncing procs

710573a

sborrazas added 6 commits June 28, 2022 14:08

refactor: recalculate mem gens when fork/new_height

2529e53

refactor: use in-memory state on StatePlug

81ea2f5

fix: allow mem to sync even when db sync has passed its limit

64497c3

refactor: use persistent_term to store the current gens_per_min rate

e560c11

style: fix formatting issue

a93d445

docs: update watcher/server event handling information

b40d9ee

sborrazas force-pushed the mem-store branch from 990ecc2 to b40d9ee Compare June 28, 2022 17:08

sborrazas requested a review from jyeshe June 28, 2022 17:08

jyeshe approved these changes Jun 29, 2022

View reviewed changes

sborrazas merged commit af95379 into master Jun 29, 2022

sborrazas deleted the mem-store branch June 29, 2022 20:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: sync latest gens in-memory for instant invalidation #676

feat: sync latest gens in-memory for instant invalidation #676

sborrazas commented May 12, 2022

jyeshe left a comment

jyeshe May 16, 2022 •

edited

Loading

jyeshe May 16, 2022

sborrazas commented May 24, 2022

jyeshe left a comment

jyeshe May 26, 2022

jyeshe May 26, 2022

jyeshe May 26, 2022

jyeshe May 26, 2022

sborrazas May 26, 2022

jyeshe May 26, 2022

sborrazas May 26, 2022

jyeshe May 30, 2022

jyeshe May 26, 2022

sborrazas May 26, 2022

jyeshe May 26, 2022

sborrazas May 26, 2022

jyeshe May 30, 2022

jyeshe May 26, 2022

sborrazas May 26, 2022

jyeshe May 30, 2022

jyeshe left a comment

sborrazas commented Jun 13, 2022

jyeshe commented Jun 13, 2022 •

edited

Loading

jyeshe left a comment

jyeshe commented Jun 13, 2022 •

edited

Loading

sborrazas commented Jun 14, 2022

jyeshe left a comment

thepiwo left a comment

jyeshe left a comment

jyeshe left a comment •

edited

Loading

		Uses a GbTree implementation for fast access to the keys, plus, being able to
		iterate over them both forwards and backwards.


		def handle_info({:DOWN, ref, :process, _, reason}, %__MODULE__{sync_mon_ref: ref} = s) do
		notify_operators(s.operators, reason)

feat: sync latest gens in-memory for instant invalidation #676

feat: sync latest gens in-memory for instant invalidation #676

Conversation

sborrazas commented May 12, 2022

jyeshe left a comment

Choose a reason for hiding this comment

jyeshe May 16, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sborrazas commented May 24, 2022

jyeshe left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jyeshe left a comment

Choose a reason for hiding this comment

sborrazas commented Jun 13, 2022

jyeshe commented Jun 13, 2022 • edited Loading

jyeshe left a comment

Choose a reason for hiding this comment

jyeshe commented Jun 13, 2022 • edited Loading

sborrazas commented Jun 14, 2022

jyeshe left a comment

Choose a reason for hiding this comment

thepiwo left a comment

Choose a reason for hiding this comment

jyeshe left a comment

Choose a reason for hiding this comment

jyeshe left a comment • edited Loading

Choose a reason for hiding this comment

jyeshe May 16, 2022 •

edited

Loading

jyeshe commented Jun 13, 2022 •

edited

Loading

jyeshe commented Jun 13, 2022 •

edited

Loading

jyeshe left a comment •

edited

Loading