Indexer Refactor feature branch #869

frrist · 2022-02-16T18:06:52Z

See me run here: https://protocollabs.grafana.net/goto/dfhCT9Pnk?orgId=1

Related Issues

Re-architect Lily for horizontal: diff extraction, diff storage, diff processing. #867, while not fully addressed, this PR modifies lily so it can address this issue with follow on work
Lily's watch should tolerate re-orgs #864, This PR together with completion of Re-architect Lily for horizontal: diff extraction, diff storage, diff processing. #867 will fully address this. The key piece of work done here was making the indexer stateless. State-full indexing caused errors when reorgs occur, now that indexing is stateless reorgs can be processed without error. A single instance of lily is not performant enough to process all tasks during a reog, but if ran with a lighter task load, this implementation of lily addresses this.
Split Lily tasks further to allow more paralelism #861 the work done in Full indexer refactor #871 closes this.

Description

This PR is a feature branch as was compiled from the below PR:

#849

This PR implements the taskAPI which is passed to and used by the indexer and its tasks.

The taskAPI presents a smaller interface for easier testing of tasks, it's simpler to mock than than the entire LilyAPI.
The StateChangedActors method has been removed from the indexer and now lives in the taskAPI. I think this is a lot cleaner.
The GetExecutedAndBlockMessagesForTipset and GetMessageExecutionsForTipSet method have been memoized, this allows them to be called within the Processors instead of called in the indexer with results passed to the processors. The goal here is to allow Processors to call these methods separatly to increase readability and extensabiliy of the code without sacrificing performance.

#850

This PR aims to keep indexer behavior consistent with the current while landing some refactor work to improve code extensibility.

embed GetExecutedAndBlockMessagesForTipset in processors, preventing unnecessary (and expensive!) calls to the method when its output isn't used. And example of this would be running all actor tasks excluding the miner actor. Prior changes here ensure multiple calls to this method are only executed once, returning a cached value for subsequent calls. This change also makes the indexer's logic more concise.
embed GetMessageExecutions in processors, taking advantage of the memoized method pattern used in previous commit. This has the added bonus of removing the ProcessMessageExecution interface and fitting the messege execution task under the ProcessMessages interface.

#871

This PR performs a (large) refactor of the indexer used by lily to index and process that chain. I have done my best to make each commit reviewable on its own and encourage reviewers to go through them sequentially. Here are some high-level details of what's changed:

Miner diffing

The DiffSectors and DiffPreCommits methods moved to the DataSource API. The API memoizes calls to these methods, allowing separate miner extractors to call them as needed.

Actor task breakup

All actor extractors have are separated such that each extractor produces a single model. For example, there is no longer a single miner state extractor rather there are several: DeadlineInfoExtractor, InfoExtractor, FeeDebtExtractor, LockedFundsExtractor, PoStExtractor, PreCommitInfoExtractor, SectorInfoExtractor, V7SectorInfoExtractor, etc. Each extract a single model from a miner state, utilizing the memoized methods (mentioned above) when needed. Similarly true for the market, init, power, reward, verifiedreg, etc. actors.

Message task breakup

The Messages processors are separate in a similar way - there exists a processor for GasOutputs, DerivedGasOutputs, Messages, ParsedMessages, InternalMessages, InternalParsedMessages, Receipts, BlockMessages, etc. These processors, too, use the memoized methods available on the DataSource API when appropriate.

The `Indexer` package

Here is where the new chain indexing logic lives. Its made up of 3 parts (all of which have code comments that should be referred to for more detail):

StateProcessor: accepts a set of processor interfaces and executes them all in parallel, returning their execution results via a channel.
TipSetIndexer: accepts a list of tasks, instantiates the appropriate processors, and passes them to the StateProcessor. The TipSetIndexer consumes results from the StateProcessor and produces results containing task names, extracted models, and reports, returning them on a channel. If the TipSetIndexer is instructed to halt before receiving all expected results from the StateProcessors it pushes the appropriate SKIP reports into its output channel as well.
IndexManager: consumes results from the TipSetIndexer and persists them to storage. The IndexManager will signal if any reports it receives from the TipSetIndexer have the status SKIP or ERROR. The IndexManager may also be given a worker pool to run multiple TipSetIndexer in parallel.

Simiplifcation of Processors

There are now only 3 processors used to extract state from the chain:

TipSetProcessor: processes a singel tipset
TipSetsProcessor: processes a current and an executed tipset
ActorProcessor: processes a current and executed tipset and actors that changed during the executed tipset appliction.
Said processors are consumed and executed by the StateProcessor

`ReportProcessor`

This processor only returns a visormodel.ProcessingReport and writes null rounds to the visormodel.ProcessingReport table. It replaces logic that once lived in the consensus task.

Models as tasks

Lily now accepts a list of model names to extract rather than a list of tasks corresponding to sets of models. For example, the task actorstatesminer has been replaced with: miner_current_deadline_infos, miner_fee_debts, miner_infos, miner_locked_funds,miner_pre_commit_infos,miner_sector_deals, miner_sector_events, miner_sector_infos, miner_sector_posts
similar for all other actorstates* tasks. This gives more flexibility to the user for extracting data. And, together with the "Actor task breakup" change, allows for certaine extractors to be configured to run over specific actor versions (greatly simplifying the extraction of the MinerSectorInfos after the v7 actor upgrade)

TODO

Implement --task-exclude flag mentioned here: Full indexer refactor #871 (comment)
Address [Indexer Refactor] miner_pre_commit_infos, miner_sector_infos, miner_sector_deals, and miner_sector_events task failures #897
document new APIs (index method) and task arguments

codecov-commenter · 2022-02-16T18:16:16Z

Codecov Report

❌ Patch coverage is 19.35484% with 150 lines in your changes missing coverage. Please review.
✅ Project coverage is 37.0%. Comparing base (92a3010) to head (60bf867).
⚠️ Report is 260 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff            @@
##           master    #869     +/-   ##
========================================
+ Coverage    29.5%   37.0%   +7.4%     
========================================
  Files          39      25     -14     
  Lines        3726    1802   -1924     
========================================
- Hits         1102     667    -435     
+ Misses       2481    1060   -1421     
+ Partials      143      75     -68

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

placer14 · 2022-04-11T21:28:55Z

Some thoughts about task names. I don't mind a full set of task options but I would consider creating a set of helper aliases. As Ian mentioned, all could be good for explicitness (I would expect default of --tasks to be all anyway). I would also include alias for the all of the prior versions tasks mapped onto their appropriate tasks. Ex: actorstatesminer would enable all models which were processed under that task previously. We can remove them later in a major version release.

Ninjaedit: And, yes please... a tasks filter would be good.

placer14

Sticky approval. (Sorry for slow review turnaround on this.) Congrats on getting this shipped! 💪 🎉

placer14 · 2022-04-11T21:21:15Z

chain/watcher.go

+		c.cache.Reset()
+		// unregister the observer
+		// TODO https://github.com/filecoin-project/lotus/pull/8441
+		//c.api.Unregister(notifier)


This PR is merged now. Does this need to change?

Not yet, it will change when we upgrade to the lotus release with the change, so 1.15.2 or something. (Currently, this exits in lotus master, but we don't use master as our dep)

placer14 · 2022-04-11T21:22:46Z

chain/watcher.go

+	}:
+	default:
+		log.Errorw("head notifier event channel blocked dropping revert event", "to", to.Key().String(), "from", from.Key().String())
+	}


I see both Apply and Revert drop the event if the buffer is full (different behavior from before the change, too). How will this affect the work needed on those tipsets? If so, what is the intended recovery mechanism?

If it's helpful, consider adding this information to the newly-exposed arg.

We can either drop the event or block the observer module: we don't want to block by waiting to send on the event channel since it will prevent other registered observers from receiving apply and revert events (method hangs), which is all-around bad.

Blocking should never happen during normal operations since the watcher will queue work it can't currently perform: https://github.com/filecoin-project/lily/blob/indexer-refactor/chain/watcher.go#L176. But if a watch job blows up/fails/gets restarted, its observer will still be registered and receiving events https://github.com/filecoin-project/lily/blob/indexer-refactor/chain/watcher.go#L69. And may continue to receive events even after its unregistered: https://github.com/filecoin-project/lotus/blob/20bf46f3098b3c433231480dda98cbf66940a201/chain/events/observer.go#L259. So this change prevents "zombie observers" from blocking on Apply/Revert until they can be unregistered.

If I'm hearing correctly that this will prevent that work from being completed, it would probably be helpful to indicate a brief warning in that error that that param could be adjusted. And a good idea to include in the help text as well.

- memoize methods for reuse within processors

- merge message execution processor with message processor - use DataSource API to retrieve Internal, Executed, and Block messages from within their respective processors

* refactor: ActorStateChangeDiff contains addresses, add TipSet method to DataSource API * polish: add tracing to adt diff method - hamt diff - amt diff * refactor: export miner actor diffing logic - setup to make miner diffing a part of the DatSource API and memoize miner diff methods * refactor: DataSource API - memoize miner diff methods - add tracing - create datasource package * refactor: break up actorstate tasks per model extracted - each actor extractor is responsible for producing a single model - add tracing throughout actor tasks * refactor: break up message tasks per model extracted - each message processor is responsible for producing a single model - add tracing throughout tasks * refactor: block, msapprovals, economics, tasks consensus - add tracing - implement the TipSetsProcessor interface - implement the TipSetProcessor interface * refactor: imlement ReportProcessor task - only produces processing reports - replaces consensus task writing null rounds to processing reports table * refactor: rename GetExecutedAndBlockMessagesForTipset params - adopt current and exectured name pattern * refactor: chain indexer - implement TipSetIndexer - implement StateProcessor - implement IndexManager * refactor: update walk, watch, fill, and find to work with new indexer * refactor: update itests to work with new indexer tasks * feat: implement index command - index a single epoch by tipset or height * cleanup: remove unused structs and TODO comment * refactor: break up internal message task per model extracte * refactor: add miner sector info v7 task * refactor: remove dead code * refactor: replace general tasks with model specific tasks * refactor: record metrics * cleanup: fix lint * fix: don't initialize zero value singleflight.Group * polish: add metrics and env setters for caches * polish: use address and tipsets for miner diffing keys - add asKey() helper * cleanup: address nits on manager construction * fix: manager locking for fatal error * cleanup: state processor logs and interface comments * fix: GetExecutedAndBlockMessagesForTipset error message * fix: remove unused miner state extractor * cleanup: remove market state extractor - its been broken apart into deal_proposal and deal_state extractors * clanup: close backtick on code comment * cleanup: parseTipSetKey accepts a string

- fixes #897 - simply miner extractor logic - ensure snapped sector dealID's are recorded by SectorDealsExtractor

- else we panic

…ferent stateroots

* fix: HeadNotifer Apply & Revert don't block on full chan - else when used as an observer the calls may hang forver * fix: allow watcher to be restarted without stale state * polish: add cli flg to set watcher buffer size

frrist force-pushed the indexer-refactor branch from a5099ab to b7d5c30 Compare February 23, 2022 19:14

frrist force-pushed the indexer-refactor branch 2 times, most recently from 924aaab to 2b103a6 Compare March 8, 2022 18:52

frrist mentioned this pull request Mar 8, 2022

Full indexer refactor #871

Merged

frrist self-assigned this Mar 8, 2022

frrist added the kind/feature New feature request label Mar 8, 2022

frrist mentioned this pull request Mar 9, 2022

wip: implement full internal message tracking #795

Closed

frrist force-pushed the indexer-refactor branch from ed6c62f to 5464d39 Compare March 9, 2022 21:01

frrist marked this pull request as ready for review March 22, 2022 20:56

frrist requested review from iand, placer14 and hsanjuan March 22, 2022 20:56

frrist added the P1 P1: Must be resolved label Mar 22, 2022

placer14 approved these changes Apr 11, 2022

View reviewed changes

frrist added 11 commits April 12, 2022 09:46

feat: implement DataSource API

695b5d1

- memoize methods for reuse within processors

refactor: simplify indexer processors

b622737

- merge message execution processor with message processor - use DataSource API to retrieve Internal, Executed, and Block messages from within their respective processors

fix: loading miner actors when actor doesn't exist

5c58f76

- fixes #897 - simply miner extractor logic - ensure snapped sector dealID's are recorded by SectorDealsExtractor

fix: correctly init diff containers

7a92c3c

- else we panic

fix: prevent deadlock when persisting models with same height and dif…

324938a

…ferent stateroots

refactor: move manager async logic to watcher

164112b

Frrist/watcher state cleanup (#927)

a490b5f

* fix: HeadNotifer Apply & Revert don't block on full chan - else when used as an observer the calls may hang forver * fix: allow watcher to be restarted without stale state * polish: add cli flg to set watcher buffer size

fix: actor task gets new wiat group for each use

044271e

chore: code gen task and table-task names (#915)

81edeaa

fix after rebase

a72fdac

frrist force-pushed the indexer-refactor branch from a16afc6 to a72fdac Compare April 12, 2022 17:07

frrist added 2 commits April 12, 2022 10:35

fix: exporter lock defer unlock call

30e5a51

chore: adjust logging

75fa7b0

frrist added 4 commits April 12, 2022 15:16

chore: add reporter name to all logs in index pipeline

60377e7

fix: correct manager log line missing log key

72fe58a

fix scheduler log reporter

a254e72

fix manager log

60bf867

frrist merged commit dbbc764 into master Apr 13, 2022

frrist deleted the indexer-refactor branch April 13, 2022 18:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Indexer Refactor feature branch #869

Indexer Refactor feature branch #869

Uh oh!

frrist commented Feb 16, 2022 •

edited

Loading

Uh oh!

codecov-commenter commented Feb 16, 2022 •

edited

Loading

Uh oh!

placer14 commented Apr 11, 2022 •

edited

Loading

Uh oh!

placer14 left a comment

Uh oh!

placer14 Apr 11, 2022

Uh oh!

frrist Apr 12, 2022

Uh oh!

placer14 Apr 11, 2022

Uh oh!

placer14 Apr 11, 2022

Uh oh!

frrist Apr 12, 2022

Uh oh!

placer14 Apr 12, 2022

Uh oh!

Uh oh!

Indexer Refactor feature branch #869

Indexer Refactor feature branch #869

Uh oh!

Conversation

frrist commented Feb 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related Issues

Description

Miner diffing

Actor task breakup

Message task breakup

The Indexer package

Simiplifcation of Processors

ReportProcessor

Models as tasks

TODO

Uh oh!

codecov-commenter commented Feb 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

placer14 commented Apr 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

placer14 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

frrist commented Feb 16, 2022 •

edited

Loading

The `Indexer` package

`ReportProcessor`

codecov-commenter commented Feb 16, 2022 •

edited

Loading

placer14 commented Apr 11, 2022 •

edited

Loading