distributed tipset workers #886

frrist · 2022-03-04T03:00:36Z

Whats changed?

Mostly 42c3bb8, its parent is a name change, ignore it.
This PR makes process toward #867 and uses https://github.com/hibiken/asynq for distributing work.

Integrated & distributed indexers

The chain/index package now contains two sub-packages:

integrated: contains the same indexer code as before, used in the same ways as usual by its consumers
distributed: contains a "distributed" TipSetIndexer backed buy a Queue interface. When its TipSet method is called it enqueues the TipSet and returns.

Indexer interface

TipSetIndexer are now required to implement the Indexer.TipSet() method. The new interface requires a priority and list of tasks. The integrated indexer ignores these params, and the distributed indexer used them when queuing work.

Watcher, Walker, Indexer processes consumer new Indexer Interface

All three have been modified to consume the new interface. An optional queue field has been added to their configs. When the field is absent they behave as they normally would.
When the field is set they will load the (redis) queue config from the daemon config file and use the distributed indexer - enqueuing tipsets as they are walked, watched, or indexed. This allows all three commands to be used as a means of producing work for tipset-processors. Watch tasks are highest priority, then Walk, then Index.

`worker-start tipset-processor` command

When executed a tipset worker is submitted to the scheduler. The worker will wait for tipsets to appear in the queue. The worker will process items in strict priority.

What's next

I'm not sold on https://github.com/hibiken/asynq. I've tried to set things up here to make swaping queue backend easy-ish. I also intend to explore https://github.com/RichardKnop/machinery as it supports the AMPQ protocol (asynq does not)

- memoize methods for reuse within processors

- merge message execution processor with message processor - use DataSource API to retrieve Internal, Executed, and Block messages from within their respective processors

…to DataSource API

- hamt diff - amt diff

- setup to make miner diffing a part of the DatSource API and memoize miner diff methods

- memoize miner diff methods - add tracing - create datasource package

- each actor extractor is responsible for producing a single model - add tracing throughout actor tasks

- each message processor is responsible for producing a single model - add tracing throughout tasks

- add tracing - implement the TipSetsProcessor interface - implement the TipSetProcessor interface

- only produces processing reports - replaces consensus task writing null rounds to processing reports table

- adopt current and exectured name pattern

- implement TipSetIndexer - implement StateProcessor - implement IndexManager

- index a single epoch by tipset or height

- add asKey() helper

- its been broken apart into deal_proposal and deal_state extractors

codecov-commenter · 2022-03-04T03:06:23Z

Codecov Report

❌ Patch coverage is 22.72727% with 51 lines in your changes missing coverage. Please review.
✅ Project coverage is 41.0%. Comparing base (be91d4d) to head (42c3bb8).
⚠️ Report is 277 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff            @@
##           master    #886     +/-   ##
========================================
+ Coverage    31.2%   41.0%   +9.7%     
========================================
  Files          39      25     -14     
  Lines        3865    1850   -2015     
========================================
- Hits         1209     759    -450     
+ Misses       2510    1015   -1495     
+ Partials      146      76     -70

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

frrist · 2022-03-04T03:30:16Z

config/config.go

 			},
 		},
 	}
+	cfg.Queue = QueueConfig{


@placer14 when you have some cycles can you drop some knowledge here:

This change appends the following to the daemon config:

[Queue] [Queue.Asynq] [Queue.Asynq.Asynq1] # env var: LOTUS_QUEUE_ASYNQ_ASYNQ1_NETWORK #Network = "tcp" # env var: LOTUS_QUEUE_ASYNQ_ASYNQ1_ADDR #Addr = "127.0.0.1:6379" # env var: LOTUS_QUEUE_ASYNQ_ASYNQ1_USERNAME #Username = "default" # env var: LOTUS_QUEUE_ASYNQ_ASYNQ1_PASSWORD #Password = "password" # env var: LOTUS_QUEUE_ASYNQ_ASYNQ1_DB #DB = 0 # env var: LOTUS_QUEUE_ASYNQ_ASYNQ1_POOLSIZE #PoolSize = 0

Will changes be needed here: https://github.com/filecoin-project/helm-charts/blob/master/charts/lily/values.yaml#L86 in order to deploy this? Or can I just append to it in a custom values file?

Yeah, I would add those values there, and then add logic in https://github.com/filecoin-project/helm-charts/blob/master/charts/lily/templates/configmap-daemonconfig.yaml which includes those values in the appropriate places.

FWIW, in the yaml, It'd probably look like this:

storage: postgresql: [] #- name: db #secretName: postgresql-secret #secretKey: url #schema: lily #applicationName: lily #poolSize: 20 #allowUpsert: false file: [] redis: [] #- name: queue #...: ...

Just a thought: I'd avoid unique naming in configurations. The library name is an internal concern which knows how to talk to redis. So perhaps redis is more descriptive of the target and less of a cognative leap? Since that's the destination these messages are delivered into, much like postgresql and file represent the destination the data is pushed into. WDYT?

Yeah, I would add those values there, and then add logic in https://github.com/filecoin-project/helm-charts/blob/master/charts/lily/templates/configmap-daemonconfig.yaml which includes those values in the appropriate places

How would I go about making a deployment with these changes? They aren't ready to land in master as this is still explorative work - Is there a way to use helm charts from a feature branch?

I'd avoid unique naming in configurations

Agreed. I don't intend for this to land in master as is - it's just a draft for exploring some ideas. I've opted to make the config names specific to the worker back-end since other libraries (like https://github.com/RichardKnop/machinery mentioned in the description) will more than likely require different configuration parameters.

So these changes would be put in the next version (say v0.8.0-rc1) which you could build and use that for a single deployment (without using it or changing it for other deployments). Existing deployments will continue to use the chart it's configured with until we update it.

Update helm-charts in your dev branch.

helm package ./charts/lily

helm install/upgrade <releasename> <path_to_new_lily_chart> -f <path_to_custom_values>

???

Profit.

frrist · 2022-04-13T18:35:15Z

will revisit now that indexer feature branch has merged to master

frrist added 30 commits March 2, 2022 13:57

feat: implement DataSource API

e1cbea7

- memoize methods for reuse within processors

refactor: simplify indexer processors

924aaab

- merge message execution processor with message processor - use DataSource API to retrieve Internal, Executed, and Block messages from within their respective processors

refactor: ActorStateChangeDiff contains addresses, add TipSet method …

4631f77

…to DataSource API

polish: add tracing to adt diff method

428899b

- hamt diff - amt diff

refactor: export miner actor diffing logic

58b5090

- setup to make miner diffing a part of the DatSource API and memoize miner diff methods

refactor: DataSource API

a527b23

- memoize miner diff methods - add tracing - create datasource package

refactor: break up actorstate tasks per model extracted

e5ac446

- each actor extractor is responsible for producing a single model - add tracing throughout actor tasks

refactor: break up message tasks per model extracted

e805b5d

- each message processor is responsible for producing a single model - add tracing throughout tasks

refactor: block, msapprovals, economics, tasks consensus

bf38799

- add tracing - implement the TipSetsProcessor interface - implement the TipSetProcessor interface

refactor: imlement ReportProcessor task

ef1832f

- only produces processing reports - replaces consensus task writing null rounds to processing reports table

refactor: rename GetExecutedAndBlockMessagesForTipset params

eec896f

- adopt current and exectured name pattern

refactor: chain indexer

299501b

- implement TipSetIndexer - implement StateProcessor - implement IndexManager

refactor: update walk, watch, fill, and find to work with new indexer

104b84c

refactor: update itests to work with new indexer tasks

8aeb23f

feat: implement index command

12c5c4d

- index a single epoch by tipset or height

cleanup: remove unused structs and TODO comment

2018178

refactor: break up internal message task per model extracte

b8e703e

refactor: add miner sector info v7 task

b1b2c0d

refactor: remove dead code

86f91c6

refactor: replace general tasks with model specific tasks

c8954d6

refactor: record metrics

42ed36b

cleanup: fix lint

0443b63

fix: don't initialize zero value singleflight.Group

5aeeed0

polish: add metrics and env setters for caches

09b3760

polish: use address and tipsets for miner diffing keys

dc1b43a

- add asKey() helper

cleanup: address nits on manager construction

7fa0767

fix: manager locking for fatal error

a125fc8

cleanup: state processor logs and interface comments

476f78f

fix: GetExecutedAndBlockMessagesForTipset error message

f6c95d3

fix: remove unused miner state extractor

ac9fa2d

frrist added 5 commits March 3, 2022 13:18

cleanup: remove market state extractor

7863a3d

- its been broken apart into deal_proposal and deal_state extractors

clanup: close backtick on code comment

94e8068

cleanup: parseTipSetKey accepts a string

e88c904

refactor: rename indexer to index

76dcac8

feat: implement distributed worker pattern using asynq

42c3bb8

frrist self-assigned this Mar 4, 2022

frrist commented Mar 4, 2022

View reviewed changes

frrist force-pushed the frrist/indexer-refactor-pt6 branch from e88c904 to 0c865a9 Compare March 8, 2022 18:55

Base automatically changed from frrist/indexer-refactor-pt6 to indexer-refactor March 8, 2022 19:04

frrist force-pushed the indexer-refactor branch from ed6c62f to 5464d39 Compare March 9, 2022 21:01

frrist force-pushed the indexer-refactor branch from a16afc6 to a72fdac Compare April 12, 2022 17:07

Base automatically changed from indexer-refactor to master April 13, 2022 18:29

frrist closed this Apr 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

distributed tipset workers #886

distributed tipset workers #886

Uh oh!

frrist commented Mar 4, 2022 •

edited

Loading

Uh oh!

codecov-commenter commented Mar 4, 2022 •

edited

Loading

Uh oh!

frrist Mar 4, 2022

Uh oh!

placer14 Mar 7, 2022 •

edited

Loading

Uh oh!

frrist Mar 7, 2022

Uh oh!

placer14 Mar 8, 2022

Uh oh!

frrist commented Apr 13, 2022

Uh oh!

Uh oh!

distributed tipset workers #886

distributed tipset workers #886

Uh oh!

Conversation

frrist commented Mar 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Whats changed?

Integrated & distributed indexers

Indexer interface

Watcher, Walker, Indexer processes consumer new Indexer Interface

worker-start tipset-processor command

What's next

Uh oh!

codecov-commenter commented Mar 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

frrist Mar 4, 2022

Choose a reason for hiding this comment

Uh oh!

placer14 Mar 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

frrist Mar 7, 2022

Choose a reason for hiding this comment

Uh oh!

placer14 Mar 8, 2022

Choose a reason for hiding this comment

Uh oh!

frrist commented Apr 13, 2022

Uh oh!

Uh oh!

frrist commented Mar 4, 2022 •

edited

Loading

`worker-start tipset-processor` command

codecov-commenter commented Mar 4, 2022 •

edited

Loading

placer14 Mar 7, 2022 •

edited

Loading