Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New scraper - pre-merge cleanup 2 #3244

Merged
merged 14 commits into from
Sep 27, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
23 changes: 23 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,34 @@ This file details changes made to TrueBlocks over time. See the [migration notes

## v1.1.0 (2023/10/01)

Added chifra config show
Updated trueBlocks.toml config file to include scraper configs
Updates block data structure to read withdrawals and adds them to the index
Adds withdrawals to blocks data structure, breaking change to block cache - needs migration
Forces v1.0.0-release into the TrueBlocks.toml file (needs a migration)
Added `--watch_list`, `--commands`, `--batch_size`, and `--run_count` options to `chifra monitors`.
Enables `chifra config edit`
Added `--run_count` to `chifra scrape` (for debugging purposes).
Changed names of some rarely used special blocks in `chifra when`. Breaking, but minimal impact.
Removes a number of previously deprecated options. `chifra abis --sol`, `chifra names --named`, `chifra names --to_custom`, `chifra transactions --trace`, and `chifra --log_level` for all commands.
Better error reporting when running against non-tracing nodes.
Fixed an issue with Content-Type in the server.
Fixed an issue where user could hit cntl+c during caching and corrupt the database.
Fixed an issue where scraper was missing some smart contract addresses created during out of gas transactions.
Better error handling from the RPC.
Added a verbose mode to `chifra when` to include more specials and a description for each special block.
Now disallows running `chifra scrape` if the node is not a tracing archive node.
Near complete re-write of block scraper to fix few bugs and prepare for writing v1.0.0 of the spec.
Add `--run_count` to both the `chifra scrape` and the `chifra monitors --watch`, hides both because they are intended for debugging only.
Added `--publisher` option to `chifra init` and `chifra chunks`.
- Removed `apiProvider` field from `Chain` data model as unused.
- Changes to `Config` data model to improve clarity and consistency:
- changed `GetPathToRootConfig` to `PathToRootConfig`
- changed `GetPathToCache` to `PathToCache`
- changed `GetPathToIndex` to `PathToIndex`
- Removes `runOnce` from `chifra monitors` (in favor of hidden `runCount` - for `chifra scrape` as well).
- Adds `dryRun` to `chifra scrape`.
- Adds `diff` to `chifra chunks`.

## v1.0.0 (2023/08/20)

Expand Down
15 changes: 6 additions & 9 deletions docs/content/api/openapi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -622,15 +622,6 @@ paths:
schema:
type: number
format: uint64
- name: runOnce
description: >
available with --watch option only, only run the monitor --watch commands once then quit
required: false
style: form
in: query
explode: true
schema:
type: boolean
- name: sleep
description: >
available with --watch option only, the number of seconds to sleep between runs
Expand Down Expand Up @@ -2796,6 +2787,12 @@ components:
type: array
items:
$ref: "#/components/schemas/hash"
description: "a possibly empty array of uncle hashes"
withdrawals:
type: array
items:
$ref: "#/components/schemas/withdrawal"
description: "a possibly empty array of withdrawals (post Shanghai)"
transaction:
description: "transaction data as returned from the RPC (with slight enhancements)"
type: object
Expand Down
3 changes: 1 addition & 2 deletions docs/content/chifra/accounts.md
Original file line number Diff line number Diff line change
Expand Up @@ -243,15 +243,14 @@ Flags:
-a, --watchlist string available with --watch option only, a file containing the addresses to watch
-c, --commands string available with --watch option only, the file containing the list of commands to apply to each watched address
-b, --batch_size uint available with --watch option only, the number of monitors to process in each batch (default 8)
-r, --run_once available with --watch option only, only run the monitor --watch commands once then quit
-s, --sleep float available with --watch option only, the number of seconds to sleep between runs (default 14)
-x, --fmt string export format, one of [none|json*|txt|csv]
-v, --verbose enable verbose output
-h, --help display this help screen

Notes:
- An address must be either an ENS name or start with '0x' and be forty-two characters long.
- If no address is presented to the --clean command, all existing monitors are be cleaned.
- If no address is presented to the --clean command, all existing monitors will be cleaned.
- The --watch option requires two additional parameters to be specified: --watchlist and --commands.
- Addresses provided on the command line are ignored in --watch mode.
- Providing the value existing to the --watchlist monitors all existing monitor files (see --list).
Expand Down
24 changes: 13 additions & 11 deletions docs/content/chifra/admin.md
Original file line number Diff line number Diff line change
Expand Up @@ -213,24 +213,25 @@ Links:

Each of the following additional configurable command line options are available.

**Configuration file:** `$CONFIG/$CHAIN/blockScrape.toml`
**Configuration group:** `[settings]`
**Configuration file:** `trueBlocks.toml`
**Configuration group:** `[scrape.<chain>]`

| Item | Type | Default | Description / Default |
| ------------------ | ------------ | ------------ | --------- |
| apps&lowbar;per&lowbar;chunk | uint64 | 200000 | the number of appearances to build into a chunk before consolidating it |
| snap&lowbar;to&lowbar;grid | uint64 | 100000 | an override to apps_per_chunk to snap-to-grid at every modulo of this value, this allows easier corrections to the index |
| first&lowbar;snap | uint64 | 0 | the first block at which snap_to_grid is enabled |
| unripe&lowbar;dist | uint64 | 28 | the distance (in blocks) from the front of the chain under which (inclusive) a block is considered unripe |
| channel&lowbar;count | uint64 | 20 | number of concurrent processing channels |
| allow&lowbar;missing | bool | true | do not report errors for blockchains that contain blocks with zero addresses |
| appsPerChunk | uint64 | 200000 | the number of appearances to build into a chunk before consolidating it |
| snapToGrid | uint64 | 100000 | an override to apps_per_chunk to snap-to-grid at every modulo of this value, this allows easier corrections to the index |
| firstSnap | uint64 | 0 | the first block at which snap_to_grid is enabled |
| unripeDist | uint64 | 28 | the distance (in blocks) from the front of the chain under which (inclusive) a block is considered unripe |
| channelCount | uint64 | 20 | number of concurrent processing channels |
| allowMissing | bool | true | do not report errors for blockchains that contain blocks with zero addresses |

Note that for Ethereum mainnet, the default values for appsPerChunk and firstSnap are 2,000,000 and 2,300,000 respectively. See the specification for a justification of these values.

These items may be set in three ways, each overridding the preceeding method:

-- in the above configuration file under the `[settings]` group,
-- in the environment by exporting the configuration item as UPPER&lowbar;CASE, without underbars, and prepended with TB_SETTINGS&lowbar;, or
-- on the command line using the configuration item with leading dashes (i.e., `--name`).
-- in the above configuration file under the `[scrape.<chain>]` group,
-- in the environment by exporting the configuration item as UPPER&lowbar;CASE (with underbars removed) and prepended with TB_SCRAPE&lowbar;CHAIN&lowbar;, or
-- on the command line using the configuration item with leading dashes and in snake case (i.e., `--snake_case`).

<!-- markdownlint-disable MD041 -->
### further information
Expand Down Expand Up @@ -314,6 +315,7 @@ Notes:
- The --first_block and --last_block options apply only to addresses, appearances, and index --belongs mode.
- The --pin option requires a locally running IPFS node or a pinning service API key.
- The --publish option requires a private key.
- The --publisher option is ignored with the --publish option since the sender of the transaction is recorded as the publisher.
```

Data models produced by this tool:
Expand Down
3 changes: 2 additions & 1 deletion docs/content/data-model/chaindata.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,8 @@ Blocks consist of the following fields:
| date | a calculated field -- the date of the object | datetime |
| transactions | a possibly empty array of transactions or transaction hashes | [Transaction[]](/data-model/chaindata/#transaction) |
| baseFeePerGas | the base fee for this block | wei |
| uncles | | Hash |
| uncles | a possibly empty array of uncle hashes | Hash |
| withdrawals | a possibly empty array of withdrawals (post Shanghai) | [Withdrawal[]](/data-model/chaindata/#withdrawal) |

## Transaction

Expand Down
3 changes: 1 addition & 2 deletions docs/readmes/accounts-monitors.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,15 +63,14 @@ Flags:
-a, --watchlist string available with --watch option only, a file containing the addresses to watch
-c, --commands string available with --watch option only, the file containing the list of commands to apply to each watched address
-b, --batch_size uint available with --watch option only, the number of monitors to process in each batch (default 8)
-r, --run_once available with --watch option only, only run the monitor --watch commands once then quit
-s, --sleep float available with --watch option only, the number of seconds to sleep between runs (default 14)
-x, --fmt string export format, one of [none|json*|txt|csv]
-v, --verbose enable verbose output
-h, --help display this help screen

Notes:
- An address must be either an ENS name or start with '0x' and be forty-two characters long.
- If no address is presented to the --clean command, all existing monitors are be cleaned.
- If no address is presented to the --clean command, all existing monitors will be cleaned.
- The --watch option requires two additional parameters to be specified: --watchlist and --commands.
- Addresses provided on the command line are ignored in --watch mode.
- Providing the value existing to the --watchlist monitors all existing monitor files (see --list).
Expand Down
1 change: 1 addition & 0 deletions docs/readmes/admin-chunks.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ Notes:
- The --first_block and --last_block options apply only to addresses, appearances, and index --belongs mode.
- The --pin option requires a locally running IPFS node or a pinning service API key.
- The --publish option requires a private key.
- The --publisher option is ignored with the --publish option since the sender of the transaction is recorded as the publisher.
```

Data models produced by this tool:
Expand Down
23 changes: 12 additions & 11 deletions docs/readmes/admin-scrape.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,24 +50,25 @@ Links:

Each of the following additional configurable command line options are available.

**Configuration file:** `$CONFIG/$CHAIN/blockScrape.toml`
**Configuration group:** `[settings]`
**Configuration file:** `trueBlocks.toml`
**Configuration group:** `[scrape.<chain>]`

| Item | Type | Default | Description / Default |
| ------------------ | ------------ | ------------ | --------- |
| apps&lowbar;per&lowbar;chunk | uint64 | 200000 | the number of appearances to build into a chunk before consolidating it |
| snap&lowbar;to&lowbar;grid | uint64 | 100000 | an override to apps_per_chunk to snap-to-grid at every modulo of this value, this allows easier corrections to the index |
| first&lowbar;snap | uint64 | 0 | the first block at which snap_to_grid is enabled |
| unripe&lowbar;dist | uint64 | 28 | the distance (in blocks) from the front of the chain under which (inclusive) a block is considered unripe |
| channel&lowbar;count | uint64 | 20 | number of concurrent processing channels |
| allow&lowbar;missing | bool | true | do not report errors for blockchains that contain blocks with zero addresses |
| appsPerChunk | uint64 | 200000 | the number of appearances to build into a chunk before consolidating it |
| snapToGrid | uint64 | 100000 | an override to apps_per_chunk to snap-to-grid at every modulo of this value, this allows easier corrections to the index |
| firstSnap | uint64 | 0 | the first block at which snap_to_grid is enabled |
| unripeDist | uint64 | 28 | the distance (in blocks) from the front of the chain under which (inclusive) a block is considered unripe |
| channelCount | uint64 | 20 | number of concurrent processing channels |
| allowMissing | bool | true | do not report errors for blockchains that contain blocks with zero addresses |

Note that for Ethereum mainnet, the default values for appsPerChunk and firstSnap are 2,000,000 and 2,300,000 respectively. See the specification for a justification of these values.

These items may be set in three ways, each overridding the preceeding method:

-- in the above configuration file under the `[settings]` group,
-- in the environment by exporting the configuration item as UPPER&lowbar;CASE, without underbars, and prepended with TB_SETTINGS&lowbar;, or
-- on the command line using the configuration item with leading dashes (i.e., `--name`).
-- in the above configuration file under the `[scrape.<chain>]` group,
-- in the environment by exporting the configuration item as UPPER&lowbar;CASE (with underbars removed) and prepended with TB_SCRAPE&lowbar;CHAIN&lowbar;, or
-- on the command line using the configuration item with leading dashes and in snake case (i.e., `--snake_case`).

<!-- markdownlint-disable MD041 -->
### further information
Expand Down
6 changes: 6 additions & 0 deletions docs/templates/api/components.txt
Original file line number Diff line number Diff line change
Expand Up @@ -424,6 +424,12 @@ components:
type: array
items:
$ref: "#/components/schemas/hash"
description: "a possibly empty array of uncle hashes"
withdrawals:
type: array
items:
$ref: "#/components/schemas/withdrawal"
description: "a possibly empty array of withdrawals (post Shanghai)"
transaction:
description: "transaction data as returned from the RPC (with slight enhancements)"
type: object
Expand Down
12 changes: 6 additions & 6 deletions docs/templates/readme-intros/admin-scrape.config.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
| Item | Type | Default | Description / Default |
| ------------------ | ------------ | ------------ | --------- |
| apps&lowbar;per&lowbar;chunk | uint64 | 200000 | the number of appearances to build into a chunk before consolidating it |
| snap&lowbar;to&lowbar;grid | uint64 | 100000 | an override to apps_per_chunk to snap-to-grid at every modulo of this value, this allows easier corrections to the index |
| first&lowbar;snap | uint64 | 0 | the first block at which snap_to_grid is enabled |
| unripe&lowbar;dist | uint64 | 28 | the distance (in blocks) from the front of the chain under which (inclusive) a block is considered unripe |
| channel&lowbar;count | uint64 | 20 | number of concurrent processing channels |
| allow&lowbar;missing | bool | true | do not report errors for blockchains that contain blocks with zero addresses |
| appsPerChunk | uint64 | 200000 | the number of appearances to build into a chunk before consolidating it |
| snapToGrid | uint64 | 100000 | an override to apps_per_chunk to snap-to-grid at every modulo of this value, this allows easier corrections to the index |
| firstSnap | uint64 | 0 | the first block at which snap_to_grid is enabled |
| unripeDist | uint64 | 28 | the distance (in blocks) from the front of the chain under which (inclusive) a block is considered unripe |
| channelCount | uint64 | 20 | number of concurrent processing channels |
| allowMissing | bool | true | do not report errors for blockchains that contain blocks with zero addresses |
1 change: 0 additions & 1 deletion sdk/python/src/_monitors.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@
"watchlist": {"hotkey": "-a", "type": "flag"},
"commands": {"hotkey": "-c", "type": "flag"},
"batchSize": {"hotkey": "-b", "type": "flag"},
"runOnce": {"hotkey": "-r", "type": "switch"},
"sleep": {"hotkey": "-s", "type": "flag"},
"fmt": {"hotkey": "-x", "type": "flag"},
"verbose:": {"hotkey": "-v", "type": "switch"},
Expand Down
1 change: 0 additions & 1 deletion sdk/typescript/src/paths/monitors.ts
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@ export function getMonitors(
watchlist?: string,
commands?: string,
batchSize?: uint64,
runOnce?: boolean,
sleep?: double,
chain: string,
noHeader?: boolean,
Expand Down
3 changes: 2 additions & 1 deletion sdk/typescript/src/types/block.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
/*
* This file was generated with makeClass --sdk. Do not edit it.
*/
import { address, blknum, datetime, gas, hash, timestamp, Transaction, uint64, wei } from '.';
import { address, blknum, datetime, gas, hash, timestamp, Transaction, uint64, wei, Withdrawal } from '.';

export type Block = {
author: address
Expand All @@ -29,4 +29,5 @@ export type Block = {
transactions: Transaction[]
transactionsRoot: hash
uncles?: hash[]
withdrawals?: Withdrawal[]
}
7 changes: 6 additions & 1 deletion src/apps/chifra/cmd/chunks.go
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,8 @@ Notes:
- The --belongs option is only available in the index mode.
- The --first_block and --last_block options apply only to addresses, appearances, and index --belongs mode.
- The --pin option requires a locally running IPFS node or a pinning service API key.
- The --publish option requires a private key.`
- The --publish option requires a private key.
- The --publisher option is ignored with the --publish option since the sender of the transaction is recorded as the publisher.`

func init() {
var capabilities = caps.Default // Additional global caps for chifra chunks
Expand All @@ -69,16 +70,20 @@ func init() {
chunksCmd.Flags().BoolVarP(&chunksPkg.GetOptions().Check, "check", "c", false, "check the manifest, index, or blooms for internal consistency")
chunksCmd.Flags().BoolVarP(&chunksPkg.GetOptions().Pin, "pin", "i", false, "pin the manifest or each index chunk and bloom")
chunksCmd.Flags().BoolVarP(&chunksPkg.GetOptions().Publish, "publish", "p", false, "publish the manifest to the Unchained Index smart contract")
chunksCmd.Flags().StringVarP(&chunksPkg.GetOptions().Publisher, "publisher", "P", "trueblocks.eth", "for some query options, the publisher of the index (hidden)")
chunksCmd.Flags().Uint64VarP(&chunksPkg.GetOptions().Truncate, "truncate", "n", 0, "truncate the entire index at this block (requires a block identifier) (hidden)")
chunksCmd.Flags().BoolVarP(&chunksPkg.GetOptions().Remote, "remote", "r", false, "prior to processing, retreive the manifest from the Unchained Index smart contract")
chunksCmd.Flags().StringSliceVarP(&chunksPkg.GetOptions().Belongs, "belongs", "b", nil, "in index mode only, checks the address(es) for inclusion in the given index chunk")
chunksCmd.Flags().BoolVarP(&chunksPkg.GetOptions().Diff, "diff", "f", false, "compare two index portions (see notes) (hidden)")
chunksCmd.Flags().Uint64VarP(&chunksPkg.GetOptions().FirstBlock, "first_block", "F", 0, "first block to process (inclusive)")
chunksCmd.Flags().Uint64VarP(&chunksPkg.GetOptions().LastBlock, "last_block", "L", 0, "last block to process (inclusive)")
chunksCmd.Flags().Uint64VarP(&chunksPkg.GetOptions().MaxAddrs, "max_addrs", "m", 0, "the max number of addresses to process in a given chunk")
chunksCmd.Flags().BoolVarP(&chunksPkg.GetOptions().Deep, "deep", "d", false, "if true, dig more deeply during checking (manifest only)")
chunksCmd.Flags().Float64VarP(&chunksPkg.GetOptions().Sleep, "sleep", "s", 0.0, "for --remote pinning only, seconds to sleep between API calls")
if os.Getenv("TEST_MODE") != "true" {
chunksCmd.Flags().MarkHidden("publisher")
chunksCmd.Flags().MarkHidden("truncate")
chunksCmd.Flags().MarkHidden("diff")
}
globals.InitGlobals(chunksCmd, &chunksPkg.GetOptions().Globals, capabilities)

Expand Down