Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse indexed deals for data segment index #1495

Merged
merged 131 commits into from
Oct 9, 2023

Conversation

willscott
Copy link
Collaborator

@willscott willscott commented Jun 6, 2023

this makes boost aware of the format in
https://github.com/filecoin-project/go-data-segment/ for new deals.

this is the same logic as in filecoin-project/lotus#10674

  • what branch should i be trying to merge this to?
  • is /extern/go-data-segment a reasonable additional submodule to add in so that the piece fixture in that repo can get used for a test?
  • test confirming expected index parsing of the data-segment fixture

jacobheun and others added 30 commits March 21, 2023 16:46
Internally this is still refered to as the piece directory

Co-authored-by: dirkmc <dirkmdev@gmail.com>
Co-authored-by: Anton Evangelatov <anton.evangelatov@gmail.com>
* add free check (#1315)

* chore: bump version to 1.6.1 (#1317)

* fix legacy deal verified status (#1324)

* fix: update go-unixfsnode enough to make sure unixfs-preload is available (#1323)

* release v1.6.2-rc1 (#1328)

* use full path (#1330)

* fix bug (#1332)

* use forks of graphsync, go-data-transfer and go-fil-markets (#1333)

* refactor: use forks of graphsync, go-data-transfer and go-fil-markets

* refactor: convert from data transfer v1 to v2 voucher type

* fix: index provider validation voucher type

* fix: pass index provider engine link system through to graphsync's transport configurer

* feat: use tagged version of boost-gfm

* fix: retrieval client imports

* feat: tagged version of lotus

* feat: require go 1.19

* lint: fix lint errors

* fix: itests

* fix: cbor-gen, docsgen

* fix: update CI lint version

* fix: lint

* fix: docgen

* fix: go mod tidy

* fix: protocol proxy TestOutboundForwarding

* fix: docsgen

* fix: update filecoin-ffi submodule

* fix: prometheus duplicate register panic

* fix: cleanup imports

* fix: legs voucher processing

* chore: release v1.6.2-rc2 (#1340)

* release v1.6.2-rc2

* fix test

* fix: flaky TestLibp2pCarServerNewTransferCancelsPreviousTransfer (#1350)

* fix: flaky TestDealCompletionOnProcessResumption (#1351)

* fix: occasional panic on shutdown (#1353)

* feat: query UI (#1352)

* log insert

* fix display error

* refactor code

* shorten status strings

* remove comment

* apply suggestion

* feat: add download block link to inspect page (#1312)

* fix(devnet): update golang and lotus default versions (#1354)

* fix(devnet): bump golang to 1.19

* chore(devnet): bump lotus default version

* chore(devnet): remove unused stable env

* booster-http: implement IPFS HTTP gateway (#1225)

* feat: implement http api gateway

* feat: use go-libipfs lib (instead of copying to extern)

* feat: bump booster-bitswap info minor version

* feat: http gateway metrics

* fix: TestHttpInfo

* feat: by default only serve blocks and CARs, with option to serve original files (jpg, mov etc)

* fix: correct link for download root block (#1355)

* feat: option to cleanup data for offline deals after add piece (#1341)

* chore: add support for multiple node.js versions in makefile (#1356)

* chore: release v.1.7.0-rc1 (#1357)

* release v.1.7.0-rc1

* fix version

* fix: dagstore initialize-all parameter (#1363)

* fix: show verifying commp state for offline deals (#1364)

* fix: boost run missing staging-area dir (#1368)

* merge(wip): main to lid

TODO: remoteblockstore needs to handle nil metrics

* fix: flaky TestNewHttpServer (#1372)

* feat: group agent version by binary name (#1369)

* fix: wrap stats in nil checks for now

we should probably revisit how stats are handled now that we have all 3 transports being tracked

* test(fix): incorrect test urls

---------

Co-authored-by: LexLuthr <88259624+LexLuthr@users.noreply.github.com>
Co-authored-by: Rod Vagg <rod@vagg.org>
Co-authored-by: dirkmc <dirkmdev@gmail.com>
* feat: support full addr config in boostd-data

* chore: fix linting for boostd-data

* feat: use addr instead of port for lid

chore: update devnet to work with lid setup

* chore: resolve feedback on lint changes
* fail deal if start epoch passed

* add suggestion

* test: add deal expiry on startup test

---------

Co-authored-by: Dirk McCormick <dirkmdev@gmail.com>
fix: add cache tests and dont announce cache state
fix: add unique index to sector state db
fix: sealed and unsealed sector state conflict
fix: ensure index provider wrapper starts after db migration has completed
* feat: yugabyte db impl

* feat: run yugabyte tests against a dockerized yugabyte

* fix: use out own yugabyte docker image

* fix: use yugabyte 2.17.2.0 docker image

* feat: piece doctor yugabyte impl

* fix: go mod tidy

* refactor: remove SetCarSize as its not longer being used

* refactor: remove functionality to mark index as errored (not being used)

* feat: implement delete commands

* refactor: consolidate test params

* feat: add lid yugabyte config

* fix: port map yugabyte postgres to standard port

* Fix yugabyte CI (#1433)

* fix: yugabyte tests in CI

* docker-compose.yml ; Dockerfile.test ; connect to `yugabyte` and not localhost

* add tag

* test lid

* make gen

* fixup

* move couchbase settings under build tag

---------

Co-authored-by: Anton Evangelatov <anton.evangelatov@gmail.com>

---------

Co-authored-by: Anton Evangelatov <anton.evangelatov@gmail.com>
* feat: script to migrate from couchbase to yugabyte

* fix: reduce batch size for yugabyte inserts
…#1444)

* feat: yugabyte db impl

* feat: run yugabyte tests against a dockerized yugabyte

* fix: use out own yugabyte docker image

* fix: use yugabyte 2.17.2.0 docker image

* feat: piece doctor yugabyte impl

* fix: go mod tidy

* refactor: remove SetCarSize as its not longer being used

* refactor: remove functionality to mark index as errored (not being used)

* feat: implement delete commands

* refactor: consolidate test params

* feat: add lid yugabyte config

* fix: port map yugabyte postgres to standard port

* Fix yugabyte CI (#1433)

* fix: yugabyte tests in CI

* docker-compose.yml ; Dockerfile.test ; connect to `yugabyte` and not localhost

* add tag

* test lid

* make gen

* fixup

* move couchbase settings under build tag

---------

Co-authored-by: Anton Evangelatov <anton.evangelatov@gmail.com>

* wip: service GetIndex returns channel of records instead of array

* feat: return channel from AddIndex and GetIndex

---------

Co-authored-by: Anton Evangelatov <anton.evangelatov@gmail.com>
* initial disaster recovery tool for LID

* wip

* do not block on individual error

* instantiate lid

* report

* catch signal

* fixup

* comment out sector already in progress

* fixup

* start containers with init: true

* record that we dont have an unsealed copy

* match deals with boost sqlite db and piece store

* fixup

* fixup

* use logger

* fixup

* disable stacktrace

* fixup

* extract piece store away from disaster recovery struct

* add more sanity checks

* compare IsUnsealed vs storage find

* improve safeIsUnseal

* fixup

* better logs

* expand repodir

* calc properly next offset

* fixup

* add sector id to logs

* incr offset

* break after finding expired deal

* more logs

* fewer logs

* better logs

* better error

* refactor

* refactor minerApi

* better logs

* add time around add index

* pd.Start
* feat: LID benchmarking tool

* fix: bench thread safety

* refactor: structured logging

* refactor: postgres bulk insert

* lid bench: Add foundationdb impl

* lid fdb: Fix Tx sizing, parallel chunk puts

* lid fdb: More efficient sample generation

* feat: array of piece count / blocks per piece (#1314)

* lid bench: print add rate

* lid bench: Add retry to postgres put (#1316)

* lid bench: Make cassandra put much more robust (#1318)

* instrumentation for bench tool (#1337)

* instrument postgres

* more instrumentation

* check for err getoffsetsize

* emit metrics every 10sec

* ignore errors

* add postgres-drop

* use directly tables

* fix: go mod tidy

* use INSERT INTO instead of tmp tables

* try to catch sig

* remove transaction commit

* fixup

* add postgres-init

* fixuop

* split create and init

* fixup

* remove if not exist

---------

Co-authored-by: Dirk McCormick <dirkmdev@gmail.com>

* feat: batch insert queries for postgres

* feat: add flag to insert into postgres using tmp table

* refactor: merge changes from nonsense/lid-bench

* refactor: just use one database (dont create bench database)

* refactor: remove unused params

* refactor: command structure

* fix: cassandra - dont use batch insert for PayloadToPieces

* fix: create tables CQL

* fix: increase payload to pieces insert parallelism

* fix: use simple replication strategy

* feat: use yugabyte cassandra driver

* fix: remove bench binary

* update metrics endpoint

* fix random generated piece cid

* fixup

* fix: cassandra bitswap benchmark

* remove foundationdb

---------

Co-authored-by: Łukasz Magiera <magik6k@gmail.com>
Co-authored-by: Łukasz Magiera <magik6k@users.noreply.github.com>
Co-authored-by: Anton Evangelatov <anton.evangelatov@gmail.com>
* fix timer.Reset and improve logs

* revert randomization

* piece doc: handle errors

* adjust piece check

* refactor unsealsectormanager

* refactor piece doctor

* add random ports

* ignore tests

* add version to boostd-data

* fix ctx in Start

* fix: add reader mock to fix tests

* fix: pass new piece directory to provider on test restart

* fix synchronisation

* note that panics are not propagated in tests

* carv1 panics piece directory

* print panics

* fix: use reader that supports Seek in piece reader mock

* fix: reset mock car reader on each invocation

* fix: TestOfflineDealDataCleanup

* add check for nil cancel func

* bump min check period for LevelDB to 5 minutes

* check if sector state mgr is initialised

* debug line for unflagging

* commenting out TestMultipleDealsConcurrent -- flaky test -- works locally

* add SectorStateUpdates pubsub

* add close for pubsub

* add mock sectorstatemgr

* add wrapper tests

* fixup

* cleanup

* cleanup

* better names

* t.Skip for test

* remove TODO above println for panic

* add unit tests for refreshState

* rename tests

* more cases

* more tests

* update description

* better comment

* better names and comments

---------

Co-authored-by: Dirk McCormick <dirkmdev@gmail.com>
* fix statx output string (#1451)

* fix: flaky TestMultipleDealsConcurrent (#1458)

* Add option to serve index provider ads over http (#1452)

* feat: option to serve index provider ads over http

* fix: config naming, hostname parsing

* fix: update docsgen

* fix: log announce address

* feat: add config for indexer direct announce urls

* refactor: always announce over pubsub

* fix: docsgen

* test: add test case for empty announce address hostname

* Add `boostd index announce-latest` command (#1456)

* feat: boostd index announce-latest

* feat: add announce-latest-http command

* fix: default direct announce url

* feat: update to index-provider v0.11.2

* Signal to index provider to skip announcements (#1457)

* fix: signal to index provider to skip announcements

* fix: ensure multihash lister skip error is of type ipld.ErrNotExists

---------

Co-authored-by: LexLuthr <lexluthr@protocol.ai>

* release v1.7.3-rc2 (#1460)

* fix: improve stalled retrieval cancellation (#1449)

* refactor stalled retrieval cancel

* add ctx with timeout

* implement suggestions

* update err wrapping

* fix: set short cancel timeout for unpaid retrievals only

---------

Co-authored-by: Dirk McCormick <dirkmdev@gmail.com>

* feat: enable listen address for booster-http (#1461)

* enable listen address

* modify tests

* fix nil ptr (#1470)

* fix: incorrect check when import offline deal data using proposal CID (#1473)

* fix incorrect early check

* update error msg

* fix(server): properly cancel graphsync requests (#1475)

* set UI default listen address to localhost (#1476)

* feat: display msg params in the mpool UI (#1471)

* show msg params

* fix: mpool nil pointer

* fix width

---------

Co-authored-by: Dirk McCormick <dirkmdev@gmail.com>

* Reset read deadline after reading deal proposal message (#1479)

* fix: reset read deadline after reading deal proposal message

* fix: increase client request deadline

* feat: Show elapsed epoch and PSD wait epochs in UI (#1480)

* show epochs

* fix devnet UI, use BlockdDelaySecs

* fix lint err

* Update gql/resolver.go

Co-authored-by: dirkmc <dirkmdev@gmail.com>

---------

Co-authored-by: dirkmc <dirkmdev@gmail.com>

* release v1.7.3-rc3 (#1481)

---------

Co-authored-by: LexLuthr <88259624+LexLuthr@users.noreply.github.com>
Co-authored-by: LexLuthr <lexluthr@protocol.ai>
Co-authored-by: Hannah Howard <hannah@hannahhoward.net>
* feat: update local index directory ui

* comment out wrench as docker doesnt build

* rearrange menu

* refactor: remove sectors list

---------

Co-authored-by: Anton Evangelatov <anton.evangelatov@gmail.com>
@LexLuthr
Copy link
Collaborator

LexLuthr commented Sep 8, 2023

Quick checklist for this PR.

@willscott willscott mentioned this pull request Sep 11, 2023
1 task
LexLuthr and others added 2 commits September 12, 2023 13:16
* add corshandler to IPFS gateway

* use cors lib

* go mod tidy

* refactor test, remove extra cors
@willscott
Copy link
Collaborator Author

@rvagg it looks like there's some nit with car offset propagation that didn't make it through the merge here properly - graphsync / retrievals look like they're not always getting expected readers

@rvagg
Copy link
Member

rvagg commented Sep 13, 2023

yeah, a bunch missing.

2 commits pushed, first one (e3b5841) fixes all that stuff; the second one (5932ce7) just adds some fixes to your data_segment_index_retrieval_test.go to make it pass — I know it's not what you're aiming for, hence the second commit you can unwind if you want, but aside from changing CIDs to make it pass, there's a couple of proper fixes in there, notably the false argument in RetrieveDirect telling it to not unpack the CAR UnixFS files, you just want to look at the CAR, and the second retrieval root check that was comparing against root 1.

@willscott
Copy link
Collaborator Author

@LexLuthr should we merge this branch to https://github.com/filecoin-project/boost/tree/feat/noncar-files ?

@LexLuthr
Copy link
Collaborator

@LexLuthr should we merge this branch to https://github.com/filecoin-project/boost/tree/feat/noncar-files ?

Yes. We should merge this branch to feat/non-carfiles once we are done with the checklist at #1495 (comment)

go.mod Show resolved Hide resolved
@willscott willscott changed the base branch from main to feat/noncar-files October 9, 2023 12:10
@LexLuthr LexLuthr merged commit 7b1f9c6 into feat/noncar-files Oct 9, 2023
22 checks passed
@LexLuthr LexLuthr deleted the piecedirectory/segmentindex branch October 9, 2023 12:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

10 participants