Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise Contract Sizes #773

Merged
merged 3 commits into from
Nov 30, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 32 additions & 17 deletions stores/metadata.go
Original file line number Diff line number Diff line change
Expand Up @@ -919,26 +919,42 @@ func (s *SQLStore) ContractSets(ctx context.Context) ([]string, error) {
}

func (s *SQLStore) ContractSizes(ctx context.Context) (map[types.FileContractID]api.ContractSize, error) {
rows := make([]struct {
type size struct {
Fcid fileContractID `json:"fcid"`
Size uint64 `json:"size"`
Prunable uint64 `json:"prunable"`
}, 0)
}

if err := s.db.
Raw(`
SELECT fcid, MAX(c.size) as size, CASE WHEN MAX(c.size)>COUNT(cs.db_sector_id) * ? THEN MAX(c.size)-(COUNT(cs.db_sector_id) * ?) ELSE 0 END as prunable
FROM contracts c
LEFT JOIN contract_sectors cs ON cs.db_contract_id = c.id
GROUP BY c.fcid
`, rhpv2.SectorSize, rhpv2.SectorSize).
Scan(&rows).
Error; err != nil {
var nullContracts []size
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

null requires some insider knowledge on the query we had before. I'd recommend something like unusedContracts and usedContracts to indicate that it's just contracts without sectors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the comments on the query to better explain what's happening. I'd like to keep nullContract because for me it's the only name that really makes sense. I don't like unusedContracts or emptyContracts because if they're unused or empty why would you be interested in their size field for pruning...

var dataContracts []size
if err := s.db.Transaction(func(tx *gorm.DB) error {
// first, we fetch all contracts without sectors and consider their
// entire size as prunable
if err := tx.
Raw(`
SELECT c.fcid, c.size, c.size as prunable FROM contracts c WHERE NOT EXISTS (SELECT 1 FROM contract_sectors cs WHERE cs.db_contract_id = c.id)`).
Scan(&nullContracts).
Error; err != nil {
return err
}

// second, we fetch how much data can be pruned from all contracts that
// do have sectors, we take a two-step approach because it allows us to
// use an INNER JOIN on contract_sectors, drastically improving the
// performance of the query
return tx.
Raw(`
SELECT fcid, contract_size as size, CASE WHEN contract_size > sector_size THEN contract_size - sector_size ELSE 0 END as prunable FROM (
SELECT c.fcid, MAX(c.size) as contract_size, COUNT(cs.db_sector_id) * ? as sector_size FROM contracts c INNER JOIN contract_sectors cs ON cs.db_contract_id = c.id GROUP BY c.fcid
) i`, rhpv2.SectorSize).
Scan(&dataContracts).
Error
}); err != nil {
return nil, err
}

sizes := make(map[types.FileContractID]api.ContractSize)
for _, row := range rows {
for _, row := range append(nullContracts, dataContracts...) {
if types.FileContractID(row.Fcid) == (types.FileContractID{}) {
return nil, errors.New("invalid file contract id")
}
Expand All @@ -962,11 +978,10 @@ func (s *SQLStore) ContractSize(ctx context.Context, id types.FileContractID) (a

if err := s.db.
Raw(`
SELECT MAX(c.size) as size, CASE WHEN MAX(c.size)>(COUNT(cs.db_sector_id) * ?) THEN MAX(c.size)-(COUNT(cs.db_sector_id) * ?) ELSE 0 END as prunable
FROM contracts c
LEFT JOIN contract_sectors cs ON cs.db_contract_id = c.id
WHERE c.fcid = ?
`, rhpv2.SectorSize, rhpv2.SectorSize, fileContractID(id)).
SELECT contract_size as size, CASE WHEN contract_size > sector_size THEN contract_size - sector_size ELSE 0 END as prunable FROM (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this new query faster? Or did you just refactor it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's marginally faster. It avoids doing MAX and COUNT multiple times. I never really cared b/c I figured the database engine would optimise that for me.

SELECT MAX(c.size) as contract_size, COUNT(cs.db_sector_id) * ? as sector_size FROM contracts c LEFT JOIN contract_sectors cs ON cs.db_contract_id = c.id WHERE c.fcid = ?
) i
`, rhpv2.SectorSize, fileContractID(id)).
Take(&size).
Error; err != nil {
return api.ContractSize{}, err
Expand Down
2 changes: 1 addition & 1 deletion stores/metadata_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -3765,7 +3765,7 @@ func TestSlabHealthInvalidation(t *testing.T) {
}

// refresh health
now := time.Now().Round(time.Second)
now := time.Now()
if err := ss.RefreshHealth(context.Background()); err != nil {
t.Fatal(err)
}
Expand Down