Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify types.ActivationTx #5789

Closed
wants to merge 18 commits into from
Closed

Conversation

poszu
Copy link
Contributor

@poszu poszu commented Mar 29, 2024

Motivation

Part of #5774. Once types.ActivationTx is not used for serialization (see #5784) , it can be refactored without breaking backward compatibility.

Description

  • made Nipost, Nipost.Post, and Nipost.PostMetadata required in types.ActivationTx (they are still pointers in the wire type)
  • changed nipostValidator interface methods to take Post, PostMetadata, and VRFPostIndex by value. These types are small so there shouldn't be a perf downgrade.
  • removed type InnerActivationTx,
  • removed optional NodeId to not have duplicated node ID. Its value is validated during parsing from the wire type,
  • exported Validity, Received, and Golden fields (id and effectiveNumUnits are left for later as this PR is already big),
  • moved verification of Sequence into the code that parses from the wire.

Test Plan

  • added tests for parsing ATX from the wire V1 type.

TODO

  • Explain motivation or link existing issue(s)
  • Test changes and document test plan
  • Update documentation as needed
  • Update changelog as needed

@poszu poszu changed the base branch from develop to atx-wire-type March 29, 2024 13:00
Copy link

codecov bot commented Mar 29, 2024

Codecov Report

Attention: Patch coverage is 96.07843% with 4 lines in your changes are missing coverage. Please review.

Project coverage is 80.3%. Comparing base (28480f5) to head (20f8b4f).

Files Patch % Lines
sql/atxs/atxs.go 80.0% 1 Missing and 1 partial ⚠️
activation/validation.go 85.7% 1 Missing ⚠️
common/types/activation.go 97.8% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##           atx-wire-type   #5789   +/-   ##
=============================================
  Coverage           80.3%   80.3%           
=============================================
  Files                287     287           
  Lines              29782   29764   -18     
=============================================
- Hits               23934   23920   -14     
+ Misses              4210    4208    -2     
+ Partials            1638    1636    -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@poszu poszu marked this pull request as ready for review March 29, 2024 13:41
@@ -677,7 +670,6 @@ func (b *Builder) createAtx(
nipostState.NumUnits,
nonce,
)
atx.InnerActivationTx.NodeID = atxNodeID
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The NodeID is required to be set in an initial ATX; this cannot just be dropped here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, but it must be set in the wire type wire.ActivationTxV1. This field doesn't need to be present in types.ActivationTx and hence it was removed. It's set in

if a.PrevATXID == EmptyATXID {
atxV1.InnerActivationTxV1.NodeID = &atxV1.SmesherID
}

@@ -677,7 +670,6 @@ func (b *Builder) createAtx(
nipostState.NumUnits,
nonce,
)
atx.InnerActivationTx.NodeID = atxNodeID
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of createAtx creating a types.ActivationTx to then convert it later in broadcast to the wiret ype createAtx should only care about crating a wire type and returning that to be used by broadcast.

@@ -148,9 +148,6 @@ func (h *Handler) processVerifiedATX(
}

func (h *Handler) SyntacticallyValidate(ctx context.Context, atx *types.ActivationTx) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function should not receive a types.ActivationTx but the respective wire type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, see #5789 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the fact that this still uses types.ActivationTx and not the wire types is the reason why this PR is so big 😉 see #5789 (comment)

Comment on lines -167 to -169
if atx.InnerActivationTx.NodeID == nil {
return errors.New("no prev atx declared, but node id is missing")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not optional! Initial ATXs must contain a NodeId.

Comment on lines -179 to -181
if atx.Sequence != 0 {
return errors.New("no prev atx declared, but sequence number not zero")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

V1 ATXs contain a sequence number and that number still needs to be verified to be monotonically increasing. We do not want to change validation behavior at all if we can avoid it.

Comment on lines -198 to -200
if atx.InnerActivationTx.NodeID != nil {
return errors.New("prev atx declared, but node id is included")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment above

Comment on lines +560 to +563
atx, err := types.ActivationTxFromWireV1(&atxOnWire)
if err != nil {
return nil, fmt.Errorf("%w: %w", errMalformedData, err)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This conversion should happen after the ATX was validated and before it is stored in the DB. Validation for ATXv1 will differ from validation for ATXv2 - which also means we will have different methods for validation based on the version of ATX that is validated.

This don't need to be a handleV1 and handleV2 method, but this could just be one service per wiretype version that we inject into the handler that handles ATXs of the specified version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree, but this PR is focused on refactoring/simplifying types.ActivationTx, not overhauling everything. I'm trying to divide the gigantic work into separate, reviewable steps (PRs) that we could iterate ~fast, yet they are already big and slow to review apparently.

Copy link
Member

@fasmat fasmat Apr 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would do it in smaller steps:

  1. Create the ATXv1 wire type and only use it in the activation.Handler
  2. Update sql/atx functions to not require scale encoding/decoding any more
    2a. This requires a DB migration that extracts info not currently present in the DB from the encoded ATXs
    2b. This can also be used to separate the atx blobs from the rest of the data into their own table
  3. Introduce new ATX domain type and start using it where currently types.ActivationTx is used
    3a. Not every usage needs to be replaced immedeatly, add comment to types.ActivationTx that it is not supposed to be used any more and slowly migrate over time
    3b. Eventually combine atxsdata.ATX and new domain type into a single type
  4. Remove usages of types.ActivationHeader or also just add a comment that it is deprecated and should not be used any more but instead the new domain type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EDIT updated comment with more info

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The step 2 is nearly impossible without first cutting down the types.ActivationTx. It contains too much data like Post, Poet membership proofs etc. IMHO, we need to reduce the amount of information in types.ActivationTx first, before we fix the SQL queries to not parse the blob.

Comment on lines 682 to 705
t.Run("missing NodeID in initial atx", func(t *testing.T) {
t.Parallel()

atxHdlr := newTestHandler(t, goldenATXID)

ctxID := posAtx.ID()
challenge := types.NIPostChallenge{
Sequence: 0,
PrevATXID: types.EmptyATXID,
PublishEpoch: currentLayer.GetEpoch(),
PositioningATX: posAtx.ID(),
CommitmentATX: &ctxID,
}
nipost := types.NIPost{PostMetadata: &types.PostMetadata{}}
atx := newAtx(challenge, &nipost, 100, types.GenerateAddress([]byte("aaaa")))
atx.InitialPost = &types.Post{
Nonce: 0,
Indices: make([]byte, 10),
}

atxHdlr.mclock.EXPECT().CurrentLayer().Return(currentLayer)
err := atxHdlr.SyntacticallyValidate(context.Background(), atx)
require.ErrorContains(t, err, "node id is missing")
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test should not be deleted. NodeID must be included in the initial ATX!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case is covered now by a different test.

Comment on lines +167 to +168
id ATXID // non-exported cache of the ATXID
effectiveNumUnits uint32 // the number of effective units in the ATX (minimum of this ATX and the previous ATX)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ID should also be an exported field. effectiveNumUnits can be dropped and it's value should be assigned to NumUnits.

The difference between the effectiveNumUnits and NumUnits value used to be that NumUnits is what the ATX contains as value - but effectiveNumUnits might differ from that if the node increased their PoST size. Example:

Epoch n : ATX with 5 NumUnits - effective NumUnits 5
Epoch n+1: ATX with 10 NumUnits - effective NumUnits 5
Epoch n+2: ATX with 10 NumUnits - effective NumUnits 10

The effectiveNumUnits value is always the minimum of the current and the previous ATX because the smesher will create a PoST in Epoch n+1 for 10 NumUnits but was only registered with 5 for the PoET round, so the data wasn't yet kept for at least one epoch.

In the Domain Type NumUnits can always be the effectiveNumUnits because all services besides the activation.Handler should only care about effectiveNumUnits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ID should also be an exported field.

Yes, I wrote in the description it should and why I didn't export it yet.

Good point about the effectiveNumUnits, I wasn't aware. I will refactor it separately.

Comment on lines +156 to +160
NIPostChallenge
Coinbase Address
NumUnits uint32

NIPost NIPost
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the Domain type for ActivationTx need to contain the full NIPostChallenge? I think just PrevATXID should be enough - all the other values are only needed to verify the NIPost which is only done by the activation.Handler and also doesn't need to be included into the domain type I believe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right! :) But it needs first refactoring the Handler, which I don't want to do in this PR (because of the PR size).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handler in my opinion shouldn't use the domain type, but the wire type. It validates that the received ATX is valid before converting it to the domain type end then storing it in the DB. This would require fewer code changes and would allow to remove fields in types.ActivationTx immediately (if we want to keep that type around instead of creating a new domain type)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handler in my opinion shouldn't use the domain type, but the wire type. It validates that the received ATX is valid before converting it to the domain type end then storing it in the DB.

That's right! :)

This would require fewer code changes

I'm not sure about this. Removing fields like NipostChallenge (i.e. the POST proof) would require lots of changes in different places (another code like looking up a positioning ATX depends on it) and I think it would end up with a bigger PR.

Please, let's not waste time arguing about the most optimal way of introducing the changes - I'm sure there are plenty of ways and we can spend a day talking about them. I think we more-less agree about the desired outcome we want to achieve.

@@ -10,7 +10,7 @@ import (
)

type NIPostState struct {
*types.NIPost
types.NIPost
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now with the context of this PR I believe that types.NIPost should not be part of the domain type types.ActivationTx but ONLY of the wire type. This also means that it does not belong here (no wire types in sql packages).

This doesn't mean that we can't have a separate type here that is identical to the current types.NIPost so we can encode it as bytes and store it in the DB, it just means that this type is ONLY used to preserve the NIPost and for nothing else - it is not included in types.ActivationTx and for encoding the "wire.ActivatonTxV1" has it's own (sub-)type.

@spacemesh-bors spacemesh-bors bot changed the base branch from atx-wire-type to develop April 15, 2024 18:16
@poszu poszu closed this Apr 16, 2024
@fasmat fasmat deleted the remove-optionals-from-ActivationTx branch August 29, 2024 20:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants