Skip to content
This repository has been archived by the owner on Nov 1, 2023. It is now read-only.

Only include fixed length fields in events #3076

Closed
tevoinea opened this issue May 1, 2023 · 2 comments · Fixed by #3543
Closed

Only include fixed length fields in events #3076

tevoinea opened this issue May 1, 2023 · 2 comments · Fixed by #3543
Labels
enhancement New feature or request

Comments

@tevoinea
Copy link
Member

tevoinea commented May 1, 2023

As a continuation of #2937, we must slim down the data contained by our events. This will be a breaking change and that will be reflected in both the OneFuzz version number and the event version number.

Which fields we choose to keep will vary event to event but in general:

Data that is OK to keep

  • Guids
  • DateTimes
  • Enums
  • Numbers
  • Any models that exclusively contain fields of those types

Examples of data to remove

  • Open ended strings (ex: notification templates)
  • Unbounded lists (ex: call stacks)
  • Any models that include either of those types

AB#141950

@tevoinea tevoinea added the enhancement New feature or request label May 1, 2023
@tevoinea
Copy link
Member Author

tevoinea commented May 4, 2023

How do we assert that the information container in events will never be too large? We can discriminate based on types.

There's 3 scenarios to consider:

  1. Bounded types
  2. Bounded data in unbounded types
  3. Unbounded data in unbounded types

Using a custom serializer, we can skip fields that do not follow scenario 1 or 2.

Bounded types

We don't need to worry about these. It's unlikely we'll ever have so many of them that we start hitting size limits.

Examples: Guids, DateTimes, Enums, Numbers

Bounded data in unbounded types

public record EventScalesetStateUpdated(
    Guid ScalesetId,
    PoolName PoolName,
    ScalesetState State
) : BaseEvent();

ScalesetId and State are bounded in length. PoolName isn't directly bounded since it's just a type alias for String, BUT it's indirectly bounded because we use it as for an azure queue name which has size limits (up to 63 characters).

Suggestion: When the underlying type is unbounded in length but we know there are external constraints, we can create a type alias or wrapping type (like we have with Container) that enforces those constraints. Then, apply a type attribute that can be picked up by the custom serializer to include this field.

If we follow the suggestion, the serialized EventScalesetStateUpdated will not change. If we don't follow the suggestion, PoolName will be omitted.

Unbounded data in unbounded types

public record EventCrashReported(
    Report Report,
    Container Container,
    [property: JsonPropertyName("filename")] String FileName,
    TaskConfig? TaskConfig
) : BaseEvent()

public record Report(
    string? InputUrl,
    BlobRef? InputBlob,
    string Executable,
    string CrashType,
    string CrashSite,
    List<string> CallStack,
    string CallStackSha256,
    ...
);

In Report there are many fields that can and do take unbounded data like CallStack. There are some fields that only take bounded data like BlobRef. There are some fields like CrashType that could be bounded if we spent the time to scope all possible inputs.

In this case, only the fields that follow scenario 1 or 2 will be serialized. So the final serialized type will look like:

{
    "report": {
        "inputBlob": "https://example.com/some-blob-ref"
        /* no inputUrl, no executable, no call stack, no crashType, etc. */
    },
    "container": "some-container-name",
    "filename": "some-file-name",
    "taskConfig": {
        /* follows the same rules as report */
    }
}

@Porges
Copy link
Member

Porges commented May 9, 2023

I think we should allow any IValidatedString (in #3045 I'm adding a non-generic "base interface" to make this easier), with the assumption that any instances should enforce a maximum length limit. As you point out, PoolName should have a length enforced already.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants