Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GZip compression for WorkflowInstance Data field on EF Core #4876

Merged
merged 6 commits into from
Feb 4, 2024

Conversation

sfmskywalker
Copy link
Member

@sfmskywalker sfmskywalker commented Feb 3, 2024

This PR adds the ability to store workflow state in a compressed format for EF Core.

Future updates may include other persistence provider support such as Dapper and MongoDB.

Closes #4875

This update adds a compression feature for workflow state data to reduce storage needs. A compression strategy resolver, None and GZip strategies have been implemented. Migration scripts were also updated, and a superfluous file(s) were removed.
This commit includes the creation of migration files for MySql, SqlServer, Sqlite, and PostgreSql contexts in the Elsa project. The files for version V3_1 were automatically generated and are to be implemented with the necessary changes in the Up and Down methods as per the requirements.
This commit introduces two new fields to the WorkflowInstances table: "DataCompressionAlgorithm" and "DataFormat". These changes allow storing additional contextual information about the payloads of workflow instances and enhance future handling and processing of this data.
@sfmskywalker sfmskywalker added elsa 3 This issue is specific to Elsa 3 enhancement New feature or request migration steps labels Feb 3, 2024
@sfmskywalker
Copy link
Member Author

I want to apply this also to the ActivityState field of the ActivityExecutionRecord table - for some customers, these fields can reach many megabytes (10MB for some scenarios that I noticed).

Deleted all migration files related to the Management module across MySql, SqlServer, Sqlite and PostgreSql data providers. These files included data compression algorithm and data format column additions in the WorkflowInstances table.
@sfmskywalker sfmskywalker added schema change This issue or PR changes a schema that may needs to be documented with the release notes and removed migration steps labels Feb 4, 2024
@lahma
Copy link
Contributor

lahma commented Feb 4, 2024

RavenDB folks did come compression algorithm comparison recently against JSON data (zstd was a clear winner): ravendb/ravendb#17678

Removed ICompressionStrategyResolver interface and file, and added ICompressionCodec and ICompressionCodecResolver interface. Updated relevant classes for the new interface. Specifically, added Zstd class under Compression as a new compression method. Also modified WorkflowInstanceStore.cs, None.cs, EFCoreWorkflowInstanceStore.cs, GZip.cs, and ActivityExecutionLogStore.cs for uniform compression terminology.
@sfmskywalker
Copy link
Member Author

RavenDB folks did come compression algorithm comparison recently against JSON data: ravendb/ravendb#17678

Wow, ZSTD is not only significantly faster, but produces smaller output as well.
Definitely gonna add that as a strategy and try it out 👍🏻

@sfmskywalker sfmskywalker merged commit f1b85a1 into main Feb 4, 2024
2 checks passed
@sfmskywalker sfmskywalker deleted the issue(4875) branch February 4, 2024 11:19
@sfmskywalker sfmskywalker added this to the Elsa 3.1 milestone Feb 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
elsa 3 This issue is specific to Elsa 3 enhancement New feature or request schema change This issue or PR changes a schema that may needs to be documented with the release notes
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

Enable optional gzip compression when saving workflow instances
4 participants