Skip to content

File Based Storage Provider #9537

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

BernhardPollerspoeck
Copy link

@BernhardPollerspoeck BernhardPollerspoeck commented Jun 2, 2025

This pull request introduces a new file-based grain storage provider for Microsoft Orleans. The changes include adding a new project for the provider, implementing its core functionality, and providing documentation and configuration examples.

New File-Based Grain Storage Provider

Project Setup:

  • Added a new project Orleans.Persistence.FileStorage to the solution with the necessary project references and metadata (Orleans.sln, src/File/Orleans.Persistence.FileStorage/Orleans.Persistence.FileStorage.csproj) [1] [2].

Core Implementation:

  • Implemented the FileGrainStorage class, which provides methods for reading, writing, and clearing grain state using a file-based approach (src/File/Orleans.Persistence.FileStorage/FileGrainStorage.cs).
  • Added a factory class FileGrainStorageFactory to create instances of FileGrainStorage (src/File/Orleans.Persistence.FileStorage/FileGrainStorageFactory.cs).
  • Created FileGrainStorageOptions to configure the root directory and serializer for the storage provider (src/File/Orleans.Persistence.FileStorage/FileGrainStorageOptions.cs).
  • Added extension methods in FileSiloBuilderExtensions to simplify the configuration of the file storage provider in Orleans silo builders (src/File/Orleans.Persistence.FileStorage/FileSiloBuilderExtensions.cs).

Documentation and Examples:

  • Added a README.md file with an introduction, setup instructions, and examples for configuring and using the file storage provider (src/File/Orleans.Persistence.FileStorage/README.md).
Microsoft Reviewers: Open in CodeFlow

@BernhardPollerspoeck
Copy link
Author

@dotnet-policy-service agree

@shacal
Copy link

shacal commented Jun 2, 2025

👍🏻

Directory.CreateDirectory(Path.GetDirectoryName(path)!);
}
var fileInfo = new FileInfo(path);
if (fileInfo.Exists && fileInfo.LastWriteTimeUtc.ToString(CultureInfo.InvariantCulture) != grainState.ETag)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File date information can be Pita when used by file shares/zfs etc.

Option for sha256 or custom providers would allow sifferent kind of consistency checks.

Some filesystems allow to have hash build in and some have metadata/tags that allows this to be better for production.


public sealed class FileGrainStorageOptions : IStorageProviderSerializerOptions
{
#region properties
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a love of devs, do not use regions 👏

var storedData = options.GrainStorageSerializer.Serialize(grainState.State);
var fName = GetKeyString(stateName, grainId);
var path = Path.Combine(options.RootDirectory, fName!);
if (!Directory.Exists(path))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sync call will destroy some sharing services.
I suggest that try use it, if fails, then try create folder.

This is more ops per write than "normally" needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned this in discord, and wanted to note it here for anyone else looking at this PR. We did something similar and
getting the semantics for a file system storage provider is tricky because:

  • File systems are implemented differently, and you have to be sure the subset of functionality you are using is robust across all of them - including when you add NFS or SMB into the mix too, what do they each guarantee in terms of data integrity?
  • Writes are not atomic, if your program or OS crashes midway through overwriting grain state, you are left in a non-deterministic possibly corrupt state.
  • LastWriteTimeUtc may not be accurate - caching, lazy metadata writes could affect it - see https://learn.microsoft.com/en-us/dotnet/api/system.io.filesysteminfo.lastwritetimeutc?view=net-9.0#remarks
  • It's possible for two identical grains to be active during a split-brain scenario so you cannot rely on reading last write time then writing because that allows a race-condition.
  • On Linux (as of .NET 9.0) there are no truly asynchronous file operations - all the async ones are implemented as synchronous queued on the threadpool, so you need to be careful not to flood the threadpool with thread stalling sync work during grain activation storage reads.

All these problems can be worked around, and I think it's important to do so because you are dealing with storage and people will trust it to reliably persist their precious data.

We went through several iterations for a log-based storage provider and we settled on was:

  • use exclusive file locking (and we test this works on the base path specified on initialization, because if not all bets are off) - and handle the specific IOException HResult (which is different on Windows and *nix) trying to access a locked file
  • append xxHash to the contents of each file to ensure we can detect partial writes/integrity problems - this could maybe also serve as your etag in this scenario
  • always write replacement contents (with a xxHash) to a new deterministically named file and then overwrite the original file ensuring exclusive access to both for the duration.
  • always look for the above deterministically named new file when opening each original file, and resume the replacement operation if it exists.
  • Add a concurrency gate around the async-over-sync file operations and increase the threadpool size by the concurrency gate limit, to ensure the threadpool has adequate capacity for our file operations.
  • Ensure file handle lifetime is short if the number of concurrent handles might become a problem (some storage stacks have limits)
  • Make sure you are not going to run into inode exhaustion based on how you store your files especially for EXT volumes it seems.

With the above mitigated we have processed a few billion storage operations on the file system now - but we are still only using this for data that can be replaced.

@BernhardPollerspoeck
Copy link
Author

I need to find the time to get more information about your suggested changes. i am not that deep into the details you mention and i dont want to submit changes i dont understand.
@shacal @willg1983

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants