Allow disk-based persistence to be completely disabled #2228

joelverhagen · 2023-10-15T20:53:13Z

PR Branch Destination

This adds a new --inMemoryPersistence option which allows the disk-based extent storage to become memory based and the LokiJS persistence settings to use only memory.

I'm happy to break this up into smaller PRs if you'd like.

This addresses the following feature request (also by me): #2227

Summary of changes:

Add design spec for this feature. I added a new docs structure mimicking how we do public specs on NuGet team.
Introduce --inMemoryPersistence option and plumb it through the code
Add MemoryExtentStore as an alternative to FSExtentStore
Fix some flaky tests (flaky prior to my change, judging by the AzDO CI build history)
Refactor QueueServer and TableServer creation to a test factory, allowing centralized configuration.
Add steps to the ADO build pipeline to run all tests with --inMemoryStorage right after disk-based.

Always Add Test Cases

All test cases (except 1) work with the --inMemoryPersistence option. The one test failing is skipped for --inMemoryPersistence. It makes sense that it fails because it requires reading a checked-in legacy LokiJS disk persistence.

Design spec

The render design spec is avalable here: https://github.com/joelverhagen/Azurite/blob/in-memory/docs/designs/2023-10-in-memory-persistence.md

For comments, it's available along with the rest of the PR here:
https://github.com/Azure/Azurite/pull/2228/files#diff-638244b70a10f551a0a6a66d691f8d88cf555fa0b9da1710e3d3bac119c05c39

Performance

Method	Mean	Error	StdDev
InMemory_Blob	1,860.0 us	25.45 us	21.25 us
InMemory_Queue	1,562.3 us	26.89 us	25.15 us
InMemory_Table	975.7 us	8.65 us	7.23 us
Disk_Blob	5,505.8 us	91.37 us	89.74 us
Disk_Queue	5,327.1 us	104.47 us	111.78 us
Disk_Table	963.6 us	11.34 us	9.47 us

Unsurprisingly, blob and queue are much faster without writing extents to disk.

src/common/persistence/MemoryExtentStore.ts

blueww · 2023-10-16T03:05:47Z

@joelverhagen

Thanks for the contribution!
Would you please share the reason for this change?
In which scenario we would like to have data cached in memory instead of disk?

And Azurite not only support saving data in disk, Azurite also allows save blob metadata in SQL, will the change also disable save data in SQL? (IF so, the parameter name/description might need change)
Besides that, have you tested all APIs of all services (B/T/Q) works with this change? If we plan to add this, we should have this kind of test environment covered in pipeline https://github.com/Azure/Azurite/blob/main/azure-pipelines.yml
And is there any limitation (like blob size) for the change? If so, we should mention it in the Readme.md file.
And could you share how Azurite will handle it when out of memory in this case, what's the typical error?

joelverhagen · 2023-10-16T13:04:48Z

Would you please share the reason for this change?

This is intended for CI testing scenarios where Azurite is started at the beginning of the pipeline, some tests are run, and then the CI pipeline ends.

Generally, I find myself using Azurite for cases where the I don't care much if the storage is lost or, it even annoys me to have to clean up leftover storage containers. Today, in my project NuGet/Insights (which I've used as a general test for Azurite as a side-goal) I have a step that cleans up containers, queues, and tables based on a test name convention. This is workable, and necessary for a test with real Azure Storage, but for CI pipelines it would be nice just to stop the node process and know all clean-up is done.
https://github.com/NuGet/Insights/blob/5d0b5ed62a71b10d93d77f5135b4876a1b1bf4dc/test/Worker.Logic.Test/TestSupport/TestDataTest.cs#L87-L114

Additionally, I've found that some of my tests run surprisingly slow and found that it was related to how queue messages are persisted in Azurite. A disk write is done per enqueue and dequeue which means there's a lot more IO than if LokiJS was used to persist message content, as Loki only writes periodically -- and this is noticeable with the current perf difference between Table and Queue write-then-read flows.

In which scenario we would like to have data cached in memory instead of disk?

In short, CI runs where we start Azurite, run tests, then (optionally) stop Azurite.

And Azurite not only support saving data in disk, Azurite also allows save blob metadata in SQL, will the change also disable save data in SQL? (IF so, the parameter name/description might need change)

It should still work. It does allow --inMemoryPersistence to work with SQL, where extent metadata is stored in SQL and the extents themselves are stored in memory. A weird combination for sure but the previous mode should work fine just by not specifying --inMemoryPersistence. From what I can tell, the FSExtentStore is used for SQL which also is a bit strange to me. It's essentially local extent storage and remote metadata storage. Anyways, happy to adjust the behavior however you'd like.

Besides that, have you tested all APIs of all services (B/T/Q) works with this change?

Yes, I did some test hacks to run all the tests I could with this change, but I'm not sure exactly how you'd like to get full coverage. Should we run all tests both with and without the --inMemoryPersistence flag? That would double the number of tests but get very good coverage. Or we could run a subset with --inMemoryPersistence? Some representative subset? Happy to follow a suggestion here or I can give it a try on my own.

Besides that, have you tested all APIs of all services (B/T/Q) works with this change? If we plan to add this, we should have this kind of test environment covered in pipeline https://github.com/Azure/Azurite/blob/main/azure-pipelines.yml

Great idea! I'll give that a try. Maybe this addresses the previous point.

And is there any limitation (like blob size) for the change? If so, we should mention it in the Readme.md file.

I'll take this as a TODO. It's certainly related to the amount of memory available to the node process.

And could you share how Azurite will handle it when out of memory in this case, what's the typical error?

I'll take this as a TODO. I am not sure how node operates when it's low on memory. My guess is big allocs can fail and trigger GC and some severe memory shortage would cause the process to crash.

Fix tests

…e usable

joelverhagen · 2023-10-17T19:58:04Z

And could you share how Azurite will handle it when out of memory in this case, what's the typical error?

I'll take this as a TODO. I am not sure how node operates when it's low on memory. My guess is big allocs can fail and trigger GC and some severe memory shortage would cause the process to crash.

I did testing. The approach is surprisingly scalable. When physical memory is consumed, virtual memory is provided by the OS. I was not able to get any "out of memory" errors. I included a screenshot in the design doc.

blueww · 2023-10-18T07:07:29Z

@joelverhagen

Thanks for the contribution!
We might will need some time to review/test the code change and the doc.

Will update you later.

joelverhagen · 2023-10-18T12:26:08Z

Great, thanks @blueww! Let me know if there's any updates I should make or questions I can answer. I'm happy to jump on a Teams call too if you think that will help ease communication.

blueww · 2023-10-20T07:37:01Z

Hi @joelverhagen,

I have done some sanity check with the PR:

I try to upload multiple 3GB blobs to Azurite run with "Azurite --inMemoryPersistence"

My machine is Windows 10 with 32GB memory with ~46GB virtual memory.
There are no OOM on my machine after I uploaded 16 * 3GB blob, when I upload the 17th 3GB blob, following error happens and Azurite exists. The behavior should changes between difference OS.
It might not be good to eat almost all memory. Should we provide a way to limit the memory usage in this case (like 1/2 of current available memory + virtual memory).

<--- Last few GCs --->

[49684:00000229E11C69C0]   456817 ms: Mark-sweep (reduce) 195.0 (242.0) -> 195.0 (209.5) MB, 123.2 / 0.0 ms  (average mu = 0.033, current mu = 0.025) external memory pressure; GC in old space requested
[49684:00000229E11C69C0]   456893 ms: Mark-sweep (reduce) 195.0 (209.5) -> 195.0 (208.5) MB, 75.7 / 0.0 ms  (average mu = 0.019, current mu = 0.000) external memory pressure; GC in old space requested


<--- JS stacktrace --->

FATAL ERROR: Committing semi space failed. Allocation failed - JavaScript heap out of memory
 1: 00007FF6F8A4D51F node_api_throw_syntax_error+175743
 2: 00007FF6F89D2E66 v8::internal::wasm::WasmCode::safepoint_table_offset+59654
 3: 00007FF6F89D4B72 v8::internal::wasm::WasmCode::safepoint_table_offset+67090
 4: 00007FF6F947AAB4 v8::Isolate::ReportExternalAllocationLimitReached+116
 5: 00007FF6F9465E12 v8::Isolate::Exit+674
 6: 00007FF6F92E7C6C v8::internal::EmbedderStackStateScope::ExplicitScopeForTesting+124
 7: 00007FF6F92F20CC v8::internal::Heap::PageFlagsAreConsistent+3612
 8: 00007FF6F92E4707 v8::internal::Heap::CollectGarbage+2039
 9: 00007FF6F92FB0C3 v8::internal::HeapAllocator::AllocateRawWithLightRetrySlowPath+2099
10: 00007FF6F92FB96D v8::internal::HeapAllocator::AllocateRawWithRetryOrFailSlowPath+93
11: 00007FF6F930B1A3 v8::internal::Factory::NewFillerObject+851
12: 00007FF6F8FFC285 v8::internal::DateCache::Weekday+1349
13: 00007FF6F9518051 v8::internal::SetupIsolateDelegate::SetupHeap+558193
14: 00007FF6F94CEA38 v8::internal::SetupIsolateDelegate::SetupHeap+257624
15: 00007FF67977349A

After uploaded 11*3GB blobs , when Azurite exist (with ctrl+c), it use > 2 minutes to release all used memory and exist. Could we accelerate this? Or at least should mention it in doc to make everyone know about that. And not sure will any bad thing happen when kill Azurite before it's full exist.
The debug log (run with "--debug") looks disappear when run Azurite with "Azurite --inMemoryPersistence --debug debug.log" the debug log won't be generated in the current folder.

Without "--inMemoryPersistence" the debug log will be generated in current folder.
If specify full path of the debug log, the debug log will also be generated like "--debug c:\temp\debug.log"

To keep data alignment, we should not allow user to start Azurite with saving metadata to SQL (when environment variable "AZURITE_DB" is set) or specify the location parameter ("--location" or "-l"), or at least omit the sql and location with a warning message.

README.md

src/table/TableServer.ts

src/table/main.ts

tests/blob/blockblob.highlevel.test.ts

tests/exe.test.ts

tests/queue/utils/QueueTestServerFactory.ts

src/queue/handlers/MessageIdHandler.ts

src/queue/handlers/MessagesHandler.ts

joelverhagen · 2023-11-05T17:25:56Z

Hey @blueww! I believe I have addressed all of your comments. Please let me know if more is needed 😃

src/common/ConfigurationBase.ts

src/common/VSCServerManagerClosedState.ts

blueww · 2023-11-13T08:50:33Z

@joelverhagen

I have tested repeated upload and delete blobs with Azurite run with "--inMemoryPersistence". But the test result is not expected, it looks the deleted blob won't be clear up from memory. Would you please help to look?

Test Scenario:

Upload 100MB * 10 blobs, then delete all 10 blobs. Then upload 100MB* 10 blobs again, and delete them again. ...
I tried to repeat 1000 times, but failed after ~15 times.

Test Result:

After upload/delete 100MB*10 blobs for ~15 times, will meet following error :

Set-AzStorageBlobContent : Cannot add an extent chunk to the in-memory store. Size limit of 17030338560 bytes will be exceeded HTTP Status Code: 409 - HTTP Error Message: Cannot add an extent chunk to the in-memory store. Size limit of 17030338560 bytes will be exceeded
ErrorCode: MemoryExtentStoreAtSizeLimit
ErrorMessage: Cannot add an extent chunk to the in-memory store. Size limit of 17030338560 bytes will be exceeded
RequestId:53834723-e10f-48f9-aa9d-35330109a991
Time:2023-11-13T08:33:50.957Z
At line:6 char:9
+         Set-AzStorageBlobContent -Container $containerName -Blob "tes ...
+         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : CloseError: (:) [Set-AzStorageBlobContent], StorageException
    + FullyQualifiedErrorId : StorageException,Microsoft.WindowsAzure.Commands.Storage.Blob.SetAzureBlobContentCommand

The memory usage chart (I wait > 15 minutes after error happen) :

For you reference, my test script:
I run it from Powershell, need first install Az.Storage module.

$ctx = New-AzStorageContext -ConnectionString "AccountName=devstoreaccount1;AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==;DefaultEndpointsProtocol=http;BlobEndpoint=http://127.0.0.1:10000/devstoreaccount1;QueueEndpoint=http://127.0.0.1:10001/devstoreaccount1;TableEndpoint=http://127.0.0.1:10002/devstoreaccount1;"

$containerName = "test"
New-AzStorageContainer -Name $containerName -Context $ctx

for ($i =0 ; $i -le 1000; $i++ )
{
    Echo "create Blobs"
    for ($j=0 ; $j -le 10; $j++)
    {
        Set-AzStorageBlobContent -Container $containerName -Blob "testblob_$($i)_$($j)" -File C:\temp\testfile_102400K_0 -Context $ctx -Force
    }
    echo "Delete blobs"
    for ($j=0 ; $j -le 10; $j++)
    {
        Remove-AzStorageBlob -Container $containerName -Blob "testblob_$($i)_$($j)"  -Context $ctx -PassThru -Force 
    }
}

joelverhagen · 2023-11-13T15:18:10Z

@blueww - the default blob GC is 10 minutes (

Azurite/src/blob/utils/constants.ts

Line 21 in 49d1065

export const DEFAULT_GC_INTERVAL_MS = 10 * 60 * 1000;

). From your memory chart, I see growth in memory only 3-4 minutes after the process start. It is expected to face this error if you have met the limit before the BlobGCManager runs. If you were to attempt the upload after 10 minutes from process start, you should see success. I see a spike in CPU right around the 10 minute mark so I'm guessing the blob GC actually happened but your uploads did not continue after you faced an error (~3-4 minute mark).

Regarding the memory consumption, I wasn't able to reproduce this behavior. I changed the GC period to 5 minutes to ease testing and set an extent limit of 5 GB.

(I am not sure what tool you used, but it looks like a visualization of system perf counters)

A debug log will shed light on this. You'll want to look for a line like this:

info: BlobGCManager:markSweep() Deleted unreferenced 650 extents, after excluding active write extents.

It's possible that Node isn't releasing the memory even after the blob GC occurs. I am not an expert in Node.js GC but after a bit of reading it seems like it could be related to the default max-old-space-size value selected on your machine. It seems Node.js won't perform some of the more expensive GC steps if the current memory consumption is lower than this threshold which may explain the behavior you're seeing. The part of the code that is impacted by this PR is the integration with the blob and the queue GC. If you see the markSweep log line similar to the above, then it's working as I designed it. That being said, if you provide a debug log it might help me understand what you're seeing.

A way you could run a similar test for the existing disk-based persistence. I tried with a ramdisk on Ubuntu WSL (steps: https://askubuntu.com/a/453755) and similarly encountered failures (HTTP 500 not 409 of course) at my 5 GB ramdisk limit. However in that case the blob GC didn't find any extents at all. I tried many versions of Azurite (historical tags like v3.10.0 with the same condition:

info: BlobGCManager:markSweep() Got 0 extents.

So an alternate explanation is that there is an existing bug in the blob GC manager that you encountered. But we need a debug log to know.

blueww · 2023-11-14T05:11:34Z

@joelverhagen

Thanks for the investigation!

I use the tool Perfmon to show the chart of memory/cpu.

I have tested following 3 scenario, and this time looks GC works in all scenarios.
And I do see the following debug log when GC happen :

info: BlobGCManager:markSweep() Deleted unreferenced 2106 extents, after excluding active write extents.

Besides that, to avoid customer issues like why deleted blob content still occupy memory, do you think we should mention how GC works in Readme.md section "Use in-memory storage"?

Scenario 1

upload 100*100MB blobs, them delete them
Wait for 10 minutes
repeated step 1, step 2.

Scenario 2

Upload 200*100MB blobs (after upload ~150 blobs will get OOM error) and delete them,
wait 10 minutes
repeated step 1, step 2.

Scenario 3

Upload 10*100MB blobs and delete them,
Repeat step 1 for 20 times (after repeat 15 times will get OOM error)
wait 1000 seconds
repeated step 1~3.
We can see the "Private Bytes" won't drop immediately after GC, this is different from other 2 scenarios. But it will drop in the begin of next round blob upload.

joelverhagen · 2023-11-14T15:02:33Z

Besides that, to avoid customer issues like why deleted blob content still occupy memory, do you think we should mention how GC works in Readme.md section "Use in-memory storage"?

Sure, I'll do that.

We can see the "Private Bytes" won't drop immediately after GC, this is different from other 2 scenarios. But it will drop in the begin of next round blob upload.

Generally I've seen this behavior in several GC'd runtimes. In my head it's basically "GC sometimes does the hard work only when memory pressure is faced" (i.e. lazily) -- which is when the next round of blobs come in for this case. But this I my gut check and not specific expertise on the Node GC.

joelverhagen · 2023-11-14T15:16:14Z

I've updated the README with these details.

blueww · 2023-11-15T01:57:38Z

@joelverhagen

Thanks for the quick update!
I will do some final review, and should can merge this PR soon.

edwin-huber · 2023-12-07T08:54:43Z

Fixes #1331
Fixes #1332

joelverhagen added 5 commits October 15, 2023 15:00

Allow in-memory Loki and extent store

a187cf6

Update config

af85ace

Plumb config

b3846f8

Switch to newer buffer stream library

6532e5a

Add the config in more places and update markdowns

d00464e

joelverhagen mentioned this pull request Oct 15, 2023

[Feature request] Allow disk-based persistence to be completely disabled #2227

Closed

joelverhagen commented Oct 15, 2023

View reviewed changes

src/common/persistence/MemoryExtentStore.ts Show resolved Hide resolved

Use built-in stream

58aa6f0

joelverhagen commented Oct 16, 2023

View reviewed changes

src/common/persistence/MemoryExtentStore.ts Outdated Show resolved Hide resolved

joelverhagen added 13 commits October 16, 2023 23:50

Refactor table creation to a single factory

a255c7f

Move in-memory persistence to an env var

657879f

Add QueueTestServerFactory to centralize test server creation

fca212e

Properly handle offset and count in extents

c3aee91

Fix tests

Add in-memory tests to all variants

939291a

Fix two flaky tests

2ae4746

Make random local files have some newline characters so diff tools ar…

b006592

…e usable

Use a larger buffer for the download, the test is very slow otherwise

ba1410b

Add design spec

7d2bf7f

Add logging to MemoryExtentStore

b0f382f

Make queue GC logging consistent with blob GC logging

481c5d1

Add VS Code option

d7c707a

Improve setting description

7379048

joelverhagen marked this pull request as ready for review October 17, 2023 19:58

Don't include docs in the VS Code extension

cda16fb