Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow disk-based persistence to be completely disabled #2228

Merged
merged 39 commits into from Nov 15, 2023
Merged

Allow disk-based persistence to be completely disabled #2228

merged 39 commits into from Nov 15, 2023

Conversation

joelverhagen
Copy link
Contributor

@joelverhagen joelverhagen commented Oct 15, 2023

PR Branch Destination

This adds a new --inMemoryPersistence option which allows the disk-based extent storage to become memory based and the LokiJS persistence settings to use only memory.

I'm happy to break this up into smaller PRs if you'd like.

This addresses the following feature request (also by me): #2227

Summary of changes:

  • Add design spec for this feature. I added a new docs structure mimicking how we do public specs on NuGet team.
  • Introduce --inMemoryPersistence option and plumb it through the code
  • Add MemoryExtentStore as an alternative to FSExtentStore
  • Fix some flaky tests (flaky prior to my change, judging by the AzDO CI build history)
  • Refactor QueueServer and TableServer creation to a test factory, allowing centralized configuration.
  • Add steps to the ADO build pipeline to run all tests with --inMemoryStorage right after disk-based.

Always Add Test Cases

All test cases (except 1) work with the --inMemoryPersistence option. The one test failing is skipped for --inMemoryPersistence. It makes sense that it fails because it requires reading a checked-in legacy LokiJS disk persistence.

Design spec

The render design spec is avalable here: https://github.com/joelverhagen/Azurite/blob/in-memory/docs/designs/2023-10-in-memory-persistence.md

For comments, it's available along with the rest of the PR here:
https://github.com/Azure/Azurite/pull/2228/files#diff-638244b70a10f551a0a6a66d691f8d88cf555fa0b9da1710e3d3bac119c05c39

Performance

Method Mean Error StdDev
InMemory_Blob 1,860.0 us 25.45 us 21.25 us
InMemory_Queue 1,562.3 us 26.89 us 25.15 us
InMemory_Table 975.7 us 8.65 us 7.23 us
Disk_Blob 5,505.8 us 91.37 us 89.74 us
Disk_Queue 5,327.1 us 104.47 us 111.78 us
Disk_Table 963.6 us 11.34 us 9.47 us

Unsurprisingly, blob and queue are much faster without writing extents to disk.

@blueww
Copy link
Member

blueww commented Oct 16, 2023

@joelverhagen

Thanks for the contribution!
Would you please share the reason for this change?
In which scenario we would like to have data cached in memory instead of disk?

And Azurite not only support saving data in disk, Azurite also allows save blob metadata in SQL, will the change also disable save data in SQL? (IF so, the parameter name/description might need change)
Besides that, have you tested all APIs of all services (B/T/Q) works with this change? If we plan to add this, we should have this kind of test environment covered in pipeline https://github.com/Azure/Azurite/blob/main/azure-pipelines.yml
And is there any limitation (like blob size) for the change? If so, we should mention it in the Readme.md file.
And could you share how Azurite will handle it when out of memory in this case, what's the typical error?

@joelverhagen
Copy link
Contributor Author

joelverhagen commented Oct 16, 2023

Would you please share the reason for this change?

This is intended for CI testing scenarios where Azurite is started at the beginning of the pipeline, some tests are run, and then the CI pipeline ends.

Generally, I find myself using Azurite for cases where the I don't care much if the storage is lost or, it even annoys me to have to clean up leftover storage containers. Today, in my project NuGet/Insights (which I've used as a general test for Azurite as a side-goal) I have a step that cleans up containers, queues, and tables based on a test name convention. This is workable, and necessary for a test with real Azure Storage, but for CI pipelines it would be nice just to stop the node process and know all clean-up is done.
https://github.com/NuGet/Insights/blob/5d0b5ed62a71b10d93d77f5135b4876a1b1bf4dc/test/Worker.Logic.Test/TestSupport/TestDataTest.cs#L87-L114

Additionally, I've found that some of my tests run surprisingly slow and found that it was related to how queue messages are persisted in Azurite. A disk write is done per enqueue and dequeue which means there's a lot more IO than if LokiJS was used to persist message content, as Loki only writes periodically -- and this is noticeable with the current perf difference between Table and Queue write-then-read flows.

In which scenario we would like to have data cached in memory instead of disk?

In short, CI runs where we start Azurite, run tests, then (optionally) stop Azurite.

And Azurite not only support saving data in disk, Azurite also allows save blob metadata in SQL, will the change also disable save data in SQL? (IF so, the parameter name/description might need change)

It should still work. It does allow --inMemoryPersistence to work with SQL, where extent metadata is stored in SQL and the extents themselves are stored in memory. A weird combination for sure but the previous mode should work fine just by not specifying --inMemoryPersistence. From what I can tell, the FSExtentStore is used for SQL which also is a bit strange to me. It's essentially local extent storage and remote metadata storage. Anyways, happy to adjust the behavior however you'd like.

Besides that, have you tested all APIs of all services (B/T/Q) works with this change?

Yes, I did some test hacks to run all the tests I could with this change, but I'm not sure exactly how you'd like to get full coverage. Should we run all tests both with and without the --inMemoryPersistence flag? That would double the number of tests but get very good coverage. Or we could run a subset with --inMemoryPersistence? Some representative subset? Happy to follow a suggestion here or I can give it a try on my own.

Besides that, have you tested all APIs of all services (B/T/Q) works with this change? If we plan to add this, we should have this kind of test environment covered in pipeline https://github.com/Azure/Azurite/blob/main/azure-pipelines.yml

Great idea! I'll give that a try. Maybe this addresses the previous point.

And is there any limitation (like blob size) for the change? If so, we should mention it in the Readme.md file.

I'll take this as a TODO. It's certainly related to the amount of memory available to the node process.

And could you share how Azurite will handle it when out of memory in this case, what's the typical error?

I'll take this as a TODO. I am not sure how node operates when it's low on memory. My guess is big allocs can fail and trigger GC and some severe memory shortage would cause the process to crash.

@joelverhagen
Copy link
Contributor Author

And could you share how Azurite will handle it when out of memory in this case, what's the typical error?

I'll take this as a TODO. I am not sure how node operates when it's low on memory. My guess is big allocs can fail and trigger GC and some severe memory shortage would cause the process to crash.

I did testing. The approach is surprisingly scalable. When physical memory is consumed, virtual memory is provided by the OS. I was not able to get any "out of memory" errors. I included a screenshot in the design doc.

@joelverhagen joelverhagen marked this pull request as ready for review October 17, 2023 19:58
@blueww
Copy link
Member

blueww commented Oct 18, 2023

@joelverhagen

Thanks for the contribution!
We might will need some time to review/test the code change and the doc.

Will update you later.

@joelverhagen
Copy link
Contributor Author

Great, thanks @blueww! Let me know if there's any updates I should make or questions I can answer. I'm happy to jump on a Teams call too if you think that will help ease communication.

@blueww
Copy link
Member

blueww commented Oct 20, 2023

Hi @joelverhagen,

I have done some sanity check with the PR:

  1. I try to upload multiple 3GB blobs to Azurite run with "Azurite --inMemoryPersistence"
  • My machine is Windows 10 with 32GB memory with ~46GB virtual memory.
  • There are no OOM on my machine after I uploaded 16 * 3GB blob, when I upload the 17th 3GB blob, following error happens and Azurite exists. The behavior should changes between difference OS.
  • It might not be good to eat almost all memory. Should we provide a way to limit the memory usage in this case (like 1/2 of current available memory + virtual memory).
<--- Last few GCs --->

[49684:00000229E11C69C0]   456817 ms: Mark-sweep (reduce) 195.0 (242.0) -> 195.0 (209.5) MB, 123.2 / 0.0 ms  (average mu = 0.033, current mu = 0.025) external memory pressure; GC in old space requested
[49684:00000229E11C69C0]   456893 ms: Mark-sweep (reduce) 195.0 (209.5) -> 195.0 (208.5) MB, 75.7 / 0.0 ms  (average mu = 0.019, current mu = 0.000) external memory pressure; GC in old space requested


<--- JS stacktrace --->

FATAL ERROR: Committing semi space failed. Allocation failed - JavaScript heap out of memory
 1: 00007FF6F8A4D51F node_api_throw_syntax_error+175743
 2: 00007FF6F89D2E66 v8::internal::wasm::WasmCode::safepoint_table_offset+59654
 3: 00007FF6F89D4B72 v8::internal::wasm::WasmCode::safepoint_table_offset+67090
 4: 00007FF6F947AAB4 v8::Isolate::ReportExternalAllocationLimitReached+116
 5: 00007FF6F9465E12 v8::Isolate::Exit+674
 6: 00007FF6F92E7C6C v8::internal::EmbedderStackStateScope::ExplicitScopeForTesting+124
 7: 00007FF6F92F20CC v8::internal::Heap::PageFlagsAreConsistent+3612
 8: 00007FF6F92E4707 v8::internal::Heap::CollectGarbage+2039
 9: 00007FF6F92FB0C3 v8::internal::HeapAllocator::AllocateRawWithLightRetrySlowPath+2099
10: 00007FF6F92FB96D v8::internal::HeapAllocator::AllocateRawWithRetryOrFailSlowPath+93
11: 00007FF6F930B1A3 v8::internal::Factory::NewFillerObject+851
12: 00007FF6F8FFC285 v8::internal::DateCache::Weekday+1349
13: 00007FF6F9518051 v8::internal::SetupIsolateDelegate::SetupHeap+558193
14: 00007FF6F94CEA38 v8::internal::SetupIsolateDelegate::SetupHeap+257624
15: 00007FF67977349A
  1. After uploaded 11*3GB blobs , when Azurite exist (with ctrl+c), it use > 2 minutes to release all used memory and exist. Could we accelerate this? Or at least should mention it in doc to make everyone know about that. And not sure will any bad thing happen when kill Azurite before it's full exist.

  2. The debug log (run with "--debug") looks disappear when run Azurite with "Azurite --inMemoryPersistence --debug debug.log" the debug log won't be generated in the current folder.

  • Without "--inMemoryPersistence" the debug log will be generated in current folder.
  • If specify full path of the debug log, the debug log will also be generated like "--debug c:\temp\debug.log"
  1. To keep data alignment, we should not allow user to start Azurite with saving metadata to SQL (when environment variable "AZURITE_DB" is set) or specify the location parameter ("--location" or "-l"), or at least omit the sql and location with a warning message.

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@joelverhagen
Copy link
Contributor Author

Hey @blueww! I believe I have addressed all of your comments. Please let me know if more is needed 😃

@blueww
Copy link
Member

blueww commented Nov 13, 2023

@joelverhagen

I have tested repeated upload and delete blobs with Azurite run with "--inMemoryPersistence". But the test result is not expected, it looks the deleted blob won't be clear up from memory. Would you please help to look?

Test Scenario:

Upload 100MB * 10 blobs, then delete all 10 blobs. Then upload 100MB* 10 blobs again, and delete them again. ...
I tried to repeat 1000 times, but failed after ~15 times.

Test Result:

After upload/delete 100MB*10 blobs for ~15 times, will meet following error :

Set-AzStorageBlobContent : Cannot add an extent chunk to the in-memory store. Size limit of 17030338560 bytes will be exceeded HTTP Status Code: 409 - HTTP Error Message: Cannot add an extent chunk to the in-memory store. Size limit of 17030338560 bytes will be exceeded
ErrorCode: MemoryExtentStoreAtSizeLimit
ErrorMessage: Cannot add an extent chunk to the in-memory store. Size limit of 17030338560 bytes will be exceeded
RequestId:53834723-e10f-48f9-aa9d-35330109a991
Time:2023-11-13T08:33:50.957Z
At line:6 char:9
+         Set-AzStorageBlobContent -Container $containerName -Blob "tes ...
+         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : CloseError: (:) [Set-AzStorageBlobContent], StorageException
    + FullyQualifiedErrorId : StorageException,Microsoft.WindowsAzure.Commands.Storage.Blob.SetAzureBlobContentCommand

The memory usage chart (I wait > 15 minutes after error happen) :
image

For you reference, my test script:
I run it from Powershell, need first install Az.Storage module.

$ctx = New-AzStorageContext -ConnectionString "AccountName=devstoreaccount1;AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==;DefaultEndpointsProtocol=http;BlobEndpoint=http://127.0.0.1:10000/devstoreaccount1;QueueEndpoint=http://127.0.0.1:10001/devstoreaccount1;TableEndpoint=http://127.0.0.1:10002/devstoreaccount1;"

$containerName = "test"
New-AzStorageContainer -Name $containerName -Context $ctx

for ($i =0 ; $i -le 1000; $i++ )
{
    Echo "create Blobs"
    for ($j=0 ; $j -le 10; $j++)
    {
        Set-AzStorageBlobContent -Container $containerName -Blob "testblob_$($i)_$($j)" -File C:\temp\testfile_102400K_0 -Context $ctx -Force
    }
    echo "Delete blobs"
    for ($j=0 ; $j -le 10; $j++)
    {
        Remove-AzStorageBlob -Container $containerName -Blob "testblob_$($i)_$($j)"  -Context $ctx -PassThru -Force 
    }
}

@joelverhagen
Copy link
Contributor Author

@blueww - the default blob GC is 10 minutes (

export const DEFAULT_GC_INTERVAL_MS = 10 * 60 * 1000;
). From your memory chart, I see growth in memory only 3-4 minutes after the process start. It is expected to face this error if you have met the limit before the BlobGCManager runs. If you were to attempt the upload after 10 minutes from process start, you should see success. I see a spike in CPU right around the 10 minute mark so I'm guessing the blob GC actually happened but your uploads did not continue after you faced an error (~3-4 minute mark).

Regarding the memory consumption, I wasn't able to reproduce this behavior. I changed the GC period to 5 minutes to ease testing and set an extent limit of 5 GB.
image
(I am not sure what tool you used, but it looks like a visualization of system perf counters)

A debug log will shed light on this. You'll want to look for a line like this:

info: BlobGCManager:markSweep() Deleted unreferenced 650 extents, after excluding active write extents.

It's possible that Node isn't releasing the memory even after the blob GC occurs. I am not an expert in Node.js GC but after a bit of reading it seems like it could be related to the default max-old-space-size value selected on your machine. It seems Node.js won't perform some of the more expensive GC steps if the current memory consumption is lower than this threshold which may explain the behavior you're seeing. The part of the code that is impacted by this PR is the integration with the blob and the queue GC. If you see the markSweep log line similar to the above, then it's working as I designed it. That being said, if you provide a debug log it might help me understand what you're seeing.

A way you could run a similar test for the existing disk-based persistence. I tried with a ramdisk on Ubuntu WSL (steps: https://askubuntu.com/a/453755) and similarly encountered failures (HTTP 500 not 409 of course) at my 5 GB ramdisk limit. However in that case the blob GC didn't find any extents at all. I tried many versions of Azurite (historical tags like v3.10.0 with the same condition:

info: BlobGCManager:markSweep() Got 0 extents.

So an alternate explanation is that there is an existing bug in the blob GC manager that you encountered. But we need a debug log to know.

@blueww
Copy link
Member

blueww commented Nov 14, 2023

@joelverhagen

Thanks for the investigation!

I use the tool Perfmon to show the chart of memory/cpu.

I have tested following 3 scenario, and this time looks GC works in all scenarios.
And I do see the following debug log when GC happen :

info: BlobGCManager:markSweep() Deleted unreferenced 2106 extents, after excluding active write extents.

Besides that, to avoid customer issues like why deleted blob content still occupy memory, do you think we should mention how GC works in Readme.md section "Use in-memory storage"?

Scenario 1

  1. upload 100*100MB blobs, them delete them
  2. Wait for 10 minutes
  3. repeated step 1, step 2.

image

Scenario 2

  1. Upload 200*100MB blobs (after upload ~150 blobs will get OOM error) and delete them,
  2. wait 10 minutes
  3. repeated step 1, step 2.
    image

Scenario 3

  1. Upload 10*100MB blobs and delete them,
  2. Repeat step 1 for 20 times (after repeat 15 times will get OOM error)
  3. wait 1000 seconds
  4. repeated step 1~3.
    We can see the "Private Bytes" won't drop immediately after GC, this is different from other 2 scenarios. But it will drop in the begin of next round blob upload.
    image

@joelverhagen
Copy link
Contributor Author

Besides that, to avoid customer issues like why deleted blob content still occupy memory, do you think we should mention how GC works in Readme.md section "Use in-memory storage"?

Sure, I'll do that.

We can see the "Private Bytes" won't drop immediately after GC, this is different from other 2 scenarios. But it will drop in the begin of next round blob upload.

Generally I've seen this behavior in several GC'd runtimes. In my head it's basically "GC sometimes does the hard work only when memory pressure is faced" (i.e. lazily) -- which is when the next round of blobs come in for this case. But this I my gut check and not specific expertise on the Node GC.

@joelverhagen
Copy link
Contributor Author

I've updated the README with these details.

@blueww
Copy link
Member

blueww commented Nov 15, 2023

@joelverhagen

Thanks for the quick update!
I will do some final review, and should can merge this PR soon.

@blueww blueww merged commit 813950d into Azure:main Nov 15, 2023
35 checks passed
@joelverhagen joelverhagen deleted the in-memory branch November 15, 2023 13:54
@edwin-huber
Copy link
Collaborator

Fixes #1331
Fixes #1332

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants