Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade WindowsAzure.Storage to Microsoft.Azure.Cosmos.Table and Asure.Storage.Blobs #151

Merged
merged 2 commits into from Apr 13, 2021

Conversation

Arkatufus
Copy link
Contributor

No description provided.

true,
BlobListingDetails.Metadata, null, null, requestOptions, new OperationContext(), cts.Token);
}
var results = Container.GetBlobsAsync(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get blob info list IAsyncEnumerator


async Task<SelectedSnapshot> FilterAndFetch(BlobResultSegment segment)
{
if (!await pageEnumerator.MoveNextAsync())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're only interested on the first page, will not retrieve the whole list of blobs

async Task<SelectedSnapshot> FilterAndFetch(BlobResultSegment segment)
{
if (!await pageEnumerator.MoveNextAsync())
return null;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and if that fails, just return immediately

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

.Where(x => FilterBlobTimestamp(criteria, x));

var deleteTasks = new List<Task>();
await foreach (var blob in filtered.WithCancellation(cts.Token))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace the old paging code with the one provided by the SDK

/// <summary>
/// Provides helpers to validate resource names across the Microsoft Azure Storage Services.
/// </summary>
public static class NameValidator
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest Blob SDK doesn't have a name validator, so swiping it from the old SDK code

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea

Copy link
Member

@Aaronontheweb Aaronontheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - just left you one question about the blob public access settings, which you can address in a follow-on PR

@@ -5,6 +5,7 @@
<PropertyGroup>
<TargetFramework>$(NetStandardLibVersion)</TargetFramework>
<Description>Akka.Persistence support for Windows Azure Table storage and Azure blob storage.</Description>
<LangVersion>8.0</LangVersion>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

}

return containerRef;
var response = await blobClient.CreateIfNotExistsAsync(
PublicAccessType.BlobContainer,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this affect whether the blob is accessible to the public or is this just weird naming on the part of the Azure SDK?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the documentation:

Azure Storage supports optional anonymous public read access for containers and blobs. By default, anonymous access to your data is never permitted. Unless you explicitly enable anonymous access, all requests to a container and its blobs must be authorized. When you configure a container's public access level setting to permit anonymous access, clients can read data in that container without authorizing the request.

So the code allows access to the blobs and containers to anonymous connections.
Its there to replace the original code await containerRef.CreateIfNotExistsAsync(BlobContainerPublicAccessType.Container, new BlobRequestOptions(), op, cts.Token) which does the exact same thing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't we just merge a PR in a few days ago that explicitly disabled this (because disallowing anonymous read access to application data by default makes sense) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see that merge, can you link the PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was this issue here: #144

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like they never changed the default setting when auto-initialize = on, but it shouldn't be public by default

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets address that in another PR

async Task<SelectedSnapshot> FilterAndFetch(BlobResultSegment segment)
{
if (!await pageEnumerator.MoveNextAsync())
return null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

/// <summary>
/// Provides helpers to validate resource names across the Microsoft Azure Storage Services.
/// </summary>
public static class NameValidator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea

@Aaronontheweb
Copy link
Member

The runtime of the test suite increased from about 2.5 minutes on Linux to 3.25 minutes, 3.5 minutes on Linux to 5.25 minutes here. Is there anything this SDK changes, or the way we're using it, that would have made the queries run for much longer than previously?

@Aaronontheweb Aaronontheweb self-requested a review April 10, 2021 15:30
Copy link
Member

@Aaronontheweb Aaronontheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want some comments around the performance / runtime addressed before merging in

@Arkatufus
Copy link
Contributor Author

Since everything is handled by the API IAsyncEnumerator, I trully have no idea. The new API really limits what you can use to access the storage.

@Arkatufus
Copy link
Contributor Author

Can we pin-point the cause of the performance degradation?

@Aaronontheweb
Copy link
Member

Can we pin-point the cause of the performance degradation?

I'm just looking at the data from Azure Pipelines. Can you investigate?

@Arkatufus
Copy link
Contributor Author

I'll try and give it a look-see

@Arkatufus
Copy link
Contributor Author

Oh, the performance test only tests for journaling, which uses Azure Tables, which is upgraded to Microsoft.Azure.Cosmos.Table, which is a 1 to 1 replacement of Microsoft.WindowsAzure.Storage for Azure Tables.
None of our codes were changed, all that was needed was to replace the old namespace to the new one.
All performance degradation was caused by the underlying API.

There is a replacement for Microsoft.Azure.Cosmos.Table called Azure.Data.Tables, but it is still in beta state.

@Aaronontheweb
Copy link
Member

Ok, we'll see if anything comes up from user reports down the stretch.

@Aaronontheweb Aaronontheweb merged commit ddf9ca3 into petabridge:dev Apr 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants