Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Azure SAS Token Authentication #812

Conversation

jonaskaltenbachz
Copy link
Contributor

This pull request allows to authenticate in Azure with a SAS token. For this, the env var AZURE_STORAGE_SAS_URL must be set.

If no AZURE_STORAGE_KEY is set, a SAS token is requested from the AZURE_STORAGE_SAS_URL. The SAS token is initially requested in the NewAzureService func. Additionally, the NewBlob func checks if the token is still valid and fetches a new one if necessary.

If AZURE_STORAGE_KEY is set, the key is used for authentication.

@Acconut
Copy link
Member

Acconut commented Sep 19, 2022

Thank you for this PR, but I am not sure if this is a good fit inside tusd. Is it a common pattern in Azure deployments that the application continuously request new SAS tokens? If so, does the Azure SDK offer this integration on its own? Then we would not have to maintain this logic inside tusd.

Maybe @omBratteng can also shed some more light on this.

@jonaskaltenbachz
Copy link
Contributor Author

jonaskaltenbachz commented Sep 21, 2022

In our case we have implemented an Azure function that issues the SAS token. So when the token expires tusd can fetch a new SAS Token from the Azure function.
Therefore within tusd it is not necessary to maintain the SAS token provision. It must be only checked before each upload if the SAS token is still valid and if not the AZURE_STORAGE_SAS_URL is called.

Does this answer your question?

@jonaskaltenbachz
Copy link
Contributor Author

Hello,
I just wanted to ask if there is an update on this topic.

Many greetings,
Jonas

@omBratteng
Copy link
Contributor

Sorry, I have been quite swamped lately. I'll try to take a look at this this week.

@Acconut
Copy link
Member

Acconut commented Oct 18, 2022

Therefore within tusd it is not necessary to maintain the SAS token provision.

What does this exactly mean? What would a SAS token provision be?

Basically, to me this looks like a feature that is only helpful for your own setup and might not be useful for anybody else. And I am concerned about adding such functionality if it only benefits a single user. I might be wrong as I don't use Azure but that is the reason for my hesitation and the inspiration of why I ask about alternative approaches here. I hope you understand.

@jonaskaltenbachz
Copy link
Contributor Author

Thank you for your reply. I understand your questions. A shared access signature (SAS Token) provides secure delegated access to resources in your storage account. With a SAS, you have granular control over how a client can access your data. The SAS tokens are provided by Azure (see: https://learn.microsoft.com/en-us/rest/api/storageservices/create-account-sas).
And the URL that can be specified with the new env var AZURE_STORAGE_SAS_URL will deliver this token to tusd.

Therefore the new feature can be used by anyone who wants to authenticate against the Azure storage account with a SAS and not with the account key.

@Acconut
Copy link
Member

Acconut commented Oct 29, 2022

Therefore the new feature can be used by anyone who wants to authenticate against the Azure storage account with a SAS and not with the account key.

I see that, yes. It would just be nice to hear from another Azure-experienced person if this approach would be useful for others as well (maybe from @omBratteng if he has some time).

Additionally, the NewBlob func checks if the token is still valid and fetches a new one if necessary.

Does this mean that the check is only performed if we want to upload data from a PATCH request? Shouldn't this check be included in every API request to Azure? Otherwise we might issue a request with an outdated token.

@jonaskaltenbachz
Copy link
Contributor Author

jonaskaltenbachz commented Nov 2, 2022

Does this mean that the check is only performed if we want to upload data from a PATCH request? Shouldn't this check be included in every API request to Azure? Otherwise we might issue a request with an outdated token.

Where are other API request to Azure? In our tests it always worked with and outdated token (because a new on was fetched).

@Acconut
Copy link
Member

Acconut commented Nov 5, 2022

Ok, I see. NewBlob is called for NewUpload and GetUpload, so we should be fine indeed on that front! Please ignore my last comment.

@omBratteng
Copy link
Contributor

I'm unsure if this would benefit anyone else. I am not familiar with anyone else solving it like this.

@jonaskaltenbachz
Copy link
Contributor Author

Hello,

okay I see your point. We have considered the following solution as an alternative:

Basically for our solution we need a way to set the credentials to NewAnonymousCredential and attach a SAS token or presigned URL to the containerURL.

We could also inject SAS tokens via a new "NewBlockBlobURL" callback function which is passed as parameter from the CLI to the AzureSerivce. The complete logic of how the SAS token or any other presigned URL is loaded is then controlled in the new function which is defined outside of the TUSD pkg.

The anonymous credentials could be set in the if there is no AZURE_STORAGE_KEY available.

The following is a prototype of our new solution approach:

type NewBlockBlobUrlFunc func(ctx context.Context, name string) (AzBlob, error)

type azService struct {
    BlobAccessTier azblob.AccessTierType
    ContainerURL   *azblob.ContainerURL
    ContainerName  string
    Endpoint       string
    NewBlockBlobURL *NewBlockBlobUrlFunc // Custom method to build blob URL
}

type AzConfig struct {
    AccountName         string
    AccountKey          string
    BlobAccessTier      string
    ContainerName       string
    ContainerAccessType string
    Endpoint            string
    NewBlockBlobURL *NewBlockBlobUrlFunc // Custom method to build blob URL
}

func NewAzureService(config *AzConfig) (AzService, error) {
    var credential azblob.Credential
    var credError error
    var cURL *url.URL
    var newBlockBlobUrl *NewBlockBlobUrlFunc = config.NewBlockBlobURL

    if config.AccountKey != "" {
        credential, credError = azblob.NewSharedKeyCredential(config.AccountName, config.AccountKey)
    } else {
        // Use anonymous credentials if no AccountKey is set
        credential = azblob.NewAnonymousCredential()
    }

    cURL, _ = url.Parse(fmt.Sprintf("%s/%s", config.Endpoint, config.ContainerName))

    if (newBlockBlobUrl != nil) {
            newBlockBlobUrl = func (..)(..) {
            return cURL.NewBlockBlobURL(name)
        }
    }

    ...
   
     return &azService{
        BlobAccessTier: blobAccessTierType,
        ContainerURL:   &containerURL,
        ContainerName:  config.ContainerName,
        Endpoint:       config.Endpoint,
        NewBlockBlobURL : newBlockBlobUrl,
    }, nil
}

func (service *azService) NewBlob(ctx context.Context, name string) (AzBlob, error) {
    var fileBlob AzBlob
    bb := service.NewBlockBlobUrl(context, name)
    if strings.HasSuffix(name, InfoBlobSuffix) {
        fileBlob = &InfoBlob{
            Blob: &bb,
        }
    } else {
        fileBlob = &BlockBlob{
            Blob:       &bb,
            Indexes:    []int{},
            AccessTier: service.BlobAccessTier,
        }
    }
    return fileBlob, nil
}

Is this a solution that you would support? If yes, we would adjust our pull request.

Many greetings,
Jonas

@omBratteng
Copy link
Contributor

Seeing as I rarely have time for it, would you be open to create a new PR, where you replace the deprecated azure-storage-blob-go with the new azblob from azure-sdk-for-go package?
I think it would have better support for SAS tokens.

@jonaskaltenbachz
Copy link
Contributor Author

Hi @omBratteng and @Acconut

We took a look at the new azure-sdk-for-go package. Unfortunately it does not solve our problem directly. Since the SAS URL still has to be passed from the outside. However, we can imagine the following solution:

  1. We replace the deprecated azure-storage-blob-go with the new azblob from azure-sdk-for-go package
  2. Previously, a containerURL was created in the azureservice and used for further communication with the Azure container.
  3. This will be removed and replaced with an instance of the new azblob.Client. See also: https://github.com/Azure/azure-sdk-for-go/tree/main/sdk/storage/azblob
  4. In case of authentication based on the account key, as it is now, this client is created in the NewAzureService function
    4.1. Via the azblob.NewClientWithSharedKeyCredential function
    4.2. See also: https://github.com/Azure/azure-sdk-for-go/blob/main/sdk/storage/azblob/examples_test.go
    4.3. Thus, nothing changes from the user's perspective in this case.
  5. However, if the authentication is to be based on SAS tokens, the azblob.NewClientWithNoCredential function must be used
    5.1. See also: https://github.com/Azure/azure-sdk-for-go/blob/main/sdk/storage/azblob/sas/examples_test.go
    5.2. But here we have a challenge: To the azblob.NewClientWithNoCredential function the SAS URL must be passed. But the SAS URL that is needed here is a dynamic URL that looks different for each user, so it cannot be statically defined in the azureservice.
    5.3. So here we come back to the solution we already presented on November 15. Our idea is to extend the azureservice with a NewBlockBlobUrlFunc parameter. This parameter is a callback function which generates the SAS URL.

    5.4. The azureservice uses this function to get a URL with SAS token whenever a new client is created with the azblob.NewClientWithNoCredential function. This is initially the case in the NewAzureService function and when the SAS token has expired to renew the client.
    5.5. The callback function is passed as a parameter from the CLI to the AzureSerivce. The complete logic of how the SAS token or any other presigned URL is loaded is then controlled in the new function which is defined outside of the TUSD pkg.

What do you think about these approaches? In principle, the described solution would also be possible with the old azure-storage-blob-go package (see comment from November 15). But if we could solve our need and contribute to the repo in general by switching to the new azure-sdk-for-go package, we'd be happy to do that too.

@jonaskaltenbachz
Copy link
Contributor Author

jonaskaltenbachz commented Jan 4, 2023

Hi @Acconut
I just want to ask if you have already found time to look at my last message.

@OlafHaalstra
Copy link

Dear all,

As microsoft states in their documentation: "With a SAS, you have granular control over how a client can access your data"

The real benefit of using SAS for me would be to use this instead of the AZURE_STORAGE_KEY since if this key gets compromised an attacker now has full control over the Azure Blob Storage resource. A SAS could be configured in such a way that it only allows for:

  1. Uploading & reading files, denying any modification on the resource
  2. Performing actions from a bound IP (in this case the static IP of the server that runs tusd)

Especially the first point is of real added value, ideally you would not even allow reading but this is necessary for tusd to work with the .info file. This could be maybe be circumvented in the future by:

  • Allowing the .info file being stored on local disk, while the 'real' file is sent directly to Azure
  • Once the upload is complete the .info file is also uploaded so that all required metadata is stored on Azure

@jonaskaltenbachz
Copy link
Contributor Author

Hello together,

the following picture should describes again what advantages a SAS token based authentication in tusd would have for us and others.

image

In this case there is no need to have a AZURE_STORAGE_KEY on the edge device. The SAS token is obtained via the NewBlockBlobUrlFunc parameter (or as originally planned via the AZURE_STORAGE_SAS_URL env var).

As @OlafHaalstra also mentioned the SAS token can only have read and write permissions to the storage account (and if desired only to a specific container in the storage account). Furthermore, the SAS token is only valid for a certain time and is bound to the IP address of the edge device.
All these points result in a much higher security than if an AZURE_STORAGE_KEY is included in tusd on the edge device. Therefore I think that many users could benefit from this.

In general this solution would work also for other cloud providers (e.g. in AWS with presigned URLs).

@omBratteng
Copy link
Contributor

@Acconut could it be a feature in tusd itself, that each provider could hook onto. Like a standard method of fetching temporary auth before a new upload? Then the provider just hooks into that, with its custom methods of fetching credentials?

I am not familiar with the NewClientWithNoCredential method, by its name, it sounds "scary", like anyone can upload.

And I agree that a SAS token is much more secure, we use something similar in our pipeline when downloading blobs/containers, where we generate a SAS token for the specific blob/container, and then forward it to the downloader. But that is written in python.

Ideally, the provider would be upgraded to use the new azblob from azure-sdk-for-go package.
Then we could auth with:

  1. Connection string env variable with key and secret
  2. Account key and secret variable
  3. Your proposed method

Unsure which order would be preferred.

@quality-leftovers
Copy link

quality-leftovers commented Jan 23, 2023

@omBratteng
The NewClientWithNoCredential method just constructs a client that does not automatically add any authentication information to requests made using that client. Since there is no "anonymous" upload in azure blob storage this basically means the client can be used for

  • downloading from blobs that allow anonymous access
  • perform upload requests using URLs where the authentication information (SAS token) was added "manually"

An auth provider would be nice, but I wonder how the signature / hook / callback would look like since it is different for each store. Maybe it would also be possible to extend FileInfo with an additional Authentication or Settings map that could be set in the PreUploadCreateCallback when a file is Posted. Not sure how one would deal with HeadFile and PatchFile since both call handler.DataStore.GetStorage and az and file store already perform reads there (unlike s3store and gcsstore). Also GetUploadInfo would need the auth info and since it returns the FileInfo that kinda makes it a bit clumsy.

In general I like the idea of having the ability to add SAS tokens and the hook not being called from the store.

@Acconut
Copy link
Member

Acconut commented Mar 26, 2023

Apologies for my delay. Azure is very much out of my comfort zone, so this is all very foreign to me. However, I do not longer see any issue with adding SAS support, as you described in #812 (comment), @jonaskaltenbachz. I think it would make sense to upgrade to the new azblob before adding the support, as you mentioned in your comment. Let me know if you are still interested in pursuing this :)

@Acconut could it be a feature in tusd itself, that each provider could hook onto. Like a standard method of fetching temporary auth before a new upload? Then the provider just hooks into that, with its custom methods of fetching credentials?

For S3 at least, I am not sure if there is something similar to SAS. As mentioned by others, I would be hesitant to define an interface because it seems to vary a lot across storage providers. So let's avoid a common interface for now.

@Acconut Acconut deleted the branch tus:v2 September 6, 2023 15:56
@Acconut Acconut closed this Sep 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants