Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade Azure Storage client to 4.0.0 #16084

Merged

Conversation

Projects
None yet
3 participants
@dadoonet
Copy link
Member

commented Jan 19, 2016

We are using 2.0.0 today but Azure team now recommends:

<dependency>
    <groupId>com.microsoft.azure</groupId>
    <artifactId>azure-storage</artifactId>
    <version>4.0.0</version>
</dependency>

This new version fix the timeout issues we have seen with azure storage although #15080 adds a timeout support.
Azure storage client 2.0.0 was not passing correctly this value when it was calling Azure services.

Note that the timeout is a server side timeout and not client side timeout.
It means that it will raise only a timeout when:

  • upload of blob is complete
  • if azure service is not able to process the blob (and store it) within a given time range.

In which case it will raise an exception which elasticsearch can deal with:

java.io.IOException
    at __randomizedtesting.SeedInfo.seed([91BC11AEF16E073F:6886FA5308FCE4D8]:0)
    at com.microsoft.azure.storage.core.Utility.initIOException(Utility.java:643)
    at com.microsoft.azure.storage.blob.BlobOutputStream.writeBlock(BlobOutputStream.java:444)
    at com.microsoft.azure.storage.blob.BlobOutputStream.access$000(BlobOutputStream.java:53)
    at com.microsoft.azure.storage.blob.BlobOutputStream$1.call(BlobOutputStream.java:388)
    at com.microsoft.azure.storage.blob.BlobOutputStream$1.call(BlobOutputStream.java:385)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: com.microsoft.azure.storage.StorageException: Operation could not be completed within the specified time.
    at com.microsoft.azure.storage.StorageException.translateException(StorageException.java:89)
    at com.microsoft.azure.storage.core.StorageRequest.materializeException(StorageRequest.java:305)
    at com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:175)
    at com.microsoft.azure.storage.blob.CloudBlockBlob.uploadBlockInternal(CloudBlockBlob.java:1006)
    at com.microsoft.azure.storage.blob.CloudBlockBlob.uploadBlock(CloudBlockBlob.java:978)
    at com.microsoft.azure.storage.blob.BlobOutputStream.writeBlock(BlobOutputStream.java:438)
    ... 9 more

The following code was used to test this against Azure platform:

public void testDumb() throws URISyntaxException, StorageException, IOException, InvalidKeyException {
    String connectionString = "MY-AZURE-STRING";

    CloudStorageAccount storageAccount = CloudStorageAccount.parse(connectionString);
    CloudBlobClient client = storageAccount.createCloudBlobClient();
    client.getDefaultRequestOptions().setTimeoutIntervalInMs(1000);
    CloudBlobContainer container = client.getContainerReference("dumb");
    container.createIfNotExists();
    CloudBlockBlob blob = container.getBlockBlobReference("blob");

    File sourceFile = File.createTempFile("sourceFile", ".tmp");

    try {
        int fileSize = 10000000;

        byte[] buffer = new byte[fileSize];
        Random random = new Random();
        random.nextBytes(buffer);

        logger.info("Generate local file");
        FileOutputStream fos = new FileOutputStream(sourceFile);
        fos.write(buffer);
        fos.close();
        logger.info("End generate local file");

        FileInputStream fis = new FileInputStream(sourceFile);

        logger.info("Start uploading");
        blob.upload(fis, fileSize);
        logger.info("End uploading");

    }
    finally {
        if (sourceFile.exists()) {
            sourceFile.delete();
        }
    }
}

With 2.0.0, the above code was not raising any exception. With 4.0.0, the exception is now thrown correctly.

Closes #12567.

Release notes from 2.0.0:

  • Removed deprecated table AtomPub support.
  • Removed deprecated constructors which take service clients in favor of constructors which take credentials.
  • Added support for "Add" permissions on Blob SAS.
  • Added support for "Create" permissions on Blob and File SAS.
  • Added support for IP Restricted SAS and Protocol SAS.
  • Added support for Account SAS to all services.
  • Added support for Minute and Hour Metrics to FileServiceProperties and added support for File Metrics to CloudAnalyticsClient.
  • Removed deprecated startCopyFromBlob() on CloudBlob. Use startCopy() instead.
  • Removed deprecated Credentials and StorageKey classes. Please use the appropriate methods on StorageCredentialsAccountAndKey instead.
  • Fixed a bug in table where a select on a non-existent field resulted in a null reference exception if the corresponding field in the TableEntity was not nullable.
  • Fixed a bug in table where JsonParser was automatically closing the response stream before it was completely drained causing socket exhaustion.
  • Fixed a bug in StorageCredentialsAccountAndKey.updateKey(String) which prevented valid keys from being set.
  • Added CloudBlobContainer.listBlobs(final String, final boolean) method.
  • Fixed a bug in blob where using AccessConditions on block blob uploads larger than 64MB done with the upload* methods or block blob uploads done openOutputStream with would fail if the blob did not already exist.
  • Added support for setting a proxy per request. Proxy can be set on an OperationContext instance and will be used when that instance is passed to the request method.
  • Added support for SAS to the Azure File service.
  • Added support for Append Blob.
  • Added support for Access Control Lists (ACL) to File Shares.
  • Added support for getting and setting of CORS rules to File service.
  • Added support for ShareStats to File Shares.
  • Added support for copying an Azure File to another Azure File or a Block Blob asynchronously, and aborting Azure File copy operations asynchronously.
  • Added support for copying a Blob to an Azure File asynchronously.
  • Added support for setting a maximum quota property on a File Share.
  • Removed deprecated AuthenticationScheme and its getter and setter. In the future only SharedKey will be used.
  • Removed deprecated getter/setters for all request option properties on the service clients. Please use the default request options getter/setters instead.
  • Removed getSubDirectoryReference() for blob directories and file directories. Use getDirectoryReference() instead.
  • Removed getEntityClass() in TableQuery. Please use getClazzType() instead.
  • Added client-side verification for lease duration and break periods.
  • Deprecated the setters in table for timestamp as this property is only modifiable by the service.
  • Deprecated startCopyFromBlob() on CloudBlob. Use startCopy() instead.
  • Deprecated the Credentials and StorageKey classes. Please use the appropriate methods on StorageCredentialsAccountAndKey instead.
  • Deprecated constructors which take service clients in favor of constructors which take credentials.
  • Fixed a bug where the DateBackwardCompatibility flag was not applied if set on the CloudTableClient default request options.
  • Changed library behavior to retry all exceptions thrown when parsing a response object.
  • Changed behavior to stop removing query parameters passed in with the resource URI if that URI contains a SAS token. Some query parameters such as comp, restype, snapshot and api-version will still be removed.
  • Added support for logging StringToSign to SharedKey and SAS.
  • Added a connect timeout to prevent hangs when establishing the network connection.
  • Made performance enhancements to the BlobOutputStream class.
  • Fixed a bug where maximum execution time was ignored for file, queue, and table services.
  • Changed the socket timeout to be set to the service side timeout plus 5 minutes when maximum execution time is not set.
  • Changed the socket timeout to default to 5 minutes rather than infinite when neither service side timeout or maximum execution time are set.
  • Fixed a bug where MD5 was calculated for commitBlockList even though UseTransactionalMD5 was set to false.
  • Fixed a bug where selecting fields that did not exist returned an error rather than an EntityProperty with a null value.
  • Fixed a bug where table entities with a single quote in their partition or row key could be inserted but not operated on in any other way.
  • Fixed a bug for all listing API's where next() would sometimes throw an exception if hasNext() had not been called even if there were more elements to iterate on.
  • Added sequence number to the blob properties. This is populated for page blobs.
  • Creating a page blob sets its length property.
  • Added support for page blob sequence numbers and sequence number access conditions.
  • Fixed a bug in abort copy where the lease access condition was not sent to the service.
  • Fixed an issue in startCopyFromBlob where if the URI of the source blob contained certain non-ASCII characters they would not be encoded appropriately. This would result in Authorization failures.
  • Fixed a small performance issue in XML serialization.
  • Fixed a bug in BlobOutputStream and FileOutputStream where flush added data to a request pool rather than immediately committing it to the Azure service.
  • Refactored to remove the blob, queue, and file package dependency on table in the error handling code.
  • Added additional client-side logging for REST requests, responses, and errors.

Closes #15976.

@dadoonet

This comment has been minimized.

Copy link
Member Author

commented Jan 19, 2016

dadoonet added a commit to dadoonet/elasticsearch that referenced this pull request Jan 19, 2016

Replace server side timeout by client side timeout
This commit replaces server side timeout (which is BTW not correctly implemented in azure client 2.0.0 but fixed later elastic#16084) with a client side timeout.

As a consequence, for each request sent to azure, azure client will raise an exception after a given amount of time (timeout).

Closes elastic#12567

dadoonet added a commit that referenced this pull request Jan 19, 2016

Replace server side timeout by client side timeout
This commit replaces server side timeout (which is BTW not correctly implemented in azure client 2.0.0 but fixed later #16084) with a client side timeout.

As a consequence, for each request sent to azure, azure client will raise an exception after a given amount of time (timeout).

Closes #12567

Backport of #16087 in 2.x branch

dadoonet added a commit that referenced this pull request Jan 19, 2016

Replace server side timeout by client side timeout
This commit replaces server side timeout (which is BTW not correctly implemented in azure client 2.0.0 but fixed later #16084) with a client side timeout.

As a consequence, for each request sent to azure, azure client will raise an exception after a given amount of time (timeout).

Closes #12567

Backport of #16087 in 2.2 branch

dadoonet added a commit that referenced this pull request Jan 19, 2016

Replace server side timeout by client side timeout
This commit replaces server side timeout (which is BTW not correctly implemented in azure client 2.0.0 but fixed later #16084) with a client side timeout.

As a consequence, for each request sent to azure, azure client will raise an exception after a given amount of time (timeout).

Closes #12567

Backport of #16087 in 2.0 branch

dadoonet added a commit that referenced this pull request Jan 19, 2016

Replace server side timeout by client side timeout
This commit replaces server side timeout (which is BTW not correctly implemented in azure client 2.0.0 but fixed later #16084) with a client side timeout.

As a consequence, for each request sent to azure, azure client will raise an exception after a given amount of time (timeout).

Closes #12567

Backport of #16087 in 2.1 branch
@tlrx

This comment has been minimized.

Copy link
Member

commented Jan 20, 2016

Why did we set a default timeout of 5m? (see #14277). According to the documentation I'm wondering how it handles the case when a blob upload takes more than 5 m:

The server timeout interval begins at the time that the complete request has been received by the service, and the server begins processing the response. If the timeout interval elapses before the response is returned to the client, the operation times out.

@dadoonet

This comment has been minimized.

Copy link
Member Author

commented Jan 20, 2016

@tlrx So according to this thread #16087 (comment) we should revert my last change #16087 and use only setTimeoutIntervalInMs and merge the current PR.

Clock starts when the blob is finished uploaded.

Which means that it's the time for Azure to "process" the blob (store it basically - I don't know the internals).
In that case, a default value of 5m looks reasonable to me because for now we are only uploading small chunks of data (< 64mb): #12448

Then the global request timeout is set to 5m + Constants.DEFAULT_READ_TIMEOUT which is 5m. It means that after 10 minutes by default, elasticsearch will raise an exception (remember that we are talking about 64mb max here).

So it looks conservative to me and a good default.

We might want to revisit this default value once we will have implemented #12448.

Make sense?

@dadoonet dadoonet force-pushed the dadoonet:fix/15976-update-azure-storage-client branch Jan 20, 2016

@dadoonet

This comment has been minimized.

Copy link
Member Author

commented Jan 20, 2016

@tlrx I added a new commit.

@clintongormley clintongormley added v2.0.3 and removed v2.0.4 labels Jan 20, 2016

@dadoonet

This comment has been minimized.

Copy link
Member Author

commented Jan 22, 2016

@tlrx and I chatted about this today and we came to those conclusions:

  • elasticsearch should use the default timeouts set in the azure client. We should not set any default on our end
  • we should let user modify this default azure timeout if they really need to

To do that, we need to:

  • upgrade the azure client to 4.0.0 as it comes with a default socket timeout (5 minutes).
  • only set socket timeout with setMaximumExecutionTimeInMs to a value if the user explicitly defined one
  • update the documentation accordingly

I'll update the current PR based on this.

@dadoonet dadoonet force-pushed the dadoonet:fix/15976-update-azure-storage-client branch Jan 24, 2016

@dadoonet

This comment has been minimized.

Copy link
Member Author

commented Jan 24, 2016

@tlrx I added a new commit (and rebased BTW)

@dadoonet dadoonet force-pushed the dadoonet:fix/15976-update-azure-storage-client branch Feb 21, 2016

@dadoonet

This comment has been minimized.

Copy link
Member Author

commented Feb 21, 2016

@tlrx I rebased on master. Could you give a final look please?

@tlrx

View changes

docs/plugins/repository-azure.asciidoc Outdated
You can set the timeout to use when making any single request. It can be defined globally, per account or both.
Defaults to `5m`.
You can set the client side timeout to use when making any single request. It can be defined globally, per account or both.
It's not set by default which means that elasticsearch is using the default value set by the azure client.

This comment has been minimized.

Copy link
@tlrx

tlrx Feb 29, 2016

Member

Can you document the default value set by the Azure client?

This comment has been minimized.

Copy link
@dadoonet

dadoonet Feb 29, 2016

Author Member

I did not want to do it because if azure changes it at some point our documentation will become inconsistent... thoughts?

This comment has been minimized.

Copy link
@tlrx

tlrx Feb 29, 2016

Member

Shouldn't it be fixed for the current version of Azure client?

This comment has been minimized.

Copy link
@dadoonet

dadoonet Feb 29, 2016

Author Member

It is, but it would mean that each time we upgrade the client, we have to check if this value changed.
May be another option would be to link to Azure doc (if this info exists somewhere)? Looking for this ATM.

This comment has been minimized.

Copy link
@dadoonet

dadoonet Feb 29, 2016

Author Member

So I could link to the JavaDoc: http://azure.github.io/azure-storage-java/com/microsoft/azure/storage/RequestOptions.html#setTimeoutIntervalInMs(java.lang.Integer)?
Note that for now, this value is null indicating no maximum time. I guess it will remain as is so may be you're right and I should just write this in our doc???

This comment has been minimized.

Copy link
@tlrx

tlrx Feb 29, 2016

Member

I think we should at least add a link to the Azure doc.

@tlrx

This comment has been minimized.

Copy link
Member

commented Feb 29, 2016

Left a minor comment, otherwise LGTM

Upgrade Azure Storage client to 4.0.0
We are using `2.0.0` today but Azure team now recommends:

```xml
<dependency>
    <groupId>com.microsoft.azure</groupId>
    <artifactId>azure-storage</artifactId>
    <version>4.0.0</version>
</dependency>
```

This new version fix the timeout issues we have seen with azure storage although #15080 adds a timeout support.
Azure storage client 2.0.0 was not passing correctly this value when it was calling Azure services.

Note that the timeout is a server side timeout and not client side timeout.
It means that it will raise only a timeout when:

* upload of blob is complete
* if azure service is not able to process the blob (and store it) within a given time range.

In which case it will raise an exception which elasticsearch can deal with:

```
java.io.IOException
    at __randomizedtesting.SeedInfo.seed([91BC11AEF16E073F:6886FA5308FCE4D8]:0)
    at com.microsoft.azure.storage.core.Utility.initIOException(Utility.java:643)
    at com.microsoft.azure.storage.blob.BlobOutputStream.writeBlock(BlobOutputStream.java:444)
    at com.microsoft.azure.storage.blob.BlobOutputStream.access$000(BlobOutputStream.java:53)
    at com.microsoft.azure.storage.blob.BlobOutputStream$1.call(BlobOutputStream.java:388)
    at com.microsoft.azure.storage.blob.BlobOutputStream$1.call(BlobOutputStream.java:385)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: com.microsoft.azure.storage.StorageException: Operation could not be completed within the specified time.
    at com.microsoft.azure.storage.StorageException.translateException(StorageException.java:89)
    at com.microsoft.azure.storage.core.StorageRequest.materializeException(StorageRequest.java:305)
    at com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:175)
    at com.microsoft.azure.storage.blob.CloudBlockBlob.uploadBlockInternal(CloudBlockBlob.java:1006)
    at com.microsoft.azure.storage.blob.CloudBlockBlob.uploadBlock(CloudBlockBlob.java:978)
    at com.microsoft.azure.storage.blob.BlobOutputStream.writeBlock(BlobOutputStream.java:438)
    ... 9 more
```

The following code was used to test this against Azure platform:

```java
public void testDumb() throws URISyntaxException, StorageException, IOException, InvalidKeyException {
    String connectionString = "MY-AZURE-STRING";

    CloudStorageAccount storageAccount = CloudStorageAccount.parse(connectionString);
    CloudBlobClient client = storageAccount.createCloudBlobClient();
    client.getDefaultRequestOptions().setTimeoutIntervalInMs(1000);
    CloudBlobContainer container = client.getContainerReference("dumb");
    container.createIfNotExists();
    CloudBlockBlob blob = container.getBlockBlobReference("blob");

    File sourceFile = File.createTempFile("sourceFile", ".tmp");

    try {
        int fileSize = 10000000;

        byte[] buffer = new byte[fileSize];
        Random random = new Random();
        random.nextBytes(buffer);

        logger.info("Generate local file");
        FileOutputStream fos = new FileOutputStream(sourceFile);
        fos.write(buffer);
        fos.close();
        logger.info("End generate local file");

        FileInputStream fis = new FileInputStream(sourceFile);

        logger.info("Start uploading");
        blob.upload(fis, fileSize);
        logger.info("End uploading");

    }
    finally {
        if (sourceFile.exists()) {
            sourceFile.delete();
        }
    }
}
```

With 2.0.0, the above code was not raising any exception. With 4.0.0, the exception is now thrown correctly.

The default timeout is 5 minutes. See https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/core/Utility.java#L352-L375

Closes #12567.

Release notes from 2.0.0:

 * Removed deprecated table AtomPub support.
 * Removed deprecated constructors which take service clients in favor of constructors which take credentials.
 * Added support for "Add" permissions on Blob SAS.
 * Added support for "Create" permissions on Blob and File SAS.
 * Added support for IP Restricted SAS and Protocol SAS.
 * Added support for Account SAS to all services.
 * Added support for Minute and Hour Metrics to FileServiceProperties and added support for File Metrics to CloudAnalyticsClient.
 * Removed deprecated startCopyFromBlob() on CloudBlob. Use startCopy() instead.
 * Removed deprecated Credentials and StorageKey classes. Please use the appropriate methods on StorageCredentialsAccountAndKey instead.

 * Fixed a bug in table where a select on a non-existent field resulted in a null reference exception if the corresponding field in the TableEntity was not nullable.
 * Fixed a bug in table where JsonParser was automatically closing the response stream before it was completely drained causing socket exhaustion.
 * Fixed a bug in StorageCredentialsAccountAndKey.updateKey(String) which prevented valid keys from being set.
 * Added CloudBlobContainer.listBlobs(final String, final boolean) method.
 * Fixed a bug in blob where using AccessConditions on block blob uploads larger than 64MB done with the upload* methods or block blob uploads done openOutputStream with would fail if the blob did not already exist.
 * Added support for setting a proxy per request. Proxy can be set on an OperationContext instance and will be used when that instance is passed to the request method.

 * Added support for SAS to the Azure File service.
 * Added support for Append Blob.
 * Added support for Access Control Lists (ACL) to File Shares.
 * Added support for getting and setting of CORS rules to File service.
 * Added support for ShareStats to File Shares.
 * Added support for copying an Azure File to another Azure File or a Block Blob asynchronously, and aborting Azure File copy operations asynchronously.
 * Added support for copying a Blob to an Azure File asynchronously.
 * Added support for setting a maximum quota property on a File Share.
 * Removed deprecated AuthenticationScheme and its getter and setter. In the future only SharedKey will be used.
 * Removed deprecated getter/setters for all request option properties on the service clients. Please use the default request options getter/setters instead.
 * Removed getSubDirectoryReference() for blob directories and file directories. Use getDirectoryReference() instead.
 * Removed getEntityClass() in TableQuery. Please use getClazzType() instead.
 * Added client-side verification for lease duration and break periods.
 * Deprecated the setters in table for timestamp as this property is only modifiable by the service.
 * Deprecated startCopyFromBlob() on CloudBlob. Use startCopy() instead.
 * Deprecated the Credentials and StorageKey classes. Please use the appropriate methods on StorageCredentialsAccountAndKey instead.
 * Deprecated constructors which take service clients in favor of constructors which take credentials.
 * Fixed a bug where the DateBackwardCompatibility flag was not applied if set on the CloudTableClient default request options.
 * Changed library behavior to retry all exceptions thrown when parsing a response object.
 * Changed behavior to stop removing query parameters passed in with the resource URI if that URI contains a SAS token. Some query parameters such as comp, restype, snapshot and api-version will still be removed.
 * Added support for logging StringToSign to SharedKey and SAS.
 * **Added a connect timeout to prevent hangs when establishing the network connection.**
 * **Made performance enhancements to the BlobOutputStream class.**

 * Fixed a bug where maximum execution time was ignored for file, queue, and table services.
 * **Changed the socket timeout to be set to the service side timeout plus 5 minutes when maximum execution time is not set.**
 * **Changed the socket timeout to default to 5 minutes rather than infinite when neither service side timeout or maximum execution time are set.**
 * Fixed a bug where MD5 was calculated for commitBlockList even though UseTransactionalMD5 was set to false.
 * Fixed a bug where selecting fields that did not exist returned an error rather than an EntityProperty with a null value.
 * Fixed a bug where table entities with a single quote in their partition or row key could be inserted but not operated on in any other way.

 * Fixed a bug for all listing API's where next() would sometimes throw an exception if hasNext() had not been called even if there were more elements to iterate on.
 * Added sequence number to the blob properties. This is populated for page blobs.
 * Creating a page blob sets its length property.
 * Added support for page blob sequence numbers and sequence number access conditions.
 * Fixed a bug in abort copy where the lease access condition was not sent to the service.
 * Fixed an issue in startCopyFromBlob where if the URI of the source blob contained certain non-ASCII characters they would not be encoded appropriately. This would result in Authorization failures.
 * Fixed a small performance issue in XML serialization.
 * Fixed a bug in BlobOutputStream and FileOutputStream where flush added data to a request pool rather than immediately committing it to the Azure service.
 * Refactored to remove the blob, queue, and file package dependency on table in the error handling code.
 * Added additional client-side logging for REST requests, responses, and errors.

Closes #15976.

@dadoonet dadoonet force-pushed the dadoonet:fix/15976-update-azure-storage-client branch to 7a42014 Feb 29, 2016

@dadoonet dadoonet merged commit 7a42014 into elastic:master Feb 29, 2016

1 check passed

CLA Commit author is a member of Elasticsearch
Details

dadoonet added a commit that referenced this pull request Feb 29, 2016

Upgrade Azure Storage client to 4.0.0
We are using `2.0.0` today but Azure team now recommends:

```xml
<dependency>
    <groupId>com.microsoft.azure</groupId>
    <artifactId>azure-storage</artifactId>
    <version>4.0.0</version>
</dependency>
```

This new version fix the timeout issues we have seen with azure storage although #15080 adds a timeout support.
Azure storage client 2.0.0 was not passing correctly this value when it was calling Azure services.

Note that the timeout is a server side timeout and not client side timeout.
It means that it will raise only a timeout when:

* upload of blob is complete
* if azure service is not able to process the blob (and store it) within a given time range.

In which case it will raise an exception which elasticsearch can deal with:

```
java.io.IOException
    at __randomizedtesting.SeedInfo.seed([91BC11AEF16E073F:6886FA5308FCE4D8]:0)
    at com.microsoft.azure.storage.core.Utility.initIOException(Utility.java:643)
    at com.microsoft.azure.storage.blob.BlobOutputStream.writeBlock(BlobOutputStream.java:444)
    at com.microsoft.azure.storage.blob.BlobOutputStream.access$000(BlobOutputStream.java:53)
    at com.microsoft.azure.storage.blob.BlobOutputStream$1.call(BlobOutputStream.java:388)
    at com.microsoft.azure.storage.blob.BlobOutputStream$1.call(BlobOutputStream.java:385)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: com.microsoft.azure.storage.StorageException: Operation could not be completed within the specified time.
    at com.microsoft.azure.storage.StorageException.translateException(StorageException.java:89)
    at com.microsoft.azure.storage.core.StorageRequest.materializeException(StorageRequest.java:305)
    at com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:175)
    at com.microsoft.azure.storage.blob.CloudBlockBlob.uploadBlockInternal(CloudBlockBlob.java:1006)
    at com.microsoft.azure.storage.blob.CloudBlockBlob.uploadBlock(CloudBlockBlob.java:978)
    at com.microsoft.azure.storage.blob.BlobOutputStream.writeBlock(BlobOutputStream.java:438)
    ... 9 more
```

The following code was used to test this against Azure platform:

```java
public void testDumb() throws URISyntaxException, StorageException, IOException, InvalidKeyException {
    String connectionString = "MY-AZURE-STRING";

    CloudStorageAccount storageAccount = CloudStorageAccount.parse(connectionString);
    CloudBlobClient client = storageAccount.createCloudBlobClient();
    client.getDefaultRequestOptions().setTimeoutIntervalInMs(1000);
    CloudBlobContainer container = client.getContainerReference("dumb");
    container.createIfNotExists();
    CloudBlockBlob blob = container.getBlockBlobReference("blob");

    File sourceFile = File.createTempFile("sourceFile", ".tmp");

    try {
        int fileSize = 10000000;

        byte[] buffer = new byte[fileSize];
        Random random = new Random();
        random.nextBytes(buffer);

        logger.info("Generate local file");
        FileOutputStream fos = new FileOutputStream(sourceFile);
        fos.write(buffer);
        fos.close();
        logger.info("End generate local file");

        FileInputStream fis = new FileInputStream(sourceFile);

        logger.info("Start uploading");
        blob.upload(fis, fileSize);
        logger.info("End uploading");

    }
    finally {
        if (sourceFile.exists()) {
            sourceFile.delete();
        }
    }
}
```

With 2.0.0, the above code was not raising any exception. With 4.0.0, the exception is now thrown correctly.

The default timeout is 5 minutes. See https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/core/Utility.java#L352-L375

Closes #12567.

Release notes from 2.0.0:

 * Removed deprecated table AtomPub support.
 * Removed deprecated constructors which take service clients in favor of constructors which take credentials.
 * Added support for "Add" permissions on Blob SAS.
 * Added support for "Create" permissions on Blob and File SAS.
 * Added support for IP Restricted SAS and Protocol SAS.
 * Added support for Account SAS to all services.
 * Added support for Minute and Hour Metrics to FileServiceProperties and added support for File Metrics to CloudAnalyticsClient.
 * Removed deprecated startCopyFromBlob() on CloudBlob. Use startCopy() instead.
 * Removed deprecated Credentials and StorageKey classes. Please use the appropriate methods on StorageCredentialsAccountAndKey instead.

 * Fixed a bug in table where a select on a non-existent field resulted in a null reference exception if the corresponding field in the TableEntity was not nullable.
 * Fixed a bug in table where JsonParser was automatically closing the response stream before it was completely drained causing socket exhaustion.
 * Fixed a bug in StorageCredentialsAccountAndKey.updateKey(String) which prevented valid keys from being set.
 * Added CloudBlobContainer.listBlobs(final String, final boolean) method.
 * Fixed a bug in blob where using AccessConditions on block blob uploads larger than 64MB done with the upload* methods or block blob uploads done openOutputStream with would fail if the blob did not already exist.
 * Added support for setting a proxy per request. Proxy can be set on an OperationContext instance and will be used when that instance is passed to the request method.

 * Added support for SAS to the Azure File service.
 * Added support for Append Blob.
 * Added support for Access Control Lists (ACL) to File Shares.
 * Added support for getting and setting of CORS rules to File service.
 * Added support for ShareStats to File Shares.
 * Added support for copying an Azure File to another Azure File or a Block Blob asynchronously, and aborting Azure File copy operations asynchronously.
 * Added support for copying a Blob to an Azure File asynchronously.
 * Added support for setting a maximum quota property on a File Share.
 * Removed deprecated AuthenticationScheme and its getter and setter. In the future only SharedKey will be used.
 * Removed deprecated getter/setters for all request option properties on the service clients. Please use the default request options getter/setters instead.
 * Removed getSubDirectoryReference() for blob directories and file directories. Use getDirectoryReference() instead.
 * Removed getEntityClass() in TableQuery. Please use getClazzType() instead.
 * Added client-side verification for lease duration and break periods.
 * Deprecated the setters in table for timestamp as this property is only modifiable by the service.
 * Deprecated startCopyFromBlob() on CloudBlob. Use startCopy() instead.
 * Deprecated the Credentials and StorageKey classes. Please use the appropriate methods on StorageCredentialsAccountAndKey instead.
 * Deprecated constructors which take service clients in favor of constructors which take credentials.
 * Fixed a bug where the DateBackwardCompatibility flag was not applied if set on the CloudTableClient default request options.
 * Changed library behavior to retry all exceptions thrown when parsing a response object.
 * Changed behavior to stop removing query parameters passed in with the resource URI if that URI contains a SAS token. Some query parameters such as comp, restype, snapshot and api-version will still be removed.
 * Added support for logging StringToSign to SharedKey and SAS.
 * **Added a connect timeout to prevent hangs when establishing the network connection.**
 * **Made performance enhancements to the BlobOutputStream class.**

 * Fixed a bug where maximum execution time was ignored for file, queue, and table services.
 * **Changed the socket timeout to be set to the service side timeout plus 5 minutes when maximum execution time is not set.**
 * **Changed the socket timeout to default to 5 minutes rather than infinite when neither service side timeout or maximum execution time are set.**
 * Fixed a bug where MD5 was calculated for commitBlockList even though UseTransactionalMD5 was set to false.
 * Fixed a bug where selecting fields that did not exist returned an error rather than an EntityProperty with a null value.
 * Fixed a bug where table entities with a single quote in their partition or row key could be inserted but not operated on in any other way.

 * Fixed a bug for all listing API's where next() would sometimes throw an exception if hasNext() had not been called even if there were more elements to iterate on.
 * Added sequence number to the blob properties. This is populated for page blobs.
 * Creating a page blob sets its length property.
 * Added support for page blob sequence numbers and sequence number access conditions.
 * Fixed a bug in abort copy where the lease access condition was not sent to the service.
 * Fixed an issue in startCopyFromBlob where if the URI of the source blob contained certain non-ASCII characters they would not be encoded appropriately. This would result in Authorization failures.
 * Fixed a small performance issue in XML serialization.
 * Fixed a bug in BlobOutputStream and FileOutputStream where flush added data to a request pool rather than immediately committing it to the Azure service.
 * Refactored to remove the blob, queue, and file package dependency on table in the error handling code.
 * Added additional client-side logging for REST requests, responses, and errors.

Closes #15976.

Backport of #16084 in 2.x branch

dadoonet added a commit that referenced this pull request Feb 29, 2016

Upgrade Azure Storage client to 4.0.0
We are using `2.0.0` today but Azure team now recommends:

```xml
<dependency>
    <groupId>com.microsoft.azure</groupId>
    <artifactId>azure-storage</artifactId>
    <version>4.0.0</version>
</dependency>
```

This new version fix the timeout issues we have seen with azure storage although #15080 adds a timeout support.
Azure storage client 2.0.0 was not passing correctly this value when it was calling Azure services.

Note that the timeout is a server side timeout and not client side timeout.
It means that it will raise only a timeout when:

* upload of blob is complete
* if azure service is not able to process the blob (and store it) within a given time range.

In which case it will raise an exception which elasticsearch can deal with:

```
java.io.IOException
    at __randomizedtesting.SeedInfo.seed([91BC11AEF16E073F:6886FA5308FCE4D8]:0)
    at com.microsoft.azure.storage.core.Utility.initIOException(Utility.java:643)
    at com.microsoft.azure.storage.blob.BlobOutputStream.writeBlock(BlobOutputStream.java:444)
    at com.microsoft.azure.storage.blob.BlobOutputStream.access$000(BlobOutputStream.java:53)
    at com.microsoft.azure.storage.blob.BlobOutputStream$1.call(BlobOutputStream.java:388)
    at com.microsoft.azure.storage.blob.BlobOutputStream$1.call(BlobOutputStream.java:385)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: com.microsoft.azure.storage.StorageException: Operation could not be completed within the specified time.
    at com.microsoft.azure.storage.StorageException.translateException(StorageException.java:89)
    at com.microsoft.azure.storage.core.StorageRequest.materializeException(StorageRequest.java:305)
    at com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:175)
    at com.microsoft.azure.storage.blob.CloudBlockBlob.uploadBlockInternal(CloudBlockBlob.java:1006)
    at com.microsoft.azure.storage.blob.CloudBlockBlob.uploadBlock(CloudBlockBlob.java:978)
    at com.microsoft.azure.storage.blob.BlobOutputStream.writeBlock(BlobOutputStream.java:438)
    ... 9 more
```

The following code was used to test this against Azure platform:

```java
public void testDumb() throws URISyntaxException, StorageException, IOException, InvalidKeyException {
    String connectionString = "MY-AZURE-STRING";

    CloudStorageAccount storageAccount = CloudStorageAccount.parse(connectionString);
    CloudBlobClient client = storageAccount.createCloudBlobClient();
    client.getDefaultRequestOptions().setTimeoutIntervalInMs(1000);
    CloudBlobContainer container = client.getContainerReference("dumb");
    container.createIfNotExists();
    CloudBlockBlob blob = container.getBlockBlobReference("blob");

    File sourceFile = File.createTempFile("sourceFile", ".tmp");

    try {
        int fileSize = 10000000;

        byte[] buffer = new byte[fileSize];
        Random random = new Random();
        random.nextBytes(buffer);

        logger.info("Generate local file");
        FileOutputStream fos = new FileOutputStream(sourceFile);
        fos.write(buffer);
        fos.close();
        logger.info("End generate local file");

        FileInputStream fis = new FileInputStream(sourceFile);

        logger.info("Start uploading");
        blob.upload(fis, fileSize);
        logger.info("End uploading");

    }
    finally {
        if (sourceFile.exists()) {
            sourceFile.delete();
        }
    }
}
```

With 2.0.0, the above code was not raising any exception. With 4.0.0, the exception is now thrown correctly.

The default timeout is 5 minutes. See https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/core/Utility.java#L352-L375

Closes #12567.

Release notes from 2.0.0:

 * Removed deprecated table AtomPub support.
 * Removed deprecated constructors which take service clients in favor of constructors which take credentials.
 * Added support for "Add" permissions on Blob SAS.
 * Added support for "Create" permissions on Blob and File SAS.
 * Added support for IP Restricted SAS and Protocol SAS.
 * Added support for Account SAS to all services.
 * Added support for Minute and Hour Metrics to FileServiceProperties and added support for File Metrics to CloudAnalyticsClient.
 * Removed deprecated startCopyFromBlob() on CloudBlob. Use startCopy() instead.
 * Removed deprecated Credentials and StorageKey classes. Please use the appropriate methods on StorageCredentialsAccountAndKey instead.

 * Fixed a bug in table where a select on a non-existent field resulted in a null reference exception if the corresponding field in the TableEntity was not nullable.
 * Fixed a bug in table where JsonParser was automatically closing the response stream before it was completely drained causing socket exhaustion.
 * Fixed a bug in StorageCredentialsAccountAndKey.updateKey(String) which prevented valid keys from being set.
 * Added CloudBlobContainer.listBlobs(final String, final boolean) method.
 * Fixed a bug in blob where using AccessConditions on block blob uploads larger than 64MB done with the upload* methods or block blob uploads done openOutputStream with would fail if the blob did not already exist.
 * Added support for setting a proxy per request. Proxy can be set on an OperationContext instance and will be used when that instance is passed to the request method.

 * Added support for SAS to the Azure File service.
 * Added support for Append Blob.
 * Added support for Access Control Lists (ACL) to File Shares.
 * Added support for getting and setting of CORS rules to File service.
 * Added support for ShareStats to File Shares.
 * Added support for copying an Azure File to another Azure File or a Block Blob asynchronously, and aborting Azure File copy operations asynchronously.
 * Added support for copying a Blob to an Azure File asynchronously.
 * Added support for setting a maximum quota property on a File Share.
 * Removed deprecated AuthenticationScheme and its getter and setter. In the future only SharedKey will be used.
 * Removed deprecated getter/setters for all request option properties on the service clients. Please use the default request options getter/setters instead.
 * Removed getSubDirectoryReference() for blob directories and file directories. Use getDirectoryReference() instead.
 * Removed getEntityClass() in TableQuery. Please use getClazzType() instead.
 * Added client-side verification for lease duration and break periods.
 * Deprecated the setters in table for timestamp as this property is only modifiable by the service.
 * Deprecated startCopyFromBlob() on CloudBlob. Use startCopy() instead.
 * Deprecated the Credentials and StorageKey classes. Please use the appropriate methods on StorageCredentialsAccountAndKey instead.
 * Deprecated constructors which take service clients in favor of constructors which take credentials.
 * Fixed a bug where the DateBackwardCompatibility flag was not applied if set on the CloudTableClient default request options.
 * Changed library behavior to retry all exceptions thrown when parsing a response object.
 * Changed behavior to stop removing query parameters passed in with the resource URI if that URI contains a SAS token. Some query parameters such as comp, restype, snapshot and api-version will still be removed.
 * Added support for logging StringToSign to SharedKey and SAS.
 * **Added a connect timeout to prevent hangs when establishing the network connection.**
 * **Made performance enhancements to the BlobOutputStream class.**

 * Fixed a bug where maximum execution time was ignored for file, queue, and table services.
 * **Changed the socket timeout to be set to the service side timeout plus 5 minutes when maximum execution time is not set.**
 * **Changed the socket timeout to default to 5 minutes rather than infinite when neither service side timeout or maximum execution time are set.**
 * Fixed a bug where MD5 was calculated for commitBlockList even though UseTransactionalMD5 was set to false.
 * Fixed a bug where selecting fields that did not exist returned an error rather than an EntityProperty with a null value.
 * Fixed a bug where table entities with a single quote in their partition or row key could be inserted but not operated on in any other way.

 * Fixed a bug for all listing API's where next() would sometimes throw an exception if hasNext() had not been called even if there were more elements to iterate on.
 * Added sequence number to the blob properties. This is populated for page blobs.
 * Creating a page blob sets its length property.
 * Added support for page blob sequence numbers and sequence number access conditions.
 * Fixed a bug in abort copy where the lease access condition was not sent to the service.
 * Fixed an issue in startCopyFromBlob where if the URI of the source blob contained certain non-ASCII characters they would not be encoded appropriately. This would result in Authorization failures.
 * Fixed a small performance issue in XML serialization.
 * Fixed a bug in BlobOutputStream and FileOutputStream where flush added data to a request pool rather than immediately committing it to the Azure service.
 * Refactored to remove the blob, queue, and file package dependency on table in the error handling code.
 * Added additional client-side logging for REST requests, responses, and errors.

Closes #15976.

Backport of #16084 in 2.2 branch

@dadoonet dadoonet deleted the dadoonet:fix/15976-update-azure-storage-client branch Feb 29, 2016

@dadoonet

This comment has been minimized.

Copy link
Member Author

commented Feb 29, 2016

@clintongormley I merged this change in 2.2, 2.x and master. I'm not sure if I should merge it in 2.1 and 2.0. I removed the labels.

dadoonet added a commit to dadoonet/elasticsearch that referenced this pull request Jul 25, 2016

Remove TODO about Timeout in Azure
In elastic#15950 elastic#15080 elastic#16084 we added the support of TimeOut for Requests with a default client`setTimeoutIntervalInMs`.
So we can remove this useless todo which was added for only one method.

Closes elastic#18617.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.