Azure blob store's readBlob() method first checks if the blob exists #23483

abeyad · 2017-03-03T18:48:08Z

Previously, the Azure blob store would depend on a 404 StorageException
coming back from Azure if trying to open an input stream to a
non-existent blob. This works for Azure repositories which access a
primary location path. For those configured to access a secondary
location path, the Azure SDK keeps trying for a long while before
returning a 404 StorageException, causing potential delays in the
snapshot APIs. This commit makes an initial check if the blob exists in
Azure and returns immediately with a NoSuchFileException, instead of
trying to open the input stream to the blob.

Closes #23480

Previously, the Azure blob store would depend on a 404 StorageException coming back from Azure if trying to open an input stream to a non-existent blob. This works for Azure repositories which access a primary location path. For those configured to access a secondary location path, the Azure SDK keeps trying for a long while before returning a 404 StorageException, causing potential delays in the snapshot APIs. This commit makes an initial check if the blob exists in Azure and returns immediately with a NoSuchFileException, instead of trying to open the input stream to the blob. Closes elastic#23480

dadoonet

Left small comments. It's good to me.

dadoonet · 2017-03-03T18:58:41Z

...c/test/java/org/elasticsearch/repositories/azure/AzureSnapshotRestoreListSnapshotsTests.java

+ * This test needs Azure to run and -Dtests.thirdparty=true to be set
+ * and -Dtests.config=/path/to/elasticsearch.yml
+ * @see AbstractAzureWithThirdPartyIntegTestCase
+ */


I think we need to add that this test requires an azure storage account defined as a Read-access geo-redundant storage (RA-GRS).

dadoonet · 2017-03-03T19:01:03Z

...c/test/java/org/elasticsearch/repositories/azure/AzureSnapshotRestoreListSnapshotsTests.java

+        logger.info("-->  creating azure primary repository");
+        PutRepositoryResponse putRepositoryResponsePrimary = client.admin().cluster().preparePutRepository("primary")
+                .setType("azure").setSettings(Settings.builder()
+                        .put(Repository.ACCOUNT_SETTING.getKey(), "my_account")


I think this is not needed. It should use the default account available.

dadoonet · 2017-03-03T19:01:08Z

...c/test/java/org/elasticsearch/repositories/azure/AzureSnapshotRestoreListSnapshotsTests.java

+        logger.info("-->  creating azure secondary repository");
+        PutRepositoryResponse putRepositoryResponseSecondary = client.admin().cluster().preparePutRepository("secondary")
+                .setType("azure").setSettings(Settings.builder()
+                        .put(Repository.ACCOUNT_SETTING.getKey(), "my_account")


I think this is not needed. It should use the default account available.

dadoonet · 2017-03-03T19:02:05Z

...c/test/java/org/elasticsearch/repositories/azure/AzureSnapshotRestoreListSnapshotsTests.java

+        PutRepositoryResponse putRepositoryResponsePrimary = client.admin().cluster().preparePutRepository("primary")
+                .setType("azure").setSettings(Settings.builder()
+                        .put(Repository.ACCOUNT_SETTING.getKey(), "my_account")
+                        .put(Repository.CONTAINER_SETTING.getKey(), "container")


May be randomize the container name as we do in AzureSnapshotRestoreTests?

private static String getContainerName() { String testName = "snapshot-itest-".concat(RandomizedTest.getContext().getRunnerSeedAsString().toLowerCase(Locale.ROOT)); return testName.contains(" ") ? Strings.split(testName, " ")[0] : testName; }

dadoonet · 2017-03-03T19:02:24Z

...c/test/java/org/elasticsearch/repositories/azure/AzureSnapshotRestoreListSnapshotsTests.java

+        PutRepositoryResponse putRepositoryResponseSecondary = client.admin().cluster().preparePutRepository("secondary")
+                .setType("azure").setSettings(Settings.builder()
+                        .put(Repository.ACCOUNT_SETTING.getKey(), "my_account")
+                        .put(Repository.CONTAINER_SETTING.getKey(), "container")


And reuse the randomized container name here?

abeyad · 2017-03-03T20:28:43Z

thanks for the review @dadoonet, I pushed a commit to address your comments and create a random container name each time.

dadoonet · 2017-03-03T20:29:25Z

...c/test/java/org/elasticsearch/repositories/azure/AzureSnapshotRestoreListSnapshotsTests.java


        logger.info("--> start get snapshots on primary");
        long startWait = System.currentTimeMillis();
        client.admin().cluster().prepareGetSnapshots("primary").get();
        long endWait = System.currentTimeMillis();
        // definitely should be done in 30s, and if its not working as expected, it takes over 1m
        assertThat(endWait - startWait, lessThanOrEqualTo(30000L));
+        removeContainer(containerName);


Why do you remove and create again?

dadoonet · 2017-03-03T20:31:10Z

...c/test/java/org/elasticsearch/repositories/azure/AzureSnapshotRestoreListSnapshotsTests.java

+        endWait = System.currentTimeMillis();
+        logger.info("--> end of get snapshots on secondary. Took {} ms", endWait - startWait);
+        assertThat(endWait - startWait, lessThanOrEqualTo(30000L));
+        removeContainer(containerName);


May be do that in an After method so it's always removed?

dadoonet · 2017-03-03T20:31:56Z

...c/test/java/org/elasticsearch/repositories/azure/AzureSnapshotRestoreListSnapshotsTests.java

+        // definitely should be done in 30s, and if its not working as expected, it takes over 1m
+        assertThat(endWait - startWait, lessThanOrEqualTo(30000L));
+        removeContainer(containerName);
+


Remove and create again is not needed I think

abeyad · 2017-03-03T20:35:29Z

I pushed c192669

dadoonet

Great! Left a small last comment.

dadoonet · 2017-03-03T20:47:59Z

...c/test/java/org/elasticsearch/repositories/azure/AzureSnapshotRestoreListSnapshotsTests.java

+    private final String containerName = getContainerName();
+
+    @Before
+    public void setupContainer() {


abeyad · 2017-03-03T22:02:02Z

thanks @dadoonet

…23483) Previously, the Azure blob store would depend on a 404 StorageException coming back from Azure if trying to open an input stream to a non-existent blob. This works for Azure repositories which access a primary location path. For those configured to access a secondary location path, the Azure SDK keeps trying for a long while before returning a 404 StorageException, causing potential delays in the snapshot APIs. This commit makes an initial check if the blob exists in Azure and returns immediately with a NoSuchFileException, instead of trying to open the input stream to the blob. Closes #23480

abeyad · 2017-03-03T22:09:11Z

5.x commit: 02cd2ae
5.3 commit: 47ae063

dadoonet and others added 2 commits March 3, 2017 12:44

Add secondary azure test for 5.2 branch

a9f5883

abeyad added :Plugin Repository Azure >bug v5.3.1 v5.4.0 v6.0.0-alpha1 labels Mar 3, 2017

abeyad requested a review from dadoonet March 3, 2017 18:48

Ali Beyad added 2 commits March 3, 2017 13:48

check location mode

ca96489

javadocs

065e0e4

dadoonet approved these changes Mar 3, 2017

View reviewed changes

address review

d054f44

dadoonet requested changes Mar 3, 2017

View reviewed changes

feedback

c192669

dadoonet approved these changes Mar 3, 2017

View reviewed changes

remove unused method

8d0a2e2

abeyad merged commit 3dff0d0 into elastic:master Mar 3, 2017

abeyad deleted the fix/azure_read_nonexistant_blob branch March 3, 2017 22:01

clintongormley added :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs and removed :Plugin Repository Azure labels Feb 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Azure blob store's readBlob() method first checks if the blob exists #23483

Azure blob store's readBlob() method first checks if the blob exists #23483

abeyad commented Mar 3, 2017

dadoonet left a comment

dadoonet Mar 3, 2017

dadoonet Mar 3, 2017

dadoonet Mar 3, 2017

dadoonet Mar 3, 2017

dadoonet Mar 3, 2017

abeyad commented Mar 3, 2017

dadoonet Mar 3, 2017

dadoonet Mar 3, 2017

dadoonet Mar 3, 2017

abeyad commented Mar 3, 2017

dadoonet left a comment

dadoonet Mar 3, 2017

abeyad Mar 3, 2017

abeyad commented Mar 3, 2017

abeyad commented Mar 3, 2017

Azure blob store's readBlob() method first checks if the blob exists #23483

Azure blob store's readBlob() method first checks if the blob exists #23483

Conversation

abeyad commented Mar 3, 2017

dadoonet left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abeyad commented Mar 3, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abeyad commented Mar 3, 2017

dadoonet left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abeyad commented Mar 3, 2017

abeyad commented Mar 3, 2017