Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure blob store's readBlob() method first checks if the blob exists #23483

Merged
merged 7 commits into from
Mar 3, 2017

Conversation

abeyad
Copy link

@abeyad abeyad commented Mar 3, 2017

Previously, the Azure blob store would depend on a 404 StorageException
coming back from Azure if trying to open an input stream to a
non-existent blob. This works for Azure repositories which access a
primary location path. For those configured to access a secondary
location path, the Azure SDK keeps trying for a long while before
returning a 404 StorageException, causing potential delays in the
snapshot APIs. This commit makes an initial check if the blob exists in
Azure and returns immediately with a NoSuchFileException, instead of
trying to open the input stream to the blob.

Closes #23480

dadoonet and others added 2 commits March 3, 2017 12:44
Previously, the Azure blob store would depend on a 404 StorageException
coming back from Azure if trying to open an input stream to a
non-existent blob.  This works for Azure repositories which access a
primary location path.  For those configured to access a secondary
location path, the Azure SDK keeps trying for a long while before
returning a 404 StorageException, causing potential delays in the
snapshot APIs.  This commit makes an initial check if the blob exists in
Azure and returns immediately with a NoSuchFileException, instead of
trying to open the input stream to the blob.

Closes elastic#23480
Copy link
Member

@dadoonet dadoonet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left small comments. It's good to me.

* This test needs Azure to run and -Dtests.thirdparty=true to be set
* and -Dtests.config=/path/to/elasticsearch.yml
* @see AbstractAzureWithThirdPartyIntegTestCase
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to add that this test requires an azure storage account defined as a Read-access geo-redundant storage (RA-GRS).

logger.info("--> creating azure primary repository");
PutRepositoryResponse putRepositoryResponsePrimary = client.admin().cluster().preparePutRepository("primary")
.setType("azure").setSettings(Settings.builder()
.put(Repository.ACCOUNT_SETTING.getKey(), "my_account")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is not needed. It should use the default account available.

logger.info("--> creating azure secondary repository");
PutRepositoryResponse putRepositoryResponseSecondary = client.admin().cluster().preparePutRepository("secondary")
.setType("azure").setSettings(Settings.builder()
.put(Repository.ACCOUNT_SETTING.getKey(), "my_account")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is not needed. It should use the default account available.

PutRepositoryResponse putRepositoryResponsePrimary = client.admin().cluster().preparePutRepository("primary")
.setType("azure").setSettings(Settings.builder()
.put(Repository.ACCOUNT_SETTING.getKey(), "my_account")
.put(Repository.CONTAINER_SETTING.getKey(), "container")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be randomize the container name as we do in AzureSnapshotRestoreTests?

    private static String getContainerName() {
        String testName = "snapshot-itest-".concat(RandomizedTest.getContext().getRunnerSeedAsString().toLowerCase(Locale.ROOT));
        return testName.contains(" ") ? Strings.split(testName, " ")[0] : testName;
    }

PutRepositoryResponse putRepositoryResponseSecondary = client.admin().cluster().preparePutRepository("secondary")
.setType("azure").setSettings(Settings.builder()
.put(Repository.ACCOUNT_SETTING.getKey(), "my_account")
.put(Repository.CONTAINER_SETTING.getKey(), "container")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And reuse the randomized container name here?

@abeyad
Copy link
Author

abeyad commented Mar 3, 2017

thanks for the review @dadoonet, I pushed a commit to address your comments and create a random container name each time.


logger.info("--> start get snapshots on primary");
long startWait = System.currentTimeMillis();
client.admin().cluster().prepareGetSnapshots("primary").get();
long endWait = System.currentTimeMillis();
// definitely should be done in 30s, and if its not working as expected, it takes over 1m
assertThat(endWait - startWait, lessThanOrEqualTo(30000L));
removeContainer(containerName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you remove and create again?

endWait = System.currentTimeMillis();
logger.info("--> end of get snapshots on secondary. Took {} ms", endWait - startWait);
assertThat(endWait - startWait, lessThanOrEqualTo(30000L));
removeContainer(containerName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be do that in an After method so it's always removed?

// definitely should be done in 30s, and if its not working as expected, it takes over 1m
assertThat(endWait - startWait, lessThanOrEqualTo(30000L));
removeContainer(containerName);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove and create again is not needed I think

@abeyad
Copy link
Author

abeyad commented Mar 3, 2017

I pushed c192669

Copy link
Member

@dadoonet dadoonet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Left a small last comment.

private final String containerName = getContainerName();

@Before
public void setupContainer() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@abeyad abeyad merged commit 3dff0d0 into elastic:master Mar 3, 2017
@abeyad abeyad deleted the fix/azure_read_nonexistant_blob branch March 3, 2017 22:01
@abeyad
Copy link
Author

abeyad commented Mar 3, 2017

thanks @dadoonet

abeyad pushed a commit that referenced this pull request Mar 3, 2017
…23483)

Previously, the Azure blob store would depend on a 404 StorageException
coming back from Azure if trying to open an input stream to a
non-existent blob. This works for Azure repositories which access a
primary location path. For those configured to access a secondary
location path, the Azure SDK keeps trying for a long while before
returning a 404 StorageException, causing potential delays in the
snapshot APIs. This commit makes an initial check if the blob exists in
Azure and returns immediately with a NoSuchFileException, instead of
trying to open the input stream to the blob.

Closes #23480
abeyad pushed a commit that referenced this pull request Mar 3, 2017
…23483)

Previously, the Azure blob store would depend on a 404 StorageException
coming back from Azure if trying to open an input stream to a
non-existent blob. This works for Azure repositories which access a
primary location path. For those configured to access a secondary
location path, the Azure SDK keeps trying for a long while before
returning a 404 StorageException, causing potential delays in the
snapshot APIs. This commit makes an initial check if the blob exists in
Azure and returns immediately with a NoSuchFileException, instead of
trying to open the input stream to the blob.

Closes #23480
@abeyad
Copy link
Author

abeyad commented Mar 3, 2017

5.x commit: 02cd2ae
5.3 commit: 47ae063

@clintongormley clintongormley added :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs and removed :Plugin Repository Azure labels Feb 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants