Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recursively Delete Unreferenced Index Directories #42189

Merged
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
09ec6f3
Recursively Delete Unreferenced Index Directories
original-brownbear May 16, 2019
20a6720
more bulk
original-brownbear May 17, 2019
273a165
add docs
original-brownbear May 17, 2019
aba91e1
Merge remote-tracking branch 'elastic/master' into allow-listing-dire…
original-brownbear May 17, 2019
ac3c496
Merge remote-tracking branch 'elastic/master' into allow-listing-dire…
original-brownbear May 17, 2019
e593e0c
cleanup after snapshotting as well
original-brownbear May 18, 2019
de61d35
fix checkstyle
original-brownbear May 18, 2019
22a668e
Merge remote-tracking branch 'elastic/master' into allow-listing-dire…
original-brownbear May 18, 2019
3d7da2e
nicer
original-brownbear May 18, 2019
826e513
start marker logic
original-brownbear May 19, 2019
96dabb7
Merge remote-tracking branch 'elastic/master' into allow-listing-dire…
original-brownbear May 19, 2019
c0e025e
clean
original-brownbear May 19, 2019
e99e5c0
prevent index dir blobs from being listed
original-brownbear May 19, 2019
0e66c19
add assertion about proper S3 behavior
original-brownbear May 19, 2019
e4aa770
fix azure blob children listing
original-brownbear May 19, 2019
a2bfa0f
fix azure
original-brownbear May 19, 2019
98dc56c
nicer
original-brownbear May 19, 2019
372fd26
catch all the exceptions
original-brownbear May 19, 2019
5eb036b
cleanup
original-brownbear May 19, 2019
3d1e26b
smaller changeset
original-brownbear May 20, 2019
cb2a196
nicer
original-brownbear May 20, 2019
2ca8c4c
nicer
original-brownbear May 20, 2019
908cfeb
dry up test infra
original-brownbear May 20, 2019
2d0f368
Merge remote-tracking branch 'elastic/master' into allow-listing-dire…
original-brownbear May 20, 2019
d72cbcd
test cleanup on hdfs repository
original-brownbear May 20, 2019
b9b4c32
Merge remote-tracking branch 'elastic/master' into allow-listing-dire…
original-brownbear May 20, 2019
8a82885
Merge remote-tracking branch 'elastic/master' into allow-listing-dire…
original-brownbear May 22, 2019
ae4c9a8
Merge remote-tracking branch 'elastic/master' into allow-listing-dire…
original-brownbear May 23, 2019
b8441ac
azure 3rd party cleanup tests
original-brownbear May 23, 2019
f50a895
Merge remote-tracking branch 'elastic/master' into allow-listing-dire…
original-brownbear May 23, 2019
014f5e0
gcs third party tests
original-brownbear May 23, 2019
f825d3f
s3 third party tests
original-brownbear May 23, 2019
2e0bd14
better comment
original-brownbear May 23, 2019
1c33d31
add todo about eventual consistency on S3
original-brownbear May 23, 2019
bb3fb73
Merge remote-tracking branch 'elastic/master' into allow-listing-dire…
original-brownbear May 23, 2019
def2d77
much drier
original-brownbear May 24, 2019
d6fa0fe
much drier consistency checks
original-brownbear May 24, 2019
e7a3c2b
Fix consistency tests to account for S3 eventual consistency
original-brownbear May 24, 2019
bb95f7f
Merge remote-tracking branch 'elastic/master' into allow-listing-dire…
original-brownbear May 24, 2019
9cac1a4
S3 resiliency tests work with minio
original-brownbear May 24, 2019
c8394fe
add comment on Minio hack
original-brownbear May 24, 2019
b338a91
shorter diff
original-brownbear May 24, 2019
a8bf18e
Merge remote-tracking branch 'elastic/master' into allow-listing-dire…
original-brownbear May 28, 2019
be8b479
Merge remote-tracking branch 'elastic/master' into allow-listing-dire…
original-brownbear May 28, 2019
58fcfe2
Merge remote-tracking branch 'elastic/master' into allow-listing-dire…
original-brownbear Jun 5, 2019
1e39e9b
shorter diff
original-brownbear Jun 5, 2019
5b5ddb2
Merge remote-tracking branch 'elastic/master' into allow-listing-dire…
original-brownbear Jun 11, 2019
dc62d64
Merge remote-tracking branch 'elastic/master' into allow-listing-dire…
original-brownbear Jun 11, 2019
c52a394
CR: removed redundant test
original-brownbear Jun 11, 2019
dea1a07
Merge remote-tracking branch 'elastic/master' into allow-listing-dire…
original-brownbear Jun 11, 2019
43bb00a
Merge remote-tracking branch 'elastic/master' into allow-listing-dire…
original-brownbear Jun 17, 2019
af47bb8
CR: add repo consistency check globally
original-brownbear Jun 17, 2019
47766e1
Checkstyle
original-brownbear Jun 17, 2019
81aab9b
Merge remote-tracking branch 'elastic/master' into allow-listing-dire…
original-brownbear Jun 17, 2019
deb73e3
Merge remote-tracking branch 'elastic/master' into allow-listing-dire…
original-brownbear Jun 18, 2019
b8a7067
CR: remove . in ex messages
original-brownbear Jun 18, 2019
c613d89
remove now redundant directory deletes
original-brownbear Jun 18, 2019
d841af4
Merge remote-tracking branch 'elastic/master' into allow-listing-dire…
original-brownbear Jun 18, 2019
e13d799
Merge remote-tracking branch 'elastic/master' into allow-listing-dire…
original-brownbear Jun 18, 2019
f384e9c
Fix test issues with MockRepository
original-brownbear Jun 18, 2019
79478cb
Merge remote-tracking branch 'elastic/master' into allow-listing-dire…
original-brownbear Jun 20, 2019
bc1497e
CR: comments
original-brownbear Jun 20, 2019
421553b
CR: move container creation
original-brownbear Jun 20, 2019
b3478b5
Merge remote-tracking branch 'elastic/master' into allow-listing-dire…
original-brownbear Jun 21, 2019
bbb9632
Remute test
original-brownbear Jun 21, 2019
26df577
CR: comments
original-brownbear Jun 21, 2019
7617660
Merge remote-tracking branch 'elastic/master' into allow-listing-dire…
original-brownbear Jun 21, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
package org.elasticsearch.common.blobstore.url;

import org.elasticsearch.common.SuppressForbidden;
import org.elasticsearch.common.blobstore.BlobContainer;
import org.elasticsearch.common.blobstore.BlobMetaData;
import org.elasticsearch.common.blobstore.BlobPath;
import org.elasticsearch.common.blobstore.support.AbstractBlobContainer;
Expand Down Expand Up @@ -74,6 +75,11 @@ public Map<String, BlobMetaData> listBlobs() throws IOException {
throw new UnsupportedOperationException("URL repository doesn't support this operation");
}

@Override
public Map<String, BlobContainer> children() throws IOException {
throw new UnsupportedOperationException("URL repository doesn't support this operation");
}

/**
* This operation is not supported by URLBlobContainer
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import org.elasticsearch.common.Nullable;
import org.elasticsearch.common.blobstore.BlobContainer;
import org.elasticsearch.common.blobstore.BlobMetaData;
import org.elasticsearch.common.blobstore.BlobPath;
import org.elasticsearch.common.blobstore.support.AbstractBlobContainer;
Expand Down Expand Up @@ -135,6 +136,15 @@ public Map<String, BlobMetaData> listBlobs() throws IOException {
return listBlobsByPrefix(null);
}

@Override
public Map<String, BlobContainer> children() throws IOException {
try {
return blobStore.children(path());
} catch (URISyntaxException | StorageException e) {
throw new IOException("Failed to list children.", e);
}
}

protected String buildKey(String blobName) {
return keyPath + (blobName == null ? "" : blobName);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@

import com.microsoft.azure.storage.LocationMode;
import com.microsoft.azure.storage.StorageException;

import org.elasticsearch.cluster.metadata.RepositoryMetaData;
import org.elasticsearch.common.blobstore.BlobContainer;
import org.elasticsearch.common.blobstore.BlobMetaData;
Expand All @@ -33,6 +32,8 @@
import java.io.InputStream;
import java.net.URISyntaxException;
import java.nio.file.FileAlreadyExistsException;
import java.util.Collections;
import java.util.HashMap;
import java.util.Map;

import static java.util.Collections.emptyMap;
Expand Down Expand Up @@ -103,6 +104,13 @@ public Map<String, BlobMetaData> listBlobsByPrefix(String keyPath, String prefix
return service.listBlobsByPrefix(clientName, container, keyPath, prefix);
}

public Map<String, BlobContainer> children(BlobPath path) throws URISyntaxException, StorageException {
return Collections.unmodifiableMap(service.children(clientName, container, path).stream().collect(
HashMap::new,
(existing, name) -> existing.put(name, new AzureBlobContainer(path.add(name), this)),
HashMap::putAll));
}

public void writeBlob(String blobName, InputStream inputStream, long blobSize, boolean failIfAlreadyExists)
throws URISyntaxException, StorageException, FileAlreadyExistsException {
service.writeBlob(this.clientName, container, blobName, inputStream, blobSize, failIfAlreadyExists);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@
import org.apache.logging.log4j.Logger;
import org.apache.logging.log4j.message.ParameterizedMessage;
import org.elasticsearch.common.blobstore.BlobMetaData;
import org.elasticsearch.common.blobstore.BlobPath;
import org.elasticsearch.common.blobstore.support.PlainBlobMetaData;
import org.elasticsearch.common.collect.Tuple;
import org.elasticsearch.common.settings.Settings;
Expand All @@ -54,7 +55,9 @@
import java.security.InvalidKeyException;
import java.util.EnumSet;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Map;
import java.util.Set;
import java.util.function.Supplier;

import static java.util.Collections.emptyMap;
Expand Down Expand Up @@ -241,6 +244,30 @@ public Map<String, BlobMetaData> listBlobsByPrefix(String account, String contai
return Map.copyOf(blobsBuilder);
}

public Set<String> children(String account, String container, BlobPath path) throws URISyntaxException, StorageException {
final var blobsBuilder = new HashSet<String>();
final EnumSet<BlobListingDetails> enumBlobListingDetails = EnumSet.of(BlobListingDetails.METADATA);
final Tuple<CloudBlobClient, Supplier<OperationContext>> client = client(account);
final CloudBlobContainer blobContainer = client.v1().getContainerReference(container);
final String keyPath = path.buildAsString();
SocketAccess.doPrivilegedVoidException(() -> {
for (final ListBlobItem blobItem : blobContainer.listBlobs(keyPath, false,
enumBlobListingDetails, null, client.v2().get())) {

final URI uri = blobItem.getUri();
logger.trace(() -> new ParameterizedMessage("blob url [{}]", uri));
// uri.getPath is of the form /container/keyPath.* and we want to strip off the /container/
// this requires 1 + container.length() + 1, with each 1 corresponding to one of the /
final String blobPath = uri.getPath().substring(1 + container.length() + 1);
final BlobProperties properties = ((CloudBlockBlob) blobItem).getProperties();
final String name = blobPath.substring(keyPath.length());
logger.trace(() -> new ParameterizedMessage("blob url [{}], name [{}], size [{}]", uri, name, properties.getLength()));
blobsBuilder.add(name);
}
});
return Set.copyOf(blobsBuilder);
}

public void writeBlob(String account, String container, String blobName, InputStream inputStream, long blobSize,
boolean failIfAlreadyExists)
throws URISyntaxException, StorageException, FileAlreadyExistsException {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@

package org.elasticsearch.repositories.gcs;

import org.elasticsearch.common.blobstore.BlobContainer;
import org.elasticsearch.common.blobstore.BlobMetaData;
import org.elasticsearch.common.blobstore.BlobPath;
import org.elasticsearch.common.blobstore.BlobStoreException;
Expand Down Expand Up @@ -55,6 +56,11 @@ public Map<String, BlobMetaData> listBlobs() throws IOException {
return blobStore.listBlobs(path);
}

@Override
public Map<String, BlobContainer> children() throws IOException {
return blobStore.listChildren(path());
}

@Override
public Map<String, BlobMetaData> listBlobsByPrefix(String prefix) throws IOException {
return blobStore.listBlobsByPrefix(path, prefix);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,6 @@
import com.google.cloud.storage.Storage.BlobListOption;
import com.google.cloud.storage.StorageBatch;
import com.google.cloud.storage.StorageException;

import org.elasticsearch.common.SuppressForbidden;
import org.elasticsearch.common.blobstore.BlobContainer;
import org.elasticsearch.common.blobstore.BlobMetaData;
Expand All @@ -50,11 +49,11 @@
import java.nio.channels.WritableByteChannel;
import java.nio.file.FileAlreadyExistsException;
import java.nio.file.NoSuchFileException;
import java.util.Map;
import java.util.ArrayList;
import java.util.Collection;
import java.util.Collections;
import java.util.List;
import java.util.Map;
import java.util.concurrent.atomic.AtomicReference;
import java.util.stream.Collectors;

Expand Down Expand Up @@ -132,13 +131,26 @@ Map<String, BlobMetaData> listBlobs(String path) throws IOException {
Map<String, BlobMetaData> listBlobsByPrefix(String path, String prefix) throws IOException {
final String pathPrefix = buildKey(path, prefix);
final MapBuilder<String, BlobMetaData> mapBuilder = MapBuilder.newMapBuilder();
SocketAccess.doPrivilegedVoidIOException(() -> {
client().get(bucketName).list(BlobListOption.prefix(pathPrefix)).iterateAll().forEach(blob -> {
SocketAccess.doPrivilegedVoidIOException(
() -> client().get(bucketName).list(BlobListOption.prefix(pathPrefix)).iterateAll().forEach(blob -> {
assert blob.getName().startsWith(path);
final String suffixName = blob.getName().substring(path.length());
mapBuilder.put(suffixName, new PlainBlobMetaData(suffixName, blob.getSize()));
});
});
}));
return mapBuilder.immutableMap();
}

Map<String, BlobContainer> listChildren(BlobPath path) throws IOException {
final String pathStr = path.buildAsString();
final MapBuilder<String, BlobContainer> mapBuilder = MapBuilder.newMapBuilder();
SocketAccess.doPrivilegedVoidIOException
(() -> client().get(bucketName).list(BlobListOption.currentDirectory()).iterateAll().forEach(blob -> {
original-brownbear marked this conversation as resolved.
Show resolved Hide resolved
if (blob.isDirectory()) {
assert blob.getName().startsWith(pathStr);
final String suffixName = blob.getName().substring(pathStr.length());
mapBuilder.put(suffixName, new GoogleCloudStorageBlobContainer(path.add(suffixName), this));
}
}));
return mapBuilder.immutableMap();
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
import org.apache.hadoop.fs.Options.CreateOpts;
import org.apache.hadoop.fs.Path;
import org.elasticsearch.common.Nullable;
import org.elasticsearch.common.blobstore.BlobContainer;
import org.elasticsearch.common.blobstore.BlobMetaData;
import org.elasticsearch.common.blobstore.BlobPath;
import org.elasticsearch.common.blobstore.fs.FsBlobContainer;
Expand Down Expand Up @@ -137,9 +138,9 @@ public void writeBlobAtomic(String blobName, InputStream inputStream, long blobS

@Override
public Map<String, BlobMetaData> listBlobsByPrefix(@Nullable final String prefix) throws IOException {
FileStatus[] files = store.execute(fileContext -> (fileContext.util().listStatus(path,
path -> prefix == null || path.getName().startsWith(prefix))));
Map<String, BlobMetaData> map = new LinkedHashMap<String, BlobMetaData>();
FileStatus[] files = store.execute(fileContext -> fileContext.util().listStatus(path,
path -> prefix == null || path.getName().startsWith(prefix)));
Map<String, BlobMetaData> map = new LinkedHashMap<>();
for (FileStatus file : files) {
map.put(file.getPath().getName(), new PlainBlobMetaData(file.getPath().getName(), file.getLen()));
}
Expand All @@ -151,6 +152,20 @@ public Map<String, BlobMetaData> listBlobs() throws IOException {
return listBlobsByPrefix(null);
}

@Override
public Map<String, BlobContainer> children() throws IOException {
FileStatus[] files = store.execute(fileContext -> fileContext.util().listStatus(path));
Map<String, BlobContainer> map = new LinkedHashMap<>();
for (FileStatus file : files) {
if (file.isDirectory()) {
final String name = file.getPath().getName();
map.put(file.getPath().getName(),
new HdfsBlobContainer(path().add(name), store, new Path(path, name), bufferSize, securityContext));
}
}
return Collections.unmodifiableMap(map);
}

/**
* Exists to wrap underlying InputStream methods that might make socket connections in
* doPrivileged blocks. This is due to the way that hdfs client libraries might open
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@
import org.elasticsearch.ExceptionsHelper;
import org.elasticsearch.common.Nullable;
import org.elasticsearch.common.Strings;
import org.elasticsearch.common.blobstore.BlobContainer;
import org.elasticsearch.common.blobstore.BlobMetaData;
import org.elasticsearch.common.blobstore.BlobPath;
import org.elasticsearch.common.blobstore.BlobStoreException;
Expand Down Expand Up @@ -231,6 +232,36 @@ public Map<String, BlobMetaData> listBlobs() throws IOException {
return listBlobsByPrefix(null);
}

@Override
public Map<String, BlobContainer> children() throws IOException {
try (AmazonS3Reference clientReference = blobStore.clientReference()) {
ObjectListing prevListing = null;
final var entries = new ArrayList<Map.Entry<String, BlobContainer>>();
while (true) {
ObjectListing list;
if (prevListing != null) {
final ObjectListing finalPrevListing = prevListing;
list = SocketAccess.doPrivileged(() -> clientReference.client().listNextBatchOfObjects(finalPrevListing));
} else {
list = SocketAccess.doPrivileged(() -> clientReference.client().listObjects(blobStore.bucket(), keyPath));
}
for (final String summary : list.getCommonPrefixes()) {
final String name = summary.substring(keyPath.length());
final BlobPath path = path().add(name);
entries.add(entry(name, blobStore.blobContainer(path)));
}
if (list.isTruncated()) {
prevListing = list;
} else {
break;
}
}
return Maps.ofEntries(entries);
} catch (final AmazonClientException e) {
throw new IOException("Exception when listing children", e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also mention current path here

}
}

private String buildKey(String blobName) {
return keyPath + blobName;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,16 @@ default void deleteBlobIgnoringIfNotExists(String blobName) throws IOException {
*/
Map<String, BlobMetaData> listBlobs() throws IOException;

/**
* Lists all child containers under this container. A child container is defined as a container whose {@link #path()} method returns
* a path that has the return of this container's {@link #path()} method as a prefix with a single path component added to it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this reads a bit funnily, maybe rephrase

* The path component by which a container and its child differ is referred to as the name of the child container below.
*
* @return Map of name of the child container to child container
* @throws IOException on failure to list child containers
*/
Map<String, BlobContainer> children() throws IOException;

/**
* Lists all blobs in the container that match the specified prefix.
*
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
package org.elasticsearch.common.blobstore.fs;

import org.elasticsearch.common.UUIDs;
import org.elasticsearch.common.blobstore.BlobContainer;
import org.elasticsearch.common.blobstore.BlobMetaData;
import org.elasticsearch.common.blobstore.BlobPath;
import org.elasticsearch.common.blobstore.support.AbstractBlobContainer;
Expand Down Expand Up @@ -73,6 +74,22 @@ public Map<String, BlobMetaData> listBlobs() throws IOException {
return listBlobsByPrefix(null);
}

@Override
public Map<String, BlobContainer> children() throws IOException {
// If we get duplicate files we should just take the last entry
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would we expect duplicates?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, for some reason I just mindlessly copied that comment from the list by prefix method. It makes 0 sense imo, I'll just remove it.

Map<String, BlobContainer> builder = new HashMap<>();
try (DirectoryStream<Path> stream = Files.newDirectoryStream(path)) {
for (Path file : stream) {
final BasicFileAttributes attrs = Files.readAttributes(file, BasicFileAttributes.class);
if (attrs.isDirectory()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just use Files.isDirectory(file)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jup ... this was just a left over from copying the logic of list by prefix, adjusting :)

final String name = file.getFileName().toString();
builder.put(name, new FsBlobContainer(blobStore, path().add(name), file));
}
}
}
return unmodifiableMap(builder);
}

@Override
public Map<String, BlobMetaData> listBlobsByPrefix(String blobNamePrefix) throws IOException {
// If we get duplicate files we should just take the last entry
Expand Down
Loading