Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: auto content-type on blob creation #338

Merged
merged 8 commits into from
Jul 15, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -521,6 +521,15 @@ public static BlobTargetOption disableGzipContent() {
return new BlobTargetOption(StorageRpc.Option.IF_DISABLE_GZIP_CONTENT, true);
}

/**
* Returns an option for detecting content type. If this option is used, the content type is
* detected from the blob name if not explicitly set. This option is on the client side only, it
* does not appear in a RPC call.
*/
public static BlobTargetOption detectContentType() {
return new BlobTargetOption(StorageRpc.Option.DETECT_CONTENT_TYPE, true);
}

/**
* Returns an option to set a customer-supplied AES256 key for server-side encryption of the
* blob.
Expand Down Expand Up @@ -593,6 +602,7 @@ enum Option {
CUSTOMER_SUPPLIED_KEY,
KMS_KEY_NAME,
USER_PROJECT,
DETECT_CONTENT_TYPE,
IF_DISABLE_GZIP_CONTENT;

StorageRpc.Option toRpcOption() {
Expand Down Expand Up @@ -733,6 +743,15 @@ public static BlobWriteOption userProject(String userProject) {
public static BlobWriteOption disableGzipContent() {
return new BlobWriteOption(Option.IF_DISABLE_GZIP_CONTENT, true);
}

/**
* Returns an option for detecting content type. If this option is used, the content type is
* detected from the blob name if not explicitly set. This option is on the client side only, it
* does not appear in a RPC call.
*/
public static BlobWriteOption detectContentType() {
return new BlobWriteOption(Option.DETECT_CONTENT_TYPE, true);
}
}

/** Class for specifying blob source options. */
Expand Down Expand Up @@ -1832,9 +1851,10 @@ public static Builder newBuilder() {
* Creates a new blob. Direct upload is used to upload {@code content}. For large content, {@link
* #writer} is recommended as it uses resumable upload. MD5 and CRC32C hashes of {@code content}
* are computed and used for validating transferred data. Accepts an optional userProject {@link
* BlobGetOption} option which defines the project id to assign operational costs.
* BlobGetOption} option which defines the project id to assign operational costs. The content
* type is detected from the blob name if not explicitly set.
*
* <p>Example of creating a blob from a byte array.
* <p>Example of creating a blob from a byte array:
*
* <pre>{@code
* String bucketName = "my-unique-bucket";
Expand All @@ -1857,7 +1877,7 @@ public static Builder newBuilder() {
* Accepts a userProject {@link BlobGetOption} option, which defines the project id to assign
* operational costs.
*
* <p>Example of creating a blob from a byte array.
* <p>Example of creating a blob from a byte array:
*
* <pre>{@code
* String bucketName = "my-unique-bucket";
Expand All @@ -1876,7 +1896,7 @@ Blob create(

/**
* Creates a new blob. Direct upload is used to upload {@code content}. For large content, {@link
* #writer} is recommended as it uses resumable upload. By default any md5 and crc32c values in
* #writer} is recommended as it uses resumable upload. By default any MD5 and CRC32C values in
* the given {@code blobInfo} are ignored unless requested via the {@code
* BlobWriteOption.md5Match} and {@code BlobWriteOption.crc32cMatch} options. The given input
* stream is closed upon success.
Expand Down Expand Up @@ -2603,11 +2623,11 @@ Blob createFrom(
ReadChannel reader(BlobId blob, BlobSourceOption... options);

/**
* Creates a blob and return a channel for writing its content. By default any md5 and crc32c
* Creates a blob and returns a channel for writing its content. By default any MD5 and CRC32C
* values in the given {@code blobInfo} are ignored unless requested via the {@code
* BlobWriteOption.md5Match} and {@code BlobWriteOption.crc32cMatch} options.
*
* <p>Example of writing a blob's content through a writer.
* <p>Example of writing a blob's content through a writer:
*
* <pre>{@code
* String bucketName = "my-unique-bucket";
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -77,9 +77,12 @@
import java.io.InputStream;
import java.io.OutputStream;
import java.math.BigInteger;
import java.net.FileNameMap;
import java.net.URLConnection;
import java.util.ArrayList;
import java.util.LinkedList;
import java.util.List;
import java.util.Locale;
import java.util.Map;

public class HttpStorageRpc implements StorageRpc {
Expand All @@ -98,6 +101,7 @@ public class HttpStorageRpc implements StorageRpc {
private final HttpRequestInitializer batchRequestInitializer;

private static final long MEGABYTE = 1024L * 1024L;
private static final FileNameMap FILE_NAME_MAP = URLConnection.getFileNameMap();

public HttpStorageRpc(StorageOptions options) {
HttpTransportOptions transportOptions = (HttpTransportOptions) options.getTransportOptions();
Expand Down Expand Up @@ -286,7 +290,7 @@ public StorageObject create(
.insert(
storageObject.getBucket(),
storageObject,
new InputStreamContent(storageObject.getContentType(), content));
new InputStreamContent(detectContentType(storageObject, options), content));
insert.getMediaHttpUploader().setDirectUploadEnabled(true);
Boolean disableGzipContent = Option.IF_DISABLE_GZIP_CONTENT.getBoolean(options);
if (disableGzipContent != null) {
Expand Down Expand Up @@ -372,6 +376,19 @@ public Tuple<String, Iterable<StorageObject>> list(final String bucket, Map<Opti
}
}

private static String detectContentType(StorageObject object, Map<Option, ?> options) {
String contentType = object.getContentType();
if (contentType != null) {
return contentType;
}

if (Boolean.TRUE == Option.DETECT_CONTENT_TYPE.get(options)) {
contentType = FILE_NAME_MAP.getContentTypeFor(object.getName().toLowerCase(Locale.ENGLISH));
}

return firstNonNull(contentType, "application/octet-stream");
}

private static Function<String, StorageObject> objectFromPrefix(final String bucket) {
return new Function<String, StorageObject>() {
@Override
Expand Down Expand Up @@ -834,9 +851,7 @@ public String open(StorageObject object, Map<Option, ?> options) {
HttpRequest httpRequest =
requestFactory.buildPostRequest(url, new JsonHttpContent(jsonFactory, object));
HttpHeaders requestHeaders = httpRequest.getHeaders();
requestHeaders.set(
"X-Upload-Content-Type",
firstNonNull(object.getContentType(), "application/octet-stream"));
requestHeaders.set("X-Upload-Content-Type", detectContentType(object, options));
String key = Option.CUSTOMER_SUPPLIED_KEY.getString(options);
if (key != null) {
BaseEncoding base64 = BaseEncoding.base64();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,8 @@ enum Option {
KMS_KEY_NAME("kmsKeyName"),
SERVICE_ACCOUNT_EMAIL("serviceAccount"),
SHOW_DELETED_KEYS("showDeletedKeys"),
REQUESTED_POLICY_VERSION("optionsRequestedPolicyVersion");
REQUESTED_POLICY_VERSION("optionsRequestedPolicyVersion"),
DETECT_CONTENT_TYPE("detectContentType");

private final String value;

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3363,4 +3363,61 @@ public void testUploadWithEncryption() throws Exception {
byte[] readBytes = blob.getContent(Blob.BlobSourceOption.decryptionKey(KEY));
assertArrayEquals(BLOB_BYTE_CONTENT, readBytes);
}

private Blob createBlob(String method, BlobInfo blobInfo, boolean detectType) throws IOException {
switch (method) {
case "create":
return detectType
? storage.create(blobInfo, Storage.BlobTargetOption.detectContentType())
: storage.create(blobInfo);
case "createFrom":
InputStream inputStream = new ByteArrayInputStream(BLOB_BYTE_CONTENT);
return detectType
? storage.createFrom(blobInfo, inputStream, Storage.BlobWriteOption.detectContentType())
: storage.createFrom(blobInfo, inputStream);
case "writer":
if (detectType) {
storage.writer(blobInfo, Storage.BlobWriteOption.detectContentType()).close();
} else {
storage.writer(blobInfo).close();
}
return storage.get(BlobId.of(blobInfo.getBucket(), blobInfo.getName()));
default:
throw new IllegalArgumentException("Unknown method " + method);
}
}

private void testAutoContentType(String method) throws IOException {
String[] names = {"file1.txt", "dir with spaces/Pic.Jpg", "no_extension"};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to infer the type based on bytes rather than file extension? That's how Golang does it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not always possible. In case of resumeable upload a blob is created first and the content is available later:

writer = storage.writer(blobInfo); // creating a blob, detecting content-type, no contect
writer.write(content);
writer.close();              

I did an experiment:
I created a file PDF.txt with a pdf content. Then I uploaded it as PDF.jpg with various clients and checked the resulting content type. I got the following:

gsutil: text/plain
NodeJS: image/jpeg
Go: application/pdf

To achieve consistent behavior across all the clients it should be implemented on the server-side at the moment when the upload is finished.

Until this feature is implemented I suggest type detecting based on file extension, as it has been done recently in nodejs (googleapis/nodejs-storage#1190)

String[] types = {"text/plain", "image/jpeg", "application/octet-stream"};
for (int i = 0; i < names.length; i++) {
BlobId blobId = BlobId.of(BUCKET, names[i]);
BlobInfo blobInfo = BlobInfo.newBuilder(blobId).build();
Blob blob_true = createBlob(method, blobInfo, true);
assertEquals(types[i], blob_true.getContentType());

Blob blob_false = createBlob(method, blobInfo, false);
assertEquals("application/octet-stream", blob_false.getContentType());
}
String customType = "custom/type";
BlobId blobId = BlobId.of(BUCKET, names[0]);
BlobInfo blobInfo = BlobInfo.newBuilder(blobId).setContentType(customType).build();
Blob blob = createBlob(method, blobInfo, true);
assertEquals(customType, blob.getContentType());
}

@Test
public void testAutoContentTypeCreate() throws IOException {
testAutoContentType("create");
}

@Test
public void testAutoContentTypeCreateFrom() throws IOException {
testAutoContentType("createFrom");
}

@Test
public void testAutoContentTypeWriter() throws IOException {
testAutoContentType("writer");
}
}