New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement backblaze-b2 gateway support #5002
Conversation
26f2f35
to
6634afa
Compare
Codecov Report
@@ Coverage Diff @@
## master #5002 +/- ##
==========================================
- Coverage 62.36% 61.21% -1.16%
==========================================
Files 190 192 +2
Lines 27756 28320 +564
==========================================
+ Hits 17310 17336 +26
- Misses 9209 9740 +531
- Partials 1237 1244 +7
Continue to review full report at Codecov.
|
54fc3aa
to
a2bf27b
Compare
Consistent unrelated failure on appveyor ignore windows error for now.
Talking to appveyor support at the moment. |
a2bf27b
to
3b19f12
Compare
Looks like with this PR we are expanding beyond 32K limit on windows shell PATH_MAX - golang/go#18468 Work-around is i will try to merge the files reducing the overall length and lets see if it fixes. |
482a29f
to
21c1275
Compare
Was able to fix it by merging some files. |
dec5bb7
to
ef3ff30
Compare
cmd/gateway-s3.go
Outdated
@@ -134,8 +135,9 @@ func newS3Gateway(host string) (GatewayLayer, error) { | |||
anonClient.SetCustomTransport(newCustomHTTPTransport()) | |||
|
|||
return &s3Objects{ | |||
Client: client, | |||
anonClient: anonClient, | |||
unsupportedAPIs: unsupportedAPIs{}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unsupportedAPIs: unsupportedAPIs{},
can be left out
cmd/gateway-unsupported.go
Outdated
@@ -16,27 +16,30 @@ | |||
|
|||
package cmd | |||
|
|||
type unsupportedAPIs struct{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be renamed to more specific gatewayUnsupported
ef3ff30
to
4815c9a
Compare
cmd/gateway-b2.go
Outdated
} | ||
// startPartNumber must be in the range 1 - 10000 for B2. | ||
if partNumberMarker == 0 { | ||
partNumberMarker = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will cause the list response to always skip the first part
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It won't @krishnasrinivas . Not in case of Backblaze.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
startPartNumber includes 1 and later, it is not the same style as S3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in that case we need to do partNumberMarker++ always because:
krishna@escape:~/dev/minio-tile-2$ miniodebug multipart listparts --bucket krishna1980 --object testobject --uploadid 4_z1efea6cee041ea395aeb021f_f205a55e6b4215c82_d20171003_m202115_c001_v0001093_t0023 --partmarker 2
{
"Bucket": "krishna1980",
"Key": "testobject",
"UploadID": "4_z1efea6cee041ea395aeb021f_f205a55e6b4215c82_d20171003_m202115_c001_v0001093_t0023",
"Initiator": {
"ID": "02d6176db174dc93cb1b899f7c6078f08654445fe8cf1b6ce98d8855f66bdbf4",
"DisplayName": ""
},
"Owner": {
"DisplayName": "",
"ID": "02d6176db174dc93cb1b899f7c6078f08654445fe8cf1b6ce98d8855f66bdbf4"
},
"StorageClass": "STANDARD",
"PartNumberMarker": 2,
"NextPartNumberMarker": 0,
"MaxParts": 1000,
"IsTruncated": false,
"ObjectParts": [
{
"PartNumber": 2,
"LastModified": "0001-01-01T00:00:00Z",
"ETag": "\"5da54579c335464b6ba52090bf57d063384bb1e9\"",
"Size": 2461
},
{
"PartNumber": 3,
"LastModified": "0001-01-01T00:00:00Z",
"ETag": "\"5da54579c335464b6ba52090bf57d063384bb1e9\"",
"Size": 2461
}
],
"EncodingType": ""
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes you are right. @krishnasrinivas
fa10005
to
07211fb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are some initial comments.
cmd/gateway-b2-anonymous.go
Outdated
return fmt.Sprintf("%s%d-%d", byteRangePrefix, offset, offset+size-1) | ||
} | ||
|
||
// AnonGetObject - Get object anonymously |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment doesn't say anything more than the name of the function. The comment could be AnonGetObject - Get object using a plain GET request.
or something to that effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
// Converts http Header into ObjectInfo. This function looks for all the | ||
// standard Backblaze B2 headers to convert into ObjectInfo. | ||
// | ||
// Content-Length is converted to Size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we could order the headers in the comment such that X-Bz-...
appear together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are written in the order they are handled inside the code..
cmd/gateway-b2-anonymous.go
Outdated
return objInfo, err | ||
} | ||
|
||
info := make(map[string]string) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could call this userDefinedHdrs
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
cmd/gateway-b2-anonymous.go
Outdated
Bucket: bucket, | ||
Name: object, | ||
ContentType: header.Get("Content-Type"), | ||
ModTime: time.Unix(timeStamp/1000, timeStamp%1000*1e6), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ModTime
can be idiomatically computed like time.Unix(0, 0).Add(timeStamp * time.Millisecond)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but why is your approach more idiomatic? @krisis .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is idiomatic since it uses standard Go types like time.Duration to work with milliseconds. To work with milliseconds available as string in header, we could do the following,
timeStamp, err := time.ParseDuration(header.Get("X-Bz-Upload-Timestamp"))
if err != nil {
}
modTime := time.Unix(0, 0).Add(timeStamp)
This way we use the appropriate Go types to work with time in milliseconds instead of dropping down to low-level types like int64, int.
cmd/gateway-b2-anonymous.go
Outdated
return objInfo, err | ||
} | ||
defer resp.Body.Close() | ||
if resp.StatusCode != 200 && resp.StatusCode != 206 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we expect partial content for a HEAD request?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes if you set the Range header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we aren't setting range headers here. should we still check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need then
cmd/gateway-b2.go
Outdated
uploadID = params[2] | ||
} | ||
|
||
// Following code is a non-exhaustive list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Following code is a non-exhaustive list
is out of place since there is no list of codes below.
cmd/gateway-b2.go
Outdated
// MakeBucket creates a new container on B2 backend. | ||
func (l *b2Objects) MakeBucketWithLocation(bucket, location string) error { | ||
info := make(map[string]string) | ||
info["x-minio-bucket-createTime"] = UTCNow().Format(time.RFC1123) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"x-minio-bucket-createTime" should be a const.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
cmd/gateway-b2.go
Outdated
return b2ToObjectError(traceError(err), bucket) | ||
} | ||
|
||
// Bucket - bucket |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment can be expanded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
cmd/gateway-b2.go
Outdated
} | ||
|
||
// Looks for x-minio-bucket-createTime as part of bucket metadata, if key | ||
// is absent then returns current UTC time instead. Returns error if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When creation time is unavailable we should use a sentinel value like `time.Unix(0,0) to indicate that we couldn't determine the actual time of creation. Returning current time may be misleading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
time.Unix(0,0) is not a good idea because of the tests that expect CreatTime to be always within some acceptable boundary of time. The reason to use currentTime here is to provide lastAccessedTime instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
time.Unix(0,0) is not a good idea because of the tests that expect CreatTime to be always within some acceptable boundary of time
Which tests are you referring to? If it is unit tests we can (and should) change it as fit. This suggestion is to avoid any confusion to the user, who knows that the bucket was created much before the time we might show (i.e, last accessed time like you mention above).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am talking about mint tests.. Setting time.Unix(0, 0) will be more confusing to user IMO. I would prefer not make this change and keep it the way it is. We can even discuss about this but i would like to avoid epoch date.
For the most part users using buckets from minio gateway will not have any problems, B2 users aren't expecting this value anyways. So i choose to set this value to something more meaningful than Epoch time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a matter of fact perhaps i can remove the x-minio-bucket-createTime and avoid such intelligence in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting time.Unix(0, 0) will be more confusing to user IMO. I would prefer not make this
The idea behind time.Unix(0,0)
is that it is a constant and doesn't change with time like time.Now()
. We could choose a different constant time.Time
value if that's the only issue.
As a matter of fact perhaps i can remove the x-minio-bucket-createTime and avoid such intelligence in the first place.
what would InitiatedTime
value for BucketInfo
be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing just time.Unix(0, 0) there is no such info available from B2 API so we don't return anything.
3fb3748
to
9e1aeb1
Compare
New change in this PR is regular PutObject() is doing multipart if the input stream size is bigger than MinPartSize . This is to ensure that we don't corrupt the namespace if there were errors during data transfer. |
51e141f
to
6043296
Compare
ac1bf8e
to
36994a0
Compare
cmd/gateway-b2.go
Outdated
} | ||
} | ||
|
||
type hexEndReader struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could be renamed to b2Hasher
as hexEndReader
would be a generic name
cmd/gateway-b2.go
Outdated
} | ||
|
||
hr := newHexDigitsEndReader(data, data.Size()) | ||
sha1, err := fc.UploadPart(l.ctx, hr, "hex_digits_at_end", hr.Len(), partID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"hex_digits_at_end" should be a const
36994a0
to
c6bff39
Compare
cmd/gateway-b2-anonymous.go
Outdated
ContentType: header.Get("Content-Type"), | ||
ModTime: time.Unix(0, 0).Add(time.Duration(timeStamp) * time.Millisecond), | ||
Size: clen, | ||
ETag: header.Get("X-Bz-Content-Sha1"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like fileID is the right candidate for ETag, because the functionality of If-Match
is supported by b2_download_file_by_id
and b2_get_file_info
both of which take fileID, there is no SHA1 support for these calls
6fa8891
to
7939c51
Compare
87524ff
to
2672aea
Compare
// Content-Type is converted to ContentType. | ||
// X-Bz-Content-Sha1 is converted to ETag. | ||
func headerToObjectInfo(bucket, object string, header http.Header) (objInfo ObjectInfo, err error) { | ||
clen, err := strconv.ParseInt(header.Get("Content-Length"), 10, 64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a comment mentioning what is going on here. E.g, // Converting upload timestamp in milliseconds to a time.Time value for ObjectInfo.ModTime
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
cmd/gateway-b2-anonymous.go
Outdated
return objInfo, err | ||
} | ||
|
||
userMetadata := make(map[string]string) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment explaining why were skipping headers that don't start with X-Bz-Info
is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
cmd/gateway-b2-anonymous.go
Outdated
return objInfo, err | ||
} | ||
defer resp.Body.Close() | ||
if resp.StatusCode != 200 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we returning a new error value, i.e errors.New(resp.Status)
while returning the error as is in AnonGetObject
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the error that you would return as is?
cmd/gateway-b2.go
Outdated
return b2ToObjectError(traceError(err), bucket) | ||
} | ||
|
||
// Bucket - bucket |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we name this function similar to other functions like GetBucket
? Is this function required to be exported? Please add a comment too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
cmd/gateway-b2.go
Outdated
Name: file.Name, | ||
ModTime: file.Timestamp, | ||
Size: file.Size, | ||
ETag: file.Info.SHA1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't file.FileID
be returned for ETag
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
cmd/gateway-b2.go
Outdated
Name: file.Name, | ||
ModTime: file.Timestamp, | ||
Size: file.Size, | ||
ETag: file.Info.SHA1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't file.FileID
be returned for ETag
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
cmd/gateway-b2.go
Outdated
type B2Reader struct { | ||
r *HashReader | ||
size int64 | ||
hsh hash.Hash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can call it sha1Sum
instead of hsh
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
if err != nil { | ||
return b2ToObjectError(traceError(err), bucket, object) | ||
} | ||
io.Copy(ioutil.Discard, reader) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we not Close reader
without copying?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
cmd/gateway-b2.go
Outdated
return lpi, nil | ||
} | ||
|
||
// AbortMultipartUpload aborts a ongoing multipart upload, uses B2's LargeFile upload API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: s/a ongoing/an ongoing/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
cmd/gateway-b2.go
Outdated
} | ||
_, err = bkt.File(uploadID, object).CompileParts(0, hashes).FinishLargeFile(l.ctx) | ||
if err != nil { | ||
return oi, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we call b2ToObjectError(traceError(err), ...)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
f26bdf4
to
14e577c
Compare
"sync" | ||
"time" | ||
|
||
b2 "github.com/minio/blazer/base" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we not using "github.com/minio/blazer/b2" client, which has reauthorization built into it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need low level API @krisis
f1c3ac9
to
cb814ad
Compare
d379d42
to
de8231a
Compare
de8231a
to
274ca16
Compare
Description
Implement backblaze-b2 gateway support
Motivation and Context
Fixes #4072
How Has This Been Tested?
Manually testing and mint.
Types of changes
Checklist: