New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cat: Ignore Stat() error and pass to Get() #2166
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2166 +/- ##
=========================================
- Coverage 8.91% 8.87% -0.05%
=========================================
Files 92 92
Lines 6863 6865 +2
=========================================
- Hits 612 609 -3
- Misses 6119 6125 +6
+ Partials 132 131 -1
Continue to review full report at Codecov.
|
Not WIP anymore. Ready for testing and reviewing and discussing design change of client interface. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall the approach is fine, what needs to be done is a full work perhaps in the next PR optimizing all the areas to avoid double calls like these.
cmd/client-fs.go
Outdated
@@ -451,7 +451,7 @@ func (f *fsClient) Copy(source string, size int64, progress io.Reader) *probe.Er | |||
|
|||
// get - get wrapper returning reader and additional metadata if any. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment on additional metadata is superfluous with this change, please remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment on additional metadata is superfluous with this change, please remove it.
Done
cmd/client-s3.go
Outdated
} | ||
objectMetadata.URL = *c.targetURL | ||
objectMetadata.Time = objectStat.LastModified | ||
objectMetadata.Size = objectStat.Size | ||
objectMetadata.Type = os.FileMode(0664) | ||
return objectMetadata, nil | ||
|
||
metadata := objectStat.Metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If metadata
can be constructed from *clientContent
then why should we return it separately? Also, we ignore the returned metadata
in most (other) places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If metadata can be constructed from *clientContent
Not sure what do you mean, clientContent
is defined in mc and metadata
is returned from minio-go. so metadata is not constructed from clientContent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it possible to construct a map[string]string
(metadata) from objectMetadata
even outside this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you suggesting that we should merge the returned metadata with objectMetadata
instead of returning it separately ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. IIUC, we don't use metadata everywhere. So, I am suggesting we construct map[string] string
wherever required with corresponding values found in objectMetadata
from the Stat method. Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. IIUC, we don't use metadata everywhere. So, I am suggesting we construct map[string] string wherever required with corresponding values found in objectMetadata from the Stat method. Does that make sense?
Yes, I agree.
So the client interface will be changed from:
40 type Client interface {
41 // Common operations
42 Stat(isIncomplete bool) (content *clientContent, metadata map[string][]string, err *probe.Error)
To
40 type Client interface {
41 // Common operations
42 Stat(isIncomplete bool) (content *clientContent, err *probe.Error)
and clientContent will have new Metadata additional field:
75 type clientContent struct {
76 URL clientURL
77 Time time.Time
78 Size int64
79 Type os.FileMode
Metadata map[string][]string
80 Err *probe.Error
81 }
@abperiasamy, @harshavardhana what do you think ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes clientContent can have the metadata which is completely fine @vadmeste and it is the right place IMHO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes clientContent can have the metadata which is completely fine @vadmeste and it is the right place IMHO.
Okay, will do.
client-s3.Get() calls Stat() method because Get() is supposed to return object metadata. This PR makes a change in client interface so metadata is returned in Stat() method. This is beneficial for us to avoid calling HEAD method in S3 when run `mc cat`
In case of cache gateway, HEAD objects returns 404 even though objects are gettable. This PR ignores Stat() error since its purpose in case of cat is to check if the download object size is equal to what the server announced.
I made further changes, can you review again ? this PR was partly fixing the issue that I wanted to address. But now cat command is also changed. @deekoder, can you block this PR? because it cann't be fully tested unless Minio cache gateway feature gets merged. |
ping @krisis anything else from your end on this PR? |
// Ignore size for filesystem objects since os.Stat() would not | ||
// return proper size all the time, for example with /proc files. | ||
if client.GetURL().Type == objectStorage { | ||
if err == nil && client.GetURL().Type == objectStorage { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, we return immediately if err != nil
from url2Stat
. Why are we ignoring the error in this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, we return immediately if err != nil from url2Stat. Why are we ignoring the error in this change?
In the first place, url2Stat() is only needed to fetch the size of the remote object so we can check that we downloaded all the object data from the server. So actually, testing on url2Stat() is not crucial since we call getSourceStreamFromURL() later.
But why we ignore that error now? it is because we need to enhance support of mc with the new minio cache gateway. In some cases, with the cache gateway, a HEAD object returns 404 but GET can return 200 with object data. Refer to #2156 or ping me online.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vadmeste, OK. I am withholding approval of PR pending test. Otherwise, the change looks good to me.
@@ -894,12 +884,14 @@ func (c *s3Client) Stat(isIncomplete bool) (*clientContent, *probe.Error) { | |||
objectMetadata.Time = objectStat.LastModified | |||
objectMetadata.Size = objectStat.Size | |||
objectMetadata.Type = os.FileMode(0664) | |||
objectMetadata.Metadata = map[string][]string{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why should we try to find the given object by listing the objects in the bucket for fetching stat information? We can directly call c.api.StatObject(bucket, object)
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@krisis, Stat() in client-s3 is supposed to be able to Stat() a directory but StatObject() of minio-go cannot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vadmeste Ah, didn't see that. Thanks for the explanation.
Can we allow this to be merged? reviewers can mark this PR reviewed and not tested |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not tested, but changes LGTM.
I am not sure if I should merge it if its not tested. |
This PR was open for a long time, we can't test because Minio cache is not ready yet, also by time we will have to rebase it many times. Also I'll have less opened issues in my buzz list if we merge this :) |
Since its approved by the reviewers, can i merge it? Its not clear whether testability was an issue? |
Yes you can.
Because Minio cache is not ready yet.. reviewers checked debug log to confirm it is working as expected |
client-s3.Get() calls Stat() method because Get() is supposed to return
object metadata. This PR makes a change in client interface so metadata
is returned in Stat() method. This is beneficial for us to avoid calling
HEAD method in S3 when run
mc cat
Fixes #2156