Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cat: Ignore Stat() error and pass to Get() #2166

Merged
merged 2 commits into from Jul 4, 2017

Conversation

vadmeste
Copy link
Member

@vadmeste vadmeste commented May 31, 2017

client-s3.Get() calls Stat() method because Get() is supposed to return
object metadata. This PR makes a change in client interface so metadata
is returned in Stat() method. This is beneficial for us to avoid calling
HEAD method in S3 when run mc cat

Fixes #2156

@vadmeste vadmeste changed the title client: Avoid returning metadata with Get() [WIP] client: Avoid returning metadata with Get() May 31, 2017
@codecov-io
Copy link

codecov-io commented Jun 2, 2017

Codecov Report

Merging #2166 into master will decrease coverage by 0.04%.
The diff coverage is 9.67%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #2166      +/-   ##
=========================================
- Coverage    8.91%   8.87%   -0.05%     
=========================================
  Files          92      92              
  Lines        6863    6865       +2     
=========================================
- Hits          612     609       -3     
- Misses       6119    6125       +6     
+ Partials      132     131       -1
Impacted Files Coverage Δ
cmd/cat-main.go 0% <0%> (ø) ⬆️
cmd/pipe-main.go 0% <0%> (ø) ⬆️
cmd/common-methods.go 0% <0%> (ø) ⬆️
cmd/client-s3.go 13.85% <18.18%> (-0.3%) ⬇️
cmd/client-fs.go 29.06% <50%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update be33703...6b6b624. Read the comment docs.

@vadmeste vadmeste changed the title [WIP] client: Avoid returning metadata with Get() client: Avoid returning metadata with Get() Jun 2, 2017
@vadmeste
Copy link
Member Author

vadmeste commented Jun 2, 2017

Not WIP anymore. Ready for testing and reviewing and discussing design change of client interface.

Copy link
Member

@harshavardhana harshavardhana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall the approach is fine, what needs to be done is a full work perhaps in the next PR optimizing all the areas to avoid double calls like these.

cmd/client-fs.go Outdated
@@ -451,7 +451,7 @@ func (f *fsClient) Copy(source string, size int64, progress io.Reader) *probe.Er

// get - get wrapper returning reader and additional metadata if any.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on additional metadata is superfluous with this change, please remove it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on additional metadata is superfluous with this change, please remove it.

Done

cmd/client-s3.go Outdated
}
objectMetadata.URL = *c.targetURL
objectMetadata.Time = objectStat.LastModified
objectMetadata.Size = objectStat.Size
objectMetadata.Type = os.FileMode(0664)
return objectMetadata, nil

metadata := objectStat.Metadata
Copy link
Member

@krisis krisis Jun 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If metadata can be constructed from *clientContent then why should we return it separately? Also, we ignore the returned metadatain most (other) places.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If metadata can be constructed from *clientContent

Not sure what do you mean, clientContent is defined in mc and metadata is returned from minio-go. so metadata is not constructed from clientContent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't​ it possible to construct a map[string]string (metadata) from objectMetadata even outside this function?

Copy link
Member Author

@vadmeste vadmeste Jun 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting that we should merge the returned metadata with objectMetadata instead of returning it separately ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. IIUC, we don't use metadata everywhere. So, I am suggesting we construct map[string] string wherever required with corresponding values found in objectMetadata from the Stat method. Does that make sense?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. IIUC, we don't use metadata everywhere. So, I am suggesting we construct map[string] string wherever required with corresponding values found in objectMetadata from the Stat method. Does that make sense?

Yes, I agree.

So the client interface will be changed from:

 40 type Client interface {
 41         // Common operations
 42         Stat(isIncomplete bool) (content *clientContent, metadata map[string][]string, err *probe.Error)

To

 40 type Client interface {
 41         // Common operations
 42         Stat(isIncomplete bool) (content *clientContent, err *probe.Error)

and clientContent will have new Metadata additional field:

 75 type clientContent struct {
 76         URL  clientURL
 77         Time time.Time
 78         Size int64
 79         Type os.FileMode
            Metadata map[string][]string
 80         Err  *probe.Error
 81 }

@abperiasamy, @harshavardhana what do you think ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes clientContent can have the metadata which is completely fine @vadmeste and it is the right place IMHO.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes clientContent can have the metadata which is completely fine @vadmeste and it is the right place IMHO.

Okay, will do.

client-s3.Get() calls Stat() method because Get() is supposed to return
object metadata. This PR makes a change in client interface so metadata
is returned in Stat() method. This is beneficial for us to avoid calling
HEAD method in S3 when run `mc cat`
In case of cache gateway, HEAD objects returns 404 even though
objects are gettable. This PR ignores Stat() error since its
purpose in case of cat is to check if the download object size
is equal to what the server announced.
@vadmeste vadmeste changed the title client: Avoid returning metadata with Get() cat: Ignore Stat() error and pass to Get() Jun 5, 2017
@vadmeste
Copy link
Member Author

vadmeste commented Jun 5, 2017

I made further changes, can you review again ? this PR was partly fixing the issue that I wanted to address. But now cat command is also changed.

@deekoder, can you block this PR? because it cann't be fully tested unless Minio cache gateway feature gets merged.

@harshavardhana
Copy link
Member

ping @krisis anything else from your end on this PR?

// Ignore size for filesystem objects since os.Stat() would not
// return proper size all the time, for example with /proc files.
if client.GetURL().Type == objectStorage {
if err == nil && client.GetURL().Type == objectStorage {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, we return immediately if err != nil from url2Stat. Why are we ignoring the error in this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, we return immediately if err != nil from url2Stat. Why are we ignoring the error in this change?

In the first place, url2Stat() is only needed to fetch the size of the remote object so we can check that we downloaded all the object data from the server. So actually, testing on url2Stat() is not crucial since we call getSourceStreamFromURL() later.

But why we ignore that error now? it is because we need to enhance support of mc with the new minio cache gateway. In some cases, with the cache gateway, a HEAD object returns 404 but GET can return 200 with object data. Refer to #2156 or ping me online.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vadmeste, OK. I am withholding approval of PR pending test. Otherwise, the change looks good to me.

@@ -894,12 +884,14 @@ func (c *s3Client) Stat(isIncomplete bool) (*clientContent, *probe.Error) {
objectMetadata.Time = objectStat.LastModified
objectMetadata.Size = objectStat.Size
objectMetadata.Type = os.FileMode(0664)
objectMetadata.Metadata = map[string][]string{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why should we try to find the given object by listing the objects in the bucket for fetching stat information? We can directly call c.api.StatObject(bucket, object), right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@krisis, Stat() in client-s3 is supposed to be able to Stat() a directory but StatObject() of minio-go cannot.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vadmeste Ah, didn't see that. Thanks for the explanation.

@vadmeste
Copy link
Member Author

Can we allow this to be merged? reviewers can mark this PR reviewed and not tested

Copy link
Member

@krisis krisis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not tested, but changes LGTM.

@deekoder
Copy link
Contributor

I am not sure if I should merge it if its not tested.

@vadmeste
Copy link
Member Author

I am not sure if I should merge it if its not tested.

This PR was open for a long time, we can't test because Minio cache is not ready yet, also by time we will have to rebase it many times.

Also I'll have less opened issues in my buzz list if we merge this :)

@krisis
Copy link
Member

krisis commented Jun 29, 2017

@deekoder Tested locally (using mc cat --debug) and works as expected. Like @vadmeste mentioned, this change can't be tested in the context of edge cache.

@harshavardhana
Copy link
Member

@deekoder Tested locally (using mc cat --debug) and works as expected. Like @vadmeste mentioned, this change can't be tested in the context of edge cache.

IMO its a good change to have anycase.

@deekoder
Copy link
Contributor

Since its approved by the reviewers, can i merge it? Its not clear whether testability was an issue?

@vadmeste
Copy link
Member Author

Since its approved by the reviewers, can i merge it?

Yes you can.

Its not clear whether testability was an issue?

Because Minio cache is not ready yet.. reviewers checked debug log to confirm it is working as expected

@deekoder deekoder merged commit 1b9bead into minio:master Jul 4, 2017
@vadmeste vadmeste deleted the issues/2156 branch May 29, 2018 20:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants