Skip to content

Conversation

@karmel
Copy link
Contributor

@karmel karmel commented Mar 10, 2018

These changes bring us to 76+% accuracy, and also improve performance, especially for the 8 GPU case. Notable differences in training:

  • Use the bounding boxes supplied in the ImageNet dataset (accuracy)
  • Use the sample_distorted_bounding_box op instead of creating a random box for cropping during training (speed)
  • Use the fused op for decoding and cropping instead of two separate ops (speed)
  • The fused decoding requires that we resize after cropping

Eval is the same as previously, but cleaned up slightly now that the functions are not used for training as well.

Screenshot from a run with 4xGPU is below.

FYI @reedwm @bignamehyp @scott7z @robieta

screen shot 2018-03-09 at 4 45 15 pm

@karmel karmel requested review from k-w-w and nealwu March 10, 2018 00:46
@googlebot
Copy link

So there's good news and bad news.

👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there.

😕 The bad news is that it appears that one or more commits were authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request.

Note to project maintainer: This is a terminal state, meaning the cla/google commit status will not change from this state. It's up to you to confirm consent of the commit author(s) and merge this pull request when appropriate.

Copy link
Contributor

@nealwu nealwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you're mainly swapping in code from https://github.com/tensorflow/models/blob/master/research/inception/inception/image_processing.py, right? Would it work better to just switch to that file?

Also, this PR ended up somehow including a commit from https://github.com/MTDzi, which I don't think you intended.

image/encoded (a JPEG-encoded string) and image/class/label (int)
The output of the build_image_data.py image preprocessing script is a dataset
containing serialized Example protocol buffers. Each Example proto contains
the following fields:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The numbers below are example values, right? Could we specify that in this comment?

@nealwu
Copy link
Contributor

nealwu commented Mar 13, 2018

Regardless, nice work on 76+% accuracy!

@karmel
Copy link
Contributor Author

karmel commented Mar 13, 2018

As I understand it, there are several names around preprocessing that are used colloquially-- "inception," "resnet," and "vgg." Inception does more color distortion, which can improve accuracy, but slows down processing. Resnet is what we now have here, and VGG is similar, but doing the aspect-preserving-resize and no bounding boxes. We now have what we might call Resnet preprocessing, pulled largely from tf_cnn_benchmarks, but with fewer flags and options for simplicity. The linked inception preprocessing has the aforementioned color distortion, no fused op, etc., so a little more complex. Arguably, we might want to rename the vgg file at this point, but for now, probably okay to leave.

Pulling in the random commits was my attempt to squash a bunch of debugging commits and a master merge, but that ended in confusion and regret, as git rebasing always seems to :/ I think the changes made, however, are limited to the set intended to be made, with only the commit messages weirdly included.

Comment updated-- thanks for the review.

Copy link
Contributor

@nealwu nealwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favor of renaming to resnet_preprocessing, as right now the VGG name and the docstring at the top of the file aren't quite accurate for what the code actually does.

Also, we should definitely fix up the commit situation before merging. I would suggest just taking a git diff and writing that to a patch file, then hard resetting to the same state as master and git apply-ing your patch file to create a single commit. Then you can force push to overwrite this branch so that only your single commit is present.

@karmel karmel merged commit 86b1f07 into master Mar 13, 2018
@karmel karmel deleted the fix/perf-tune-2 branch March 13, 2018 21:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants