Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[MXNET-116] object detector class added #10179

Closed
wants to merge 38 commits into from

Conversation

lanking520
Copy link
Member

@lanking520 lanking520 commented Mar 21, 2018

Description

Add Object Detection Class for MXNet Inference API, also add test and example to it. This code is fully tested with SSD model.
@nswamy @Roshrini

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • [X ] Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Feature1, tests, (and when applicable, API doc)
  • Feature2, tests, (and when applicable, API doc)

Comments

@lanking520 lanking520 requested a review from yzhliu as a code owner March 21, 2018 00:11
@@ -0,0 +1,96 @@
package SSDClassifierExample
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add license and standard preamble docs as to what this does and how you use it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the License, will add a short description to this example

* See the License for the specific language governing permissions and
* limitations under the License.
*/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add comments on what this does and how it is used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add this part in the next day

* Takes input as NDArrays, useful when you want to perform multiple operations on
* the input Array or when you want to pass a batch of input.
* @param input: Indexed Sequence of NDArrays
* @param topK: (Optional) How many top_k(sorting will be based on the last axis)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space after "k".

* the input Array or when you want to pass a batch of input.
* @param input: Indexed Sequence of NDArrays
* @param topK: (Optional) How many top_k(sorting will be based on the last axis)
* elements to return, if not passed returns unsorted output.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comma after passed

}

/**
* Takes input as NDArrays, useful when you want to perform multiple operations on
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switch comma to period and start new sentence.

class SSDClassifierExample {
@Option(name = "--model-dir", usage = "the input model directory")
private val modelPath: String = "/model"
@Option(name = "--model-prefix", usage = "The prefix of the model")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lower case The... unless you make the others all The

@Option(name = "--model-prefix", usage = "The prefix of the model")
private val modelPrefix: String = "/ssd_resnet50_512"
@Option(name = "--input-image", usage = "the input image")
private val inputImagePath: String = "/images/Cat-hd-wallpapers.jpg"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we stick to kitten.jpg since it's probably around in the other test artifacts?


import ml.dmlc.mxnet._
import ml.dmlc.mxnet.infer._
import org.kohsuke.args4j.{CmdLineParser, Option}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any comment why we need these things (next four lines)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used the ObjectDetection Class and ImageClassifier Object from mxnet.infer package. Do I need to import them separately in two lines or adding some comments in the code?

Will double check the functions being imported from mxnet._

Args4j used to get the user input args (such as '--model-dir'), same question here, do I need to add comments about this usage?

try {
println(mdDir)
val dType = DType.Float32
val inputShape = Shape(1, 3, 512, 512)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this being fetched from the signature.json file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add this section on 03/21

val dType = DType.Float32
val inputShape = Shape(1, 3, 512, 512)
// ssd detections, numpy.array([[id, score, x1, y1, x2, y2]...])
val outputShape = Shape(1, 6132, 6)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

signature.json?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Copy link
Member

@nswamy nswamy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job!.
Please make the changes requested.

class ObjectDetector(modelPathPrefix: String,
inputDescriptors: IndexedSeq[DataDesc])
extends Classifier(modelPathPrefix: String,
inputDescriptors: IndexedSeq[DataDesc]) {1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is 1 here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will kill that 1 in the next update


// Considering 'NCHW' as default layout when not provided
// Else get axis according to the layout
val batch = inputShape(if (inputLayout.indexOf('N')<0) 0 else inputLayout.indexOf('N'))
Copy link
Member

@nswamy nswamy Mar 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should just require user to pass the input descriptor. Most models will have batch size in the input so thats the only thing you should default and force the user to pass CHW.
Also these are already checked in the Classifier constructor, you don't need to do it again here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that one is the ImageClassifier part, this is ObjectDetector, we need to do this part again. Will force user input CHW

val width = inputShape(if (inputLayout.indexOf('W')<0) 3 else inputLayout.indexOf('W'))

/**
* To classify the image according to the provided model
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To Detect bounding boxes and corresponding labels

}

/**
* Takes input as NDArrays. Useful when you want to perform multiple operations on
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Takes input images as NDArrays. Useful when you want to perform multiple operations on
the input Array, or when you want to pass a batch of input images.

r.dispose()
}
handler.execute(predictResult.dispose())
batchResult.toList.toIndexedSeq
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can just do batchResult.toIndexedSeq

val predictResult: ListBuffer[Array[Float]] = ListBuffer[Array[Float]]()
val accuracy : ListBuffer[Float] = ListBuffer[Float]()

// iterating over the individual items(batch size is in axis 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are actually iterating over all results here

r.dispose()
}

var sortedIndices = accuracy.zipWithIndex.sortBy(-_._1).map(_._2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need to sort if topK is not defined.

val op = NDArray.concatenate(imageBatch)
// printf("concatenateed shape %s", op.shape)
val result = objectDetectWithNDArray(IndexedSeq(op), topK)
handler.execute(op.dispose())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you also have to dispose NDArrays created in imageBatch


val result = sortedIndices.map(idx
=> (synset(predictResult(idx)(0).toInt),
predictResult(idx).takeRight(5))).toList
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment on Why takeRight(5)

sortedIndices = sortedIndices.take(topK.get)
}

val result = sortedIndices.map(idx
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again no need sort if topK is not defined.

*/
class ObjectDetector(modelPathPrefix: String,
inputDescriptors: IndexedSeq[DataDesc])
extends Classifier(modelPathPrefix: String,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it does not feel like ObjectDetector "is a" Classifier, this should be a composition rather than inhertiance

wget https://cloud.githubusercontent.com/assets/3307514/20012566/cbb53c76-a27d-11e6-9aaa-91939c9a1cd5.jpg -O 000001.jpg
wget https://cloud.githubusercontent.com/assets/3307514/20012567/cbb60336-a27d-11e6-93ff-cbc3f09f5c9e.jpg -O dog.jpg
wget https://cloud.githubusercontent.com/assets/3307514/20012563/cbb41382-a27d-11e6-92a9-18dab4fd1ad3.jpg -O person.jpg
fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the return char


java -Xmx8G -cp $CLASS_PATH \
ml.dmlc.mxnetexamples.inferexample.objectdetector.SSDClassifierExample \
--model-dir $MODEL_DIR \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you just take model-path-prefix instead of two different directories ?


// Considering 'NCHW' as default layout when not provided
// Else get axis according to the layout
val batch = inputShape(if (inputLayout.indexOf('N')<0) 0 else inputLayout.indexOf('N'))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you probably can access all the below variables from the ImageClassifier
Also no need for the if else


val result = objectDetectWithNDArray(IndexedSeq(op), topK)
handler.execute(op.dispose())
for (ele <- imageBatch) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The beauty of Scala is you can express statements in concise format like this:
imageBatch.foreach(_.dispose())

@aaronmarkham
Copy link
Contributor

I think that you should have the examples work from a common directory.
scala-package/examples/scripts/inferexample/models/ and
scala-package/examples/scripts/inferexample/images/
Also use the exact prefix for naming the model subfolder:
/resnet-152/ and
resnet50_ssd
Rather than have the models being saved inconsistently in your example subfolder. SSD is using model and imageclassifer is using resnet.
You want to give the user the option of reusing the models and experimenting with different ones.

@Roshrini
Copy link
Member

@lanking520 we can put them in a folder like this:
models/ssd/
models/resnet-152/

@lanking520
Copy link
Member Author

Yes, Will update that


test("objectDetectWithInputImage") {
val inputDescriptor = IndexedSeq[DataDesc](new DataDesc(modelPath, Shape(1, 3, 512, 512)))
val inputImage = new BufferedImage(224, 224, BufferedImage.TYPE_INT_RGB)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are these hardcode 224 size for? I don't think object detection works on such resolution.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. This image is a sample image feed into the unit test. Made the change.

@aaronmarkham
Copy link
Contributor

Docs GTG for now, but would like to test again after #10054 is merged.

lanking520 and others added 5 commits March 22, 2018 16:32
* update windows installation doc to inform user that they need to update the PATH variable

* update
* Fix failed test with opencv 3.4.1

* Replace int conversion with `//`

* retrigger test
@lanking520 lanking520 requested a review from szha as a code owner March 23, 2018 18:44
@lanking520 lanking520 closed this Mar 23, 2018
@lanking520
Copy link
Member Author

Please move to: #10229.

@lanking520 lanking520 deleted the scala-infer-package branch April 9, 2018 20:00
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants