Feature extraction #3210

NeerajKomuravalli · 2020-05-21T19:11:05Z

@TobyRoseman, please have a look and let me know what you think.
Thank you!

…issing value test (apple#3126)

…#3126)

…wo support functions (apple#3126)

…e column (apple#3126)

…rors

… and without extracted_features column

…re type and then take the decision

…_to_train test case

…noput features in classify, predict and _canonize_input functions

…cted features as feature

…erating features

… input feature

TobyRoseman · 2020-05-21T22:26:02Z

Address #3126

TobyRoseman

@NeerajKomuravalli - Thanks for working on it! I took a first look. Generally things look good; this is a great start.

I left a few comments about things to work on. My suggestion would be to work on the unit tests first. Let's make sure we're testing using both deep features and using images.

src/python/turicreate/test/test_image_classifier.py

src/python/turicreate/toolkits/image_analysis/__init__.py

src/python/turicreate/toolkits/image_analysis/image_analysis.py

src/python/turicreate/test/test_image_classifier.py

src/python/turicreate/test/test_image_similarity.py

src/python/turicreate/toolkits/image_analysis/image_analysis.py

…_extracted_features_column

…to _find_only_image_extracted_features_column as chnages were made in image_analysis

TobyRoseman

This is looking good. I pulled down your branch and tried it out. Things seem to be generally working.

If we can get the unit tests all fixed up, I'll run your branch on a large set of internal tests that we have.

src/python/turicreate/test/test_image_classifier.py

TobyRoseman · 2020-05-27T21:44:05Z

src/python/turicreate/toolkits/image_analysis/image_analysis.py

+    from array import array
+
+    try:
+        feature_columns, _ = zip(*list(filter(lambda x: x[1] == array, list(zip(sframe.column_names(), sframe.column_types())))))


This line is a bit complicated. I think you should just be able to use a helper function that we already have.

This makes sense, Will use _find_only_column_of_type

TobyRoseman · 2020-05-27T22:57:16Z

src/python/turicreate/toolkits/image_classifier/image_classifier.py

+            feature = None
+
+        if feature is None:
+            try:


Why not do all this in the except block above?

Basically that part of the code is responsible for figuring out which column is a feature column in case the feature column in not passed by the user.
And in that case we first see if there is a deep feature column in the sframe and if yes, we use it and if it's not found we go ahead and look for Image column.
The try block will fail for two reasons:

There is no deep feature column

More than one deep feature columns were present in the sframe. (This I am assuming is by design because we throw a similar error even when more than one Image columns were found in the dataset.)
And when the try and except fail that means one of the above two cases would have happened and in that case we ignore the exception and we move ahead to see if we can find a Image column in the sframe dataset.
So this choice was by design. Please let me know if I am missing something here.

…he model specific name

…ly_column_of_type instead of writing a new logic

TobyRoseman · 2020-06-23T21:32:18Z

src/python/turicreate/test/test_image_classifier.py

-                    list(self.model.predict(deep_features, output_type="probability_vector")),
-                    self.tolerance,
-                )                
+            # If the code came here that means the type of the feature used is deep_deatures and the predict fwature in coremltools doesn't work with deep_features yet so we will ignore this specific test case unitl the same is written.


Should "fwature" be "function"? We try to maintain a 80 or 100 character length line limit. I would be nice if the text wrapped to the next line after 80 or 100 characters.

TobyRoseman · 2020-06-23T21:32:49Z

src/python/turicreate/test/test_image_similarity.py

-                tc_distances = tc_ret.sort("reference_label")["distance"].to_numpy()
-                psnr_value = get_psnr(coreml_distances, tc_distances)
-                self.assertTrue(psnr_value > 50)
+            # If the code came here that means the type of the feature used is deep_deatures and the predict fwature in coremltools doesn't work with deep_features yet so we will ignore this specific test case unitl the same is written.


same comment

TobyRoseman · 2020-06-23T21:54:51Z

@NeerajKomuravalli - Your most recent changes look good, just one minor comment. I'm running your changes now on an internal testing system. If that all passes, I think we're ready to merge this change.

TobyRoseman · 2020-06-24T00:15:01Z

The unit tests are failing on Linux. The VisionFeaturePrint_Scene model is not available on Linux.

We should skip the tests for that model if _mac_ver() < (10, 14) evaluates to True. Also we should not being trying to extract deep features for that model.

NeerajKomuravalli · 2020-06-24T09:03:18Z

@TobyRoseman
I went through both, test_image_classifier.py and test_image_similarity.py and in both the files we are only testing for VisionFeaturePrint_Scene model if _mac_ver() < (10, 14).
Can you point me to the code where this might be failing?

src/python/turicreate/test/test_image_classifier.py

TobyRoseman · 2020-06-24T21:29:40Z

@TobyRoseman
I went through both, test_image_classifier.py and test_image_similarity.py and in both the files we are only testing for VisionFeaturePrint_Scene model if _mac_ver() < (10, 14).
Can you point me to the code where this might be failing?

Ok, if we're already skipping those classes on Linux, then we just need to skip extracting the deep features on Linux (since that is done independently of the test classes). I've just added comments in the code about that.

… if its a Linux system or any other OS

NeerajKomuravalli · 2020-06-25T09:44:46Z

Hi @TobyRoseman ,

I have made the requested changes, let me know if there is anything else.

TobyRoseman · 2020-06-25T18:35:36Z

Thanks @NeerajKomuravalli. Those changes look good. I've just kicked off another internal test run. I'll let you know the results.

TobyRoseman · 2020-06-30T20:59:49Z

src/python/turicreate/test/test_image_similarity.py

 from turicreate.toolkits._main import ToolkitError as _ToolkitError
+from turicreate.toolkits.image_analysis.image_analysis import MODEL_TO_FEATURE_SIZE_MAPPING, get_deep_features
+
+import coremltools


This is no longer getting used.

Unfortunately due to the (brittle) way that we test the minimal version of TuriCreate, we are going to need to remove this unnecessary line. coremltools is not a dependency for the minimal version of our TuriCreate. So having it imported at a top level break our unit tests even though the tests in this file are not ran for the minimal version.

I believe removing this line should be the final required change for this pull request. All of our other internal tests are passing.

Makes sense, will remove the import coremltools line and push changes

…practices

NeerajKomuravalli · 2020-07-01T08:55:45Z

Made the required changes!

TobyRoseman · 2020-07-01T21:07:06Z

Thanks @NeerajKomuravalli - I'm rerunning the internal tests now.

TobyRoseman · 2020-07-07T00:36:39Z

Our unit tests for the minimal version of our package is still failing with this change. When our minimal package is installed we can't call get_deep_features. Since get_test_data is called just by importing the unit test files, that means get_deep_features gets called. We need to have to make it so that get_deep_features is only called from inside of a setUpClass method.

…o tested)

…ully tested)

NeerajKomuravalli · 2020-07-09T20:28:15Z

Makes sense, I shifted the get_deep_features call from get_test_data to setUpClass in both test_image_similarity.pyand test_image_classifier.py.
Let me know if there is anything else that is required!

TobyRoseman · 2020-07-09T23:54:46Z

src/python/turicreate/test/test_image_classifier.py

@@ -84,19 +88,24 @@ class ImageClassifierTest(unittest.TestCase):
    def setUpClass(
        self,
        model="resnet-50",
+        feature="resnet-50_deep_features", 


I don't think this right. If the class name doesn't end in WithDeepFeature, then I think it should be feature="awesome_image". This code is only calling get_deep_features when self.feature != "awesome_image".

I think you need to make this change to all of your test classes.

Changed the deep feature column name in test data to the one suggested by you and made all the requested changes.

I think you misunderstood my previous comment. I was not suggesting you need to change the name of the feature. I was basically just trying to say that it's important for the name of the class to represent what that class is actually testing.

For example, if the name of the class is ImageClassifierResnetTestWithDeepFeatures, then it should be feature="resnet-50_WithDeepFeature" or feature="resnet-50_deep_features", but not feature="awesome_image",. Similarly, if the name of the class is VisionFeaturePrintSceneTest then it should not be creating the model from deep features, so it should be feature="awesome_image", not feature="VisionFeaturePrint_Scene_WithDeepFeature".

Does that make sense? Let me know if you have any questions.

Oh yeah, you are right I did not see it there. It was correct before but during one of my recent pushes I changed it to test something and forgot to revert it back. My bad, will change it and push it.
And about changing the name of feature columns in the test data in test_image_classifier.py and test_image_similarity.py to have a suffix _WithDeepFeature instead of _deep_features makes more sense to me. So I am keeping that change as it is.

…ed the name of the same to WithDeepFeatures

…intain consistency

NeerajKomuravalli · 2020-07-11T06:07:53Z

I made the required changes. Let me know if you need anything else.

TobyRoseman · 2020-07-13T19:58:37Z

@NeerajKomuravalli - Those changes look good. Thank you. I'm rerunning our internal tests now.

TobyRoseman · 2020-07-16T20:07:14Z

Internal tests now pass.

@NeerajKomuravalli - thanks so much for all your work! I think this is great feature. It will be included in our next release.

NeerajKomuravalli · 2020-07-16T20:11:15Z

That's a great news @TobyRoseman ! It was my first contribution to an open source project so I am really excited.

It was great working on this, let me know if I can help contribute more!

Thanks!

NeerajKomuravalli added 20 commits May 6, 2020 10:12

Adding features column to the data and compensating for the same in m…

daa5b4a

…issing value test (apple#3126)

Adding features column to the data to test with features (apple#3126)

63736cf

Changing import statements to make the image_analysis callable (apple…

37d5ccd

…#3126)

Adding get_deep_features module to extract features and adding more t…

67216b5

…wo support functions (apple#3126)

Adding a way to create model by passing either feature column or imag…

d891038

…e column (apple#3126)

Adding a way to create model by passing either feature column or imag…

18d5b70

…e column (apple#3126)

Fixing syntax error

40ec7a5

Adding more elaborate error messages to also include Missing value er…

d51b193

…rors

Adding image_classifier test methods to include testing for data with…

ea237b5

… and without extracted_features column

Making Image feature extraction feature more robust and indipendent

f1ea4ac

Changing the _extract_features functions to first recognize the featu…

8e45833

…re type and then take the decision

Doing small edits

e14a97b

Changing the _extract_features functions to first recognize the featu…

7dc12a1

…re type and then take the decision

passing model name specifically in test_select_correct_feature_column…

8cc488d

…_to_train test case

Adding a way to account for extracted features instead of images as i…

62aa5e2

…noput features in classify, predict and _canonize_input functions

Adding feature based test cases for both image as a feature and extra…

e0a301a

…cted features as feature

Using actual features using get_deep_features instead of randomly gen…

1aa4dc1

…erating features

Adding test cases for model using both image and extracted_feature as…

4bfedd9

… input feature

Removing unnecessary code

e7195d6

Adding a way to accept input as extracted feature in quary function

c874c97

TobyRoseman self-requested a review May 21, 2020 22:25

TobyRoseman suggested changes May 21, 2020

View reviewed changes

NeerajKomuravalli added 4 commits May 23, 2020 18:56

Created test cases for model that is being trained with deep features

049c1e5

Created test cases for model that is being trained with deep features

c02ac72

Hiding the functions is_image_deep_feature_sarray and find_only_image…

26a9e61

…_extracted_features_column

Changing the function name find_only_image_extracted_features_column …

80dd166

…to _find_only_image_extracted_features_column as chnages were made in image_analysis

TobyRoseman reviewed May 27, 2020

View reviewed changes

NeerajKomuravalli added 2 commits June 4, 2020 15:17

Changing the generic deep feature column name in data generation to t…

b563393

…he model specific name

Simplifing the code to detect type of sfrane column by using _find_on…

c4b4ada

…ly_column_of_type instead of writing a new logic

TobyRoseman reviewed Jun 23, 2020

View reviewed changes

TobyRoseman reviewed Jun 24, 2020

View reviewed changes

src/python/turicreate/test/test_image_classifier.py Outdated Show resolved Hide resolved

Ignoring getting deep features when mac version is less than 10.14 or…

61ac0c2

… if its a Linux system or any other OS

TobyRoseman reviewed Jun 30, 2020

View reviewed changes

Removing coremltools import line to adhere to the turicreate testing …

fd0cbd5

…practices

NeerajKomuravalli added 4 commits July 9, 2020 14:33

Shifting deep feature building in from get_test_data to setUpClass (n…

199e1ce

…o tested)

Shifting deep feature building in from get_test_data to setUpClass (f…

3b97d6f

…ully tested)

Shifting deep feature building in from get_test_data to setUpClass

6a29bc5

Removing unnecessary comments

0c5ff0c

TobyRoseman suggested changes Jul 9, 2020

View reviewed changes

NeerajKomuravalli added 2 commits July 10, 2020 14:09

Changed the logic that decides when to extract deepFeatures and chang…

df86dbd

…ed the name of the same to WithDeepFeatures

Using the feature fed to unit test classes based on their names to ma…

fd01bcd

…intain consistency

TobyRoseman changed the title ~~Feature extraction [WIP]~~ Feature extraction Jul 16, 2020

TobyRoseman merged commit 4f4f1a6 into apple:master Jul 16, 2020

TobyRoseman mentioned this pull request Aug 31, 2020

Add .get_deep_features to image classifier as well #1916

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature extraction #3210

Feature extraction #3210

NeerajKomuravalli commented May 21, 2020

TobyRoseman commented May 21, 2020

TobyRoseman left a comment

TobyRoseman left a comment

TobyRoseman May 27, 2020

NeerajKomuravalli Jun 4, 2020

TobyRoseman May 27, 2020

NeerajKomuravalli Jun 4, 2020

TobyRoseman Jun 23, 2020

TobyRoseman Jun 23, 2020

TobyRoseman commented Jun 23, 2020

TobyRoseman commented Jun 24, 2020

NeerajKomuravalli commented Jun 24, 2020

TobyRoseman commented Jun 24, 2020

NeerajKomuravalli commented Jun 25, 2020

TobyRoseman commented Jun 25, 2020

TobyRoseman Jun 30, 2020

TobyRoseman Jun 30, 2020

NeerajKomuravalli Jul 1, 2020

NeerajKomuravalli commented Jul 1, 2020

TobyRoseman commented Jul 1, 2020

TobyRoseman commented Jul 7, 2020

NeerajKomuravalli commented Jul 9, 2020

TobyRoseman Jul 9, 2020

NeerajKomuravalli Jul 10, 2020

TobyRoseman Jul 11, 2020

NeerajKomuravalli Jul 11, 2020

NeerajKomuravalli commented Jul 11, 2020

TobyRoseman commented Jul 13, 2020

TobyRoseman commented Jul 16, 2020

NeerajKomuravalli commented Jul 16, 2020

Feature extraction #3210

Feature extraction #3210

Conversation

NeerajKomuravalli commented May 21, 2020

TobyRoseman commented May 21, 2020

TobyRoseman left a comment

Choose a reason for hiding this comment

TobyRoseman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TobyRoseman commented Jun 23, 2020

TobyRoseman commented Jun 24, 2020

NeerajKomuravalli commented Jun 24, 2020

TobyRoseman commented Jun 24, 2020

NeerajKomuravalli commented Jun 25, 2020

TobyRoseman commented Jun 25, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

NeerajKomuravalli commented Jul 1, 2020

TobyRoseman commented Jul 1, 2020

TobyRoseman commented Jul 7, 2020

NeerajKomuravalli commented Jul 9, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

NeerajKomuravalli commented Jul 11, 2020

TobyRoseman commented Jul 13, 2020

TobyRoseman commented Jul 16, 2020

NeerajKomuravalli commented Jul 16, 2020