Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training with custom data models results very poor. #250

Open
cappittall opened this issue Oct 5, 2023 · 1 comment
Open

Training with custom data models results very poor. #250

cappittall opened this issue Oct 5, 2023 · 1 comment

Comments

@cappittall
Copy link

cappittall commented Oct 5, 2023

I have trained custom object detection model with relevant data (~750 annotated image x 3 augmantations)
There are 4 choices for pretrained models. And tested 4 of them with different learning rate (default 0.3)
But I tried different combinations with 300 -200 epoch learning reate 0.01 to 0.3. However, the results always very very poor. Ap50 is max 0.35 .

Here reproducted result.
What am I do wrong. ???

Note :
With same data I got ~86% at object detection with EfficientDet pretrained models as desription.

index created!
Running per image evaluation...
Evaluate annotation type bbox
DONE (t=1.95s).
Accumulating evaluation results...
DONE (t=0.17s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.082
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.318
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.008
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.070
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.351
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.118
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.182
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.184
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.167
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.422
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000
Validation loss: [1.0166717767715454, 0.42808738350868225, 0.010559567250311375, 0.9560657143592834]
Validation coco metrics: {'AP': 0.08189376, 'AP50': 0.31780246, 'AP75': 0.00782904, 'APs': 0.069966085, 'APm': 0.3512227, 'APl': -1.0, 'ARmax1': 0.11827957, 'ARmax10': 0.18172044, 'ARmax100': 0.18387097, 'ARs': 0.16743295, 'ARm': 0.42222223, 'ARl': -1.0}

@cappittall cappittall changed the title custom training data allways very poor results. Trained with custom data models, Aallways very poor results. Oct 5, 2023
@cappittall cappittall changed the title Trained with custom data models, Aallways very poor results. Training with custom data models results very poor. Oct 5, 2023
@BoHellgren
Copy link

I have the same experience. I created a folder with 115 folders with cropped images of dogs and uploaded it to Google Drive. I ran the MediaPipe Model Maker Image Classifier Demo with all defaults except for image_path and spec = image_classifier.SupportedModels (EfficientNet-Lite0). I expected to get a tflite model that is smaller, faster and more accurate when classifying dog images, compared to EfficientNet-Lite0. What I got is indeed smaller but the accuracy is lousy. I suspect this has to do with the metadata created by the model maker. If I display the metadata of the new model, it shows mean 0.0 std 255.0 while EfficientNet-Lite0 has mean 127.0 std 128.0. See below for details.

{
  "name": "ImageClassifier",
  "description": "Identify the most prominent object in the image from a known set of categories.",
  "subgraph_metadata": [
    {
      "input_tensor_metadata": [
        {
          "name": "image",
          "description": "Input image to be processed.",
          "content": {
            "content_properties_type": "ImageProperties",
            "content_properties": {
              "color_space": "RGB"
            }
          },
          "process_units": [
            {
              "options_type": "NormalizationOptions",
              "options": {
                "mean": [
                  0.0
                ],
                "std": [
                  255.0
                ]
              }
            }
          ],
          "stats": {
            "max": [
              1.0
            ],
            "min": [
              0.0
            ]
          }
        }
      ],
      "output_tensor_metadata": [
        {
          "name": "score",
          "description": "Score of the labels respectively.",
          "content": {
            "content_properties_type": "FeatureProperties",
            "content_properties": {
            }
          },
          "stats": {
            "max": [
              1.0
            ],
            "min": [
              0.0
            ]
          },
          "associated_files": [
            {
              "name": "labels.txt",
              "description": "Labels for categories that the model can recognize.",
              "type": "TENSOR_AXIS_LABELS"
            }
          ]
        }
      ]
    }
  ]
}

{
  "name": "EfficientNet-lite image classifier",
  "description": "Identify the most prominent object in the image from a set of 1,000 categories such as trees, animals, food, vehicles, person etc.",
  "version": "1",
  "subgraph_metadata": [
    {
      "input_tensor_metadata": [
        {
          "name": "image",
          "description": "Input image to be classified. The expected image is 224 x 224, with three channels (red, blue, and green) per pixel. Each element in the tensor is a value between min and max, where (per-channel) min is [-0.9921875] and max is [1.0].",
          "content": {
            "content_properties_type": "ImageProperties",
            "content_properties": {
              "color_space": "RGB"
            }
          },
          "process_units": [
            {
              "options_type": "NormalizationOptions",
              "options": {
                "mean": [
                  127.0
                ],
                "std": [
                  128.0
                ]
              }
            }
          ],
          "stats": {
            "max": [
              1.0
            ],
            "min": [
              -0.992188
            ]
          }
        }
      ],
      "output_tensor_metadata": [
        {
          "name": "probability",
          "description": "Probabilities of the 1000 labels respectively.",
          "content": {
            "content_properties_type": "FeatureProperties"
          },
          "stats": {
            "max": [
              1.0
            ],
            "min": [
              0.0
            ]
          },
          "associated_files": [
            {
              "name": "labels_without_background.txt",
              "description": "Labels for objects that the model can recognize.",
              "type": "TENSOR_AXIS_LABELS"
            }
          ]
        }
      ]
    }
  ],
  "author": "MediaPipe",
  "license": "Apache License. Version 2.0 http://www.apache.org/licenses/LICENSE-2.0."
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants