Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Steps for creating a custom dataset #21

Closed
ffazil opened this issue Jan 21, 2020 · 6 comments
Closed

Steps for creating a custom dataset #21

ffazil opened this issue Jan 21, 2020 · 6 comments
Labels
question Further information is requested

Comments

@ffazil
Copy link

ffazil commented Jan 21, 2020

Am trying out djl and really excited to finally try out object detection in Java. What are the steps to label images for object detection?

I see that images needs to be separated in to train and test directories and also see index.file containing the coordinates of annotated objects. What tool can be used to annotate images to generate index.file?

@frankfliu
Copy link
Contributor

@ffazil
You can try AWS SageMaker ground truth: https://aws.amazon.com/sagemaker/groundtruth/

@frankfliu frankfliu added the question Further information is requested label Jan 21, 2020
@lanking520
Copy link
Contributor

lanking520 commented Jan 21, 2020

Hi @ffazil, in order to train a basic classification or object detection model, you need to label each image.

About custom dataset, you can try some tools to generate labels for you. As an example:
SageMaker Ground Truth or Mechanical turk.

@zachgk
Copy link
Contributor

zachgk commented Jan 21, 2020

@ffazil Also, you are not required to follow the same formats as PikachuDetection dataset. You can just create your own dataset class extending RandomAccessDataset and then format your data in whatever way works best for you.

@stu1130
Copy link
Contributor

stu1130 commented Jan 21, 2020

@ffazil Also, you are not required to follow the same formats as PikachuDetection dataset. You can just create your own dataset class extending RandomAccessDataset and then format your data in whatever way works best for you.

You can follow the instruction here if you want to extend RandomAccessDataset dataset.
https://github.com/awslabs/djl/blob/master/docs/development/how_to_use_dataset.md#how-to-create-your-own-dataset

@ffazil
Copy link
Author

ffazil commented Jan 28, 2020

Thank you all.

@ffazil ffazil closed this as completed Jan 28, 2020
@thhart
Copy link
Contributor

thhart commented Sep 25, 2020

Hi contributors thanks for the suggestions added here. However I have some criticism and I have to say the documentation is very incomplete. Maybe it is worth to put some more effort in this whole topic.

  1. SageMaker is a very high level labeling tool which does not answer how to integrate a dataset into DJL.
  2. PikachuDetection is a good basic example which unfortunately only contains 1 label, further it is not clear how the Pikachu is woven into a record dataset. From the index structure I can see it is a relative bounding box structure. But still it is unclear how multiple labels can be converted to a NDArray record structure.
  3. Finally your link to the own dataset creation only addresses a different AI problem and not image labeling for object detection not to speak of image segmentation.

aksrajvanshi added a commit to aksrajvanshi/djl that referenced this issue Mar 18, 2021
Lokiiiiii pushed a commit to Lokiiiiii/djl that referenced this issue Oct 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

6 participants