Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to train on own dataset ? #3

Closed
gauthsvenkat opened this issue Jan 16, 2019 · 11 comments
Closed

How to train on own dataset ? #3

gauthsvenkat opened this issue Jan 16, 2019 · 11 comments

Comments

@gauthsvenkat
Copy link

I wanna try training LCFCN on my own dataset. What are the things I should be looking at (images, annotations, etc) to train the model on my own dataset ?

@IssamLaradji
Copy link
Collaborator

IssamLaradji commented Jan 16, 2019

The __getitem__ function in the dataset loaders such as in datasets/trancos.py shows you what LCFCN and its loss expect. They expect the following items:

 return {"images":image, "points":points, 
                "counts":counts, "index":index,
                "image_path":self.path + name + ".jpg"}

where images is an RGB with shape (1,3,H,W), points is a matrix with a single point for each object and has the shape (1,H,W), counts has the count for each category with shape (1, K), index is the image id; also,

H: is the image height
W: is the image width
K: is the number of classes

You can create a file like trancos.py for your dataset and then load it for training. Let me know if you need help in this part. Cheers!

@gauthsvenkat
Copy link
Author

First of all, Thanks a ton for the fast reply!

I'll explore the file and try to reverse engineer it as much a possible. I've never worked with pytorch so this is pretty new to me.

When you say a single point for each object... does that mean something like the center point of each object ? For example

0 0 0 0
0 1 0 0
0 0 0 0
0 0 0 0

would mean that the 1 corresponds to the center of an object ?

If that's the case I have the four coordinates for each object (I annotated them because I tried to solve it as an object detection challenge), I could just take the centroid right ?

@IssamLaradji
Copy link
Collaborator

Happy to help!

When you say a single point for each object... does that mean something like the center point of each object ?

Yes, you can take the center of the object as a single point, just like the example you showed. The value of the point represents the class of the object.

@gauthsvenkat
Copy link
Author

Okay I've seen the trancos.py file and mostly understand what's happening.

  1. What are the .mat files that are being loaded ?
  2. Also what are the .txt files present in the images directory ? (I'm looking at the TRANCOS dataset)
  3. What exactly is the transform function ?

@IssamLaradji
Copy link
Collaborator

  1. the .mat files are binary matrices that represent the regions of interest in the image. Not all datasets have that, for example, shanghai.py doesn't have that.

  2. the .txt files contain the paths to the images which you use to load image at every iteration by calling __getitem__

  3. the transform function are used to flip, rotate, or/and normalize the image. normalization is important if you are using a pretrained network like resnet which expects a specific kind of input distribution.

@gauthsvenkat
Copy link
Author

  1. So I don't necessarily need to have regions of interests included in my training images right ?

  2. I get the .txt files in image_sets/*.txt, but the .txt files in images/ which have the same name as that of the image. They have some numbers in them.

@IssamLaradji
Copy link
Collaborator

IssamLaradji commented Jan 18, 2019

  1. You are right, you don't need regions of interests for your training images;
  2. The image_sets specify which image files are for training, validation, and testing. So for training, you only load the images mentioned in image_sets/train.txt from images/

@gauthsvenkat
Copy link
Author

gauthsvenkat commented Jan 21, 2019

Man, the last few days I've been breaking my head over this. I don't exactly "get" the loss function (All 4 losses) or how you implemented it in torch. I was hoping if I get the loss function I could write it in keras (which I'm comfortable with). Is there maybe another source (like a blog post or an article) that explains how you practically implemented the loss (and the entire model in general) ?

Thanks a ton for helping out!

(I'm also closing this issue, since you did solve the actual issue)

@IssamLaradji
Copy link
Collaborator

you are welcome! you are free to open another issue where i can explain each part of the loss and/or architecture for you.

I don't think there is another source yet, but I am planning to create a blog post on this at some point. Sorry :(

@tongpinmo
Copy link

  1. You are right, you don't need regions of interests for your training images;
  2. The image_sets specify which image files are for training, validation, and testing. So for training, you only load the images mentioned in image_sets/train.txt from images/
     
    I have images and the points files ,should I produce the dots.png and .mat files ?

@tongpinmo
Copy link

  1. the .mat files are binary matrices that represent the regions of interest in the image. Not all datasets have that, for example, shanghai.py doesn't have that.
  2. the .txt files contain the paths to the images which you use to load image at every iteration by calling __getitem__
  3. the transform function are used to flip, rotate, or/and normalize the image. normalization is important if you are using a pretrained network like resnet which expects a specific kind of input distribution.

Actually ,in shanghai.py , line 45, there are .mat files?So ,what's the meaning of shanghai.py doesn't have that as you mentioned

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants