Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing example of training a gan "without requiring supervision of masks or points" #4

Open
romain-rsr opened this issue Dec 15, 2022 · 4 comments

Comments

@romain-rsr
Copy link

Hi,

We were able to run your gan on the celeba example but we are struggling to apply it on raw images (without any supervision of mask or keypoint).

After careful reading of your works, we understand the main interest of your model is to generate keypoints then masks then segmented images from training on a dataset with only raw images, without any mask or keypoints in the training set. Can you please confirm this and provide us with a version of your code that can be applied on a folder containing only raw 256x256 images ?

Many thanks again for your work and reactivity,
Romain

@xingzhehe
Copy link
Owner

Hi Romain,

Yes our main interest is obtaining the masks and keypoints by training only on raw images.

I am not sure I understand your request. Our model can be directly applied if you simply resize the 256x256 images to 128x128.
If you want to re-train on 256x256 images, you can

  1. delete the hard coded architecture in line 149-225 in generator.py to allow it consume 256x256 images..
  2. copy line 41 to line 42 in discriminator.py to allow it consume 256x256 images.

If you have further questions, don't hesitate to reach me.

Bests,
Xingzhe

@romain-rsr
Copy link
Author

romain-rsr commented Dec 16, 2022

Hi,

I'm sorry i didn't go straight to the point : we run your model successfully on celeba but we failed to apply it on this toy example, where the raw images we train the model on are plain blue rectangles of various sizes and position on a plain grey background :

image1

The model succeeds to generate such shapes but fail into segmenting the blue rectangle from the background.

  • here are two example with 3 keypoints :

image2
image3b
image3
(the segmentation mask is a tiny point on this one)

image4a
image4b
image5

  • here is an other examples with 8 keypoints :

image6
image7
image8

---------------------- more info

  • the number of processing epochs for these failed experiments was similar (and actually higher) than the one for the succeeding runs on celeba.
  • we used a the same batch size of 4 for all experiments (celeba and toy samples)

---------------------- why we asked about the preprocessing first

Since we produced the model input h5 file by applying the celeba processing file on our toy samples, our previous question aimed to verify that the preprocessing step was not in fault. Meanwhile, we actually wrote a generic preprocessing file on our own, that only requires a folder with raw images in it (without any segmentation information). It's available here if required : https://github.com/romain-rsr/colab/blob/main/uprocess.py

Bests,
Romain

@xingzhehe
Copy link
Owner

Hi Romain,

Thanks for this experiment! It is indeed very very interesting! It illustrates some insights I have never thought about.

I also tested by myself on GANSeg, and it didn't work. I think it is due to the (positional encoding or part embedding or background embedding) overfit this too simple dataset.

In parallel, I also tested on two different unsupervised keypoint detection methods, which use keypoint embeddings. They also fail although slightly slightly better than GANSeg, probably because they don't have background embeddings and positional encoding. I doubt if all methods using embeddings without additional care could lead to this problem to some extend.

Finally I tested on a method without using embeddings: https://xingzhehe.github.io/autolink/
and it finally works:
image
image
Although it only detect keypoints and their linkages, masks can be easily extracted.

Bests,
Xingzhe

@romain-rsr
Copy link
Author

Hi,

Thanks a lot for these complementary works on our examples. I focus my experiments on getting the segmentation encoding for a private dataset whom characteristics are halfway between those of the toy dataset and those of highly structured datasets (celeba, flower etc). Since we can't communicate publicly on it, I'll try to find a dataset that share the same problematics (high details and diversity with very low overall structuration)

Bests,
Romain

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants