-
Notifications
You must be signed in to change notification settings - Fork 19.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ImageDataGenerator with masks as labels #3059
Comments
@fchollet Still waiting for your thoughts :) |
@pengpaiSH I don't know if this would work, but maybe its enough to do it like this:
|
@mayorpain Thank you for your response. In your proposed solution, you set batch_size=1. I don't understand why X and y will be transformed simultaneously? |
@pengpaiSH have a look at this https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html |
@mayorpain Thanks for your reference. Is batch_size = 1 necessary? And,
In this way, you could get batch one by one which is the augmented |
@pengpaiSH i dont know if you can use batchsize > 1. If you look at the example on the page, they treat x as 1 image. Ok they did this just for plotting reasons but i dont know if this would also work for batchsizes > 1. I would Loop over my imageset and set x to the current image, then apply the imagegenerator like in the example and then use the same generator again on the mask like
|
@mayorpain Really thank you for your detailed comments. I think your idea is right, looks in the right direction. If we are appending each image or mask, then we should set batch_size=1. |
@mayorpain I have tried this idea and it works! Thank you again! By the way, it seems that @oeway is trying to extend ImageGenerator to support more flexibilities. |
Yes, you could try my fork(branch: extendImageDataGenerator), for now you can do: train_generator = dataGen1+dataGen2
model.fit_generator(train_generator) Suggestions would be appreciated. |
@oeway Thank you for your contribution for extending current ImageDataGenerator! |
For reference, here's another extension by He Xie (HEXIE). It should be useful when this enhancement is finalised and added to the unit tests. |
I see, so he added y directly to random_transformation, it will work but will less likely to be generalized into a customizable preprocessing pipeline. For my extension, I used a separate ImageDataGenerator for X and y, so the pipeline can be easily extended with your own functions. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed. |
If anyone else gets here from search, the new answer is that you can do this with the imagedatagenerator From Docs
|
Hi @peachthiefmedia ! I have two questions:
from the code of your last comment? And, 2)I've tried flow_from_directory() with images of type '.jpg', giving the folder containing them as parameter of the function, i. e., image_datagen.flow_from_directory(folder_of_jpg_images,target_size,class_mode='categorical')`` I've double checked the folder and the image type is correct. What could be the problem? Thanks in advance! |
data/images/where_they_are/ It's expecting that you will give it the data/images and data/masks directories, not the /images/where_they_are . Flow from directory is really built around mulitclass problems, but still works if you only have one folder in there (single class). |
@ peachthiefmedia |
@chaitanya1chaitanya Mine was slightly different but based on that, I used
Some of that was specific to my segmentation, but it should give you the idea. I use my own generators now for the most part. |
@peachthiefmedia For classification problem its fine,for segmentation problem how to give path correctly? if path to be given as in case(i),then why its showing double the no. of images for both generators. finally,can we use Imagedatagenerator for segmentation problem? |
im the same point as you @chaitanya1chaitanya , the exact same mistake and the same thoughts |
@chaitanya1chaitanya @ixoneioseba You need to use the following structure, the example is with 2 image but works with however many /data/images/0/image1.jpg Then you would use /data/images as your image directory and /data/masks for your mask directory in the generator, i.e. image_datagen.flow_from_directory('data/images' ... You need to have a directory in the directory because it is built for classification really, so it expects that you have more than one label set. |
The generator works well, however when fed to a neural network that does segmentation, it gives the error : "Error when checking target: expected activation_layer(final softmax) to have 3 dimensions, but got array with shape (32, 360, 480, 3)". The code for image_segmentation is the following:
#TRYING TO FIT_GENERATOR THROWS THIS ERROR(BOTH FOR FCNN AND SEGNET)
|
it's because the loss function is expecting the masks to be 2d arrays, and the image generator is reading 3d rgb arrays. |
@peachthiefmedia so many thanks for your solution. I'm trying to make it work and the process takes a lot of RAM memory (from CPU not GPU) in the zip step (more than 16Gb for 1000 512x512 images) and takes a lot of time to work, do you think this is normal? Somebody have an idea to reduce this RAM memory consumption? |
Will this work if batch_size is greater than 1 ??? |
Hey @aliechoes ! I solved this problem by the creation of my own datagenerator for images and masks simultaneously. Here you have the code I used:
Note I have not used image augmentation but the code resize the images to feed the network. |
@amlarraz : thanks. It works. However, I start having a problem with the fit generator. it goes file by file
However I chose the batch_size to be 64 for example. Did you have the same issue? Am I making a mistake? Thanks |
Hey @aliechoes, how many images do you have for train? I think you're seeing the number of iterations, not the images. Each iteration is one batch, I mean, if you have 6 images with a batch size of 3, you only have 2 steps. Please check how many images do you have. Happy to help! |
@amlarraz : oh yeah. I totally missed the difference between fit_generator and fit. Thanks a lot :) |
Without the hassle of organizing the folders for the case that one map matches on one mask as the answer above, we can use .flow()
note: make sure you don't have other np random seed generators in the code before this function. for those who don't get a GPU and get stuck with zip(), please read this:
|
Hi, @peachthiefmedia , I have a question that In a multi-class segmentation, masks have multiple colors, and their each pixel need to be converted to a one-hot vector or a integer label. How to use the API of ImageDataGenerator to augment mask correctly? |
@JianyingLi I have always been directly outputting the masks themselves for segmentation, but I've only done up to 3 classes at a time for it so its fine outputting R,G,B as the effective integer 0,1,2 class number, I find it gets progressively more difficult to segment more classes so I normally split the training into single class only at the moment and then run all the models on the image and use some image based work afterwards to get the final segmentation. |
I will try these methods. Thanks for your reply. :) |
@sammilei Hi, I did something pretty similar (identical) to your code but my images and masks (that I save through "save_to_dir" option) do not seem to match. Moreover, masks are ok but images are saved as totally black (I have rgb images).
Does someone have some suggestions? |
I followed the tutorial from the official keras documentation for image and mask generators till this line:
I got this error message:
Does anyone know why? Thanks! |
@yanfengliu would you able to resolve the issue, I also have the same one. |
@shivg7706 I actually ended up writing my own generator. To make sure that the images and masks have the same augmentation, I recommend https://github.com/albu/albumentations |
@shivg7706 This is the data generator I wrote. I haven't optimized it for speed/multi-worker, but at least it works: import os
import albumentations as albu
import cv2
import keras
import numpy as np
def read_image_from_list(image_list, idx):
img_path = image_list[idx]
img = cv2.imread(img_path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
return img
def read_mask_from_list(mask_list, idx):
mask_path = mask_list[idx]
mask = cv2.imread(mask_path)
mask = cv2.cvtColor(mask, cv2.COLOR_BGR2GRAY)
mask = np.expand_dims(mask, axis=-1)
return mask
def augment_image_and_mask(image, mask, aug):
augmented = aug(image=image, mask=mask)
image_augmented = augmented['image']
mask_augmented = augmented['mask']
return image_augmented, mask_augmented
def normalize_img(img):
img = img/255.0
img[img > 1.0] = 1.0
img[img < 0.0] = 0.0
return img
class SegmentationDataGenerator:
def __init__(self, image_list, mask_list, img_size,
batch_size, num_classes, augmentation, shuffle):
self.image_list = image_list
self.mask_list = mask_list
self.img_size = img_size
self.batch_size = batch_size
self.num_classes = num_classes
self.aug = augmentation
self.shuffle = shuffle
self.indices = np.arange(start = 0, stop = len(image_list), step = 1).tolist()
self.idx = 0
self.batch_num = len(image_list) // batch_size
if shuffle:
np.random.shuffle(self.indices)
def read_batch(self):
image_batch = np.zeros((self.batch_size, self.img_size, self.img_size, 3))
mask_batch = np.zeros((self.batch_size, self.img_size, self.img_size, 1))
indices = self.indices[(self.idx*self.batch_size):((self.idx+1)*self.batch_size)]
for i in range(self.batch_size):
idx = indices[i]
img = read_image_from_list(self.image_list, idx)
mask = read_mask_from_list(self.mask_list, idx)
img, mask = augment_image_and_mask(img, mask, self.aug)
img = normalize_img(img)
image_batch[i] = img
mask_batch[i] = mask
self.idx += 1
return image_batch, mask_batch
def get_batch(self):
if (self.idx < self.batch_num):
image_batch, mask_batch = self.read_batch()
else:
if self.shuffle:
np.random.shuffle(self.indices)
self.idx = 0
image_batch, mask_batch = self.read_batch()
return image_batch, mask_batch
def prep_for_model(self, img, mask):
img = img * 2 - 1
mask = keras.utils.to_categorical(mask, self.num_classes)
return img, mask
# constants
IMG_SIZE = 256
BATCH_SIZE = 4
NUM_CLASSES = 7
# read list of filenames from dir
image_list = os.listdir("image")
mask_list = os.listdir("mask")
# shuffle files with a fixed seed for reproducibility
idx = np.arange(len(image_list))
np.random.seed(1)
np.random.shuffle(idx)
image_list = [image_list[i] for i in idx]
mask_list = [mask_list[i] for i in idx]
# split train and test data
train_test_split_idx = int(0.9 * len(image_list))
train_image_list = image_list[:train_test_split_idx]
test_image_list = image_list[train_test_split_idx:]
train_mask_list = mask_list[ :train_test_split_idx]
test_mask_list = mask_list[ train_test_split_idx:]
# define image augmentation operations for train and test set
aug_train = albu.Compose([
albu.Blur(blur_limit = 3),
albu.RandomGamma(),
albu.augmentations.transforms.Resize(height = IMG_SIZE, width = IMG_SIZE),
albu.RandomSizedCrop((IMG_SIZE - 50, IMG_SIZE - 1), IMG_SIZE, IMG_SIZE)
])
aug_test = albu.Compose([
albu.augmentations.transforms.Resize(height = IMG_SIZE, width = IMG_SIZE)
])
# construct train and test data generators
train_generator = SegmentationDataGenerator(
image_list = train_image_list,
mask_list = train_mask_list,
img_size = IMG_SIZE,
batch_size = BATCH_SIZE,
num_classes = NUM_CLASSES,
augmentation = aug_train,
shuffle = True)
test_generator = SegmentationDataGenerator(
image_list = test_image_list,
mask_list = test_mask_list,
img_size = IMG_SIZE,
batch_size = BATCH_SIZE,
num_classes = NUM_CLASSES,
augmentation = aug_test,
shuffle = False) To use the generator in training, do the following: # training
step = 0
EPOCHS = 100
loss_history = []
for epoch in range(EPOCHS):
print(f'Training, epoch {epoch}')
for i in range(train_generator.batch_num):
step += 1
img_batch, mask_batch = train_generator.get_batch()
img_batch, mask_batch = train_generator.prep_for_model(img_batch, mask_batch)
history = model.fit(img_batch, mask_batch, batch_size = BATCH_SIZE, verbose = False)
loss_history.append(history.history['loss'][-1]) I hope this helps. If you spot anything to correct or change, feel free to leave a comment :) |
When I use the ImageDataGenerator with masks as labels,the value of the masks will change,but I do not set rescale.The original value of mask is between 0 and 5.After imagedatagenerator ,it become 255. @author: nzl we create two instances with the same argumentsfrom keras.preprocessing.image import ImageDataGenerator train_datagen = ImageDataGenerator( i = 0 i = 0 |
While @peachthiefmedia 's solution is very neat, the use of One solution is to implement a from keras.utils import Sequence
class MergedGenerators(Sequence):
def __init__(self, *generators):
self.generators = generators
# TODO add a check to verify that all generators have the same length
def __len__(self):
return len(self.generators[0])
def __getitem__(self, index):
return [generator[index] for generator in self.generators]
train_generator = MergedGenerators(image_generator, mask_generator) |
Given the folder structure, Workaround discussed above is to tweak the folder structure as below. However, this can be avoided with changes to the path and using additional "classes" parameter. image_generator = image_datagen.flow_from_directory( mask_generator = mask_datagen.flow_from_directory( https://kylewbanks.com/blog/loading-unlabeled-images-with-imagedatagenerator-flowfromdirectory-keras |
What will be the scenario if I have to send multiple masks for a same image as input? What changes are needed to be done to the ImageData Generator |
thank you very much!
…------------------ 原始邮件 ------------------
发件人: "Raghu Ramarao"<notifications@github.com>;
发送时间: 2019年10月25日(星期五) 凌晨0:09
收件人: "keras-team/keras"<keras@noreply.github.com>;
抄送: "a...."<490269628@qq.com>;"Comment"<comment@noreply.github.com>;
主题: Re: [keras-team/keras] ImageDataGenerator with masks as labels (#3059)
Given the folder structure,
/data/images/image1.jpg
/data/images/image2.jpg
/data/masks/image1.jpg
/data/masks/image1.jpg
and the corresponding error output: "Found 0 images belonging to 0 classes"
Workaround discussed above is to tweak the folder structure as below.
/data/images/0/image1.jpg
/data/images/0/image2.jpg
/data/masks/0/image1.jpg
/data/masks/0/image1.jpg
However, this can be avoided with changes to the path and using additional "classes" parameter.
image_generator = image_datagen.flow_from_directory(
'data/',
classes=['images'],
class_mode=None,
seed=seed)
mask_generator = mask_datagen.flow_from_directory(
'data/',
classes=['masks'],
class_mode=None,
seed=seed)
https://kylewbanks.com/blog/loading-unlabeled-images-with-imagedatagenerator-flowfromdirectory-keras
should help to understand better.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
@michirup I also have the problem, that the masks and images do not match anymore! Could you fix it? |
I do form the dataset and the input pipeline differently now:
#n_images_to_iterate = len(frames_paths_train)
The same I do for the validation dataset. Later on I call: If I execute the same code on my laptop instead of the computer I usually execute the code on, I do again face the problem that the mask and image files do not match anymore. I figured out that it is caused by the line: This line of code seems to work differently on the two computers. Maybe it's caused by the version of the package or some other package? I am not sure but on one of my computers it behaves the way it is supposed to. |
@YasarL Thanks for the swift response. Is it due to version changes? Never faced such issue before. |
@amlarraz , do you have RGB images in your 'img_dir' and black & white images in your 'label_dir'? |
@EtagiBI yes, I had RGB images in img_dir and grayscale images in label_dir, however if you want to use grayscale images in img_dir you can (but the typical architectures pretrained in imagenet need 3-channel images as input) |
Hmm. It should work then. I get the following error at the very beginning of the learning process:
|
could you please share all the error to see the line where the problem is? |
@amlarraz here's a complete traceback:
|
I think everything is ok in your generator, the error says that you have images with 3 channels and your images have 3 channels.
Where "mean" and "std" are your train set channel mean and std. |
@fchollet
We know that ImageDataGenerator provides a way for image data augmentation:
ImageDataGenerator.flow(X, Y)
. Now consider the image segmentation task whereY
is not a categorical label but a image mask which is the same size as inputX
, e.g. 256x256 pixels. If we would like to use data augmentation, the same transformation should also be adopted toY
. Is there any simple way to handle this?The text was updated successfully, but these errors were encountered: