Skip to content
This repository has been archived by the owner on Nov 3, 2022. It is now read-only.

flow_from_dataframe() found 0 images #92

Closed
Mahi-Mai opened this issue Nov 14, 2018 · 11 comments
Closed

flow_from_dataframe() found 0 images #92

Mahi-Mai opened this issue Nov 14, 2018 · 11 comments

Comments

@Mahi-Mai
Copy link

Hello again! I'm still struggling with flow_from_dataframe() after the issues I had here.

In order to use the new fixes, I cloned the keras repo, and then replaced the contents of the preprocessing folder with the latest from the keras-preprocessing repo. I renamed the local repo keras2 to avoid importing the vanilla repo. The code finally runs, but it's not finding any images.

Here's my script:

import pandas as pd
import numpy as np
import sys
sys.path.append('/Users/lmcane/documents/tools/keras2/')
from keras2.preprocessing.image import ImageDataGenerator


train = pd.read_csv('short_dir_train.csv', index_col=0)
print(train.filepath[0] + '\n')
train.info()

Returns:

Using TensorFlow backend.

March 29 2018/Top view_1-2/IMG_6823.JPG

<class 'pandas.core.frame.DataFrame'>
Int64Index: 869 entries, 0 to 868
Data columns (total 2 columns):
filepath    869 non-null object
label       869 non-null object
dtypes: object(2)
memory usage: 60.4+ KB

Then the main body of the script:

main_dir = '/Users/lmcane/Documents/Datasets/Unsorted Extracted/224x224px'

img_width, img_height = 224, 224
nb_train_samples = 433
nb_validation_samples = 216
batch_size = 20
epochs = 10

train_datagen = ImageDataGenerator(horizontal_flip = True,
                                   fill_mode = "nearest",
                                   zoom_range = 0.3,
                                   width_shift_range = 0.1,
                                   height_shift_range = 0.1,
                                   rotation_range = 30)

train_generator = train_datagen.flow_from_dataframe(dataframe=train,
                                                    directory=main_dir,
                                                    x_col='filepath',
                                                    y_col='label',
                                                    has_ext=True,
                                                    target_size = (img_height, img_width),
                                                    batch_size = batch_size, 
                                                    class_mode = "binary")

Returns:

Found 0 images belonging to 2 classes.

It should find 433. I suspect I didn't import the repo correctly?

@Mahi-Mai
Copy link
Author

Reporting back- I made some directories for testing and can confirm that providing absolute directories to images with capital extensions returned images.

My csv file looks like this:

location	label
/Users/lmcane/Desktop/mag/1.JPG	beep
/Users/lmcane/Desktop/mag/2.JPG	boop
/Users/lmcane/Desktop/mag/3.JPG	boop
/Users/lmcane/Desktop/mag/4.JPG	beep
/Users/lmcane/Desktop/mag/new/5.JPG	boop
/Users/lmcane/Desktop/mag/new/6.JPG	beep
/Users/lmcane/Desktop/mag/new/7.JPG	boop

.info() returns:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7 entries, 0 to 6
Data columns (total 2 columns):
location    7 non-null object
label       7 non-null object
dtypes: object(2)
memory usage: 192.0+ bytes

And finally, here's my code:

import pandas as pd
import numpy as np

import sys
sys.path.append('/Users/lmcane/documents/tools/keras2/')
from keras2.preprocessing.image import ImageDataGenerator

train = pd.read_csv('absolute directories.csv', delimiter='\t')

main_dir = None

img_width, img_height = 224, 224
batch_size = 20

train_datagen = ImageDataGenerator(horizontal_flip = True,
                                   fill_mode = "nearest",
                                   zoom_range = 0.3,
                                   width_shift_range = 0.1,
                                   height_shift_range = 0.1,
                                   rotation_range = 30)

train_generator = train_datagen.flow_from_dataframe(dataframe=train,
                                                    directory=main_dir,
                                                    x_col='location',
                                                    y_col='label',
                                                    has_ext=True,
                                                    target_size = (img_height, img_width),
                                                    batch_size = batch_size, 
                                                    class_mode = "binary")

Returns:

Found 7 images belonging to 2 classes.

This leads me to believe I'm doing something wrong in my original path list, but I'll try it again.

Can I get a confirmation that I'm applying this fix correctly, in terms of replacing the contents of the main Keras folder with the preprocessing scripts from keras-preprocessing? It appears to be working, but I'd like to avoid any nasty surprises.

@Mahi-Mai
Copy link
Author

Can confirm it was an error in my paths! It found all the images it was supposed to. :)

@Dref360 Dref360 closed this as completed Nov 26, 2018
@648374hub
Copy link

I've tried providing absolute directories of images with capital extensions but still couldn't find the images. Can't Understand what I'm doing wrong.

@tufail117
Copy link

tufail117 commented Sep 27, 2019

I was also facing the same error and found a solution for this.
I was using the absolute path, was using correct DataFrame and everything was fine still the code was throwing an error - "image not found".

I inspected and found that my dataframe was containing image names without extension and the images in the folder was having extension also.
E.g. The image name in DataFrame was 'abc' but the image in the folder was having a name 'abc.png'.
Just add .png in the image names in DataFrame and it will solve your problem.
I just tried below code and it worked out..!!!!

def append_ext(fn):
    return fn+".png"
train_valid_data["id_code"]=train_valid_data["id_code"].apply(append_ext)
test_data["id_code"]=test_data["id_code"].apply(append_ext)

Let me know if it solves your problem or if you need any further explanation.

@sbcool01
Copy link

I have a csv file with image id in one column and Labels(0 or 1) correponding to six different classes in six other columns. I am using flowfromdataframe to make training generator. But, image are not loaded by the generator which I checked by printing filenames of generated files. Here is the screenshot of it for reference.

Also tried to modify image id column by passing absolute path of image with capital extension and setting directory parameter of fit_generator to "none" but problem is not solved.
issue3

@sbcool01
Copy link

@Mahi-Mai
@Dref360
@tufail117
@648374hub
pls suggest

@DanRunfola
Copy link

Wow! I just rann into this, and it seems like there are about a thousand things that can cause this issue. For me, this worked only when I did not specify classes in the flow_from_dataframe. Good luck to others, hopefully this happens to help!

@federicoweill
Copy link

Wow! I just rann into this, and it seems like there are about a thousand things that can cause this issue. For me, this worked only when I did not specify classes in the flow_from_dataframe. Good luck to others, hopefully this happens to help!

Exactly the same...cant figure why...

@Philip-Kovacs
Copy link

For me, the issue was that I accidently specified delimiter=' ' for the dataframe csv file read instead of ',' so it took the whole csv row as image path.

@pankaj-mahadik-eagle
Copy link

for me it was cause I missed validation_split=0.2 while instantiating the ImageDataGenerator... It worked fine when I added it.

@AlphaLaser
Copy link

I was also facing the same error and found a solution for this. I was using the absolute path, was using correct DataFrame and everything was fine still the code was throwing an error - "image not found".

I inspected and found that my dataframe was containing image names without extension and the images in the folder was having extension also. E.g. The image name in DataFrame was 'abc' but the image in the folder was having a name 'abc.png'. Just add .png in the image names in DataFrame and it will solve your problem. I just tried below code and it worked out..!!!!

def append_ext(fn):
    return fn+".png"
train_valid_data["id_code"]=train_valid_data["id_code"].apply(append_ext)
test_data["id_code"]=test_data["id_code"].apply(append_ext)

Let me know if it solves your problem or if you need any further explanation.

Yooooooooooooooooooooooooo, this was the exact problem I was facing. You just saved me at an ML competition. I hope from my heart you have a good day :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants