Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error with the mjsynth-tfrecord.py file #28

Closed
kai-kaushik opened this issue Jun 15, 2018 · 8 comments
Closed

error with the mjsynth-tfrecord.py file #28

kai-kaushik opened this issue Jun 15, 2018 · 8 comments

Comments

@kai-kaushik
Copy link

I downloaded the mjsynth dataset separately and stored the images in the image subpath under the data directory. Basically, I did everything manually up until the "make mjsynth-tfrecord.py" command.
When i ran the command, it showed me a syntax error in the print line in this line from the mjsynth-tfrecord.py file.

    print str(i),'of',str(num_shards),'[',str(start),':',str(end),']',out_filename
    gen_shard(sess, input_base_dir, image_filenames[start:end], out_filename)
# Clean up writing last shard
start = num_shards*images_per_shard
out_filename = output_filebase+'-'+(shard_format % num_shards)+'.tfrecord'
print str(i),'of',str(num_shards),'[',str(start),':]',out_filename
gen_shard(sess, input_base_dir, image_filenames[start:], out_filename) 

since i am using python 3.6, I thought the problem is the absence of opening and closing brackets in the print line, hence i changed it to this...

    print (str(i),'of',str(num_shards),'[',str(start),':',str(end),']',out_filename)
    gen_shard(sess, input_base_dir, image_filenames[start:end], out_filename)
# Clean up writing last shard
start = num_shards*images_per_shard
out_filename = output_filebase+'-'+(shard_format % num_shards)+'.tfrecord'
print (str(i),'of',str(num_shards),'[',str(start),':]',out_filename)
gen_shard(sess, input_base_dir, image_filenames[start:], out_filename)

And the program started runnig, but Im seeing a lot of files read a error corrosponding to this line

    except:
        # Some files have bogus payloads, catch and note the error, moving on
        print('ERROR',filename)

Can anyone tell me why this is happening? Thankyou for the help in advance.

@weinman
Copy link
Owner

weinman commented Jun 15, 2018

Thanks for the note! If you find enough other places that vary significantly for Python 3, I'd be happy to have a separate branch that contains updates for Python3 , and I'd merge it with master if it works in both Python2 and Python3.

In any case, several of the jpg files in the raw mjsynth archive are just garbage. You can verify this by trying to load them in any image viewer (they might be truncated, but they tend to be only a handful of bytes relative to the valid images).

I don't know why that is, but the way the TFRecord encoder handles this is to detect the exception inevitably thrown by the image file decoder and let you know know about it whilst moving on to another example.

@kai-kaushik
Copy link
Author

Thanks for the prompt reply, the problem is, all I see are errors when reading the files. I'll try once in python 2 and see if I get the same error.

@Hust-ShuaiWang
Copy link

Did you run this code on a Windows system?

@weinman
Copy link
Owner

weinman commented Jan 7, 2019

@Hust-ShuaiWang I'm not sure whether you're asking me or @Kumara-Kaushik, but I can tell you that I did not run any of this repo on a Windows system. Are you suggesting that the problems with bad input data do not occur on a Windows-based file system?

@weinman weinman reopened this Jan 7, 2019
@Hust-ShuaiWang
Copy link

I am asking for @Kumara-Kaushik .I have met the same error when I run this repo on a Windows system.The reason for this problem is that the file has different storage formats on WINDOWS and LINUX.So you have to change the way you read the file.Just change the "with tf.gfile.FastGFile( filename, 'r' ) as f:" to "with tf.gfile.FastGFile( filename, 'rb' ) as f:" (line 133 in mjsynth-tfrecord.py).Detailed error reason, you have to find relevant information yourself, not difficult

@weinman
Copy link
Owner

weinman commented Jan 16, 2019

@Hust-ShuaiWang Thanks for the report. I'll try out that change on Linux, and if it works there too, I will commit it.

@david-morris
Copy link

@weinman I can confirm @Hust-ShuaiWang 's technique. I think it works because the 'b' signifies bytes, and the default file opening mode is as text now.

@weinman
Copy link
Owner

weinman commented Mar 9, 2019

Updated in defd8ae

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants