Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TFRecordReader keeps files locked after session closes #14772

Closed
Utumno opened this issue Nov 21, 2017 · 10 comments
Closed

TFRecordReader keeps files locked after session closes #14772

Utumno opened this issue Nov 21, 2017 · 10 comments
Assignees
Labels
stat:awaiting response Status - Awaiting response from author

Comments

@Utumno
Copy link

Utumno commented Nov 21, 2017

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow):yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):windows 7 64bit
  • TensorFlow installed from (source or binary): pip install
  • TensorFlow version (use command below): 1.4.0
  • Python version: 3.5.2
  • Bazel version (if compiling from source):-
  • GCC/Compiler version (if compiling from source):-
  • CUDA/cuDNN version:-
  • GPU model and memory:-
  • Exact command to reproduce:

Running this script (you need some tfrecords from here):

import os
import shutil
import sys
import tempfile

import tensorflow as tf

data_dir = r'/path/to/tfrecords'

def test_generate_tfrecords_from_csv():
    with tempfile.TemporaryDirectory() as tmpdirname:
        filenames = os.listdir(data_dir)
        for f in filenames:
            shutil.copy(os.path.join(data_dir, f), os.path.join(tmpdirname, f))
        filenames = sorted([os.path.join(tmpdirname, f) for f in filenames])
        # Create a queue that produces the filenames to read.
        queue = tf.train.string_input_producer(filenames, num_epochs=1,
                                               shuffle=False)
        with tf.Session() as sess:
            sess.run(tf.local_variables_initializer()) # Local !
            tf.train.start_queue_runners(sess=sess)
            reader = tf.TFRecordReader()
            for j in range(len(filenames)):
                key, value = reader.read(queue)
                features_dict = tf.parse_single_example(value, features={
                    'label': tf.FixedLenFeature([], tf.string),})
                # the decode call below is needed, if you replace it with
                # label = tf.constant(0) no files are locked
                label = tf.decode_raw(features_dict['label'], tf.float32)
                _ = sess.run([label]) # files are locked here
        listdir = os.listdir(tmpdirname)
        print(tmpdirname, listdir)
        for f in sorted(listdir):
            os.remove(os.path.join(tmpdirname, f))

print(tf.__version__)
print(sys.version)
test_generate_tfrecords_from_csv()

Produces:

C:\_\Python35>python.exe C:\Users\MrD\.PyCharm2017.2\config\scratches\so_46259067.py
1.4.0
3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:18:55) [MSC v.1900 64 bit (AMD64)]
C:\Users\MrD\AppData\Local\Temp\tmp3hqhkgy0 ['img_2013-01-01-00-00.tfrecords', 'img_2013-01-01-00-01.tfrecords', 'img_2013-01-01-00-02.tfrecords']
Traceback (most recent call last):
  File "C:\Users\MrD\.PyCharm2017.2\config\scratches\so_46259067.py", line 38, in <module>
    test_generate_tfrecords_from_csv()
  File "C:\Users\MrD\.PyCharm2017.2\config\scratches\so_46259067.py", line 34, in test_generate_tfrecords_from_csv
    os.remove(os.path.join(tmpdirname, f))
  File "C:\_\Python35\lib\tempfile.py", line 808, in __exit__
    self.cleanup()
  File "C:\_\Python35\lib\tempfile.py", line 812, in cleanup
    _shutil.rmtree(self.name)
  File "C:\_\Python35\lib\shutil.py", line 488, in rmtree
    return _rmtree_unsafe(path, onerror)
  File "C:\_\Python35\lib\shutil.py", line 383, in _rmtree_unsafe
    onerror(os.unlink, fullname, sys.exc_info())
  File "C:\_\Python35\lib\shutil.py", line 381, in _rmtree_unsafe
    os.unlink(fullname)
PermissionError: [WinError 5] Access is denied: 'C:\\Users\\MrD\\AppData\\Local\\Temp\\tmp3hqhkgy0\\img_2013-01-01-00-02.tfrecords'

(I had also asked at stack overflow here. Unless I am doing something stupid shouldn't the tfrecord file be free for deleting after the session closes ? Do I have to explicitly close it (is it even possible) ?

The equivalent dataset code has the same issue:

def test_generate_tfrecords_from_csv_dataset():
    with tempfile.TemporaryDirectory() as tmpdirname:
        filenames = os.listdir(data_dir)
        for f in filenames:
            shutil.copy(os.path.join(data_dir, f), os.path.join(tmpdirname, f))
        filenames = sorted([os.path.join(tmpdirname, f) for f in filenames])
        def _parse_rec(value):
            features_dict = tf.parse_single_example(value, features={
                    'label': tf.FixedLenFeature([], tf.string),})
            # return tf.constant(0, tf.float32)  # files are locked all the same
            return tf.decode_raw(features_dict['label'], tf.float32)
        dataset = tf.data.TFRecordDataset(filenames).map(_parse_rec)
        get_next = dataset.make_one_shot_iterator().get_next
        with tf.Session() as sess:
            for j in range(len(filenames)):
                label = get_next()
                _ = sess.run([label]) # files are locked here
        listdir = os.listdir(tmpdirname)
        print(tmpdirname, listdir)
        for f in sorted(listdir):
            os.remove(os.path.join(tmpdirname, f))

It seems in both cases it locks the last file - the others are removed ok.

@drpngx
Copy link
Contributor

drpngx commented Dec 7, 2017

OK, it looks like it's a problem in the TFRecordReader. I glanced at the code of the reader itself and it seems fine. Would you mind putting a few LOG(INFO) statements to see if the lock reset code gets called, and if not, why?

@drpngx drpngx added the stat:awaiting response Status - Awaiting response from author label Dec 7, 2017
@Utumno
Copy link
Author

Utumno commented Dec 7, 2017

Thanks for the response @drpngx - where should I add those statements ?

@drpngx
Copy link
Contributor

drpngx commented Dec 7, 2017

I would start here:

Status OnWorkFinishedLocked() override {

@Utumno
Copy link
Author

Utumno commented Dec 7, 2017

This means I will have to recompile tf - I have no time for this now, hopefully soon

@drpngx
Copy link
Contributor

drpngx commented Dec 7, 2017

Sounds good, thanks!
/CC @josh11b

@aselle aselle removed the stat:awaiting response Status - Awaiting response from author label Dec 7, 2017
@drpngx drpngx added the stat:awaiting response Status - Awaiting response from author label Dec 9, 2017
@Utumno
Copy link
Author

Utumno commented Jan 3, 2018

It is still an issue

@drpngx
Copy link
Contributor

drpngx commented Jan 3, 2018

Could you provide more information by printing out what the call sequences look like? Is OnWorkFinishedLocked called and if not, why not?

@Utumno
Copy link
Author

Utumno commented Jan 3, 2018

Sorry I have no time to look into the C++ code now.

@tensorflowbutler tensorflowbutler removed the stat:awaiting response Status - Awaiting response from author label Jan 23, 2018
@josh11b josh11b added the stat:awaiting response Status - Awaiting response from author label Jun 25, 2018
@tensorflowbutler
Copy link
Member

We are closing this issue for now due to lack of activity. Please comment if this is still an issue for you. Thanks!

@rvinas
Copy link
Contributor

rvinas commented May 23, 2019

Any update here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat:awaiting response Status - Awaiting response from author
Projects
None yet
Development

No branches or pull requests

6 participants