Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix fileobj I/O un-deterministic behavior #1297

Merged
merged 4 commits into from
Feb 23, 2021
Merged

Conversation

mthrok
Copy link
Collaborator

@mthrok mthrok commented Feb 23, 2021

Ever since the file-like object support was added in #1158, the test
was occasionally failing in CI. This PR fixes this.

What issues are resolved by this PR?

The failing tests exhibit flakiness in the following three areas;

  1. info_test
    For mp3 and vorbis formats, the number of reported frames were most times zero (this is due to the container format), but occasionally, the test reported a correct number. It should be always reporting the same number.
  2. load_test
    The shape of the Tensor loaded with load function occasionally had different number of frames than what actually the file has.
  3. save_test
    The saved data occasionally lacked some frames at the end.

On which environment were the issues happening?

unix system, all Python environment

What was the cause and what is the fix?

When opening file (for both read and write), libsox initializes the dedicated data structure called sox_format_t, and set various attributes. One of which is seekable and this value is supposed to be false for file-like objects. In the investigation #1229, it turned out that this attribute is occasionally true which contributed to the un-deterministic behavior described above. The following code from libsox determines if the opened FILE* is seekable, by simply checking if the associated file descriptor is a regular file. [src]

static sox_bool is_seekable(sox_format_t const * ft)
{
  struct stat st;

  assert(ft);
  if (!ft->fp)
    return sox_false;
  fstat(fileno((FILE*)ft->fp), &st);
  return ((st.st_mode & S_IFMT) == S_IFREG);
}

The problem is that fileno returns -1 for the case of file-like object. This is because for file-like object, we use sox_open_mem_read or sox_open_mem_stream_write, which calls fmemopen and open_memstream respectively, and the resulting *FILE object does not have the associated file descriptor. In this case, fstat received -1 as the file descriptor value and (I guess) the stat structure is untouched. Thus the attribute of st_mode has an un-initialized value, which result in un-deterministic behavior.

This PR adds patch to check the the value of file descriptor and mark seekable=false if it does not have a valid file descriptor.

How do we know this patch fixes the issue?

I took the detailed note on #1229. The following script will keep running the tests.

https://github.com/mthrok/audio/blob/10ba189d8bc5f0beb5f7200fcb59778acc490033/debug_fileobj.sh#L1

Before applying the patch, it typically took ~15 mins for one of the tests to fail. After this patch is applied, they do not fail. I kept running the test over night and none of the tests failed.

TODOs

Ever since the file-like object support was added in pytorch#1158, the test
was occasionally failing in CI. This PR fixes this.
@mthrok mthrok marked this pull request as ready for review February 23, 2021 14:03
@mthrok mthrok added this to the v0.8 milestone Feb 23, 2021
@mthrok mthrok changed the title Fix fileobj I/O undeterministic behavior Fix fileobj I/O un-deterministic behavior Feb 23, 2021
@mthrok mthrok merged commit 71486ac into pytorch:master Feb 23, 2021
@mthrok mthrok deleted the fix-fileobj branch February 23, 2021 18:08
mthrok added a commit to mthrok/audio that referenced this pull request Feb 23, 2021
* Fix fileobj I/O undeterministic behavior

Ever since the file-like object support was added in pytorch#1158, the test
was occasionally failing in CI. This PR fixes this.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants