New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows Named Pipe as source broken #101

Closed
GoogleCodeExporter opened this Issue Mar 24, 2015 · 12 comments

Comments

Projects
None yet
1 participant
@GoogleCodeExporter

GoogleCodeExporter commented Mar 24, 2015

What steps will reproduce the problem?
1.

The "plain" call would be this:

  xdelta3 -s v0 v1 v1.vcdiff

I'm trying to get it working using a windows named pipe for the source. e.g.,

xdelta3 -s "\\.\pipe\mypipe" v1 v1.vcdiff

Where the pipe exists and is set to stream out the contents of the source
file (e.g. version 0 in this case).

What is the expected output? What do you see instead?

Two issues:
 1. Performance was at first really terrible. I gave up after a couple of
minutes (should take seconds). This led me to discover what I believe is a
bug. 
 2. Once I'd fixed (?) that bug, it now prints out:

  xdelta3: target window checksum mismatch: XD3_INVALID_INPUT

What version of the product are you using? On what operating system?

xdelta3 w on Windows XP 32 bit, Visual Studio 2008 (from source code).

Tried also the latest from SVN - same result.

Please provide any additional information below.

OK, so first I think there may be a bug. In "main_set_source" it calls
"main_file_stat" to work out the size of the file, and if that fails, it
assumes it's a FIFO (see xdelta3-main.h:2804).

The issue is that GetFileSizeEx returns non failure for pipes - just
returns a size of 0. I believe this confuses things and somehow causes the
read size to be 1 byte (hence the poor performance of calling ReadFile for
a single byte each time). 

So, I fixed that with the attached patch which just makes "main_file_stat"
fail when the file is a pipe.

So anyway, now I can see that it succesfully opens the pipe and treats it
as "size unknown".

The next problem is that it now fails with the error: 

  xdelta3: target window checksum mismatch: XD3_INVALID_INPUT

This happens quite quickly, and since it's a 200 MB file, I don't think
it's anything to do with the end of stream.

I'm 99% sure the pipe is providing the right data (I wrote a dummy app to
read from the pipe and dump it out - it was correct).

I also know that it works when I don't use a pipe, but instead use the
source file as normal (see first command at, above).

The source code which creates the pipe is something like:

  HANDLE pipe = CreateNamedPipe(L"\\\\.\\pipe\\mypipe",
                                PIPE_ACCESS_OUTBOUND,
                                PIPE_TYPE_BYTE | PIPE_WAIT | 
PIPE_READMODE_BYTE,
                                50, // 50 instances allowed
                                1024 * 1024 * 5, // out buffer size
                                1024 * 1024 * 5, // in buffer size
                                10000, // Client time out (default)
                                NULL); // Security attrs

  if (pipe == INVALID_HANDLE_VALUE) 
  {
    throw 0;
  }

  if (ConnectNamedPipe(pipe, NULL) == 0)
  {
    throw 1;
  }

  FILE* f = fopen("D:\\Projects\\TestData\\Temp\\0_OUT", "rb");

  int size = 1024 * 1024 * 32;
  byte* buf = new byte[size];

  uint32 c;
  while ((c = fread(buf, 1, size, f)) != 0)
  {
    DWORD temp;

    if (!WriteFile(pipe, buf, c, &temp, NULL))
    {
      throw GetLastError();
    }

    if (temp != c)
    {
      throw 2;
    }
  }

  CloseHandle(pipe);
  return 0;

Maybe it was just never supposed to work with named pipes, but they are
pretty similar to FIFO so I don't see why they shouldn't work.

Original issue reported on code.google.com by jdmw...@gmail.com on 15 Jan 2010 at 5:19

Attachments:

@GoogleCodeExporter

This comment has been minimized.

Show comment
Hide comment
@GoogleCodeExporter

GoogleCodeExporter Mar 24, 2015

This is very cool.  I'm also trying to fix a performance bug in 3.0w related to 
the 
recent source-from-fifo support, and I'll be able to look at your patch in 
detail in a 
few days.  Unfortunately, no one has reported having short examples of inputs 
that give 
bad performance

Original comment by josh.mac...@gmail.com on 15 Jan 2010 at 6:01

  • Changed state: Started

GoogleCodeExporter commented Mar 24, 2015

This is very cool.  I'm also trying to fix a performance bug in 3.0w related to 
the 
recent source-from-fifo support, and I'll be able to look at your patch in 
detail in a 
few days.  Unfortunately, no one has reported having short examples of inputs 
that give 
bad performance

Original comment by josh.mac...@gmail.com on 15 Jan 2010 at 6:01

  • Changed state: Started
@GoogleCodeExporter

This comment has been minimized.

Show comment
Hide comment
@GoogleCodeExporter

GoogleCodeExporter Mar 24, 2015

OK interestingly I tried it on a smaller file and it worked ok. 

Could it be a bug with the streaming stuff?

Original comment by jdmw...@gmail.com on 15 Jan 2010 at 6:04

GoogleCodeExporter commented Mar 24, 2015

OK interestingly I tried it on a smaller file and it worked ok. 

Could it be a bug with the streaming stuff?

Original comment by jdmw...@gmail.com on 15 Jan 2010 at 6:04

@GoogleCodeExporter

This comment has been minimized.

Show comment
Hide comment
@GoogleCodeExporter

GoogleCodeExporter Mar 24, 2015

(addendum): Re my last comment, I mean I tried it using my patch and it worked.

Original comment by jdmw...@gmail.com on 15 Jan 2010 at 6:05

GoogleCodeExporter commented Mar 24, 2015

(addendum): Re my last comment, I mean I tried it using my patch and it worked.

Original comment by jdmw...@gmail.com on 15 Jan 2010 at 6:05

@GoogleCodeExporter

This comment has been minimized.

Show comment
Hide comment
@GoogleCodeExporter

GoogleCodeExporter Mar 24, 2015

There was a bug introduced in 3.0w and fixed in SVN 310.  I wonder how this 
relates to 
your fix, but I haven't had a chance to look yet.

Original comment by josh.mac...@gmail.com on 16 Feb 2010 at 5:43

GoogleCodeExporter commented Mar 24, 2015

There was a bug introduced in 3.0w and fixed in SVN 310.  I wonder how this 
relates to 
your fix, but I haven't had a chance to look yet.

Original comment by josh.mac...@gmail.com on 16 Feb 2010 at 5:43

@GoogleCodeExporter

This comment has been minimized.

Show comment
Hide comment
@GoogleCodeExporter

GoogleCodeExporter Mar 24, 2015

I've released 3.0x (source).  Would you mind trying again?

Original comment by dotdotis...@gmail.com on 16 Feb 2010 at 6:47

GoogleCodeExporter commented Mar 24, 2015

I've released 3.0x (source).  Would you mind trying again?

Original comment by dotdotis...@gmail.com on 16 Feb 2010 at 6:47

@GoogleCodeExporter

This comment has been minimized.

Show comment
Hide comment
@GoogleCodeExporter

GoogleCodeExporter Mar 24, 2015

I've released 3.0y...

Original comment by josh.mac...@gmail.com on 23 Feb 2010 at 4:41

GoogleCodeExporter commented Mar 24, 2015

I've released 3.0y...

Original comment by josh.mac...@gmail.com on 23 Feb 2010 at 4:41

@GoogleCodeExporter

This comment has been minimized.

Show comment
Hide comment
@GoogleCodeExporter

GoogleCodeExporter Mar 24, 2015

OK, I think I've found the problem.

The issue is the way xdelta handles "EOF" on an input stream.

It seems that if it reads less than it asked for in "main_read_primary_input" 
then
that must signify EOF. 

E.g., if the buffer size is 16KB and it only receives 8KB, then we're at EOF.

While this is usually a safe bet on a file where it will always fill the input
buffer, this may not work with windows pipes.

I've verified this experimentally using latest release (3.0y).

In my code, I create a pipe to stream 1025 bytes worth of data.

I use a standard copy loop which copies 1024 bytes at a time.

I put in a pause after the first loop (i.e. 1024 bytes are copied). This leaves 
1
byte remaining to be copied.

When I do this xdelta reads the 1024 bytes, processes it but then exits.

I can see that the xdelta process has ended before I am able to send the last
reamining byte.


Note that you *could* buffer up bytes on the input but I suspect this might end 
in
disaster in rare cases. E.g., you'd have to make sure you passed in data as 
multiples
of the buffer size. Otherwise, you could end up with a case where xdelta gets 
ahead
of the sended and thinks there is EOF.

Note that the docs for ReadFile do NOT indicate that failure to fill a buffer
indicates EOF. In fact, it says that EOF will be indicated by returning with 0 
read
bytes.

So, it seems to me the bug is in the use of ReadFile. 

To reiterate:

 * Failure to fill read buffer should NOT be considered EOF on Windows
 * EOF is correctly established by ReadFile returning zero.

Original comment by jdmw...@gmail.com on 12 Mar 2010 at 3:30

GoogleCodeExporter commented Mar 24, 2015

OK, I think I've found the problem.

The issue is the way xdelta handles "EOF" on an input stream.

It seems that if it reads less than it asked for in "main_read_primary_input" 
then
that must signify EOF. 

E.g., if the buffer size is 16KB and it only receives 8KB, then we're at EOF.

While this is usually a safe bet on a file where it will always fill the input
buffer, this may not work with windows pipes.

I've verified this experimentally using latest release (3.0y).

In my code, I create a pipe to stream 1025 bytes worth of data.

I use a standard copy loop which copies 1024 bytes at a time.

I put in a pause after the first loop (i.e. 1024 bytes are copied). This leaves 
1
byte remaining to be copied.

When I do this xdelta reads the 1024 bytes, processes it but then exits.

I can see that the xdelta process has ended before I am able to send the last
reamining byte.


Note that you *could* buffer up bytes on the input but I suspect this might end 
in
disaster in rare cases. E.g., you'd have to make sure you passed in data as 
multiples
of the buffer size. Otherwise, you could end up with a case where xdelta gets 
ahead
of the sended and thinks there is EOF.

Note that the docs for ReadFile do NOT indicate that failure to fill a buffer
indicates EOF. In fact, it says that EOF will be indicated by returning with 0 
read
bytes.

So, it seems to me the bug is in the use of ReadFile. 

To reiterate:

 * Failure to fill read buffer should NOT be considered EOF on Windows
 * EOF is correctly established by ReadFile returning zero.

Original comment by jdmw...@gmail.com on 12 Mar 2010 at 3:30

@GoogleCodeExporter

This comment has been minimized.

Show comment
Hide comment
@GoogleCodeExporter

GoogleCodeExporter Mar 24, 2015

Just an afterthought: My previous comment actually related to using pipes as the
input file not the reference/source file (-s). But, I suspect the same code is 
used
either way to read from a stream so I think it's still relevant.

Original comment by jdmw...@gmail.com on 15 Mar 2010 at 10:10

GoogleCodeExporter commented Mar 24, 2015

Just an afterthought: My previous comment actually related to using pipes as the
input file not the reference/source file (-s). But, I suspect the same code is 
used
either way to read from a stream so I think it's still relevant.

Original comment by jdmw...@gmail.com on 15 Mar 2010 at 10:10

@GoogleCodeExporter

This comment has been minimized.

Show comment
Hide comment
@GoogleCodeExporter

GoogleCodeExporter Mar 24, 2015

I see.  This kind of issue is handled in the POSIX code path.

Original comment by josh.mac...@gmail.com on 15 Mar 2010 at 9:46

GoogleCodeExporter commented Mar 24, 2015

I see.  This kind of issue is handled in the POSIX code path.

Original comment by josh.mac...@gmail.com on 15 Mar 2010 at 9:46

@GoogleCodeExporter

This comment has been minimized.

Show comment
Hide comment
@GoogleCodeExporter

GoogleCodeExporter Mar 24, 2015

Hmm. It should be fairly straightforward to fix. All we'd need to do is make 
ReadFile loop around if the 
input buffer isn't filled. I'm happy to submit a patch if you like? Question: 
will the code base tolerate that 
function returning zero bytes? I guess this must already be ok to do this...

Original comment by jdmw...@gmail.com on 15 Mar 2010 at 10:49

GoogleCodeExporter commented Mar 24, 2015

Hmm. It should be fairly straightforward to fix. All we'd need to do is make 
ReadFile loop around if the 
input buffer isn't filled. I'm happy to submit a patch if you like? Question: 
will the code base tolerate that 
function returning zero bytes? I guess this must already be ok to do this...

Original comment by jdmw...@gmail.com on 15 Mar 2010 at 10:49

@GoogleCodeExporter

This comment has been minimized.

Show comment
Hide comment
@GoogleCodeExporter

GoogleCodeExporter Mar 24, 2015

Hi, I'm working on this one now (with a new Windows 7 machine).  First, I agree 
with your diagnosis and I'm working on the ReadFile loop (similar to the 
handling of POSIX file handles in the same location).

I've made a test pipe-server program using the code above and I'm having 
trouble getting ERROR_PIPE_DISCONNECTED.  I added a call to FlushFileBuffer() 
and DisconnectNamedPipe() prior to CloseHandle(), but my pipe-client (xdelta3) 
still gets read errors.  I'll keep working on it. 

My next question is whether/how to support a pipe as the input and output file. 
 In POSIX, xdelta uses the default /dev/stdin and /dev/stdout to connect to the 
standard input and output, but I don't know how to test these features in 
windows.  What command-line would I use to compress from stdin to stdout?


Thanks.

Original comment by josh.mac...@gmail.com on 31 Jul 2010 at 4:05

GoogleCodeExporter commented Mar 24, 2015

Hi, I'm working on this one now (with a new Windows 7 machine).  First, I agree 
with your diagnosis and I'm working on the ReadFile loop (similar to the 
handling of POSIX file handles in the same location).

I've made a test pipe-server program using the code above and I'm having 
trouble getting ERROR_PIPE_DISCONNECTED.  I added a call to FlushFileBuffer() 
and DisconnectNamedPipe() prior to CloseHandle(), but my pipe-client (xdelta3) 
still gets read errors.  I'll keep working on it. 

My next question is whether/how to support a pipe as the input and output file. 
 In POSIX, xdelta uses the default /dev/stdin and /dev/stdout to connect to the 
standard input and output, but I don't know how to test these features in 
windows.  What command-line would I use to compress from stdin to stdout?


Thanks.

Original comment by josh.mac...@gmail.com on 31 Jul 2010 at 4:05

@GoogleCodeExporter

This comment has been minimized.

Show comment
Hide comment
@GoogleCodeExporter

GoogleCodeExporter Mar 24, 2015

SVN 317 should solve this, although I couldn't figure out a proper way to 
detect EOF when reading from a named pipe.  The solution I came up with 
involves breaking out of the ReadFile loop when an ERROR_BROKEN_PIPE happens.  
When the pipe-writer wants to end the file, it closes the handle, causing the 
broken pipe.  I've tested this and it seems to work.

Original comment by josh.mac...@gmail.com on 1 Aug 2010 at 2:43

  • Changed state: Fixed

GoogleCodeExporter commented Mar 24, 2015

SVN 317 should solve this, although I couldn't figure out a proper way to 
detect EOF when reading from a named pipe.  The solution I came up with 
involves breaking out of the ReadFile loop when an ERROR_BROKEN_PIPE happens.  
When the pipe-writer wants to end the file, it closes the handle, causing the 
broken pipe.  I've tested this and it seems to work.

Original comment by josh.mac...@gmail.com on 1 Aug 2010 at 2:43

  • Changed state: Fixed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment