Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long filename crashes dzsave #184

Closed
ax-meyer opened this issue Nov 24, 2022 · 13 comments · Fixed by libvips/libvips#3188
Closed

Long filename crashes dzsave #184

ax-meyer opened this issue Nov 24, 2022 · 13 comments · Fixed by libvips/libvips#3188
Labels
bug Something isn't working

Comments

@ax-meyer
Copy link

ax-meyer commented Nov 24, 2022

Using long filenames (>~230 characters) crashes the Image.Dzsave method without an exception - it actually crashes the whole application without any warning or exception.

Example code:

using NetVips;

using var image = Image.NewFromFile("testImage.jpg");

string shortFileName = "short";
image.Dzsave(shortFileName);
Console.WriteLine("short success");
Directory.Delete(shortFileName + "_files", true);

string longFileName = "longlonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglong";
image.Dzsave(longFileName);
Console.WriteLine("long success");
Directory.Delete(longFileName + "_files", true);

Output:

> NetVipsPathBugDemo.exe 
short success

** (process:14852): CRITICAL **: 13:26:48.876: gsf_outfile_new_child_full: assertion 'outfile != NULL' failed

** (process:14852): CRITICAL **: 13:26:48.878: gsf_outfile_new_child_full: assertion 'outfile != NULL' failed

** (process:14852): CRITICAL **: 13:26:48.879: gsf_output_write: assertion 'output != NULL' failed

** (process:14852): CRITICAL **: 13:26:48.879: gsf_output_close: assertion 'GSF_IS_OUTPUT (output)' failed

** (process:14852): CRITICAL **: 13:26:48.879: gsf_output_set_error: assertion 'GSF_IS_OUTPUT (output)' failed

** (process:14852): CRITICAL **: 13:26:48.879: gsf_output_error: assertion 'GSF_IS_OUTPUT (output)' failed

csrpoj:

<Project Sdk="Microsoft.NET.Sdk">

    <PropertyGroup>
        <OutputType>Exe</OutputType>
        <TargetFramework>net7.0</TargetFramework>
        <ImplicitUsings>enable</ImplicitUsings>
        <Nullable>enable</Nullable>
    </PropertyGroup>

    <ItemGroup>
      <PackageReference Include="NetVips" Version="2.2.0" />
      <PackageReference Include="NetVips.Native.win-x64" Version="8.13.2" />
    </ItemGroup>

    <ItemGroup>
      <None Update="testImage.tif">
        <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
      </None>
    </ItemGroup>

</Project>

Test image file:
testImage

@kleisauke kleisauke added the triage This issue is being investigated label Nov 24, 2022
@kleisauke
Copy link
Owner

I assume this was tested on Windows since you're using the NetVips.Native.win-x64 package? If so, it seems that Windows has a 260-character pathname limit (including the null terminator), which GLib uses internally, but libvips limits it to 4096 on all platforms, hence the mismatch, see:
https://github.com/libvips/libvips/blob/5947f6ed11eb1abd7a6a6282a3a34d1d08e9e934/libvips/include/vips/util.h#L213-L216
https://github.com/GNOME/glib/blob/320b24f1f1cab0102f8ad81ec88acbfd6400ec8d/glib/gfileutils.c#L2852-L2860
https://github.com/mingw-w64/mingw-w64/blob/3dfe762034a6bc1b1062dd2f64b79685d66903c1/mingw-w64-headers/crt/limits.h#L11-L20

/cc @jcupitt

@ax-meyer
Copy link
Author

Yes, Windows has the character limit.
I think this is usually solved using this: https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-getshortpathnamew
At the very least, I think DzSave should throw an exception if the path is too long and not just crash the whole application.

@jcupitt
Copy link
Contributor

jcupitt commented Nov 24, 2022

I believe Windows has a separate API which can handle long paths on some filesystems, but it's not very well supported. Explorer, for example, will fail for long paths, and node on windows is always struggling with this. My understanding is that it's simplest just to avoid deep directory structures on Windows.

(I could be very out of date here, perhaps this has been improved since I was last bitten by it)

@jcupitt
Copy link
Contributor

jcupitt commented Nov 24, 2022

At the very least, I think DzSave should throw an exception if the path is too long and not just crash the whole application.

I'd be a bit surprised if dzsave is crashing, I think it checks all return values. I would suspect libgsf (all software devs are always certain it's someone else's fault haha). Someone brave would need to investigate.

@jcupitt
Copy link
Contributor

jcupitt commented Nov 24, 2022

... as a workaround, you could save to a zip container instead of the filesystem. It would avoid all filename length issues, and it's much faster, since you avoid the large file creation overhead on win. It'll make better use of your disc space too.

@jcupitt
Copy link
Contributor

jcupitt commented Nov 24, 2022

I tried on my win VM with the openslide tests data here: https://openslide.cs.cmu.edu/download/openslide-testdata/Aperio/

jcupi@DESKTOP-HGI6HBR MINGW64 ~/pics
$ ../vips-dev-8.13/bin/vipsheader.exe CMU-1.svs
CMU-1.svs: 46000x32914 uchar, 4 bands, srgb, openslideload

jcupi@DESKTOP-HGI6HBR MINGW64 ~/pics
$ time ../vips-dev-8.13/bin/vips.exe dzsave CMU-1.svs x.zip

real    0m34.119s
user    0m0.000s
sys     0m0.015s

jcupi@DESKTOP-HGI6HBR MINGW64 ~/pics
$ time ../vips-dev-8.13/bin/vips.exe dzsave CMU-1.svs x

real    1m8.675s
user    0m0.015s
sys     0m0.000s

So zip output is about 2x faster, on this machine anyway.

@ax-meyer
Copy link
Author

Thanks for the hint to the zip container.
I switched my application to using those, and yes, it solved the long file path issue, and is about 2x faster for me as well (at least my unit tests only take half as long).

@jcupitt
Copy link
Contributor

jcupitt commented Nov 25, 2022

Windows file creation has got a lot quicker over the last 10 years or so (zip output used to be 5x faster, ouch), but it's still a bit sluggish.

You'll find you need noticeably less disc space with zip too, and it's not because of compression (dzsave zips are uncompressed). JPEG tiles from dzsave are typically ~2kb, so NTFS will store each in a separate 4kb disc sector, leaving 2kb unused. Zip output puts them all end to end in one file, so each disc sector is filled. You can expect maybe a 30% saving in disc space if you count allocated sectors.

You often don't need to unzip them. Because they are uncompressed, most zip libraries (eg. the standard python one) will be able to fetch a tile with a one seek and one read.

@kleisauke
Copy link
Owner

I could reproduce this on my Windows PC. Somehow the directory is not created at all, so perhaps it's (silently) failing on this line(?):
https://github.com/libvips/libvips/blob/5947f6ed11eb1abd7a6a6282a3a34d1d08e9e934/libvips/foreign/dzsave.c#L346

Though, _wmkdir should probably set errno to ENAMETOOLONG when that happens.

I'll have another look tomorrow. It could also be a security feature of UCRT (Universal C Runtime in Windows) which llvm-mingw uses by default.

@kleisauke kleisauke added bug Something isn't working blocked-upstream-dependency Upstream dependency needs to be updated and removed triage This issue is being investigated labels Nov 27, 2022
kleisauke added a commit to kleisauke/libvips that referenced this issue Nov 27, 2022
It can return `NULL` when it exceeds the path limits.

See: kleisauke/net-vips#184
@kleisauke
Copy link
Owner

kleisauke commented Nov 27, 2022

g_mkdir() returns a non-zero result when it exceeds the path limits, as expected.

Details
#include <assert.h>
#include <stdio.h>
#include <errno.h>

#include <glib.h>

int main(int argc, char **argv) {
    if (argc != 2) {
        printf("usage: %s dirname", argv[0]);
        return 1;
    }

    GDir *dir;
    int ret;

    ret = g_mkdir(argv[1], 0777);
    assert(errno == 0);
    assert(ret == 0);

    dir = g_dir_open(argv[1], 0, NULL);
    assert(dir != NULL);

    g_dir_close(dir);

    return 0;
}

Compile with:

PS> cd C:\Users\kleisauke
PS> cl /Zi /MTd /IC:\vips-dev-8.14\include\glib-2.0 /IC:\vips-dev-8.14\lib\glib-2.0\include test.c /link /libpath:C:\vips-dev-8.14\lib libglib-2.0.lib

Test with:

PS> test.exe short
PS> test.exe longlonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglong

PS> copy test.exe C:\Users\kleisauke\net-vips\samples\NetVips.Samples\bin\x64\Release\net7.0
        1 file(s) copied.

PS> cd C:\Users\kleisauke\net-vips\samples\NetVips.Samples\bin\x64\Release\net7.0
PS> test.exe short
PS> test.exe longlonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglong

PS> mkdir another_subdir
PS> copy test.exe another_subdir
PS> cd another_subdir
PS> test.exe short
PS> test.exe longlonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglong
Assertion failed: ret == 0, file test.c, line 18

This means that vips_gsf_path() can also NULL when it exceeds the path limits, however libvips did not always check the return result of this. I just added error handling for this with PR libvips/libvips#3188.

With that PR and when applying this patch:

@@ -8,6 +8,14 @@ Console.WriteLine("short success");
 Directory.Delete(shortFileName + "_files", true);
 
 string longFileName = "longlonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglong";
-image.Dzsave(longFileName);
+try
+{
+    image.Dzsave(longFileName);
+}
+catch (VipsException e)
+{
+    Console.WriteLine($"{e.Message}");
+    throw;
+}
 Console.WriteLine("long success");
 Directory.Delete(longFileName + "_files", true);

I now see:

** (process:15976): CRITICAL **: 13:42:10.618: gsf_outfile_new_child_full: assertion 'outfile != NULL' failed

** (process:15976): CRITICAL **: 13:42:10.619: gsf_outfile_new_child_full: assertion 'outfile != NULL' failed

** (process:15976): CRITICAL **: 13:42:10.620: gsf_outfile_new_child_full: assertion 'outfile != NULL' failed

** (process:15976): CRITICAL **: 13:42:10.621: gsf_outfile_new_child_full: assertion 'outfile != NULL' failed

** (process:15976): CRITICAL **: 13:42:10.621: gsf_outfile_new_child_full: assertion 'outfile != NULL' failed
unable to call dzsave
wbuffer_write: write failed

And the program terminates with a proper exception instead.

Thanks for reporting this! It'll be in libvips 8.14, due RSN.

jcupitt pushed a commit to libvips/libvips that referenced this issue Nov 27, 2022
It can return `NULL` when it exceeds the path limits.

See: kleisauke/net-vips#184
@jcupitt
Copy link
Contributor

jcupitt commented Nov 27, 2022

Nice find @kleisauke ! So yes (ahem), I was wrong, it was libvips's fault.

@kleisauke
Copy link
Owner

A release candidate of NetVips.Native v8.14.0 is now available for testing.
https://www.nuget.org/packages/NetVips.Native/8.14.0-rc1

@kleisauke kleisauke removed the blocked-upstream-dependency Upstream dependency needs to be updated label Mar 24, 2023
@kleisauke
Copy link
Owner

NetVips v2.3.0 and NetVips.Native v8.14.2 is now available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants