Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GDAL_FILENAME_IS_UTF8 config option doesn't work for cyrillic symbols #38

Closed
Gigas002 opened this issue Feb 3, 2021 · 10 comments
Closed
Labels
bug-windows-only This bug is only spawns on windows

Comments

@Gigas002
Copy link

Gigas002 commented Feb 3, 2021

No one's taken this issue on Gdal's repo for 2 years now, so I decided to test it on your version of gdal bindings.

Any geotiff on input will pass as input data. Slightly updated test code:

Click to expand
using System;
using System.IO;
using System.Linq;
using OSGeo.GDAL;
using MaxRev.Gdal.Core;

namespace TestGdalBuildVrt
{
    internal static class Program
    {
        private static void Main()
        {
            // Configure Gdal's paths before using it. Don't forget to change target system to x64
            GdalBase.ConfigureAll();

            // Check the default state of "GDAL_FILENAME_IS_UTF8" config option.
            string currentState = Gdal.GetConfigOption("GDAL_FILENAME_IS_UTF8", "YES");

            // Paths local variables
            string dataDirectoryPath = "D:/test";
            string inputEngDirectoryPath = Path.Combine(dataDirectoryPath, "input-eng");
            string inputCyrDirectoryPath = Path.Combine(dataDirectoryPath, "input-cyr");

            string englishInputFilePath = Path.Combine(inputEngDirectoryPath, "input data.tif");
            string cyrillicInputFilePath = Path.Combine(inputCyrDirectoryPath, "исходные данные.tif");
            string outputFilePath;

            #region Test 1

            // Test 1 - Passing. Paths doesn't contain any cyrillic symbols

            Console.WriteLine($"GDAL_FILENAME_IS_UTF8 is set to {currentState} by default");

            outputFilePath = Path.Combine(dataDirectoryPath, "test1.vrt");
            string testResult = RunTest(englishInputFilePath, inputEngDirectoryPath, outputFilePath) ? "passed" : "failed";
            Console.WriteLine($"Test 1 {testResult}");

            #endregion

            #region Test 2

            // Test 2 - Gdal.Open pass, BuildVrt fails (doesn't throw errors/exceptions, but no output file), writes "warning" in console

            Console.WriteLine($"GDAL_FILENAME_IS_UTF8 is set to {currentState} before the test");

            outputFilePath = Path.Combine(dataDirectoryPath, "test2.vrt");
            testResult = RunTest(cyrillicInputFilePath, inputCyrDirectoryPath, outputFilePath) ? "passed" : "failed";
            Console.WriteLine($"Test 2 {testResult}");

            #endregion

            // Change "GDAL_FILENAME_IS_UTF8" value to "NO" and check, if it was changed correctly.
            Gdal.SetConfigOption("GDAL_FILENAME_IS_UTF8", "NO");
            currentState = Gdal.GetConfigOption("GDAL_FILENAME_IS_UTF8", "YES");

            #region Test 3

            // Test 3 - Gdal.Open pass, BuildVrt fails (doesn't throw errors/exceptions, but no output file), writes "warning" in console

            Console.WriteLine($"GDAL_FILENAME_IS_UTF8 is set to {currentState} before the test");

            outputFilePath = Path.Combine(dataDirectoryPath, "test3.vrt");
            testResult = RunTest(englishInputFilePath, inputEngDirectoryPath, outputFilePath) ? "passed" : "failed";
            Console.WriteLine($"Test 3 {testResult}");

            #endregion

            #region Test 4

            // Test 4 - all fails, Gdal.Open throws exception, BuildVrt writes "" warning

            Console.WriteLine($"GDAL_FILENAME_IS_UTF8 is set to {currentState} before the test");

            outputFilePath = Path.Combine(dataDirectoryPath, "test4.vrt");
            testResult = RunTest(cyrillicInputFilePath, inputCyrDirectoryPath, outputFilePath) ? "passed" : "failed";
            Console.WriteLine($"Test 4 {testResult}");

            #endregion
        }

        private static bool GdalBuildVrt(string[] inputFilesPaths, string outputFilePath, string[] options, Gdal.GDALProgressFuncDelegate callback)
        {
            try
            {
                using Dataset result = Gdal.wrapper_GDALBuildVRT_names(outputFilePath, inputFilesPaths, new GDALBuildVRTOptions(options), callback, null);
            }
            catch (Exception exception)
            {
                Console.WriteLine(exception.Message);

                return false;
            }

            return true;
        }

        private static bool OpenDataset(string inputFilePath)
        {
            try
            {
                using Dataset inputDataset = Gdal.Open(inputFilePath, Access.GA_ReadOnly);
            }
            catch (Exception exception)
            {
                Console.WriteLine(exception.Message);

                return false;
            }

            return true;
        }

        private static bool RunTest(string inputFilePath, string inputDirectoryPath, string outputFilePath)
        {
            bool isTestSuccessful = OpenDataset(inputFilePath);

            string[] inputFilesPaths = new DirectoryInfo(inputDirectoryPath)
                                      .EnumerateFiles().Select(fileInfo => fileInfo.FullName).ToArray();
            if (!GdalBuildVrt(inputFilesPaths, outputFilePath, null, null))
                isTestSuccessful = false;

            // Check if .vrt file was created, because GdalBuildVrt doesn't throw exceptions in that case
            if (!new FileInfo(outputFilePath).Exists)
                isTestSuccessful = false;

            return isTestSuccessful;
        }
    }
}

Tested on Win10 x64, MaxRev.Gdal.Core ver. 3.2.0.250, MaxRev.Gdal.WindowsRuntime.Minimal ver. 3.2.0.250. Run on .NET ver. 5.0.2.

Interesting difference with original issue, that Test 4 now produces the almost correct .vrt file, yet the naming of file is corrupted and the path inside of file too. Also, on the last test, Gdal fails to open dataset, but GdalBuildVrt doesn't seem to throw any warnings or exceptions.

This also reminds me about #23. Did we end up using iconv in vcpkg, or no? Ah, don't mind, found it in GdalCore.opt.

@MaxRev-Dev
Copy link
Owner

That's interesting, but currently I don't see here any solution. In my opinion this should be resolved internally in GDAL.
Have you any ideas how to solve this?

Anyway thanks, just found that on linux, configure has additional flag for libiconv aka --with-libiconv-prefix. I will apply this fix in next build.

@Gigas002
Copy link
Author

Gigas002 commented Feb 4, 2021

I'm not sure it's gdal's internal issue. I've tested the same dataset with the same paths with:

docker run -it -v D:/test:/test-data --rm osgeo/gdal
gdalbuildvrt "/test-data/input-eng/input.vrt" "/test-data/input-eng/input data.tif" -overwrite
gdalbuildvrt "/test-data/input-cyr/input.vrt" "/test-data/input-cyr/исходные данные.tif" -overwrite

Everything worked out without any issues or warnings. Testing with GDAL_FILENAME_IS_UTF8 on docker (same as on linux I suppose) is redundant, judging on this.

./gdalbuildvrt D:/test/input-eng/input.vrt "D:/test/input-eng/input data.tif" -overwrite --config GDAL_FILENAME_IS_UTF8 NO
./gdalbuildvrt D:/test/input-eng/input.vrt "D:/test/input-eng/input data.tif" -overwrite --config GDAL_FILENAME_IS_UTF8 YES
./gdalbuildvrt D:/test/input-cyr/input.vrt "D:/test/input-cyr/исходные данные.tif" -overwrite --config GDAL_FILENAME_IS_UTF8 NO
./gdalbuildvrt D:/test/input-cyr/input.vrt "D:/test/input-cyr/исходные данные.tif" -overwrite --config GDAL_FILENAME_IS_UTF8 YES

The only failed test was number 3 (with GDAL_FILENAME_IS_UTF8 NO and cyrillic path). Number 4 worked without any issues and warnings.

MaxRev-Dev added a commit that referenced this issue Mar 7, 2021
@MaxRev-Dev MaxRev-Dev added the bug-windows-only This bug is only spawns on windows label Mar 7, 2021
@MaxRev-Dev
Copy link
Owner

Tested with your code, a bit adjusted it for xunit tests.
And confirm that this issue exists in the latest packages (*.300) and only on windows runtime.

Screenshot

image

Another important detail that docker images were built on Ubuntu20.04 and newer for windows.
As this fails on windows only, I think this can be related to iconv usage. Both this packages and GISInternals use libiconv 1.16, so there is no version difference.
Finally, there is a question, iconv is the only translator of characters in gdal, or maybe we are missing something?

@MaxRev-Dev
Copy link
Owner

More on this. I had to test cyrillic symbols via clipboard as on Windows 10 console I can't input them dirrectly and it's not a font issue. Console behaviour is weird, as it works likely via tty layer, carret just randomly jumps on input.
Maybe I should test it via legacy console? But this requires full reload of all cmd instances. Anyway, it causes more problems than it solves.

@Gigas002
Copy link
Author

Sorry for a late answer. I've tested the latest vcpkg's build (version 3.2.2, x64-windows) and it seems to work fine with cyrillic symbols (though I've encountered this old issue: OSGeo/gdal#568).

@anton-petrov
Copy link
Contributor

anton-petrov commented Oct 31, 2021

Maybe need to close this issue?

@Gigas002
Copy link
Author

Sorry for necroposting, but the issue isn't resolved actually. Just tested it on win11 pc with 3.3.3 binaries and it fails. Tests mentioned in commit above runs under ubuntu-latest runner -- and yes, there's no issue with cyrillic paths on linux indeed.

@MaxRev-Dev
Copy link
Owner

@Gigas002 Here's a dirty hack, or what I changed in tests, those are executed on windows.
But other tests are passing as intended.
Please, could you provide another test method for that case?

// this works like a charm on linux even without config flag
if (RuntimeInformation.IsOSPlatform(OSPlatform.Linux))
Assert.True(result);
// windows can't find a file though
else if (RuntimeInformation.IsOSPlatform(OSPlatform.Windows))
Assert.False(result);
else
throw new XunitException("This test was not created for current os platform");

@Gigas002
Copy link
Author

Well, 4th test fails for me too, but IMO the main problem that is the 2nd test fails for the same reason. That happens even if I try to convert paths manually like this:

var cyrillicBytes = Encoding.Default.GetBytes(cyrillicPath);
var utf8Path = Encoding.UTF8.GetString(cyrillicBytes);

@MaxRev-Dev
Copy link
Owner

That's really weird, what's the output on
dotnet test --filter="FullyQualifiedName~CyrillicSymbols_YES_OptionDefault"
On my machine, everything works as expected)
Idk, maybe it's an issue with the runtime version?
Please, tell me what you think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-windows-only This bug is only spawns on windows
Projects
None yet
Development

No branches or pull requests

3 participants