Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for Issue 7648. #3138

Closed
wants to merge 13 commits into from
Closed

Fix for Issue 7648. #3138

wants to merge 13 commits into from

Conversation

aka-demik
Copy link
Contributor

@schveiguy
Copy link
Member

Can you add a unittest for this situation?

@mgwasus
Copy link

mgwasus commented Apr 3, 2015

I loaded master.phobos and compiled phobos.lib. I try to compile with new phobos.lib. In case of execution I receive an error. Windows 7 (32), cp1251. This error is and in 65 and 66 and 67 versions dmd.
Source text in Utf-8.


import std.stdio;
void main() {
File(«Ёлки Объект №876».txt, "w").writeln("Привет из D");
}

I correct this error the following patch in stdio.d


private FILE* fopen(in char[] name, in char[] mode = "r") @trusted // nothrow @nogc -> mgw disable for fromUtf8toAnsiW
{
import std.internal.cstring : tempCString;

version(Windows)
{
    // 02.04.2015 8:31:19 Адаптированно для dmd 2.067.0 
    wchar* fromUtf8toAnsiW(in char[] s, uint codePage = 0) @trusted  {
        import std.c.windows.windows: WideCharToMultiByte, GetLastError;
        import std.windows.syserror: sysErrorString;
        import std.conv: to;
        char[] result, rez; int readLen; auto ws = std.utf.toUTF16z(s);
        result.length = WideCharToMultiByte(codePage, 0, ws, -1, null, 0, null, null);
        if (result.length)  {
            readLen = WideCharToMultiByte(codePage, 0, ws, -1, result.ptr,  to!int(result.length), null, null);
            for(int i; i != result.length; i++) { rez ~= result[i]; rez ~= 0; } rez ~= 0; rez ~= 0;
        }
        if (!readLen || readLen != result.length)   {
            throw new Exception("Couldn't convert string: " ~ sysErrorString(GetLastError()));
        }
        return cast(wchar*)rez.ptr;
    }

    import std.internal.cstring : tempCStringW;
    // return _wfopen(name.tempCStringW(), mode.tempCStringW()); // mgw disable for  fromUtf8toAnsiW
    return _wfopen(fromUtf8toAnsiW(name), mode.tempCStringW());
}
else version(Posix)
{
    import core.sys.posix.stdio : fopen;
    return fopen(name.tempCString(), mode.tempCString());
}
else
{
    return .fopen(name.tempCString(), mode.tempCString());
}

}

@aka-demik
Copy link
Contributor Author

Added a simple unit test.

@schveiguy
Copy link
Member

@mgwasus so you are saying that this patch does NOT fix your issue?

@aka-demik
Copy link
Contributor Author

to mgwasus: WideCharToMultiByte convert to local code page. So this may fail on windows with cp1251:

File("Sehenswürdigkeit.txt", "r").writeln("Hello from Germany.");

In any case, I think that your patch for "fopen" makes sense. Because now we can not use "fopen" on windows, if the file name contains non-ASCII characters.
But it does not solve Issue 7648.

@aka-demik
Copy link
Contributor Author

I think mgwasus offered an alternative solution.

@schveiguy
Copy link
Member

I'm going to defer to others on this regarding whether this fixes the issue, as language support is not my forte, but the implementation seems sound to me.

@aka-demik
Copy link
Contributor Author

This PR is the implementation of @CyberShadow -s suggestion, from 7648 thread:

Vladimir Panteleev 2014-02-14 11:32:05 UTC
How about using CreateFile + windowsHandleOpen?

Maybe he could check?

@aka-demik
Copy link
Contributor Author

Woops. Based on unittest in std\stdio.d:1219 fopen opens the file in FILE_SHARE_READ + FILE_SHARE_WRITE. I will fix it.

@CyberShadow
Copy link
Member

Hi,

  • mgwasus' solution only applies to the DigitalMars C runtime, and it only works if the filename contains characters in the system codepage. It is an improvement but a minor one.
  • Using CreateFile/windowsHandleOpen allows us to bypass the C runtime's crappy filename parsing at the expense of duplicating some of its functionality (parsing mode flags, at least). Some regressions might appear (esp. with C runtime interoperability) if we don't match it's functionality exactly.
  • Another approach would be to ask for the source code to the DM C runtime from Walter Bright. He has given it out in the past to people who have shown intent to fix things in them. If DMC's _wfopen can be fixed to properly work with wide characters (instead of casting each char to the system CP like it does now), the Phobos-side patch would be much simpler.

About this patch:

  • trustedHandleToFD seems to be a copy-paste of windowsHandleOpen. Please refactor things instead of copy-pasting them.
  • I have a third-party implementation of the rest of mode parsing here. Might be worth comparing the implementations, discrepancies might indicate bugs in one or another.

@aka-demik
Copy link
Contributor Author

I'll be ready for the second round on Monday.
Fixing DMC's _wfopen seems to me the right decision. But I can not estimate how difficult it is.

@rainers
Copy link
Member

rainers commented Apr 4, 2015

The functions that are explicitely mapped from Unicode back to the bad multibyte versions by the DMC runtime are CreateFile, DeleteFile, MoveFile, GetFileAttributes, SetFileAttributes, CreateDirectory, RemoveDirectory, FindFirstFile, FindNextFile, GetCurrentDirectory, SetCurrentDirectory, CreateProcess, GetCommandLine and GetModuleFileName. So there are quite a few other C runtime functions other than fopen that fail to work with non-ascii characters.
If dmc ditches Win95 support, it should be rather simple for @WalterBright to map the unicode functions to the correct API.

@CyberShadow
Copy link
Member

I think DMC still supports MS-DOS, but as D supports XP and up, maybe @WalterBright would be OK with a DM libc branch/fork for D.

@rainers
Copy link
Member

rainers commented Apr 4, 2015

The DOS versions would not be affected, but some early Windows version might. For full compatibility a runtime check for availability of the W-Versions could be added, too.

@CyberShadow
Copy link
Member

If you have time to handle this, I think you should email Walter Bright and get the code.

@rainers
Copy link
Member

rainers commented Apr 4, 2015

Actually I've got it a few years ago as part of the commercial compiler version, but never tried to build it. I'll have a look...

@MGWL
Copy link

MGWL commented Apr 4, 2015

I have loaded function _wfopen () from msvcrt.dll using functions LoadLibrary and GetProcAddress. This loaded dynamically function fine works with wide lines and creates files with any names. Thus, precisely mistake in _wfopen from dmc.

It is strange, why such serious mistake is not corrected till now.

@CyberShadow
Copy link
Member

Yeah, by now it's not a secret that DM libc's "wide" functions simply convert the wide strings down to ANSI (and not even doing any sort of Unicode/codepage conversion, but simply by casting each wchar_t to a char).

@rainers
Copy link
Member

rainers commented Apr 4, 2015

The dmc lib does a WideCharToMultiByte conversion, but that only works for characters in the current code page.
I've emailed a fix to Walter, let's see whether it is accepted.

@d-random-contributor
Copy link

Unicode winapi is supported on win9x through MSLU

@aka-demik
Copy link
Contributor Author

As I understand it, we waiting for a reply Walter Bright.
In case if we can not fix DMC, I corrected PR taking into account the wishes of CyberShadow.
Rebased. Just not sure that is correct.

@aka-demik
Copy link
Contributor Author

Ping @rainers
Have passed two weeks. Is there no news from Walter Bright?

@rainers
Copy link
Member

rainers commented Apr 22, 2015

Sorry, I didn't hear back from @WalterBright regarding my second version. I don't know if it is ignored or accepted.

@WalterBright
Copy link
Member

Sorry, I'd overlooked it. Will check it out.

@WalterBright
Copy link
Member

dmd does not support Windows versions prior to XP (and not sure if we officially dumped XP or not). Hence, there is no need to deal with MSLU. There never was anyway - prior versions of Phobos handled Win95 correctly, but this support was removed.

I.e. just go ahead and call the Windows 'W' functions.

dmc does need to support Win95, but it'd be better to just work around it in Phobos.

@CyberShadow
Copy link
Member

I.e. just go ahead and call the Windows 'W' functions.

I'm not sure we can implement this completely in Phobos. We need to basically completely reimplement fopen but in a way that it constructs a FILE* compatible with other C functions. For example, how would you implement the "a+" file mode, which reads from one point in the file and writes to another?

@aka-demik
Copy link
Contributor Author

"a+" implemented by FILE_APPEND_DATA | GENERIC_READ flags.
I will add unittest that would confirm this.

@CyberShadow
Copy link
Member

"a+" implemented by FILE_APPEND_DATA | GENERIC_READ flags.

Nice, didn't know that. I don't see this documented on MSDN anywhere.

@aka-demik
Copy link
Contributor Author

Nice, didn't know that. I don't see this documented on MSDN anywhere.

On MSDN CreateFile i found comment

Opening File for APPEND access
You can get atomic append on local files by opening a file with FILE_APPEND_DATA access and without FILE_WRITE_DATA access.
This behavior is documented in Windows Driver Kit / Device and Driver Technologies / Installable File System / Reference / IO Manager Routines / IoCreateFileSpecifyDeviceObject

If only the FILE_APPEND_DATA and SYNCHRONIZE flags are set, the caller can write only to the end of the file, and any offset information about writes to the file is ignored. However, the file will automatically be extended as necessary for this type of write operation.

In any case, on vanilla dmd(v2.067.1) and ldc(0.15.2-beta1) read+write with "a+" also does not working. My testcase Can't check result on DPaste because Result: Runtime error / Return code: 25 (File size limit exceeded).
dmd result:

Test 1 - ok
Test 2 - fail
Test 3 - fail

ldc result:

Test 1 - ok
Test 2 - ok
std.exception.ErrnoException@e:\develop\apps\ldc2-0.15.2-beta1-mingw-x86\bin/../import\std\stdio.d(2223):  (Invalid argument)
----------------
...

I do not correctly understand the mode "a+"?
If I add a unittest for mode "a+" in this PR, it will not be passed for all configurations except DIGITAL_MARS_STDIO on Windows.

@aka-demik
Copy link
Contributor Author

Mmm... I checked with tutorialspoint's DMD64 v2.066. Got:

Test 1 - ok                                                                                                                                
Test 2 - ok                                                                                                                                
Test 3 - ok

Apparently, it is a problem only on windows.

@rainers
Copy link
Member

rainers commented May 19, 2015

Ping @rainers
Have passed two weeks. Is there no news from Walter Bright?

Walter has built a new version of snn.lib with the patches, it is already in the dmc package found here http://www.digitalmars.com/download/freecompiler.html. This should fix _wfopen, _wunlink, _wrename, _wmkdir, _wrmdir and a few more.

@aka-demik
Copy link
Contributor Author

Thank you. I looked at the name of the archive dm857c.zip, rather than the content. And I did not notice the update.

@aka-demik
Copy link
Contributor Author

aka-demik commented May 19, 2015

With the new DMC everything is working fine on the master branch (and on 2.067.1 to). Thanks to the preparatory steps in #788. I close this PR. And opened a new one to perform the cleanup doc's info about Issue 7648.
Perhaps we need to rebuild the current release (exe and zip) with the new version DMC?
Thanks!

@CyberShadow
Copy link
Member

Walter has built a new version of snn.lib with the patches, it is already in the dmc package found here http://www.digitalmars.com/download/freecompiler.html. This should fix _wfopen, _wunlink, _wrename, _wmkdir, _wrmdir and a few more.

Which link on that page includes the new snn.lib? dm857c.zip still has the old one.

@aka-demik
Copy link
Contributor Author

Hmm...
When i download "http://ftp.digitalmars.com/Digital_Mars_C++/Patch/dm857c.zip" on 19 May 2015 snn.lib was 557.0 Kib, 2015-05-17, md5:45920f522533547f58eac2ba77883ec3
But now by this link snn.lib is 560.5 Kib, 2013-02-26T05:19, md5:9357508e541067ea34056dade4381dd8

That is, the link to the old version again. Although previously it was new.
My copy of dm857c.zip 19 May 2015

@aka-demik
Copy link
Contributor Author

http://ftp.digitalmars.com/snn.lib - new version of snn.lib.

@CyberShadow
Copy link
Member

Would be nice to have permanent links to specific versions... I guess I'll just nab it from recent DMD zip files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants