New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

switch to using 'W' versions of windows functions instead of 'A' #534

Open
andrewrk opened this Issue Oct 14, 2017 · 11 comments

Comments

Projects
None yet
4 participants
@andrewrk
Member

andrewrk commented Oct 14, 2017

Here's the situation:

The 'A' versions of windows functions depend on a global code-page setting. We don't want to depend on global state in this way.

The 'W' versions use UTF-16LE with no global state, which at least supports all of unicode.

There are also some microsoft decisions such as: 'A' versions of file paths are limited to 260 bytes while 'W' versions are limited to 32,727. For some new API functions, there is no 'A' version, only 'W'. In general, 'A' seems legacy and deprecated, and 'W' is the correct way to use the Windows API.

Sadly, since Zig uses UTF-8 in the standard library (and this remains the correct decision), this essentially means decoding UTF-8, encoding UTF-16LE, making a Windows API call, decoding UTF-16LE, encoding UTF-8 for many of our syscalls on windows. But that's how it goes in the windows world.

@andrewrk andrewrk added this to the 0.2.0 milestone Oct 14, 2017

@andrewrk andrewrk added the os-windows label Oct 15, 2017

@PavelVozenilek

This comment has been minimized.

PavelVozenilek commented Oct 16, 2017

'A' functions convert narrow string parameter to UTF16 and call the 'W' variant.

Manual conversion can be done by Win32 functions MultiByteToWideChar/WideCharToMultiByte. Their use is a bit tricky (CP_THREAD_ACP does not work in some situations, CP_OEMCP does) but they allow to find out how much memory will be needed for the conversion and then use good old alloca, not touching the heap.

It is also possible to set codepage to UTF8 per application (or per thread) and let Windows do the conversions.

@andrewrk

This comment has been minimized.

Member

andrewrk commented Oct 16, 2017

Manual conversion can be done by Win32 functions MultiByteToWideChar/WideCharToMultiByte

Zig will use its own functions for this. Less room for error.

use good old alloca, not touching the heap.

This is not acceptable since the strings have a non-upper-bounded length.

It is also possible to set codepage to UTF8 per application (or per thread) and let Windows do the conversions.

How?

@PavelVozenilek

This comment has been minimized.

PavelVozenilek commented Oct 16, 2017

Zig will use its own functions for this. Less room for error.

The above functions handle (or claim to handle) invalid input sentences and surrogates.

[alloca] is not acceptable since the strings have a non-upper-bounded length.

I have yet to see string longer than a page. Applications dealing with big texts (editors) split them into parts, long blob is a no-no.
Allocations larger than N can be trivially rerouted to the heap.

It is also possible to set codepage to UTF8 per application (or per thread) and let Windows do the conversions.

How?

Oops, not possible (to switch Win32 app to "UTF8 mode"). I was thinking something else.

@andrewrk

This comment has been minimized.

Member

andrewrk commented Oct 16, 2017

The above functions handle (or claim to handle) invalid input sentences and surrogates.

Better to not make syscalls when no syscalls are fundamentally required.

Allocations larger than N can be trivially rerouted to the heap.

Not true since we accept an allocator argument when we want to use the heap.

@pluto439

This comment has been minimized.

pluto439 commented Oct 18, 2017

File paths is probably the only time this really matters. But yeah, to be safe.

@andrewrk andrewrk modified the milestones: 0.2.0, 0.3.0 Jan 17, 2018

@andrewrk andrewrk modified the milestones: 0.3.0, 0.4.0 Feb 28, 2018

@andrewrk andrewrk added the userland label Aug 9, 2018

andrewrk added a commit that referenced this issue Aug 22, 2018

fix linux
 * error.BadFd is not a valid error code. it would always be a bug to
   get this error code.
 * merge error.Io with existing error.InputOutput
 * merge error.PathNotFound with existing error.FileNotFound.
   Not all OS's support both.
 * add os.File.openReadC
 * add error.BadPathName for windows file operations with invalid
   characters
 * add os.toPosixPath to help stack allocate a null terminating byte
 * add some TODOs for other functions to investigate removing the
   allocator requirement
 * optimize some implementations to use the alternate functions when
   a null byte is already available
 * add a missing error.SkipZigTest
 * os.selfExePath uses a non-allocating API
 * os.selfExeDirPath uses a non-allocating API
 * os.path.real uses a non-allocating API
 * add os.path.realAlloc and os.path.realC
 * convert many windows syscalls to use the W versions (See #534)

andrewrk added a commit that referenced this issue Aug 22, 2018

Merge branch 'shawnl-path_max'
This does a proof of concept of changing most file system APIs to not
require an allocator and remove the possibility of failure via
OutOfMemory.

This also does most of the work of #534.

@andrewrk andrewrk modified the milestones: 0.4.0, 0.3.0 Aug 22, 2018

@andrewrk

This comment has been minimized.

Member

andrewrk commented Aug 22, 2018

Bumping up to 0.3.0 since this is mostly solved by the above commit.

@andrewrk

This comment has been minimized.

Member

andrewrk commented Sep 3, 2018

progress:

  • std/os/windows/advapi32.zig:pub extern "advapi32" stdcallcc fn CryptAcquireContextA(
  • std/os/windows/kernel32.zig:pub extern "kernel32" stdcallcc fn FreeEnvironmentStringsA(penv: [*]u8) BOOL;
  • std/os/windows/kernel32.zig:pub extern "kernel32" stdcallcc fn GetCommandLineA() LPSTR;
  • std/os/windows/kernel32.zig:pub extern "kernel32" stdcallcc fn GetEnvironmentStringsA() ?[*]u8;
  • std/os/windows/kernel32.zig:pub extern "kernel32" stdcallcc fn GetEnvironmentVariableA(lpName: LPCSTR, lpBuffer: LPSTR, nSize: DWORD) DWORD;
  • std/os/windows/kernel32.zig:pub extern "kernel32" stdcallcc fn LoadLibraryA(lpLibFileName: LPCSTR) ?HMODULE;
  • std/os/windows/shlwapi.zig:pub extern "shlwapi" stdcallcc fn PathFileExistsA(pszPath: ?LPCTSTR) BOOL;
  • std/os/windows/user32.zig:pub extern "user32" stdcallcc fn MessageBoxA(hWnd: ?HANDLE, lpText: ?LPCTSTR, lpCaption: ?LPCTSTR, uType: UINT) c_int;
  • std/os/windows/util.zig: return windows.LoadLibraryA(padded_buff.ptr) orelse error.DllNotFound;

andrewrk added a commit that referenced this issue Sep 3, 2018

@andrewrk andrewrk modified the milestones: 0.3.0, 0.4.0 Sep 3, 2018

@emekoi

This comment has been minimized.

Contributor

emekoi commented Sep 17, 2018

about the first one, should we really be using it? accord to the docs that api is deprecated.

@andrewrk

This comment has been minimized.

Member

andrewrk commented Sep 17, 2018

See #1318 for the details on CryptAcquireContextA vs RtlGenRandom.

@andrewrk

This comment has been minimized.

Member

andrewrk commented Sep 18, 2018

Oops, according to that issue, actually we should just remove CryptAcquireContextA from the standard library since it's deprecated according to microsoft and the zig std lib does not depend on it.

andrewrk added a commit that referenced this issue Sep 18, 2018

remove deprecated, unused windows functions
 * `CryptAcquireContextA`
 * `CryptReleaseContext`
 * `CryptGenRandom`

See #534 (comment)
@andrewrk

This comment has been minimized.

Member

andrewrk commented Dec 13, 2018

Thanks to @suirad this issue is now down to the only problem being GetCommandLineA

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment