Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using win-iconv for iconv on Windows #4277

Open
samth opened this issue Jun 2, 2022 · 19 comments
Open

Consider using win-iconv for iconv on Windows #4277

samth opened this issue Jun 2, 2022 · 19 comments
Labels
platform:windows Windows specific topics

Comments

@samth
Copy link
Sponsor Member

samth commented Jun 2, 2022

Currently, the "base" package depends on the "racket-win32-x86_64" package, which includes libiconv. libiconv is under the LGPL v2.1, and is the only LGPL dependency of minimal Racket (the others are from OpenSSL and libedit on various platforms).

The win-iconv project (see https://github.com/burgerrg/win-iconv) has a Windows-only iconv replacement that is already mentioned by the Chez documentation. It is available under a non-restrictive license. It's also much smaller (1 MB vs 30kB).

@Aeva
Copy link

Aeva commented Jun 2, 2022

What license is win-iconv under? I don't see it in the git repository.

@samth
Copy link
Sponsor Member Author

samth commented Jun 2, 2022

The readme says "public domain"1 (see https://github.com/burgerrg/win-iconv/blob/master/readme.txt#L3)

Footnotes

  1. Yes I know this isn't really something you can do in all relevant legal systems.

@Aeva
Copy link

Aeva commented Jun 2, 2022

We should try to reach out to the author and see if they're willing to apply CC0 to it, so it can be used internationally. Otherwise, this is exchanging one set of licensing problems for another.

How extensively is libiconv used in Racket right now? Maybe we can just use the relevant win32 APIs directly on Windows builds?

@samth
Copy link
Sponsor Member Author

samth commented Jun 2, 2022

I'd be fine with contacting the author, although there haven't been any commits since 2016. But I think that while the public domain declaration has various legal issues, it's almost certainly not a problem in practice and would improve the licensing situation.

iconv is used only by bytes-open-converter and related functions, as far as I can tell. There is also an implementation of string character set conversion using iconv in Chez Scheme (see the iconv-codec procedure) but that is not used by Racket. Judging by the amount of code in win-iconv which does what you suggest, I think that would be a lot of work. Certainly adding (and adapted version of) the content of win-iconv to rktio would be a possible approach, though.

@otherjoel
Copy link
Sponsor Contributor

If it is in the public domain, I would assume any person or organization in a country that recognizes that status could simply fork it and issue it under a new license. But now you're the maintainer.

@LiberalArtist
Copy link
Contributor

LiberalArtist commented Jun 4, 2022

@samth wrote:

@Aeva wrote:

We should try to reach out to the author and see if they're willing to apply CC0 to it, so it can be used internationally. Otherwise, this is exchanging one set of licensing problems for another.

I'd be fine with contacting the author, although there haven't been any commits since 2016. But I think that while the public domain declaration has various legal issues, it's almost certainly not a problem in practice and would improve the licensing situation.

I know @burgerrg is involved in Chez Scheme, but it looks to me from https://github.com/burgerrg/win-iconv/blob/db14b52d58f98b7700863bb38ae69d7d5121f1fa/readme.txt#L20 like win-iconv is forked from an upstream project with several contributors: https://github.com/win-iconv/win-iconv/graphs/contributors I think to be really robust we would need all of them to agree to CC0 (unless maybe some have made only trivial commits). (Maybe this is a situation where we should seek guidance from the SFC?)

I agree that the informal public-domain declaration is not very likely to be a problem in practice. However, when not building with --embed-dlls or for a platform (like a game console) that's hostile to relinking, I would find the LGPL more confidence-inspiring than the ad-hoc public-domain statement.

@Aeva
Copy link

Aeva commented Jun 6, 2022

I agree that the informal public-domain declaration is not very likely to be a problem in practice. However, when not building with --embed-dlls or for a platform (like a game console) that's hostile to relinking, I would find the LGPL more confidence-inspiring than the ad-hoc public-domain statement.

Neither the LGPL or a DIY public-domain declaration "spark joy" here. Poking someone at the SFC might be a good idea, but I suspect the expedient thing may be to #if defined(WIN32) out all of the iconv calls and replace them w/ equivalent win32 syscalls? Would this still be a problem on other platforms? I'm unfamiliar with the finer points of pre-modern C89, building Racket, or the relevant win32 APIs, but I could give it a go.

@Aeva
Copy link

Aeva commented Jun 6, 2022

If it is in the public domain, I would assume any person or organization in a country that recognizes that status could simply fork it and issue it under a new license. But now you're the maintainer.

I'm pretty sure it would not be public domain in some jurisdictions, and therefor forking it to apply a fallback license for those jurisdictions would violate the authors' copyright in those jurisdictions, which would effectively solve no problems, but create several new ones.

@Aeva
Copy link

Aeva commented Jun 6, 2022

iconv is used only by bytes-open-converter and related functions, as far as I can tell. There is also an implementation of string character set conversion using iconv in Chez Scheme (see the iconv-codec procedure) but that is not used by Racket. Judging by the amount of code in win-iconv which does what you suggest, I think that would be a lot of work. Certainly adding (and adapted version of) the content of win-iconv to rktio would be a possible approach, though.

Would it be viable to remove the unused portion from Chez? As for the other parts, maybe there's another library we can use to provide similar functionality as the subset of iconv that is actually used? Is there a quick way to enumerate the calls we're actually using?

@sorawee
Copy link
Collaborator

sorawee commented Jun 6, 2022

iconv was added by @mflatt to rktio in 6a543f4. It looks like there are various flags to disable it.

I don't know much about linking, but: since Racket doesn't use the Chez Scheme portion that uses iconv, can't we create a stub libiconv that has blank implementation? That way, we don't need to change Chez Scheme itself.

@burgerrg
Copy link
Contributor

burgerrg commented Jun 6, 2022

Chez Scheme dynamically links iconv when needed, so not having the iconv library should be fine if you don't use iconv-codec.

@samth
Copy link
Sponsor Member Author

samth commented Jun 6, 2022

It's possible to build Racket with --disable-iconv which omits the rktio code. Chez Scheme also has a --disable-iconv flag for its configure script, although I haven't checked if the Racket-level flag disables the Chez-level flag.

Given that, the remaining challenge is the dynamic dependency on the iconv dll. It seems like it should be possible to disable that dependency if --disable-iconv is provided but it's not obvious what the best way to do that is.

@LiberalArtist
Copy link
Contributor

From racket/libs@8ba7a87, it sounds like the iconv dll is also used by e.g. libintl-9.dll in the support packages for racket/draw. That seems like it would require a more complete and compatible replacement than might suffice for bytes-open-converter.

@sorawee
Copy link
Collaborator

sorawee commented Jun 7, 2022

My understanding is that it's totally fine to have the iconv dependency as long as it's not a dependency of base. libintl is already LGPL, so having libiconv being LGPL too is not an issue.

@Aeva
Copy link

Aeva commented Jun 7, 2022

I agree with @sorawee. An optional LGPL dependency to an optional LGPL package is not the end of the world.

@LiberalArtist
Copy link
Contributor

How bad would it be to just take libiconv-2.dll out of e.g. racket-win32-x86_64-3 and put it in a new (set of) package(s) libiconv-win32-x86_64?

In some sense it would be a non-compatible change for users of "minimal Racket" (with only base, racket-lib, and their dependencies installed), but only for uses of bytes-open-converter with one of the non-guaranteed encoding combinations—and we already, well, don't guarantee them. Specifically, we currently say:

The set of available encodings and combinations varies by platform, depending on the iconv library that is installed; the from-name and to-name arguments are passed on to iconv_open. On Windows, "iconv.dll" or "libiconv.dll" must be in the same directory as "libmzschVERS.dll" (where VERS is a version number), in the user’s path, in the system directory, or in the current executable’s directory at run time, and the DLL must either supply _errno or link to "msvcrt.dll" for _errno; otherwise, only the guaranteed combinations are available.

[Margin note:] In the Racket software distributions for Windows, a suitable "iconv.dll" is included with "libmzschVERS.dll".

All we would have to change is "the Racket software distributions" in the margin note to something like "the main Racket distribution".

We could provide a meta-package libiconv to conveniently depend on the Racket-packaged DLL for the applicable platform.

@LiberalArtist
Copy link
Contributor

Re #4276 (comment) (keeping iconv-specific discussion in one place seemed best):

I think by "mandatory" you mean "required by the base package", which just includes libiconv on Windows (iconv is required on all platforms but is available in the OS on both macOS and Linux, and the other packages required there are distributed under more permissive licenses).

Here are a few possible things we can do wrt iconv, roughly in order of likelihood.

1. Provide an easy way to install `win-iconv` instead of the regular LGPL-licensed iconv.

2. Provide a way to configure without iconv (so that the relevant functions just error instead of using `dlopen()`).

3. Switch to `win-iconv` by default.

4. Switch to some other encoding-conversion library (such as ICU, which has a permissive license, but is not available on most platforms by default).

If there's some other iconv approach (or encoding transformation approach) that would work here, it would be good to know about that.

I noticed 8a08704 says that Android doesn't provide iconv, either.

It seems like it might be generally useful to implement the POSIX iconv API on top of ICU. I haven't found any project that's done so, but apparently ICU distributes a program uconv with the same interface as the command-line tool iconv, which makes me hopeful that it should be reasonably possible. One design decision would be whether to expose ICU's encoding names or to translate from iconv-style names.

@LiberalArtist
Copy link
Contributor

As another option, the Apache Portable Runtime project provides an iconv implementation with a permissive (but not dubious DIY) license: https://apr.apache.org I haven't tried building or using it, and I don't know how its features compare to GNU libiconv or others. (But my general view is that the non-Unicode functionality for which bytes-open-converter uses iconv is fairly niche these days.)

@LiberalArtist
Copy link
Contributor

I've looked into this a bit more. Apparently Windows has incorporated the ICU C API as a public system library since Windows 10 Creators Update (1703), released in 2017. IIUC, that means it is available in all versions of Windows currently supported by Microsoft (except for enterprise "extended support" for Windows 8.1 which ends on January 10, 2023).

The ICU function ucnv_convertEx looks especially promising for implementing the POSIX iconv API, and certainly for bytes-open-converter.

I think we could reasonably either distribute an iconv DLL implemented using ICU or support ICU as a backend within rktio without having to start distributing ICU ourselves (which seems relatively large and complex).

I've looked into it a little, but I thought I'd post here in case anyone else has the time and inclination to explore it before I do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform:windows Windows specific topics
Projects
None yet
Development

No branches or pull requests

7 participants