Skip to content

I18N isn't working (mostly) via changing the environment variables #783

@brlin-tw

Description

@brlin-tw

I noticed that currently (5.6.0-64-gc6f1946) tidy isn't printing localized messages even when I changing the environmentals according to tidy-html5/localize at next · htacg/tidy-html5:

export LANG=fr_FR
export LC_ALL=fr_FR

Digging the code I found the setlocale call at tidy-html5/tidylib.c at 86b52dc1081ca4b0582c7bad279bf254bad268e1 · htacg/tidy-html5:

    /* Set the locale for tidy's output. This both configures
    ** LibTidy to use the environment's locale as well as the
    ** standard library.
    */
#if SUPPORT_LOCALIZATIONS
    if ( TY_(tidyGetLanguageSetByUser)() == no )
    {
        TY_(tidySetLanguage)( setlocale( LC_ALL, "") );
    }
#endif

I written a small program to test it:

#include <stdlib.h>
#include <stdio.h>
#include <locale.h>

int main(int argc, char* argv[]){
	char* result = NULL;
 
	result = setlocale(LC_ALL, "");

	if(result != NULL){
		printf("setlocale(LC_ALL, \"\") returns %s\n", result);
	}else{
		printf("setlocale(LC_ALL, \"\") returns NULL.\n");
	}
	return EXIT_SUCCESS;
}

The following is my locale:

LANG=zh_TW.UTF-8
LANGUAGE=zh_TW:zh_HK:zh_CN:en_US:en
LC_CTYPE="zh_TW.UTF-8"
LC_NUMERIC=zh_TW.UTF-8
LC_TIME=zh_TW.UTF-8
LC_COLLATE="zh_TW.UTF-8"
LC_MONETARY=zh_TW.UTF-8
LC_MESSAGES="zh_TW.UTF-8"
LC_PAPER=zh_TW.UTF-8
LC_NAME=zh_TW.UTF-8
LC_ADDRESS=zh_TW.UTF-8
LC_TELEPHONE=zh_TW.UTF-8
LC_MEASUREMENT=zh_TW.UTF-8
LC_IDENTIFICATION=zh_TW.UTF-8
LC_ALL=

and the locale definitions compiled in my system:

$ cat /var/lib/locales/supported.d/*
en_US.UTF-8 UTF-8
en_GB.UTF-8 UTF-8
fr_FR.UTF-8 UTF-8

zh_CN.UTF-8 UTF-8
zh_TW.UTF-8 UTF-8
zh_HK.UTF-8 UTF-8

Here's the test results:

$ ./test_setlocale 
setlocale(LC_ALL, "") returns "zh_TW.UTF-8"

$ env LANG=zh_TW.UTF-8 ./test_setlocale 
setlocale(LC_ALL, "") returns "zh_TW.UTF-8"

$ env LANG=fr_FR ./test_setlocale 
setlocale(LC_ALL, "") returns NULL.

$ env LANG=fr_FR.UTF-8 ./test_setlocale 
setlocale(LC_ALL, "") returns "LC_CTYPE=fr_FR.UTF-8;LC_NUMERIC=zh_TW.UTF-8;LC_TIME=zh_TW.UTF-8;LC_COLLATE=fr_FR.UTF-8;LC_MONETARY=zh_TW.UTF-8;LC_MESSAGES=fr_FR.UTF-8;LC_PAPER=zh_TW.UTF-8;LC_NAME=zh_TW.UTF-8;LC_ADDRESS=zh_TW.UTF-8;LC_TELEPHONE=zh_TW.UTF-8;LC_MEASUREMENT=zh_TW.UTF-8;LC_IDENTIFICATION=zh_TW.UTF-8"

$ env LC_ALL=zh_TW.UTF-8 ./test_setlocale 
setlocale(LC_ALL, "") returns "zh_TW.UTF-8"

$ env LC_ALL=zh_TW ./test_setlocale 
setlocale(LC_ALL, "") returns NULL.

$ env LC_ALL=fr_FR.UTF-8 ./test_setlocale 
setlocale(LC_ALL, "") returns "fr_FR.UTF-8"

$ env LC_ALL=fr_FR ./test_setlocale 
setlocale(LC_ALL, "") returns NULL.

Speculations

  • I believe the LC_CTYPE... lines causes the I18N fail as the locale LC_CT doesn't exist
  • I'm not sure why LANG=zh_TW.UTF-8 and LANG=fr_FR.UTF-8 results differently
  • The only configuration I found that makes Tidy's I18N work is env LC_ALL=fr_FR.UTF-8 tidy -help
  • I noticed that env LC_ALL=fr_FR.UTF-8 tidy -help only works when the locale definition of fr_FR.UTF_8is installed

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions