Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ci] fix locale-setting in jobs running in ubuntu container #5643

Merged
merged 4 commits into from
Dec 28, 2022

Conversation

jameslamb
Copy link
Collaborator

@jameslamb jameslamb commented Dec 21, 2022

Today, the Linux CI jobs on Azure DevOps run in a container using the ubuntu:22.04 image. Since that image is just a minimal operating system distribution, running LightGBM's CI in it requires a lot of extra configuration.

Here's where much of that happens:

if [[ $IN_UBUNTU_LATEST_CONTAINER == "true" ]]; then

I noticed while working on #5638 that one piece of that has been silently failing for a while (unsure how long).

There are a few problems with this bit of configuration that tries to change the locale inside the container to en_US.UTF-8:

LightGBM/.ci/setup.sh

Lines 58 to 61 in a174893

export LANG="en_US.UTF-8"
export LC_ALL="${LANG}"
sudo locale-gen ${LANG}
sudo update-locale

Setting LC_ALL environment variable to a locale that doesn't exist yet (i.e. hasn't been downloaded or generated with locale-gen) isn't possible. That's why this warning shows up in the Linux_latest logs

/__w/1/s/.ci/setup.sh: line 59: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
/bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)

(recent build on master)

Setting environment variables inside a process only affects that process and it's sub-processes...so those variables won't be used by test.sh, since it runs in a separate process not as a sub-process of setup.sh

LightGBM/.vsts-ci.yml

Lines 142 to 145 in a174893

- bash: $(Build.SourcesDirectory)/.ci/setup.sh
displayName: Setup
- bash: $(Build.SourcesDirectory)/.ci/test.sh
displayName: Test

I noticed this because once I started trying to get .ci/test_r_package.sh to run in an ubuntu:22.04 container, the R unit test about handling of non-ASCII feature names was failing, and R was reporting that it had fallen back to the default character set for the image (ASCII, the "POSIX" locale).

This PR proposes the following changes:

  • just install locales-all instead of generating a specific locale with locale-gen
  • set environment variable LC_ALL again in test.sh
  • switch environment variable IN_UBUNTU_LATEST_CONTAINER to IN_UBUNTU_BASE_CONTAINER
    • In [ci] run r-package Linux jobs in containers #5638, I'd like to run the R 3.6 (older version of R) linux jobs in an ubuntu:18.04 instead of ubuntu:22.04 container, but it can re-use all of the setup stuff.
    • So everything these scripts guard with that environment variable as being relevant in an ubuntu "latest" container are actually just useful whenever running the scripts in an ubuntu:{something} container

@jameslamb jameslamb changed the title WIP: [ci] fix locale-setting in jobs running in ubuntu container [ci] fix locale-setting in jobs running in ubuntu container Dec 21, 2022
@jameslamb jameslamb marked this pull request as ready for review December 21, 2022 23:51
@jameslamb jameslamb merged commit 59c7313 into master Dec 28, 2022
@jameslamb jameslamb deleted the ci/fix-locale branch December 28, 2022 07:19
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed.
To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues
including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 16, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants