Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultiQC: Try to get py3 working by exporting a locale. #5541

Merged
merged 5 commits into from Aug 23, 2017

Conversation

Projects
None yet
4 participants
@ewels
Copy link
Member

ewels commented Aug 16, 2017

  • I have read the guidelines for bioconda recipes.
  • This PR adds a new recipe.
  • This PR updates an existing recipe.
  • This PR does something else (explain below).

Trying to get the Python 3 build to work for MultiQC by explicitly setting the locale before running test commands.

@ewels

This comment has been minimized.

Copy link
Member Author

ewels commented Aug 21, 2017

Hmm, an annoying couple of errors from the two linux builds..

  1. First test with export LC_ALL=C.UTF-8 LANG=C.UTF-8

    RuntimeError: Click will abort further execution because Python 3 was configured to use ASCII as encoding for the environment. Consult http://click.pocoo.org/python3/for mitigation steps.

    This system lists a couple of UTF-8 supporting locales that you can pick from. The following suitable locales where discovered: en_US.utf8

    Click discovered that you exported a UTF-8 locale but the locale system could not pick up from it because it does not exist. The exported locale is "C.UTF-8" but it is not supported

  2. Second test with export LC_ALL=en_US.utf8 LANG=en_US.utf8

    RuntimeError: Click will abort further execution because Python 3 was configured to use ASCII as encoding for the environment. Consult http://click.pocoo.org/python3/for mitigation steps.

    This system supports the C.UTF-8 locale which is recommended.
    You might be able to resolve your issue by exporting the following environment variables:

    export LC_ALL=C.UTF-8
    export LANG=C.UTF-8

    Click discovered that you exported a UTF-8 locale but the locale system could not pick up from it because it does not exist. The exported locale is "en_US.utf8" but it is not supported

So, in summary:

  1. C.UTF-8 doesn't work, but en_US.utf8 should
  2. en_US.utf8 doesn't work, but C.UTF-8 should

The fact that click auto-discovers and suggests a locale and then claims that it doesn't exist when used is quite annoying 😠

I'm at a loss now. Any ideas from anyone else? Is there anything we can do to the extended-base docker image? (eg. as in this post?)

Phil

@daler

This comment has been minimized.

Copy link
Member

daler commented Aug 21, 2017

@ewels the commands might be in separate shell calls, so you might to put everything in one line:

test:
  commands:
    - LC_ALL=en_US.utf8 LANG=en_US.utf8 multiqc --version # [linux]
    - LC_ALL=en_US.UTF-8 LANG=en_US.UTF-8 multiqc --version # [osx]

(edited to remove export)

Alternatively you can try using the new(ish) extended-base: in meta.yaml. See https://github.com/bioconda/bioconda-recipes/blob/master/recipes/nanoplotter/meta.yaml#L44 for example that uses this. This will use a larger base image that has locale installed.

@daler

This comment has been minimized.

Copy link
Member

daler commented Aug 21, 2017

@ewels whoops you're using extended-base already, sorry about that. I was lazily looking for extra: at the bottom of meta.yaml.

@ewels

This comment has been minimized.

Copy link
Member Author

ewels commented Aug 21, 2017

Ok thanks, I'll try! I did wonder to myself where the best place for the extra: was - I'll move it to the end of the file 😉

ewels added some commits Aug 21, 2017

@ewels ewels force-pushed the ewels:multiqc branch from d51ba47 to ce56f20 Aug 22, 2017

@ewels

This comment has been minimized.

Copy link
Member Author

ewels commented Aug 22, 2017

Amazing, it finally works! Thanks for the suggestion @daler!

Are there any repercussions of this for downstream users? Will the python 3 installations / docker containers work as expected or do people need to do any of their own locale trickery?

@daler

This comment has been minimized.

Copy link
Member

daler commented Aug 22, 2017

@ewels great! Conda installations should typically be fine; docker containers may need some help.

One way around this could be to add a wrapper script like locale_multiqc or multiqc_docker or something to the recipe that users can call directly when they have issues. That wrapper can call multiqc prefixed by the env vars like in this test.

@ewels

This comment has been minimized.

Copy link
Member Author

ewels commented Aug 22, 2017

Yes, that's the kind of thing I was wondering about - what's the best way to handle this? Should I add a wrapper script to bioconda somewhere? Or add a new alias to the build somehow?

Is there any scope for setting these locale env vars in the extended-base container itself directly so that we don't have to work around their absence or is that going to cause problems?

@daler

This comment has been minimized.

Copy link
Member

daler commented Aug 22, 2017

As for the wrapper, it's up to you how you want to document and support it. You could store a wrapper in the recipe and in build.sh you would use cp $RECIPE_DIR/multiqc-wrapper $PREFIX/bin && chmod +x $PREFIX/bin/multiqc-wrapper to get the wrapper into the built package. Or there might be something fancy you could do with entrypoints in the actual multiqc python install process.

I haven't played with it yet, but it may be possible to set env vars (1, 2).

@bgruening @johanneskoester what are your thoughts on setting LC_ALL and LANG in the extended base?

@epruesse

This comment has been minimized.

Copy link
Member

epruesse commented Aug 22, 2017

Most of bioinformatics assumes english, but a lot of the underlying infrastructure such as Python is designed to support locales. Since the software in Bioconda is very unlikely to have been tested against varying locale settings, I'd consider the results under any but en_US.UTF8 as "unspecified" (=wrong as soon as it matters).

If it was possible, I'd strongly suggest "pinning" the locale to en_US.UTF-8 (or GB). Everything else just risks faulty results which is very non-Bioconda. It's not really possible, but we could have a package bioconda-locale to pin the locale in environments and in the test docker.

Here's a fun scenario:
A cluster in Germany was installed by a sysadmin who doesn't speak english well. He chose to set the system language to German. A scientist working on this cluster has his personal locale set to English, as he doesn't understand German. Curiously, when he submits jobs to the cluster, the results are different from what he gets on the head node. After a lot of investigation, the cause is identified as one of the tools called in the job actually adhering to LC_NUMERIC settings. When jobs are submitted, the scientists shell environment is wiped and replaced with default, which on the German cluster is en_DE.UTF-8. That indicates that the decimal separator should be ,, not .....

The "C.UTF-8" that seems to be the only one working in the docker is actually a "workaround" to have UTF-8 encoded output without setting the language. It a fairly recent idea (as those things go), IIRC.

@bgruening

This comment has been minimized.

Copy link
Member

bgruening commented Aug 22, 2017

@daler LC_ALL and LANG should be set in the extended env vars, actually I thought this is already the case :(
I can look at this @daler if you like.

Thanks @ewels for keep working on this!

@ewels

This comment has been minimized.

Copy link
Member Author

ewels commented Aug 23, 2017

@epruesse - scary story! I've had a few similar issues with MultiQC not being able to parse output with German locale settings (commas) - it's impossible to detect retrospectively in some places and leads to horrible hacky solutions like this. Good to hear that I'm not the only one who's suffered from this issue! 😉

@bgruening - I did add a test commit to print all of the locale settings for debugging purposes, but when the above tests passed I cancelled the jobs and removed the commits. But seeing as click keeps saying that nothing is set I'm fairly confident that they're not set (or at least that something strange is going on). Would be nice to avoid having to write wrapper scripts etc!

Someone give me poke if we think it's necessary to go ahead with writing wrapper scripts. For now is it sensible to merge this PR to make MultiQC available for Python3 users and we can continue the discussion in a new issue?

Phil

@bgruening

This comment has been minimized.

Copy link
Member

bgruening commented Aug 23, 2017

Merging sounds good. Unfortunately, I'm fighting at the moment with travis, so I will take care about this as soon as travis is working for us again. Thanks again!

@ewels ewels merged commit c5d539a into bioconda:master Aug 23, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@ewels ewels deleted the ewels:multiqc branch Aug 23, 2017

@ewels

This comment has been minimized.

Copy link
Member Author

ewels commented Aug 24, 2017

Bah, the Travis linux build on the merge commit failed! Any ideas?

@ewels

This comment has been minimized.

Copy link
Member Author

ewels commented Aug 29, 2017

@daler / @bgruening - do you know if the above merge commit failed because of generic travis problems? The py3 MultiQC v1.3 build doesn't appear to be available on linux when I do a conda search multiqc..

Cheers,

Phil

@daler

This comment has been minimized.

Copy link
Member

daler commented Aug 29, 2017

I restarted it just to see if that easy fix would work, @bgruening did you end up doing anything with the extended container to set LC_ALL and LANG?

@ewels

This comment has been minimized.

Copy link
Member Author

ewels commented Aug 29, 2017

Ah, sad times - looks like the restarted job failed immediately due to problems with the git diff command.. I noticed that the merge commit was missing from the repo commit history, not sure if it was lost somewhere in some kind of rebasing or something?

@daler daler referenced this pull request Aug 29, 2017

Closed

Multiqc bump #5888

3 of 5 tasks complete
@daler

This comment has been minimized.

Copy link
Member

daler commented Aug 29, 2017

Let's see what happens with #5888

@vladsaveliev vladsaveliev referenced this pull request Sep 22, 2017

Merged

Downgrade networkx dependency #6094

3 of 5 tasks complete

@daler daler referenced this pull request Sep 26, 2017

Merged

locale-gen en_US.UTF-8 #3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.