Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract workflow failed when a dataset has unicode character in its name #7421

Closed
vloux opened this issue Feb 27, 2019 · 5 comments
Closed

Extract workflow failed when a dataset has unicode character in its name #7421

vloux opened this issue Feb 27, 2019 · 5 comments

Comments

@vloux
Copy link

@vloux vloux commented Feb 27, 2019

Galaxy version 18.09.

To reproduce :

  • Put a special character like "é" or "è" in the name of a dataset
  • Extract the workflow from the history menu. The workflow is extracted without error
  • When trying to edit the workflow an error ("Loading workflow failed") occurs

Removing the special character from the dataset name and re-extracting the workflow solve the problem.

Error in the logs :

File '/project/proteore/galaxy/lib/galaxy/managers/workflows.py', line 1014 in __set_default_label
if str(default_label).lower() not in ['input dataset', 'input dataset collection']:
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 16: ordinal not in range(128)

Thanks !

@mvdbeek
Copy link
Member

@mvdbeek mvdbeek commented Feb 27, 2019

Thanks @vloux, it looks like your environment doesn't have LC_ALL set (or set to ASCII). If that is the case python will fall back to ASCII as the default encoding, and that'll break a lot of things. You may want to set up the locales to en_US.UTF-8 or any other UTF-8 encoding.

Loading

@mvdbeek
Copy link
Member

@mvdbeek mvdbeek commented Feb 27, 2019

(if you want to check it works you can do python -c "print(u'\xe9')" and you should see:

é

You can simulate that error with LC_ALL="ASCII" python -c "print(u'\xe9')":

Traceback (most recent call last):
  File "<string>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 0: ordinal not in range(128)

Loading

@mvdbeek mvdbeek closed this Feb 27, 2019
@mvdbeek
Copy link
Member

@mvdbeek mvdbeek commented Feb 27, 2019

Well, we can also force UTF-8 when loading the step dict, but you should really investigate the environment.

Loading

@vloux
Copy link
Author

@vloux vloux commented Feb 28, 2019

Hello Marius,
thank you for your quick answer. Indeed the LC_ALL environment variable wasn't set on our Galaxy server environment. I tried to set it (via the GALAXY_LOCAL_ENV_FILE ), but it doesn't seem to resolve the problem ?
For information, I can reproduce the same bug on usegalaxy.org

Loading

@mvdbeek
Copy link
Member

@mvdbeek mvdbeek commented Feb 28, 2019

This is a system-wide setting that you can't control with GALAXY_LOCAL_ENV_FILE, on Debian systems you'd use something like sudo locale-gen fr_FR fr_FR.UTF-8 command and update-locale LANG=fr_FR.UTF-8. That said my production instance has the same issue, but not my local dev instance. Might be a difference in how the database is set up ... in any case #7422 fixes this.

Loading

@nsoranzo nsoranzo closed this in 9e0d3b6 Mar 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants