Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTF-8 locale #483

Closed
tuk05578 opened this issue May 27, 2022 · 14 comments
Closed

UTF-8 locale #483

tuk05578 opened this issue May 27, 2022 · 14 comments
Labels
colab AlphaFold colab issue

Comments

@tuk05578
Copy link

tuk05578 commented May 27, 2022

Hello,

I am running an AlphaFold run on a favoenzyme, photolyase. I have had a few successful runs of the program, but suddenly it stopped working and keeps giving me this error whenever I try to run it:

NotImplementedError                       Traceback (most recent call last)
[<ipython-input-1-bc0091fa34e2>](https://localhost:8080/#) in <module>()
    577 sequence = 'AKIGLFYGTQTGVTQTIAESIQQEFGGESIVDLNDIANADASDLNAYDYLIIGCPTWNVGELQSDWEGIYDDLDSVNFQGKKVAYFGAGDQVGYSDNFQDAMGILEEKISSLGSQTVGYWPIEGYDFNESKAVRNNQFVGLAIDEDNQPDLTKNRIKTWVSQLKSEFGL'  #@param {type:"string"}
    578 
--> 579 run_prediction(sequence)

3 frames
[/usr/local/lib/python3.7/dist-packages/google/colab/_system_commands.py](https://localhost:8080/#) in _run_command(cmd, clear_streamed_output)
    166   if locale_encoding != _ENCODING:
    167     raise NotImplementedError(
--> 168         'A UTF-8 locale is required. Got {}'.format(locale_encoding))
    169 
    170   parent_pty, child_pty = pty.openpty()

NotImplementedError: A UTF-8 locale is required. Got ANSI_X3.4-1968

If anyone can provide context or an explanation on how to fix this (I am very much NOT very knowledgable on coding and the nitty gritty details of AlphaFold), that would be much appreciated :)
Thank you for your time in advance guys!!

Best wishes,
Jared

@li-yq
Copy link

li-yq commented May 29, 2022

I also met the same problem and a temporary workaround is to manually download result files via the files panel on the left.

@tuk05578
Copy link
Author

I also met the same problem and a temporary workaround is to manually download result files via the files panel on the left.

I don't seem to see a results file, and when I try to download the prediction it says is the best prediction, it doesn't actually download to my computer.

@kamdiehl
Copy link

kamdiehl commented Jun 9, 2022

Same here, it won't save to mine either. Have you found a solution yet?

@sokrypton
Copy link

This error appears to only appear when conda is installed. Seems conda is interfering with google colab's %shell magic.

@Hans-Yolo
Copy link

Just checking in as well, I'm having the same issue. I do have conda installed, I don't want to uninstall just for this. I can't download, and I also can't seem to mount my google drive as a workaround. Has anyone figured out how to grab their PDB?

@tuk05578
Copy link
Author

Hi all, sorry I haven't been replying. Life has gotten VERY busy at our lab now that it's summer time.

The work around that I found was connecting to a hosted runtime (top right corner) after a run is complete and the error code is given, and once it connects, in the top menu, I click "Runtime", and go down and press "restart and run all" and it should do it relatively quick and bring up the prediction in ChimeraX.

Let me know if it works for any of you!

@Hans-Yolo
Copy link

Thanks for your response, I really appreciate it! Yes this worked, but only the second time. I ran again and got the same error, then I tried this method. When I ran again, it just started over and took another hour and threw the same error. Then I tried the same thing one more time, and then it ran for 1 minute and dropped the prediction into ChimeraX.

Thanks for the fix!

@Belfield
Copy link

Belfield commented Jul 1, 2022

Hi all, sorry I haven't been replying. Life has gotten VERY busy at our lab now that it's summer time.

The work around that I found was connecting to a hosted runtime (top right corner) after a run is complete and the error code is given, and once it connects, in the top menu, I click "Runtime", and go down and press "restart and run all" and it should do it relatively quick and bring up the prediction in ChimeraX.

Let me know if it works for any of you!

Thanks for the suggestion but it's still not running for me. I get this:

image

@Augustin-Zidek
Copy link
Collaborator

Augustin-Zidek commented Aug 3, 2022

Could you try adding this code just before the problematic line is called?

import os
del os.environ['LC_ALL']

Also, is this issue happening in the AlphaFold Colab, or is it in ColabFold?

@li-yq
Copy link

li-yq commented Aug 3, 2022

Also, is this issue happening in the AlphaFold Colab, or is it in ColabFold?

It's the AlphaFold Colab.

@Belfield
Copy link

Belfield commented Aug 4, 2022 via email

@gmihaila
Copy link

This is more of a Google Colab issue when trying to create the output_dir zip file from the output_dir folder. I solved this by using shutil to create the zip file.
You can simply replace the line of code !zip -q -r {output_dir}.zip {output_dir} from cell 5. Run AlphaFold and download prediction with import shutil; shutil.make_archive(output_dir, 'zip', output_dir)
This will create the zip file and should not cause any issues in the future since we don't rely on the Google Colab terminal commands.

I added this fix to a pull request here: #672

@tomgoddard
Copy link

tomgoddard commented Jan 19, 2023

This problem is caused by Python somehow being switched from using the default UTF-8 text encoding to ANSI_X4.3-1968 (the technical name for ASCII text encoding). Google Colab shell magic (leading "!" to run shell commands in Python scripts) gives the reported error if the encoding is not UTF-8. The switch from UTF-8 to ASCII encoding happens when OpenMM energy minimization is run by AlphaFold. I am not sure how OpenMM causes that switch. Usually the encoding is controlled by environment variables such as LANG or LC_ALL and the settings of those are not changed when the error happens.

This bug has been reported for ColabFold run on Google Colab

sokrypton/ColabFold#237

and also for ChimeraX AlphaFold predictions run on Google Colab

https://www.rbvi.ucsf.edu/trac/ChimeraX/ticket/8313

I debugged the ChimeraX case, but was not able to find the underlying cause. The Python method locale.getpreferredencoding() is giving ANSI_X4.3-1968 when the error occurs but UTF-8 when there is no error. That Python routine uses _locale.nl_langinfo(_locale.CODESET) in Python 3.8 which is a call into C code that uses the nl_langinfo(CODESET) C library call. I did hours of testing and could not figure out why the C library call is not reporting UTF-8. Details are in the above ChimeraX ticket. Ultimately I put in a very ugly work-around monkey patching _locale.nl_langinfo(CODESET) to report UTF-8 in the ChimeraX AlphaFold code, a horrible solution.

The suggested fix in the Dec 30, 2022 comment by gmihaila of replacing the !zip shell magic works the first time AlphaFold is run. But there are other uses of shell magic that then break if you do another run in the same Google Colab session. Also another run will create output files and those will have default ASCII encoding which will cause failures (e.g. in ColabFold when it tries to write out citations with non-ascii characters.) A real fix will need to figure out how the text encoding is being changed or how to reset it to be UTF-8 after the OpenMM minimization changes it.

@Augustin-Zidek
Copy link
Collaborator

Fixed in 0d9a24b. Thanks for reporting!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
colab AlphaFold colab issue
Projects
None yet
Development

No branches or pull requests

9 participants