Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python3 Incompatibilities #6

Open
apeltzer opened this issue Nov 3, 2018 · 13 comments
Open

Python3 Incompatibilities #6

apeltzer opened this issue Nov 3, 2018 · 13 comments

Comments

@apeltzer
Copy link
Contributor

apeltzer commented Nov 3, 2018

So apparently there are quite some issues with Python3+, making reading input files quite difficult.

a.) UTF-8/Latin-1 file encodings are quite difficult to handle in Python3, especially forcing these to be consistent on how input reading was handled automatically by Python 2.

cf for details:

https://stackoverflow.com/questions/12468179/unicodedecodeerror-utf8-codec-cant-decode-byte-0x9c

b.) I guess the easiest way would be adding pysam (which is installable via python-pip, conda etc pp) and then rely on this as reading/writing library for SAM/BAM compatibility. One could even have automatic MD tagging activated, making the process easier for users too.

@pontussk
Copy link
Owner

pontussk commented Nov 4, 2018 via email

@apeltzer
Copy link
Contributor Author

apeltzer commented Nov 4, 2018

Hi Pontus,

yes, we could work on fixing these issues in the code directly - @boulund apparently ported the code basis to Python3 without changing that, so we could ask him again whether he can share what he changed :-)

cf nf-core/eager#36

@boulund
Copy link

boulund commented Nov 8, 2018

I haven't had time to look at this in detail, maybe my code has the same issue with encodings as you've encountered?

Please have a look at the start of the rewrite I made here: https://github.com/boulund/PMDtools/tree/boulund-rewrite

Note that I made my rewrite of this a while back, before the code was put online under a permissive license, so if there are recent changes they are probably not incorporated in my version of the code.

@pontussk
Copy link
Owner

pontussk commented Nov 9, 2018 via email

@rsh249
Copy link

rsh249 commented Jan 25, 2021

Was there a solution to the python3 incompatibility?

@apeltzer
Copy link
Contributor Author

None that I'm aware of unfortunately so far. I#ve not had time to have a more detailed look but would be happy to see such an update :-)

@boulund
Copy link

boulund commented Jan 25, 2021

I haven't compared the results from my python3 conversion with the original code, but it runs in python 3 and produces reasonably looking output.

@rsh249
Copy link

rsh249 commented Jan 26, 2021

I made an attempt to make the original script work with python3 and also get sensible results (so far).

https://github.com/rsh249/PMDtools *cloned repository with fixes for python3

I’ll check out yours @boulund too!

@apeltzer
Copy link
Contributor Author

If this works reliably, could be a good thing to open a PR to this repository and get it updated in a new release maybe ?

@rsh249
Copy link

rsh249 commented Feb 1, 2021

I will keep testing and let you know if my version seems OK for a PR. Thanks all!

@pontussk
Copy link
Owner

pontussk commented Feb 25, 2021 via email

@boulund
Copy link

boulund commented Sep 5, 2022

Is there still interest in a Python3 port of PMDtools ?
If there are any significant updates to the original code of version 0.6 (which is labelled 0.5 in the code) I could add them to my py3 port and create a PR to this repo, if anyone still wants it. Maybe PMDtools is good enough as is, or people have moved on to something else?
Is there a good test data set that could be used to verify that the output doesn't change?

@pontussk
Copy link
Owner

pontussk commented Sep 30, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants