Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when running buttery-eel on macOS #36

Closed
kisarur opened this issue Apr 29, 2024 · 9 comments
Closed

Error when running buttery-eel on macOS #36

kisarur opened this issue Apr 29, 2024 · 9 comments
Assignees
Labels
bug Something isn't working

Comments

@kisarur
Copy link

kisarur commented Apr 29, 2024

I could successfully setup buttery-eel on my MacBook Air, which has the Apple M2 chip and is running macOS Sonoma version 14.1.1. Below are the commands I used for the setup:

# setup buttery-eel inside its own environment:
git clone https://github.com/Psy-Fer/buttery-eel
mv buttery-eel buttery-eel-0.4.2+dorado7.2.13
cd buttery-eel-0.4.2+dorado7.2.13
git checkout 0860e66fcf783e8e # the commit I am testing
sed -i 's/7.1.4/7.2.13/' requirements.txt
python3 -m venv venv3
source venv3/bin/activate
pip3 install --upgrade pip
pip3 install .
deactivate

# download ont-dorado-server for macOS
wget https://cdn.oxfordnanoportal.com/software/analysis/ont-dorado-server_7.2.13_osxarm64.zip 
unzip ont-dorado-server_7.2.13_osxarm64.zip 

I tried running buttery-eel with a sample dataset using the command below:

scripts/eel -g ont-dorado-server/bin --config dna_r10.4.1_e8.2_400bps_5khz_hac_prom.cfg --device 'metal' -i PGXXXX230339_reads_20k.blow5 -o PGXXXX230339_reads_20k_dorado7213hac.fastq --qscore 9

BTW, when running above command, I'm getting the following error related to multiprocessing on macOS:

==========================================================================
  Basecalling
==========================================================================

Process read_worker:
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/Volumes/KisaruExtSSD/Buttery-eel/buttery-eel-0.4.2+dorado7.2.13/venv3/lib/python3.11/site-packages/src/buttery_eel.py", line 238, in read_worker
    if iq.qsize() < max_limit:
       ^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/queues.py", line 126, in qsize
    return self._maxsize - self._sem._semlock._get_value()
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
NotImplementedError

Thanks!

@Psy-Fer
Copy link
Owner

Psy-Fer commented Apr 30, 2024

Oh interesting! Thanks for reporting this, i'll see what I can do about it.

@Psy-Fer Psy-Fer self-assigned this Apr 30, 2024
@Psy-Fer Psy-Fer added the bug Something isn't working label Apr 30, 2024
@Psy-Fer
Copy link
Owner

Psy-Fer commented Apr 30, 2024

Hey @kisarur could you try this again with a new branch I just pushed called new_queue
https://github.com/Psy-Fer/buttery-eel/tree/new_queue

switch to this branch, modify the requirements.txt again, then do a pip install . again.

Then try running it. (I don't have macOS to test myself, sorry)

The fix I'm trying here, is instead of using miltiprocessing.JoinableQueue(), I'm using an Manager instance with multiprocessing.Manager.JoinableQueue()

Apparently that is implemented for MacOS.

The issue here, is related to
return self._maxsize - self._sem._semlock._get_value()

Basically, semlock is bugged or something for MacOS, so can't call qsize on the queue.

Let me know how you go with this version. If it works, I'll make this the default for all systems.

Cheers,
James

@hasindu2008
Copy link
Collaborator

@Psy-Fer I heard a rumour that it is working on Mac.

Do you think this will affect the stability/compatibility on architectures and O/S in which buttery-eel is well tested?

@kisarur
Copy link
Author

kisarur commented Apr 30, 2024

Hey @Psy-Fer, thanks for working on this fix! I tried the new branch with the fix and it worked perfectly on my mac. BTW, you could consider @hasindu2008's opinion to see if you should make this the default.

Cheers!
Kisaru

@kisarur
Copy link
Author

kisarur commented Apr 30, 2024

BTW, according to this Python doc, the value returned by Queue.qsize() seems to be not reliable on other OSs as well?

@Psy-Fer
Copy link
Owner

Psy-Fer commented Apr 30, 2024

Ahh good news.

Yea while it is approximate, I am dealing in batches, so the queue is not really that high a number, and there are not a lot of transactions happening to lend to it being inaccurate.

Using this fix shouldn't really impact much but of course I could just wrap this code in an OS detection method we built for interARTIC, as the code change is rather simple.

Thoughts @hasindu2008?

@hasindu2008
Copy link
Collaborator

Looking at your code, the use of queue size is a heuristic to prevent memory blowing up and thus the exact accuracy of the queue size does not matter.

Not sure if this article is right, but seems like manager.queue may impact performance which is quite important to us especially that we pass terabytes of data over a large run. So i am more inclined for that os dependent approach if it is not too hard and doesn't add non standard dependencies

https://www.geeksforgeeks.org/python-multiprocessing-queue-vs-multiprocessing-manager-queue/

@Psy-Fer
Copy link
Owner

Psy-Fer commented Aug 16, 2024

Alright i'm doing this now in the new version

    if platform.system() == "Darwin":
            im = mp.Manager()
            rm = mp.Manager()
            input_queue = im.JoinableQueue()
            result_queue = rm.JoinableQueue()
    else:
            input_queue = mp.JoinableQueue()
            result_queue = mp.JoinableQueue()
            ```
            
            
That should detect macOS and then use Manager. A bit slower, but it will work at least.

@Psy-Fer
Copy link
Owner

Psy-Fer commented Sep 7, 2024

This should now be resolved with
https://github.com/Psy-Fer/buttery-eel/releases/tag/v0.5.0

cheers
James

@Psy-Fer Psy-Fer closed this as completed Sep 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants