Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parasail_memalign: posix_memalign failed: Cannot allocate memory #70

Open
silenus092 opened this issue Mar 20, 2023 · 4 comments
Open

Comments

@silenus092
Copy link

Hi,

I have a problem with running the tool on my server, and it was very unclear to me what was happening.

At first, I installed the package (via pip) and tested/integrated it into my application, and then ran my application in my notebook (WSL, ubuntu 20) , which was working fine.

However, when I deploy my application on the server to perform a bigger task.

It said the error as follows
parasail_memalign: posix_memalign failed: Cannot allocate memory

The spec. of my mini server is.

Debian GNU/Linux 10 (buster)
Release: 10
have 16 CPUs and 32 GB of RAM.

Even though I tried to align with only 1 sample, it still said the same error.

I also tried to install the parasail from the source (e.g., python setup.py bdist_wheel),
and I also tried the method from CMake build (https://github.com/jeffdaily/parasail#compiling-and-installing) and manually copied the libparasail.so to the parasail-python shared lib folder. The installation from both methods went well; however, when I executed my application, the problem was still the same.

So maybe, Could you please shed light on what else could be the problems? Does it is something to do with the cross-platform compiling or dependencies?

Best regards,
Note

@jeffdaily
Copy link
Owner

This is a generic out of memory error being reported from the C parasail library.

Could you provide more information about the sizes of your input sequences, the "bigger task"? Also the matrix being used and the alignment routine. These all have implications for how much memory is needed to perform the alignment.

@silenus092
Copy link
Author

The bigger task is just more sequences to be aligned, sorry for the confusion.

So I want to align the mpox sequence *(seq. length ~ 19X,XXX ) from NCBI against a mpox reference.

The setting in the code is

gapopen=16, gapextend=4
parasail.sg_trace( query_seq, reference_seq, gapopen, gapextend, parasail.blosum62)

I use the same code for both machines.

Right now I just use only a single thread to perform the alignment, however, I also plan to apply multiple alignments at the same time (multithread).

@silenus092
Copy link
Author

Hi, I found the solution; my server didn't set a swap space to compensate for the RAM.

Anyway, Does it is normal to consume so much RAM to do alignment?

parasail_aligner -a sw_trace -f NC_063383.1.fasta -q sample-1.fasta -e 4 -g 16 -r 2GB -O SAM -V

      memory budget: 2.0000 GB
read and pack memory: 0.0008 GB
  openmp prep memory: 0.0008 GB
  post-result memory: 38.8786 GB

@jeffdaily
Copy link
Owner

It's been too long. I'm having to go back through my code to see how much memory it uses. Roughly, an alignment uses 3 NxM arrays and 1 length N array where N is the length of the s1 sequence and M is the length of the s2 sequence. The arrays are typically 4-byte integers.

How long is the NCBI mpox reference?

The parasail_aligner by default uses a suffix array filter. It can eliminate alignments by only aligning pairs with exact-match segments longer than a cutoff. It uses N + M memory. For your use case, I would turn it off using -x. The use case for keeping the filter turned on is more like BLAST where you are potentially aligning many sequences, and alignments are more expensive than computing the filter first so it does save time.

Also, you used sw_trace for the alignment. This is the vanilla implementation that doesn't use any vector instructions so you aren't getting the best performance. Might I recommend sw_trace_striped_16 or sw_trace_striped_32? These will dispatch to the function that supports your CPU's best vector ISA. Here the 16 and 32 refer to the intermediate solution size, if you know the alignment score will remain below the 16-byte integer limit 32767 or 32-byte integer limit 2147483647.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants