You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I ran into OOMs when processing a set of large fastq files.
I reproduced in a small example of what appears to be a memory leak that is noticeable when you call .quali
The attached program repeatedly opens test.fq.gz and prints out the memory used after each iteration of opening the files and processing all the read. There is some small growth in memory (with plateaus) when calling .seq and .qual, but a much larger / consistent growth when calling .quali
This is the output:
Running calling seq
After run 0, using 12,210,176 bytes of memory
After run 1, using 12,845,056 bytes of memory
After run 2, using 13,230,080 bytes of memory
After run 3, using 13,340,672 bytes of memory
After run 4, using 13,471,744 bytes of memory
After run 5, using 13,520,896 bytes of memory
After run 6, using 13,524,992 bytes of memory
After run 7, using 13,594,624 bytes of memory
After run 8, using 13,594,624 bytes of memory
After run 9, using 13,594,624 bytes of memory
After run 10, using 13,627,392 bytes of memory
After run 11, using 13,705,216 bytes of memory
After run 12, using 13,787,136 bytes of memory
After run 13, using 13,791,232 bytes of memory
After run 14, using 13,824,000 bytes of memory
After run 15, using 13,856,768 bytes of memory
After run 16, using 13,877,248 bytes of memory
After run 17, using 14,000,128 bytes of memory
After run 18, using 14,053,376 bytes of memory
After run 19, using 14,053,376 bytes of memory
After run 20, using 14,053,376 bytes of memory
After run 21, using 14,053,376 bytes of memory
After run 22, using 14,086,144 bytes of memory
After run 23, using 14,127,104 bytes of memory
After run 24, using 14,127,104 bytes of memory
After run 25, using 14,127,104 bytes of memory
After run 26, using 14,127,104 bytes of memory
After run 27, using 14,127,104 bytes of memory
After run 28, using 14,127,104 bytes of memory
After run 29, using 14,127,104 bytes of memory
After run 30, using 14,127,104 bytes of memory
After run 31, using 14,131,200 bytes of memory
After run 32, using 14,131,200 bytes of memory
After run 33, using 14,163,968 bytes of memory
After run 34, using 14,163,968 bytes of memory
After run 35, using 14,163,968 bytes of memory
After run 36, using 14,163,968 bytes of memory
After run 37, using 14,163,968 bytes of memory
After run 38, using 14,163,968 bytes of memory
After run 39, using 14,163,968 bytes of memory
After run 40, using 14,196,736 bytes of memory
After run 41, using 14,196,736 bytes of memory
After run 42, using 14,196,736 bytes of memory
After run 43, using 14,196,736 bytes of memory
After run 44, using 14,196,736 bytes of memory
After run 45, using 14,196,736 bytes of memory
After run 46, using 14,196,736 bytes of memory
After run 47, using 14,229,504 bytes of memory
After run 48, using 14,229,504 bytes of memory
After run 49, using 14,229,504 bytes of memory
Running calling qual
After run 0, using 14,299,136 bytes of memory
After run 1, using 14,299,136 bytes of memory
After run 2, using 14,299,136 bytes of memory
After run 3, using 14,299,136 bytes of memory
After run 4, using 14,299,136 bytes of memory
After run 5, using 14,299,136 bytes of memory
After run 6, using 14,299,136 bytes of memory
After run 7, using 14,299,136 bytes of memory
After run 8, using 14,299,136 bytes of memory
After run 9, using 14,299,136 bytes of memory
After run 10, using 14,299,136 bytes of memory
After run 11, using 14,299,136 bytes of memory
After run 12, using 14,299,136 bytes of memory
After run 13, using 14,303,232 bytes of memory
After run 14, using 14,303,232 bytes of memory
After run 15, using 14,303,232 bytes of memory
After run 16, using 14,303,232 bytes of memory
After run 17, using 14,303,232 bytes of memory
After run 18, using 14,303,232 bytes of memory
After run 19, using 14,303,232 bytes of memory
After run 20, using 14,303,232 bytes of memory
After run 21, using 14,303,232 bytes of memory
After run 22, using 14,303,232 bytes of memory
After run 23, using 14,303,232 bytes of memory
After run 24, using 14,303,232 bytes of memory
After run 25, using 14,360,576 bytes of memory
After run 26, using 14,372,864 bytes of memory
After run 27, using 14,372,864 bytes of memory
After run 28, using 14,372,864 bytes of memory
After run 29, using 14,405,632 bytes of memory
After run 30, using 14,409,728 bytes of memory
After run 31, using 14,409,728 bytes of memory
After run 32, using 14,462,976 bytes of memory
After run 33, using 14,462,976 bytes of memory
After run 34, using 14,462,976 bytes of memory
After run 35, using 14,462,976 bytes of memory
After run 36, using 14,462,976 bytes of memory
After run 37, using 14,462,976 bytes of memory
After run 38, using 14,462,976 bytes of memory
After run 39, using 14,462,976 bytes of memory
After run 40, using 14,462,976 bytes of memory
After run 41, using 14,462,976 bytes of memory
After run 42, using 14,462,976 bytes of memory
After run 43, using 14,462,976 bytes of memory
After run 44, using 14,462,976 bytes of memory
After run 45, using 14,462,976 bytes of memory
After run 46, using 14,462,976 bytes of memory
After run 47, using 14,462,976 bytes of memory
After run 48, using 14,462,976 bytes of memory
After run 49, using 14,462,976 bytes of memory
Running calling quali
After run 0, using 14,630,912 bytes of memory
After run 1, using 14,798,848 bytes of memory
After run 2, using 15,265,792 bytes of memory
After run 3, using 15,433,728 bytes of memory
After run 4, using 15,601,664 bytes of memory
After run 5, using 15,769,600 bytes of memory
After run 6, using 15,937,536 bytes of memory
After run 7, using 16,105,472 bytes of memory
After run 8, using 16,273,408 bytes of memory
After run 9, using 16,740,352 bytes of memory
After run 10, using 16,908,288 bytes of memory
After run 11, using 17,076,224 bytes of memory
After run 12, using 17,281,024 bytes of memory
After run 13, using 17,448,960 bytes of memory
After run 14, using 17,498,112 bytes of memory
After run 15, using 17,797,120 bytes of memory
After run 16, using 17,952,768 bytes of memory
After run 17, using 18,120,704 bytes of memory
After run 18, using 18,292,736 bytes of memory
After run 19, using 18,460,672 bytes of memory
After run 20, using 18,546,688 bytes of memory
After run 21, using 18,845,696 bytes of memory
After run 22, using 18,964,480 bytes of memory
After run 23, using 19,132,416 bytes of memory
After run 24, using 19,300,352 bytes of memory
After run 25, using 19,468,288 bytes of memory
After run 26, using 19,595,264 bytes of memory
After run 27, using 19,894,272 bytes of memory
After run 28, using 19,972,096 bytes of memory
After run 29, using 20,140,032 bytes of memory
After run 30, using 20,307,968 bytes of memory
After run 31, using 20,480,000 bytes of memory
After run 32, using 20,647,936 bytes of memory
After run 33, using 20,688,896 bytes of memory
After run 34, using 21,024,768 bytes of memory
After run 35, using 21,192,704 bytes of memory
After run 36, using 21,360,640 bytes of memory
After run 37, using 21,528,576 bytes of memory
After run 38, using 21,729,280 bytes of memory
After run 39, using 21,770,240 bytes of memory
After run 40, using 22,069,248 bytes of memory
After run 41, using 22,233,088 bytes of memory
After run 42, using 22,433,792 bytes of memory
After run 43, using 22,622,208 bytes of memory
After run 44, using 22,790,144 bytes of memory
After run 45, using 22,872,064 bytes of memory
After run 46, using 23,171,072 bytes of memory
After run 47, using 23,298,048 bytes of memory
After run 48, using 23,478,272 bytes of memory
After run 49, using 23,646,208 bytes of memory
The code is (sorry, I can't attach a .py directly)
import gc
import psutil
import pyfastx
print("\nRunning calling seq")
for run_idx in range(50):
f = pyfastx.Fastq("tests/data/test.fq.gz")
mysum = 0
for read in f:
mysum += hash(read.seq)
del(f)
gc.collect()
mem_used = psutil.Process().memory_info().rss
print(f"After run {run_idx}, using {mem_used:,} bytes of memory")
print("\nRunning calling qual")
for run_idx in range(50):
f = pyfastx.Fastq("tests/data/test.fq.gz")
mysum = 0
for read in f:
mysum += hash(read.qual)
del(f)
gc.collect()
mem_used = psutil.Process().memory_info().rss
print(f"After run {run_idx}, using {mem_used:,} bytes of memory")
print("\nRunning calling quali")
for run_idx in range(50):
f = pyfastx.Fastq("tests/data/test.fq.gz")
mysum = 0
for read in f:
mysum += sum(read.quali)
del(f)
gc.collect()
mem_used = psutil.Process().memory_info().rss
print(f"After run {run_idx}, using {mem_used:,} bytes of memory")
I ran into OOMs when processing a set of large fastq files.
I reproduced in a small example of what appears to be a memory leak that is noticeable when you call .quali
The attached program repeatedly opens test.fq.gz and prints out the memory used after each iteration of opening the files and processing all the read. There is some small growth in memory (with plateaus) when calling .seq and .qual, but a much larger / consistent growth when calling .quali
This is the output:
The code is (sorry, I can't attach a .py directly)
Relevant system details
The text was updated successfully, but these errors were encountered: