Huge Memory consumption #3

Colorstorm · 2020-04-27T09:26:05Z

Hi,

I have tried to run some test and had a lot of trouble with the huge amount of memory that is needed.

@konnosif which settings do you use?

Maybe we could think about making the settings with the least memory usage default and add a comment that other settings take a lot of memory

konnosif · 2020-04-27T10:52:26Z

@Colorstorm

--filter_sTF 1 --filter_sStart 3 --filter_sEnd 4 --suffix_label 0

--genus (should not affect)

--corr (should not affect)
Those should be the less memory using settings

if there is still a memory problem it might be because of the memory it needs to create the huge images. So input should not been affecting memory but image creation is affecting it.

Colorstorm · 2020-04-27T12:29:02Z

Thanks, I am trying the options now and will keep you up to date

Colorstorm · 2020-04-27T13:15:35Z

How much memory did you have allocated in slurm?

rcug_lw@hpc05:/working2/rcug_lw/konstantinos/fabian_test$ srun python3 heat5.py --batch_files marie.txt.filt.sort.csv.forpython.txt --filter_sTF 1 --filter_sStart 3 --filter_s   End 4 --suffix_label 0 --where ../cov_test_mariep/
srun: job 3367604 queued and waiting for resources
srun: error: Lookup failed: Unknown host
srun: job 3367604 has been allocated resources
/working2/rcug_lw/miniconda3/lib/python3.7/site-packages/statsmodels/tools/_testing.py:19: FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead.
  import pandas.util.testing as tm
####Current directory is : /working2/rcug_lw/konstantinos/fabian_test
####Current file.txt with filenames to open is:marie.txt.filt.sort.csv.forpython.txt
KGCF01_S4_R1 dataset
Successfully created the directory /working2/rcug_lw/konstantinos/fabian_test/KGCF01_S4_R1
Successfully created the directory /working2/rcug_lw/konstantinos/fabian_test/KGCF01_S4_R1/heatmap
Traceback (most recent call last):
  File "heat5.py", line 211, in <module>
    result = pd.merge(leftPANDA, rightPANDA, how='outer', on=['colA'])
  File "/working2/rcug_lw/miniconda3/lib/python3.7/site-packages/pandas/core/reshape/merge.py", line 88, in merge
    return op.get_result()
  File "/working2/rcug_lw/miniconda3/lib/python3.7/site-packages/pandas/core/reshape/merge.py", line 668, in get_result
    self._maybe_add_join_keys(result, left_indexer, right_indexer)
  File "/working2/rcug_lw/miniconda3/lib/python3.7/site-packages/pandas/core/reshape/merge.py", line 821, in _maybe_add_join_keys
    result[name] = key_col
  File "/working2/rcug_lw/miniconda3/lib/python3.7/site-packages/pandas/core/frame.py", line 2938, in __setitem__
    self._set_item(key, value)
  File "/working2/rcug_lw/miniconda3/lib/python3.7/site-packages/pandas/core/frame.py", line 3000, in _set_item
    value = self._sanitize_column(key, value)
  File "/working2/rcug_lw/miniconda3/lib/python3.7/site-packages/pandas/core/frame.py", line 3645, in _sanitize_column
    value = value.copy(deep=True)
  File "/working2/rcug_lw/miniconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 861, in copy
    new_index = self._shallow_copy(self._data.copy())
MemoryError: Unable to allocate 41.4 GiB for an array with shape (5556669383,) and data type object
srun: error: hpc-rc05: task 0: Exited with exit code 1

colindaven · 2020-04-27T13:48:40Z

@Colorstorm I haven't tried the script out, but I saw uses up to 480 GB RAM on one server this morning.

colindaven · 2020-04-28T09:01:15Z

Konstantinos' working command:

command i used on Ubuntu * ubuntu on windows 10..! on a 8 GB deskop on toy dataset (Github) :

python3 heat5.py --batch_files sample.txt --filter_sTF 1 --filter_sStart 0 --filter_sEnd 4 --suffix_label A --genus 1 --corr 0

colindaven · 2020-04-28T09:02:05Z

Huge RAM use was due to the coverage window, and not the bam.txt files, being used as input.

Colorstorm assigned konnosif Apr 27, 2020

colindaven closed this as completed Apr 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Huge Memory consumption #3

Huge Memory consumption #3

Colorstorm commented Apr 27, 2020

konnosif commented Apr 27, 2020 •

edited

Colorstorm commented Apr 27, 2020

Colorstorm commented Apr 27, 2020

colindaven commented Apr 27, 2020

colindaven commented Apr 28, 2020 •

edited

colindaven commented Apr 28, 2020

Huge Memory consumption #3

Huge Memory consumption #3

Comments

Colorstorm commented Apr 27, 2020

konnosif commented Apr 27, 2020 • edited

Colorstorm commented Apr 27, 2020

Colorstorm commented Apr 27, 2020

colindaven commented Apr 27, 2020

colindaven commented Apr 28, 2020 • edited

colindaven commented Apr 28, 2020

konnosif commented Apr 27, 2020 •

edited

colindaven commented Apr 28, 2020 •

edited