### purpose

create allele frequency runs of GF using the sets of random loci assigned to individual runs

<a id='home'></a>
### purpose

create allele frequency runs of GF using the sets of random loci assigned to individual runs

### outline

1. [get dirs with training files](#dirs)

    get a list of directories containing genomic data

1. [symlink supplementary files](#sup)

    directories are nested by the number of loci provided for training. There is no need to copy environmental data to each directory, so I symlink it. These files are used by training scripts.
    
1. [create snpfiles](#snp)

use the individual-based files to get a list of loci, then subset the pooled files
    
1. [create sh files for training and predicting](#shfiles)

    create slurm sbatch files to submit training jobs to the cluster, as well as jobs that use trained model to predict models to common garden environments
    
1. [create shfiles for making predictions of the trained GF models to specific environments](#shfiles)
s
    1. [create sh files for training](#training)
    1. [create shfiles for making predictions of the trained GF models to specific environments](#predict)

1. [submit jobs using 500 loci](#submit500)

1. [submit jobs using 5000 loci](#submit5000)

1. [submit jobs using 10000 loci](#submit10k)

1. [submit jobs using 20000 loci](#submit20k)

In [1]:
from pythonimports import *

import MVP_summary_functions as mvp

lview, dview = get_client(cluster_id='1707744598-70rf', profile='lotterhos')

outerdir = '/work/lotterhos/brandon/ind_runtimes'
pooled_dir = makedir('/work/lotterhos/brandon/pooled_runtimes')

mvp.latest_commit()
session_info.show()

36 36
#########################################################
Today:	February 05, 2024 - 15:13:09 EST
python version: 3.8.5
conda env: mvp_env

Current commit of [1mpythonimports[0m:
[33mcommit 419895d157c97717f835390196c13cf973d25eba[m  
Merge: e20434f 1e09b6c  
Author: Brandon Lind <lind.brandon.m@gmail.com>

Current commit of [94m[1mMVP_offsets[0m[0m:
[33mcommit 8b790072e7a46d7f58a30c40cf4660986a612764[m  
Author: Brandon Lind <lind.brandon.m@gmail.com>  
Date:   Fri Feb 2 13:55:49 2024 -0500
#########################################################



<a id='dirs'></a>
# get dirs with training files
[top](#home)

In [2]:
# get a list of directories for each rep
reps = fs(outerdir, startswith='run', dirs=True, bnames=True)

reps

['run_20220919_0-225', 'run_20220919_225-450', 'run_20220919_450-675']

In [3]:
# directories with pooled SNP data with all markers
src_dirs = {}
for rep in reps:
    src_dirs[rep] = f'/work/lotterhos/MVP-Offsets/{rep}/gradient_forests/training/training_files'

In [4]:
# create pooled directories

set_nums = ['00500', '05000', '10000', '20000']

pooled_dirs = defaultdict(list)
for rep, d in src_dirs.items():
    print(ColorText(rep).bold())
    
    for set_num in set_nums: 
        pooled_dirs[rep].append(
            makedir(
                f'{pooled_dir}/{rep}/{set_num}/gradient_forests/training/training_files'
            )
        )
        
    print(pooled_dirs[rep], '\n')

[1mrun_20220919_0-225[0m
['/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/00500/gradient_forests/training/training_files', '/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/05000/gradient_forests/training/training_files', '/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/10000/gradient_forests/training/training_files', '/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/20000/gradient_forests/training/training_files'] 

[1mrun_20220919_225-450[0m
['/work/lotterhos/brandon/pooled_runtimes/run_20220919_225-450/00500/gradient_forests/training/training_files', '/work/lotterhos/brandon/pooled_runtimes/run_20220919_225-450/05000/gradient_forests/training/training_files', '/work/lotterhos/brandon/pooled_runtimes/run_20220919_225-450/10000/gradient_forests/training/training_files', '/work/lotterhos/brandon/pooled_runtimes/run_20220919_225-450/20000/gradient_forests/training/training_files'] 

[1mrun_20220919_450-675[0m
['/work/lotterhos/brandon/poo

<a id='sup'></a>
# symlink supplementary files

[top](#home)

In [5]:
# symlink envfile and rangefile
for rep, d in src_dirs.items():
    files = fs(d, exclude=['maf-gt-p01', 'adaptive', 'neutral'], endswith='pooled.txt')
    
    assert len(files) == 450, len(files)  # 2 files for each of 225 seeds (one envfile, one rangefile)

    for src in pbar(files, desc=rep):
        for repdir in pooled_dirs[rep]:
            dst = f'{repdir}/{op.basename(src)}'
            
            try:
                os.symlink(src, dst)
            except FileExistsError as e:
                pass

run_20220919_0-225: 100%|███████████████| 450/450 [00:00<00:00, 574.38it/s]
run_20220919_225-450: 100%|███████████████| 450/450 [00:00<00:00, 611.40it/s]
run_20220919_450-675: 100%|███████████████| 450/450 [00:00<00:00, 643.21it/s]


In [6]:
# add ind envfiles to directories so jobs pass mvp02.get_envdata filecount assertion
for rep, repdirs in pooled_dirs.items():
    for dst_dir in pbar(repdirs, desc=rep):

        src_dir = src_dirs[rep]
        assert src_dir != dst_dir

        envfiles = fs(src_dir, endswith='envfile_GFready_ind.txt')
        assert len(envfiles) == 225

        for src in envfiles:
            dst = f'{dst_dir}/{op.basename(src)}'

            try:
                os.symlink(src, dst)
            except FileExistsError as e:
                pass

run_20220919_0-225: 100%|███████████████| 4/4 [00:00<00:00, 10.44it/s]
run_20220919_225-450: 100%|███████████████| 4/4 [00:00<00:00, 10.61it/s]
run_20220919_450-675: 100%|███████████████| 4/4 [00:00<00:00, 10.94it/s]


<a id='snp'></a>
# create snpfiles

use the individual-based files to get a list of loci, then subset the pooled files

In [7]:
ind_snpfiles = wrap_defaultdict(dict, 2)
for rep, repdirs in pooled_dirs.items():
    for repdir in repdirs:
        num_loci = repdir.split('/')[6]
        
        ind_dir = repdir.replace('pooled_runtimes', 'ind_runtimes')
        
        ind_snpfiles[rep][num_loci] = fs(ind_dir, endswith='ind_all.txt')
        
        print(rep, num_loci, len(ind_snpfiles[rep][num_loci]))

run_20220919_0-225 00500 225
run_20220919_0-225 05000 225
run_20220919_0-225 10000 225
run_20220919_0-225 20000 225
run_20220919_225-450 00500 225
run_20220919_225-450 05000 225
run_20220919_225-450 10000 225
run_20220919_225-450 20000 225
run_20220919_450-675 00500 225
run_20220919_450-675 05000 225
run_20220919_450-675 10000 225
run_20220919_450-675 20000 225


In [8]:
ind_snpfiles[rep][num_loci][0]

'/work/lotterhos/brandon/ind_runtimes/run_20220919_450-675/20000/gradient_forests/training/training_files/1231544_Rout_Gmat_sample_maf-gt-p01_GFready_ind_all.txt'

In [9]:
rep

'run_20220919_450-675'

In [10]:
def subset_loci(rep, num_loci, ind_file):
    import pandas as pd
    from os import path as op
    
    # 
    ind_snps = pd.read_table(ind_file, index_col='index', nrows=5)
    
    if int(num_loci) < 10000:
        assert len(ind_snps.columns) == int(num_loci)
        # some of the datasets has a small number of snps
    
    basename = op.basename(ind_file).replace('_ind_', '_pooled_')
    pooled_file = f'/home/b.lind/offsets/{rep}/gradient_forests/training/training_files/{basename}'
    
    assert op.exists(pooled_file)
    
    pooled_snps = pd.read_table(pooled_file, index_col='index')[ind_snps.columns]
    pooled_snps['index'] = pooled_snps.index.tolist()
    
    dst = ind_file.replace('ind_runtimes', 'pooled_runtimes').replace('_ind_all.txt', '_pooled_all.txt')
    
    assert 'pooled_runtimes' in dst
    
    pooled_snps.to_csv(dst, sep='\t', index=False, header=True)
    
    return dst

In [11]:
jobs = []
for (rep, num_loci), ind_files in unwrap_dictionary(ind_snpfiles):
    for ind_file in ind_files:
        jobs.append(
            lview.apply_async(
                subset_loci, *(rep, num_loci, ind_file)
            )
        )
    
watch_async(jobs)

[1m
Watching 2700 jobs ...[0m


100%|███████████████| 2700/2700 [03:39<00:00, 12.29it/s]


In [12]:
# check for errors
for i, j in enumerate(jobs):
    x = j.r

In [13]:
for j in jobs:
    assert op.exists(j.r)

<a id='shfiles'></a>
# create sh files for training and predicting

[top](#home)

<a id='training'></a>
### create training shfiles

[top](#home)

I use training sh files previously created for Lind & Lotterhos (2024). I edit them to serve our purposes

In [14]:
mems = {
    '00500' : '10000M',
    '05000' : '100000M',
    '10000' : '180000M',
    '20000' : '200000M'
}

times = {
    '00500' : '0-03:00:00',
    '05000' : '0-12:00:00',
    '10000' : '1-00:00:00',
    '20000' : '1-00:00:00'
}

In [15]:
all_shfiles = []
for rep, src_dir in src_dirs.items():
    # get the sh files
    shdir = src_dir.replace('_files', '_shfiles')
    
    shfiles = fs(shdir, endswith='pooled_all.sh', exclude='watcher')
    
    assert len(shfiles) == 225
    
    # create new shfiles in each repdir
    for sh in pbar(shfiles, desc=rep):
    
        for dst_dir in pooled_dirs[rep]:
            rep_num = dst_dir.split('/')[6]
            assert int(rep_num) == float(rep_num)
            
            text = read(sh, lines=False)

            assert text.count(f'/home/b.lind/offsets/{rep}') == 2
            
            # replace training files and training outfiles dirs
            text = text.replace(f'/home/b.lind/offsets/{rep}', dst_dir.split('/gradient')[0])
            
            # replace mem and time and email
            assert all(
                [
                    '--time=2-00:00:00' in text or '--time=1-00:00:00' in text or '5-00:00:00' in text,
                    '--mem=300000M' in text or '--mem=400000M',
                    'b.lind@northeastern.edu' in text
                ]
            )
            
            text = text.replace('#SBATCH --time=2-00:00:00', '#SBATCH --time=%s' % times[rep_num]).\
                        replace('#SBATCH --time=5-00:00:00', '#SBATCH --time=%s' % times[rep_num]).\
                        replace('#SBATCH --time=1-00:00:00', '#SBATCH --time=%s' % times[rep_num])
            text = text.replace('#SBATCH --mem=300000M', '#SBATCH --mem=%s' % mems[rep_num]).\
                        replace('#SBATCH --mem=400000M', '#SBATCH --mem=%s' % mems[rep_num])
            text = text.replace('b.lind@northeastern.edu', 'dummy_email@gmail.com')
            
            # make dirs
            dst_shdir = makedir(dst_dir.replace('training_files', 'training_shfiles'))
            dst_outdir = makedir(dst_dir.replace('training_files', 'training_outfiles'))
            
            # write new sh file
            newsh = f'{dst_shdir}/{op.basename(sh)}'
            with open(newsh, 'w') as o:
                o.write(text)
                
            all_shfiles.append(newsh)
            
len(all_shfiles), luni(all_shfiles)

run_20220919_0-225: 100%|███████████████| 225/225 [00:05<00:00, 39.10it/s]
run_20220919_225-450: 100%|███████████████| 225/225 [00:06<00:00, 36.20it/s]
run_20220919_450-675: 100%|███████████████| 225/225 [00:05<00:00, 41.19it/s]


(2700, 2700)

In [16]:
for sh in pbar(all_shfiles):
    assert sh.startswith('/work/lotterhos/brandon/pooled_runtimes')

100%|███████████████| 2700/2700 [00:00<00:00, 1009054.69it/s]


In [17]:
all_shfiles[0]

'/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/00500/gradient_forests/training/training_shfiles/1231094_GF_training_pooled_all.sh'

In [18]:
print(text)

#!/bin/bash
#SBATCH --job-name=1231768_GF_training_pooled_all
#SBATCH --time=1-00:00:00
#SBATCH --mem=200000M
#SBATCH --partition=long
#SBATCH --output=1231768_GF_training_pooled_all_%j.out
#SBATCH --mail-user=dummy_email@gmail.com
#SBATCH --mail-type=FAIL

source $HOME/.bashrc  # assumed that conda init is within .bashrc
conda deactivate
conda activate r35

cd /work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/20000/gradient_forests/training/training_files

/home/b.lind/anaconda3/envs/r35/lib/R/bin/Rscript \
/home/b.lind/code/MVP-offsets/01_src/MVP_gf_training_script.R \
1231768_Rout_Gmat_sample_maf-gt-p01_GFready_pooled_all.txt \
1231768_envfile_GFready_pooled.txt \
1231768_rangefile_GFready_pooled.txt \
1231768_GF_training_pooled_all \
/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/20000/gradient_forests/training/training_outfiles




<a id='predict'></a>
## create shfiles for making predictions of the trained GF models to specific environments

MVP_02 doesn't use slimdir, so that doesn't matter

MVP_03 needs the original slimdir to get fitness/envdata/locations etc for validation

[top](#home)

In [19]:
src_dirs

{'run_20220919_0-225': '/work/lotterhos/MVP-Offsets/run_20220919_0-225/gradient_forests/training/training_files',
 'run_20220919_225-450': '/work/lotterhos/MVP-Offsets/run_20220919_225-450/gradient_forests/training/training_files',
 'run_20220919_450-675': '/work/lotterhos/MVP-Offsets/run_20220919_450-675/gradient_forests/training/training_files'}

In [20]:
pooled_dirs[rep]

['/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/00500/gradient_forests/training/training_files',
 '/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/05000/gradient_forests/training/training_files',
 '/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/10000/gradient_forests/training/training_files',
 '/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/20000/gradient_forests/training/training_files']

In [21]:
for rep, src_dir in src_dirs.items():
    fitting_shdir = src_dir.replace('training/training_files', 'fitting/fitting_shfiles')
    
    shfiles = fs(fitting_shdir, endswith='.sh', exclude='watcher')
    
    for sh in pbar(shfiles, desc=rep):
        for dst_dir in pooled_dirs[rep]:
            dst_shdir = makedir(dst_dir.replace('training/training_files', 'fitting/fitting_shfiles'))
            new_sh = f'{dst_shdir}/{op.basename(sh)}'
            
            text = read(sh, lines=True, ignore_blank=True)
            
            assert 'time' in text[2]
            text[2] = '#SBATCH --time=1-00:00:00'

            assert 'MVP_02' in text[-2]
            text[-2] = text[-2].replace(f'/home/b.lind/offsets/{rep}',
                                        dst_dir.split('/grad')[0])
            text[-2] += ' 1 ind'  # expect one RDS file and exclude pooled (non-default for MVP_02)
            
            assert 'MVP_03' in text[-1]
            text[-1] = text[-1].replace(f'/home/b.lind/offsets/{rep}/gradient_forests',
                                        dst_dir.split('/training')[0])
            text[-1] += ' 100 ind'  # expect 100 RDS files and exclude pooled (non-default for MVP_03)
            
            if 'partition' in text[5]:
                text.remove(text[5])
            
            # erase previous dependencies, will update after sbatching
            assert 'dependency' in text[6]
            text[6] = '#SBATCH --dependency=afterok:'
            
            assert 'mail' in text[7]
            text[7] = '#SBATCH --mail-user=dummy_email@gmail.com'  # new code, don't want 1e6 emails
            
            # write text to file
            with open(new_sh, 'w') as o:
                o.write('\n'.join(text))
                
                
dst_dir 

run_20220919_0-225: 100%|███████████████| 225/225 [00:07<00:00, 31.39it/s]
run_20220919_225-450: 100%|███████████████| 225/225 [00:05<00:00, 37.92it/s]
run_20220919_450-675: 100%|███████████████| 225/225 [00:06<00:00, 36.14it/s]


'/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/20000/gradient_forests/training/training_files'

In [22]:
new_sh

'/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/20000/gradient_forests/fitting/fitting_shfiles/1231768_gf_fitting.sh'

In [23]:
text

['#!/bin/bash',
 '#SBATCH --job-name=1231768_gf_fitting',
 '#SBATCH --time=1-00:00:00',
 '#SBATCH --ntasks=1',
 '#SBATCH --mem=300000M',
 '#SBATCH --output=1231768_gf_fitting_%j.out',
 '#SBATCH --dependency=afterok:',
 '#SBATCH --mail-user=dummy_email@gmail.com',
 '#SBATCH --mail-type=FAIL',
 '#SBATCH --nodes=1',
 '#SBATCH --cpus-per-task=7',
 'cd /home/b.lind/code/MVP-offsets/01_src',
 'source $HOME/.bashrc',
 'conda activate mvp_env',
 'python MVP_02_fit_gradient_forests.py 1231768 /work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/20000/slimdir /work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/20000/gradient_forests/training/training_outfiles /home/b.lind/anaconda3/envs/r35/lib/R/bin/Rscript 1 ind',
 'python MVP_03_validate_gradient_forests.py 1231768 /home/b.lind/offsets/run_20220919_450-675/slimdir /work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/20000/gradient_forests 100 ind']

In [24]:
read(new_sh)

['#!/bin/bash',
 '#SBATCH --job-name=1231768_gf_fitting',
 '#SBATCH --time=1-00:00:00',
 '#SBATCH --ntasks=1',
 '#SBATCH --mem=300000M',
 '#SBATCH --output=1231768_gf_fitting_%j.out',
 '#SBATCH --dependency=afterok:',
 '#SBATCH --mail-user=dummy_email@gmail.com',
 '#SBATCH --mail-type=FAIL',
 '#SBATCH --nodes=1',
 '#SBATCH --cpus-per-task=7',
 'cd /home/b.lind/code/MVP-offsets/01_src',
 'source $HOME/.bashrc',
 'conda activate mvp_env',
 'python MVP_02_fit_gradient_forests.py 1231768 /work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/20000/slimdir /work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/20000/gradient_forests/training/training_outfiles /home/b.lind/anaconda3/envs/r35/lib/R/bin/Rscript 1 ind',
 'python MVP_03_validate_gradient_forests.py 1231768 /home/b.lind/offsets/run_20220919_450-675/slimdir /work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/20000/gradient_forests 100 ind']

<a id='submit500'></a>
# submit jobs using 500 loci

I have to be careful about which shfiles I submit since my code relies on non-duplicated job names in the slurm queue

[top](#home)

### submit training files

In [25]:
pooled_dirs[rep]

['/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/00500/gradient_forests/training/training_files',
 '/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/05000/gradient_forests/training/training_files',
 '/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/10000/gradient_forests/training/training_files',
 '/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/20000/gradient_forests/training/training_files']

In [26]:
d_index = 0  # which dir I'm submitting

for rep, repdirs in pooled_dirs.items():
    print(repdirs[d_index])

/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/00500/gradient_forests/training/training_files
/work/lotterhos/brandon/pooled_runtimes/run_20220919_225-450/00500/gradient_forests/training/training_files
/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/00500/gradient_forests/training/training_files


In [27]:
# submit
jobnames =[]
pids = defaultdict(dict)
for rep, repdirs in pooled_dirs.items():
    sh_dir = repdirs[d_index].replace('training_files', 'training_shfiles')
    
    shfiles = fs(sh_dir, endswith='.sh', exclude='watcher')
    
    assert len(shfiles) == 225
    
    for sh in pbar(shfiles, desc=rep):
        jobnames.append(op.basename(sh))
        seed = op.basename(sh).split("_")[0]
        pids[rep][seed] = sbatch(sh, progress_bar=False)

luni(jobnames), len(jobnames)

run_20220919_0-225: 100%|███████████████| 225/225 [00:39<00:00,  5.76it/s]
run_20220919_225-450: 100%|███████████████| 225/225 [00:43<00:00,  5.15it/s]
run_20220919_450-675: 100%|███████████████| 225/225 [00:45<00:00,  4.94it/s]


(675, 675)

In [28]:
fitting_shfiles = []
for (rep, seed), pid in unwrap_dictionary(pids, progress_bar=True):
    fitting_shdir = pooled_dirs[rep][d_index].replace('training/training_files', 'fitting/fitting_shfiles')
    fitting_sh = f'{fitting_shdir}/{seed}_gf_fitting.sh'

    text = read(fitting_sh)

    assert 'dependency' in text[6]
    text[6] = f'#SBATCH --dependency=afterok:{pid[0]}'

    with open(fitting_sh, 'w') as o:
        o.write('\n'.join(text))

    fitting_shfiles.append(fitting_sh)

print(len(fitting_shfiles))

text

100%|███████████████| 3/3 [00:03<00:00,  1.19s/it]

675





['#!/bin/bash',
 '#SBATCH --job-name=1231768_gf_fitting',
 '#SBATCH --time=1-00:00:00',
 '#SBATCH --ntasks=1',
 '#SBATCH --mem=300000M',
 '#SBATCH --output=1231768_gf_fitting_%j.out',
 '#SBATCH --dependency=afterok:40707672',
 '#SBATCH --mail-user=dummy_email@gmail.com',
 '#SBATCH --mail-type=FAIL',
 '#SBATCH --nodes=1',
 '#SBATCH --cpus-per-task=7',
 'cd /home/b.lind/code/MVP-offsets/01_src',
 'source $HOME/.bashrc',
 'conda activate mvp_env',
 'python MVP_02_fit_gradient_forests.py 1231768 /work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/00500/slimdir /work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/00500/gradient_forests/training/training_outfiles /home/b.lind/anaconda3/envs/r35/lib/R/bin/Rscript 1 ind',
 'python MVP_03_validate_gradient_forests.py 1231768 /home/b.lind/offsets/run_20220919_450-675/slimdir /work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/00500/gradient_forests 100 ind']

In [29]:
fitting_shfiles[0]

'/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/00500/gradient_forests/fitting/fitting_shfiles/1231094_gf_fitting.sh'

In [30]:
read(fitting_shfiles[0])

['#!/bin/bash',
 '#SBATCH --job-name=1231094_gf_fitting',
 '#SBATCH --time=1-00:00:00',
 '#SBATCH --ntasks=1',
 '#SBATCH --mem=300000M',
 '#SBATCH --output=1231094_gf_fitting_%j.out',
 '#SBATCH --dependency=afterok:40706964',
 '#SBATCH --mail-user=dummy_email@gmail.com',
 '#SBATCH --mail-type=FAIL',
 '#SBATCH --nodes=1',
 '#SBATCH --cpus-per-task=7',
 'cd /home/b.lind/code/MVP-offsets/01_src',
 'source $HOME/.bashrc',
 'conda activate mvp_env',
 'python MVP_02_fit_gradient_forests.py 1231094 /work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/00500/slimdir /work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/00500/gradient_forests/training/training_outfiles /home/b.lind/anaconda3/envs/r35/lib/R/bin/Rscript 1 ind',
 'python MVP_03_validate_gradient_forests.py 1231094 /home/b.lind/offsets/run_20220919_0-225/slimdir /work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/00500/gradient_forests 100 ind']

In [31]:
fitting_shfiles[0]

'/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/00500/gradient_forests/fitting/fitting_shfiles/1231094_gf_fitting.sh'

In [32]:
fitting_pids = sbatch(fitting_shfiles)

sbatching: 100%|███████████████| 675/675 [03:00<00:00,  3.74it/s]


In [33]:
Squeue(grepping='fit')

[1m[38;2;128;128;128m🗒️  Queue Summary:
[0m[0m
{'[4m[1mshort[0m[0m': {'PD': 675}}

In [34]:
Squeue(grepping='train')

[1m[38;2;128;128;128m🗒️  Queue Summary:
[0m[0m
{'[4m[1mlong[0m[0m': {'PD': 383, 'R': 24},
 '[4m[1mshort[0m[0m': {'PD': 122, 'R': 39}}

In [35]:
Squeue(grepping='fit', partition='short').update(to_partition='long', num_jobs=0.5)

update: 100%|███████████████| 338/338 [00:24<00:00, 13.96it/s]


In [36]:
Squeue(grepping='fit')

[1m[38;2;128;128;128m🗒️  Queue Summary:
[0m[0m
{'[4m[1mlong[0m[0m': {'PD': 338},
 '[4m[1mshort[0m[0m': {'PD': 337}}

In [37]:
Squeue(grepping='train')

[1m[38;2;128;128;128m🗒️  Queue Summary:
[0m[0m
{'[4m[1mlong[0m[0m': {'PD': 379, 'R': 21},
 '[4m[1mshort[0m[0m': {'PD': 101, 'R': 48}}

In [42]:
Squeue(grepping=['train', '10000'], partition='long').update(to_partition='lotterhos', num_jobs=30)

update: 100%|███████████████| 30/30 [00:02<00:00, 14.32it/s]


In [43]:
Squeue()

[1m[38;2;128;128;128m🗒️  Queue Summary:
[0m[0m
{'[4m[1mlong[0m[0m': {'PD': 639, 'R': 18},
 '[4m[1mlotterhos[0m[0m': {'PD': 36, 'R': 18},
 '[4m[1mshort[0m[0m': {'PD': 425, 'R': 33}}

In [44]:
pooled_dirs

defaultdict(list,
            {'run_20220919_0-225': ['/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/00500/gradient_forests/training/training_files',
              '/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/05000/gradient_forests/training/training_files',
              '/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/10000/gradient_forests/training/training_files',
              '/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/20000/gradient_forests/training/training_files'],
             'run_20220919_225-450': ['/work/lotterhos/brandon/pooled_runtimes/run_20220919_225-450/00500/gradient_forests/training/training_files',
              '/work/lotterhos/brandon/pooled_runtimes/run_20220919_225-450/05000/gradient_forests/training/training_files',
              '/work/lotterhos/brandon/pooled_runtimes/run_20220919_225-450/10000/gradient_forests/training/training_files',
              '/work/lotterhos/brandon/pooled_runtimes/run_20220919_

<a id='submit5000'></a>
# submit jobs using 5000 loci
[top](#home)

In [1]:
from pythonimports import *

import MVP_summary_functions as mvp

lview, dview = get_client(cluster_id='1707154768-urwa', profile='lotterhos')

outerdir = '/work/lotterhos/brandon/ind_runtimes'
pooled_dir = '/work/lotterhos/brandon/pooled_runtimes'

mvp.latest_commit()
session_info.show()

36 36
#########################################################
Today:	February 05, 2024 - 16:16:37 EST
python version: 3.8.5
conda env: mvp_env

Current commit of [1mpythonimports[0m:
[33mcommit 419895d157c97717f835390196c13cf973d25eba[m  
Merge: e20434f 1e09b6c  
Author: Brandon Lind <lind.brandon.m@gmail.com>

Current commit of [94m[1mMVP_offsets[0m[0m:
[33mcommit 8b790072e7a46d7f58a30c40cf4660986a612764[m  
Author: Brandon Lind <lind.brandon.m@gmail.com>  
Date:   Fri Feb 2 13:55:49 2024 -0500
#########################################################



In [3]:
pooled_dirs = {
    'run_20220919_0-225': ['/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/00500/gradient_forests/training/training_files',
                           '/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/05000/gradient_forests/training/training_files',
                           '/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/10000/gradient_forests/training/training_files',
                           '/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/20000/gradient_forests/training/training_files'],
    
    'run_20220919_225-450': ['/work/lotterhos/brandon/pooled_runtimes/run_20220919_225-450/00500/gradient_forests/training/training_files',
                             '/work/lotterhos/brandon/pooled_runtimes/run_20220919_225-450/05000/gradient_forests/training/training_files',
                             '/work/lotterhos/brandon/pooled_runtimes/run_20220919_225-450/10000/gradient_forests/training/training_files',
                             '/work/lotterhos/brandon/pooled_runtimes/run_20220919_225-450/20000/gradient_forests/training/training_files'],
    
    'run_20220919_450-675': ['/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/00500/gradient_forests/training/training_files',
                             '/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/05000/gradient_forests/training/training_files',
                             '/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/10000/gradient_forests/training/training_files',
                             '/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/20000/gradient_forests/training/training_files']
}

In [4]:
d_index = 1  # which dir I'm submitting

for rep, repdirs in pooled_dirs.items():
    print(repdirs[d_index])

/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/05000/gradient_forests/training/training_files
/work/lotterhos/brandon/pooled_runtimes/run_20220919_225-450/05000/gradient_forests/training/training_files
/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/05000/gradient_forests/training/training_files


In [5]:
# submit
jobnames =[]
pids = defaultdict(dict)
for rep, repdirs in pooled_dirs.items():
    sh_dir = repdirs[d_index].replace('training_files', 'training_shfiles')
    
    shfiles = fs(sh_dir, endswith='.sh', exclude='watcher')
    
    assert len(shfiles) == 225
    
    for sh in pbar(shfiles, desc=rep):
        jobnames.append(op.basename(sh))
        seed = op.basename(sh).split("_")[0]
        pids[rep][seed] = sbatch(sh, progress_bar=False)

luni(jobnames), len(jobnames)

run_20220919_0-225: 100%|███████████████| 225/225 [00:49<00:00,  4.57it/s]
run_20220919_225-450: 100%|███████████████| 225/225 [00:47<00:00,  4.76it/s]
run_20220919_450-675: 100%|███████████████| 225/225 [00:47<00:00,  4.77it/s]


(675, 675)

In [6]:
fitting_shfiles = []
for (rep, seed), pid in unwrap_dictionary(pids, progress_bar=True):
    fitting_shdir = pooled_dirs[rep][d_index].replace('training/training_files', 'fitting/fitting_shfiles')
    fitting_sh = f'{fitting_shdir}/{seed}_gf_fitting.sh'

    text = read(fitting_sh)

    assert 'dependency' in text[6]
    text[6] = f'#SBATCH --dependency=afterok:{pid[0]}'

    with open(fitting_sh, 'w') as o:
        o.write('\n'.join(text))

    fitting_shfiles.append(fitting_sh)

print(len(fitting_shfiles))

text

100%|███████████████| 3/3 [00:04<00:00,  1.61s/it]

675





['#!/bin/bash',
 '#SBATCH --job-name=1231768_gf_fitting',
 '#SBATCH --time=1-00:00:00',
 '#SBATCH --ntasks=1',
 '#SBATCH --mem=300000M',
 '#SBATCH --output=1231768_gf_fitting_%j.out',
 '#SBATCH --dependency=afterok:40709572',
 '#SBATCH --mail-user=dummy_email@gmail.com',
 '#SBATCH --mail-type=FAIL',
 '#SBATCH --nodes=1',
 '#SBATCH --cpus-per-task=7',
 'cd /home/b.lind/code/MVP-offsets/01_src',
 'source $HOME/.bashrc',
 'conda activate mvp_env',
 'python MVP_02_fit_gradient_forests.py 1231768 /work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/05000/slimdir /work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/05000/gradient_forests/training/training_outfiles /home/b.lind/anaconda3/envs/r35/lib/R/bin/Rscript 1 ind',
 'python MVP_03_validate_gradient_forests.py 1231768 /home/b.lind/offsets/run_20220919_450-675/slimdir /work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/05000/gradient_forests 100 ind']

In [7]:
pids = sbatch(fitting_shfiles)

sbatching: 100%|███████████████| 675/675 [02:42<00:00,  4.14it/s]


<a id='submit10k'></a>
# submit jobs using 10000 loci

[top](#home)

In [5]:
from pythonimports import *

import MVP_summary_functions as mvp

outerdir = '/work/lotterhos/brandon/ind_runtimes'
pooled_dir = '/work/lotterhos/brandon/pooled_runtimes'

mvp.latest_commit()
session_info.show()

#########################################################
Today:	February 07, 2024 - 14:24:37 EST
python version: 3.8.5
conda env: mvp_env

Current commit of [1mpythonimports[0m:
[33mcommit 419895d157c97717f835390196c13cf973d25eba[m  
Merge: e20434f 1e09b6c  
Author: Brandon Lind <lind.brandon.m@gmail.com>

Current commit of [94m[1mMVP_offsets[0m[0m:
[33mcommit 8b790072e7a46d7f58a30c40cf4660986a612764[m  
Author: Brandon Lind <lind.brandon.m@gmail.com>  
Date:   Fri Feb 2 13:55:49 2024 -0500
#########################################################



In [6]:
pooled_dirs = {
    'run_20220919_0-225': ['/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/00500/gradient_forests/training/training_files',
                           '/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/05000/gradient_forests/training/training_files',
                           '/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/10000/gradient_forests/training/training_files',
                           '/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/20000/gradient_forests/training/training_files'],
    
    'run_20220919_225-450': ['/work/lotterhos/brandon/pooled_runtimes/run_20220919_225-450/00500/gradient_forests/training/training_files',
                             '/work/lotterhos/brandon/pooled_runtimes/run_20220919_225-450/05000/gradient_forests/training/training_files',
                             '/work/lotterhos/brandon/pooled_runtimes/run_20220919_225-450/10000/gradient_forests/training/training_files',
                             '/work/lotterhos/brandon/pooled_runtimes/run_20220919_225-450/20000/gradient_forests/training/training_files'],
    
    'run_20220919_450-675': ['/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/00500/gradient_forests/training/training_files',
                             '/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/05000/gradient_forests/training/training_files',
                             '/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/10000/gradient_forests/training/training_files',
                             '/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/20000/gradient_forests/training/training_files']
}

In [7]:
d_index = 2  # which dir I'm submitting

for rep, repdirs in pooled_dirs.items():
    print(repdirs[d_index])

/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/10000/gradient_forests/training/training_files
/work/lotterhos/brandon/pooled_runtimes/run_20220919_225-450/10000/gradient_forests/training/training_files
/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/10000/gradient_forests/training/training_files


In [8]:
# submit
jobnames =[]
pids = defaultdict(dict)
for rep, repdirs in pooled_dirs.items():
    sh_dir = repdirs[d_index].replace('training_files', 'training_shfiles')
    
    shfiles = fs(sh_dir, endswith='.sh', exclude='watcher')
    
    assert len(shfiles) == 225
    
    for sh in pbar(shfiles, desc=rep):
        jobnames.append(op.basename(sh))
        seed = op.basename(sh).split("_")[0]
        pids[rep][seed] = sbatch(sh, progress_bar=False)

luni(jobnames), len(jobnames)

run_20220919_0-225: 100%|███████████████| 225/225 [00:37<00:00,  5.99it/s]
run_20220919_225-450: 100%|███████████████| 225/225 [00:42<00:00,  5.34it/s]
run_20220919_450-675: 100%|███████████████| 225/225 [00:46<00:00,  4.88it/s]


(675, 675)

In [9]:
fitting_shfiles = []
for (rep, seed), pid in unwrap_dictionary(pids, progress_bar=True):
    fitting_shdir = pooled_dirs[rep][d_index].replace('training/training_files', 'fitting/fitting_shfiles')
    fitting_sh = f'{fitting_shdir}/{seed}_gf_fitting.sh'

    text = read(fitting_sh)

    assert 'dependency' in text[6]
    text[6] = f'#SBATCH --dependency=afterok:{pid[0]}'

    with open(fitting_sh, 'w') as o:
        o.write('\n'.join(text))

    fitting_shfiles.append(fitting_sh)

print(len(fitting_shfiles))

text

100%|███████████████| 3/3 [00:05<00:00,  1.86s/it]

675





['#!/bin/bash',
 '#SBATCH --job-name=1231768_gf_fitting',
 '#SBATCH --time=1-00:00:00',
 '#SBATCH --ntasks=1',
 '#SBATCH --mem=300000M',
 '#SBATCH --output=1231768_gf_fitting_%j.out',
 '#SBATCH --dependency=afterok:40735645',
 '#SBATCH --mail-user=dummy_email@gmail.com',
 '#SBATCH --mail-type=FAIL',
 '#SBATCH --nodes=1',
 '#SBATCH --cpus-per-task=7',
 'cd /home/b.lind/code/MVP-offsets/01_src',
 'source $HOME/.bashrc',
 'conda activate mvp_env',
 'python MVP_02_fit_gradient_forests.py 1231768 /work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/10000/slimdir /work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/10000/gradient_forests/training/training_outfiles /home/b.lind/anaconda3/envs/r35/lib/R/bin/Rscript 1 ind',
 'python MVP_03_validate_gradient_forests.py 1231768 /home/b.lind/offsets/run_20220919_450-675/slimdir /work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/10000/gradient_forests 100 ind']

In [10]:
pids = sbatch(fitting_shfiles) # there will be a lot of cancelled jobs, I still have some 5k fitting jobs in the queue

sbatching:   9%|█▎             | 60/675 [00:12<02:07,  4.82it/s]
scancel: 100%|███████████████| 1/1 [00:00<00:00, 15.66it/s]
sbatching:  42%|██████▎        | 285/675 [01:03<01:31,  4.28it/s]
scancel: 100%|███████████████| 1/1 [00:00<00:00, 15.13it/s]
sbatching:  42%|██████▎        | 286/675 [01:03<01:36,  4.03it/s]
scancel: 100%|███████████████| 1/1 [00:00<00:00, 14.62it/s]
sbatching:  43%|██████▍        | 289/675 [01:04<01:37,  3.96it/s]
scancel: 100%|███████████████| 1/1 [00:00<00:00, 14.45it/s]
sbatching:  43%|██████▍        | 290/675 [01:04<01:43,  3.72it/s]
scancel: 100%|███████████████| 1/1 [00:00<00:00, 15.50it/s]
sbatching:  44%|██████▌        | 294/675 [01:05<01:37,  3.93it/s]
scancel: 100%|███████████████| 1/1 [00:00<00:00, 14.80it/s]
sbatching:  44%|██████▋        | 300/675 [01:07<01:24,  4.44it/s]
scancel: 100%|███████████████| 1/1 [00:00<00:00, 14.59it/s]
sbatching:  45%|██████▋        | 301/675 [01:07<01:31,  4.10it/s]
scancel: 100%|███████████████| 1/1 [00:00<00:00, 14.2

scancel: 100%|███████████████| 1/1 [00:00<00:00, 14.32it/s]
sbatching:  63%|█████████▍     | 426/675 [01:46<01:27,  2.85it/s]
scancel: 100%|███████████████| 1/1 [00:00<00:00, 15.30it/s]
sbatching:  63%|█████████▍     | 427/675 [01:46<01:21,  3.05it/s]
scancel: 100%|███████████████| 1/1 [00:00<00:00, 10.43it/s]
sbatching:  63%|█████████▌     | 428/675 [01:46<01:18,  3.13it/s]
scancel: 100%|███████████████| 1/1 [00:00<00:00, 15.28it/s]
sbatching:  64%|█████████▌     | 429/675 [01:47<01:18,  3.12it/s]
scancel: 100%|███████████████| 1/1 [00:00<00:00, 14.67it/s]
sbatching:  64%|█████████▌     | 430/675 [01:47<01:15,  3.23it/s]
scancel: 100%|███████████████| 1/1 [00:00<00:00, 14.52it/s]
sbatching:  64%|█████████▌     | 431/675 [01:47<01:12,  3.36it/s]
scancel: 100%|███████████████| 1/1 [00:00<00:00, 14.45it/s]
sbatching:  64%|█████████▌     | 432/675 [01:48<01:00,  3.99it/s]


KeyboardInterrupt: 

In [11]:
Squeue(grepping='train')

[1m[38;2;128;128;128m🗒️  Queue Summary:
[0m[0m
{'[4m[1mlong[0m[0m': {'PD': 433},
 '[4m[1mshort[0m[0m': {'PD': 235, 'R': 7}}

In [12]:
Squeue(grepping='fit')

[1m[38;2;128;128;128m🗒️  Queue Summary:
[0m[0m
{'[4m[1mlong[0m[0m': {'PD': 220, 'R': 1},
 '[4m[1mshort[0m[0m': {'PD': 320}}

In [15]:
Squeue(partition='short', grepping='train').update(to_partition='lotterhos', num_jobs=9)

update: 100%|███████████████| 9/9 [00:00<00:00, 15.10it/s]


In [22]:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9][3:1000]

[3, 4, 5, 6, 7, 8, 9]

In [23]:
Squeue()

[1m[38;2;128;128;128m🗒️  Queue Summary:
[0m[0m
{'[4m[1mlong[0m[0m': {'PD': 653, 'R': 1},
 '[4m[1mlotterhos[0m[0m': {'PD': 9, 'R': 2},
 '[4m[1mshort[0m[0m': {'PD': 530, 'R': 22}}

<a id='submit20k'></a>
# submit jobs using 20000 loci
[top](#home)

In [1]:
from pythonimports import *

import MVP_summary_functions as mvp

outerdir = '/work/lotterhos/brandon/ind_runtimes'
pooled_dir = '/work/lotterhos/brandon/pooled_runtimes'

mvp.latest_commit()
session_info.show()

#########################################################
Today:	February 12, 2024 - 09:03:56 EST
python version: 3.8.5
conda env: mvp_env

Current commit of [1mpythonimports[0m:
[33mcommit 419895d157c97717f835390196c13cf973d25eba[m  
Merge: e20434f 1e09b6c  
Author: Brandon Lind <lind.brandon.m@gmail.com>

Current commit of [94m[1mMVP_offsets[0m[0m:
[33mcommit 8b790072e7a46d7f58a30c40cf4660986a612764[m  
Author: Brandon Lind <lind.brandon.m@gmail.com>  
Date:   Fri Feb 2 13:55:49 2024 -0500
#########################################################



In [2]:
pooled_dirs = {
    'run_20220919_0-225': ['/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/00500/gradient_forests/training/training_files',
                           '/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/05000/gradient_forests/training/training_files',
                           '/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/10000/gradient_forests/training/training_files',
                           '/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/20000/gradient_forests/training/training_files'],
    
    'run_20220919_225-450': ['/work/lotterhos/brandon/pooled_runtimes/run_20220919_225-450/00500/gradient_forests/training/training_files',
                             '/work/lotterhos/brandon/pooled_runtimes/run_20220919_225-450/05000/gradient_forests/training/training_files',
                             '/work/lotterhos/brandon/pooled_runtimes/run_20220919_225-450/10000/gradient_forests/training/training_files',
                             '/work/lotterhos/brandon/pooled_runtimes/run_20220919_225-450/20000/gradient_forests/training/training_files'],
    
    'run_20220919_450-675': ['/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/00500/gradient_forests/training/training_files',
                             '/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/05000/gradient_forests/training/training_files',
                             '/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/10000/gradient_forests/training/training_files',
                             '/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/20000/gradient_forests/training/training_files']
}

In [3]:
d_index = 3  # which dir I'm submitting

for rep, repdirs in pooled_dirs.items():
    print(repdirs[d_index])

/work/lotterhos/brandon/pooled_runtimes/run_20220919_0-225/20000/gradient_forests/training/training_files
/work/lotterhos/brandon/pooled_runtimes/run_20220919_225-450/20000/gradient_forests/training/training_files
/work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/20000/gradient_forests/training/training_files


In [4]:
# submit
jobnames =[]
pids = defaultdict(dict)
for rep, repdirs in pooled_dirs.items():
    sh_dir = repdirs[d_index].replace('training_files', 'training_shfiles')
    
    shfiles = fs(sh_dir, endswith='.sh', exclude='watcher')
    
    assert len(shfiles) == 225
    
    for sh in pbar(shfiles, desc=rep):
        jobnames.append(op.basename(sh))
        seed = op.basename(sh).split("_")[0]
        pids[rep][seed] = sbatch(sh, progress_bar=False)

luni(jobnames), len(jobnames)

run_20220919_0-225: 100%|███████████████| 225/225 [00:44<00:00,  5.03it/s]
run_20220919_225-450: 100%|███████████████| 225/225 [00:46<00:00,  4.81it/s]
run_20220919_450-675: 100%|███████████████| 225/225 [00:51<00:00,  4.41it/s]


(675, 675)

In [6]:
fitting_shfiles = []
for (rep, seed), pid in unwrap_dictionary(pids, progress_bar=True):
    fitting_shdir = pooled_dirs[rep][d_index].replace('training/training_files', 'fitting/fitting_shfiles')
    fitting_sh = f'{fitting_shdir}/{seed}_gf_fitting.sh'

    text = read(fitting_sh)

    assert 'dependency' in text[6]
    text[6] = f'#SBATCH --dependency=afterok:{pid[0]}'

    with open(fitting_sh, 'w') as o:
        o.write('\n'.join(text))

    fitting_shfiles.append(fitting_sh)

print(len(fitting_shfiles))

text

100%|███████████████| 3/3 [00:05<00:00,  1.74s/it]

675





['#!/bin/bash',
 '#SBATCH --job-name=1231768_gf_fitting',
 '#SBATCH --time=1-00:00:00',
 '#SBATCH --ntasks=1',
 '#SBATCH --mem=300000M',
 '#SBATCH --output=1231768_gf_fitting_%j.out',
 '#SBATCH --dependency=afterok:40811756',
 '#SBATCH --mail-user=dummy_email@gmail.com',
 '#SBATCH --mail-type=FAIL',
 '#SBATCH --nodes=1',
 '#SBATCH --cpus-per-task=7',
 'cd /home/b.lind/code/MVP-offsets/01_src',
 'source $HOME/.bashrc',
 'conda activate mvp_env',
 'python MVP_02_fit_gradient_forests.py 1231768 /work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/20000/slimdir /work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/20000/gradient_forests/training/training_outfiles /home/b.lind/anaconda3/envs/r35/lib/R/bin/Rscript 1 ind',
 'python MVP_03_validate_gradient_forests.py 1231768 /home/b.lind/offsets/run_20220919_450-675/slimdir /work/lotterhos/brandon/pooled_runtimes/run_20220919_450-675/20000/gradient_forests 100 ind']