**Set environment**

In [1]:
suppressMessages(suppressWarnings(source("../run_config_project_sing.R")))
show_env()

You are working on        Singularity: singularity_proj_encode_fcc 
BASE DIRECTORY (FD_BASE): /data/reddylab/Kuei 
REPO DIRECTORY (FD_REPO): /data/reddylab/Kuei/repo 
WORK DIRECTORY (FD_WORK): /data/reddylab/Kuei/work 
DATA DIRECTORY (FD_DATA): /data/reddylab/Kuei/data 

You are working with      ENCODE FCC 
PATH OF PROJECT (FD_PRJ): /data/reddylab/Kuei/repo/Proj_ENCODE_FCC 
PROJECT RESULTS (FD_RES): /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/results 
PROJECT SCRIPTS (FD_EXE): /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/scripts 
PROJECT DATA    (FD_DAT): /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/data 
PROJECT NOTE    (FD_NBK): /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/notebooks 
PROJECT DOCS    (FD_DOC): /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/docs 
PROJECT LOG     (FD_LOG): /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/log 
PROJECT REF     (FD_REF): /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/references 



## Import, arrange, and save data

**Helper function for labeling region based on their TSS proximity**

In [2]:
fun_label_by_tss_proximity = function(vec_num_distance){
    vec_txt_label = ifelse(
        vec_num_distance <= 2000,
        "Proximal",
        "Distal"
    )
    return(vec_txt_label)
}

**Process each data**

In [3]:
### init
vec_txt_region_label = c(
    "fcc_astarr_macs_input_overlap",
    "fcc_astarr_macs_input_union"
)

### loop through each region pair file
for (txt_region_label in  vec_txt_region_label){
    
    ### set file directory
    txt_fdiry = file.path(
        FD_RES, 
        "region_closest",
        txt_region_label,
        "summary"
    )
    txt_fname = "region.pair.genome_tss.tsv"
    txt_fpath = file.path(txt_fdiry, txt_fname)
    dat_region_pair = read_tsv(txt_fpath, show_col_types = FALSE)

    ### select and rename columns
    dat = dat_region_pair
    dat = dat %>% 
        dplyr::filter(Annotation_B == "genome_tss_pol2_rnaseq") %>%
        dplyr::select(
            Chrom_A, ChromStart_A, ChromEnd_A, Region_A, 
            Annotation_A, Annotation_B,
            Distance
        ) %>% 
        dplyr::distinct()
    colnames(dat) = c(
        "Chrom", "ChromStart", "ChromEnd", "Region",
        "Annotation_A", "Annotation_B",
        "Distance"
    )
    dat_region_rename = dat
    
    ### add label
    dat = dat_region_rename
    dat = dat %>%
        dplyr::mutate(
            Label_TSS_Proximity = fun_label_by_tss_proximity(Distance),
        )
    dat_region_summary = dat

    ### write table
    txt_fname = "region.summary.genome_tss.tsv"
    txt_fpath = file.path(txt_fdiry, txt_fname)
    dat = dat_region_summary
    write_tsv(dat, txt_fpath)

    ### show progress
    cat("Save:", txt_fpath, "\n")
    cat("\n")
}

Save: /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/results/region_closest/fcc_astarr_macs_input_overlap/summary/region.summary.genome_tss.tsv 

Save: /data/reddylab/Kuei/repo/Proj_ENCODE_FCC/results/region_closest/fcc_astarr_macs_input_union/summary/region.summary.genome_tss.tsv 

