**Set environment**

In [1]:
suppressMessages(suppressWarnings(source("../run_config_project_sing.R")))
show_env()

You are working on        Singularity 
BASE DIRECTORY (FD_BASE): /mount 
REPO DIRECTORY (FD_REPO): /mount/repo 
WORK DIRECTORY (FD_WORK): /mount/work 
DATA DIRECTORY (FD_DATA): /mount/data 

You are working with      ENCODE FCC 
PATH OF PROJECT (FD_PRJ): /mount/repo/Proj_ENCODE_FCC 
PROJECT RESULTS (FD_RES): /mount/repo/Proj_ENCODE_FCC/results 
PROJECT SCRIPTS (FD_EXE): /mount/repo/Proj_ENCODE_FCC/scripts 
PROJECT DATA    (FD_DAT): /mount/repo/Proj_ENCODE_FCC/data 
PROJECT NOTE    (FD_NBK): /mount/repo/Proj_ENCODE_FCC/notebooks 
PROJECT DOCS    (FD_DOC): /mount/repo/Proj_ENCODE_FCC/docs 
PROJECT LOG     (FD_LOG): /mount/repo/Proj_ENCODE_FCC/log 
PROJECT APP     (FD_APP): /mount/repo/Proj_ENCODE_FCC/app 
PROJECT REF     (FD_REF): /mount/repo/Proj_ENCODE_FCC/references 



**Set global variable**

In [2]:
TXT_FOLDER_INP = "fcc_crispri_hcrff"
TXT_FOLDER_OUT = "fcc_table"

## Import data

In [3]:
### set directory
txt_folder = TXT_FOLDER_INP
txt_fdiry  = file.path(FD_RES, "region", txt_folder)
dir(txt_fdiry)

In [4]:
### set file path
txt_folder = TXT_FOLDER_INP
txt_fdiry  = file.path(FD_RES, "region", txt_folder, "summary")
txt_fname  = "description.tsv"
txt_fpath  = file.path(txt_fdiry, txt_fname)

### read table
dat = read_tsv(txt_fpath, show_col_types = FALSE)

### assign and show
dat_cnames = dat
fun_display_table(dat)

Name,Note
Chrom,Name of the chromosome
ChromStart,The starting position of the feature in the chromosome
ChromEnd,The ending position of the feature in the chromosome
Name,Region location
Score,CASA peak score
Strand,Defines the strand. Either '.' (=no strand) or '+' or '-'.
Gene_Symbol,Gene symbol; Gene that is screened for CRISPRi-FlowFish
Gene_Ensembl,Gene Ensembl ID; Gene that is screened for CRISPRi-FlowFish
Group,Assay Name
Label,Region Label; {Assay Name}:{Tested Gene Name}


In [5]:
### set file path
txt_folder = TXT_FOLDER_INP
txt_fdiry  = file.path(FD_RES, "region", txt_folder)
txt_fname  = "K562.hg38.CRISPRi_HCRFF.CASA.bed.gz"
txt_fpath  = file.path(txt_fdiry, txt_fname)

### read table
vec = dat_cnames$Name
dat = read_tsv(txt_fpath, col_names = vec, show_col_types = FALSE)

### assign and show
dat_region_import = dat
print(dim(dat))
fun_display_table(head(dat, 3))

[1] 113  10


Chrom,ChromStart,ChromEnd,Name,Score,Strand,Gene_Symbol,Gene_Ensembl,Group,Label
chr11,5248847,5249047,chr11:5248847-5249047,1.068624,.,HBG1,ENST00000330597.5,CRISPRi-HCRFF,CRISPRi-HCRFF:HBG1
chr11,5248847,5249047,chr11:5248847-5249047,0.9357701,.,HBG2,ENST00000336906.6,CRISPRi-HCRFF,CRISPRi-HCRFF:HBG2
chr11,5249847,5250847,chr11:5249847-5250847,1.8908899,.,HBG1,ENST00000330597.5,CRISPRi-HCRFF,CRISPRi-HCRFF:HBG1


## Arrange table

In [6]:
### get table
dat = dat_region_import
vec = c(
    "Chrom", "ChromStart", "ChromEnd", "Group", "Label",
    "Assay", "Region", "Target", "Score", "NLog10P",
    "Method", "Source"
)

dat = dat %>% 
    dplyr::mutate(
        Assay   = "CRISPRi-HCR FlowFISH",
        Region  = fun_gen_region(Chrom, ChromStart, ChromEnd),
        Target  = Gene_Symbol,
        NLog10P = NA,
        Method  = "CASA",
        Source  = "Riley Lab"
    ) %>%
    dplyr::select(!!!vec)

dat_region_arrange = dat
print(dim(dat))
fun_display_table(head(dat, 3))

[1] 113  12


Chrom,ChromStart,ChromEnd,Group,Label,Assay,Region,Target,Score,NLog10P,Method,Source
chr11,5248847,5249047,CRISPRi-HCRFF,CRISPRi-HCRFF:HBG1,CRISPRi-HCR FlowFISH,chr11:5248847-5249047,HBG1,1.068624,,CASA,Riley Lab
chr11,5248847,5249047,CRISPRi-HCRFF,CRISPRi-HCRFF:HBG2,CRISPRi-HCR FlowFISH,chr11:5248847-5249047,HBG2,0.9357701,,CASA,Riley Lab
chr11,5249847,5250847,CRISPRi-HCRFF,CRISPRi-HCRFF:HBG1,CRISPRi-HCR FlowFISH,chr11:5249847-5250847,HBG1,1.8908899,,CASA,Riley Lab


## Export results

In [7]:
### set file path
txt_folder = TXT_FOLDER_OUT
txt_fdiry  = file.path(FD_RES, "region", txt_folder)
txt_fname  = "K562.hg38.fcc_crispri_hcrff.bed.gz"
txt_fpath  = file.path(txt_fdiry, txt_fname)

### set table
dat = dat_region_arrange
dat = dat %>% dplyr::arrange(Chrom, ChromStart, ChromEnd)

### write table
write_tsv(dat, txt_fpath, col_names = FALSE)