Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fragment length is not consistent with fithic resolution (-r) in tests data #48

Open
bnjvrjnke opened this issue Sep 6, 2021 · 1 comment

Comments

@bnjvrjnke
Copy link

When running the tests data, I noticed that you set resolution (-r) to 100000, however, the input file has a fragment size of 1000000.

Here is tests script:

#/fithic/fithic/tests/run_tests-git.sh
#line 46-49
for i in Dixon_IMR90_HindIII_hg19_w100000; do
    python3 ../fithic.py -r 100000 -l "$i" -i $inI/$i.gz -f $inF/$i.gz -b $noOfBins -p $noOfPasses -o outputs/${i}.interOnly -x interOnly
    python3 ../fithic.py -r 100000 -l "$i" -i $inI/$i.gz -f $inF/$i.gz -b $noOfBins -p $noOfPasses -o outputs/${i}.all -x All
done

Here is tests data:

#/fithic/fithic/tests/contactCounts/Dixon_IMR90_HindIII_hg19_w100000.gz
chr10   500000  chr10   500000  13850
chr10   500000  chr10   1500000 3472
chr10   500000  chr10   10500000        370

Here is log file:

#Dixon_IMR90_HindIII_hg19_w100000.fithic.log
Interactions file read successfully
-----------------------------------------------------------------------------------
-
Observed, Intra-chr in range: pairs= 215762      totalCount= 91387585
Observed, Intra-chr all: pairs= 218642   totalCount= 121700752
Observed, Inter-chr all: pairs= 3878618  totalCount= 99952107
Range of observed genomic distances [1000000 249000000]

Making equal occupancy bins
-----------------------------------------------------------------------------------
-
Observed intra-chr read counts in range 91387585
Desired number of contacts per bin      456937.925,
Number of bins  200
Equal occupancy bins generated

Looping through all possible fragment pairs in-range
-----------------------------------------------------------------------------------
-
Chromosome 'chr1',      250 mappable fragments,         -2487765 possible intra-chr
 fragment pairs in range,    715750 possible inter-chr fragment pairs
Chromosome 'chr10',     136 mappable fragments,         -733191 possible intra-chr 
fragment pairs in range,     404872 possible inter-chr fragment pairs
Chromosome 'chr11',     136 mappable fragments,         -733191 possible intra-chr fragment pairs in range,     404872 possible inter-chr fragment pairs
Chromosome 'chr12',     134 mappable fragments,         -711689 possible intra-chr fragment pairs in range,     399186 possible inter-chr fragment pairs
Chromosome 'chr13',     116 mappable fragments,         -532571 possible intra-chr fragment pairs in range,     347652 possible inter-chr fragment pairs
Chromosome 'chr14',     108 mappable fragments,         -461283 possible intra-chr fragment pairs in range,     324540 possible inter-chr fragment pairs
Chromosome 'chr15',     103 mappable fragments,         -419328 possible intra-chr fragment pairs in range,     310030 possible inter-chr fragment pairs
Chromosome 'chr16',     91 mappable fragments,  -326796 possible intra-chr fragment pairs in range,     275002 possible inter-chr fragment pairs
Chromosome 'chr17',     82 mappable fragments,  -264957 possible intra-chr fragment pairs in range,     248542 possible inter-chr fragment pairs
Chromosome 'chr18',     79 mappable fragments,  -245784 possible intra-chr fragment pairs in range,     239686 possible inter-chr fragment pairs
Chromosome 'chr19',     60 mappable fragments,  -141075 possible intra-chr fragment pairs in range,     183180 possible inter-chr fragment pairs
Chromosome 'chr2',      244 mappable fragments,         -2369499 possible intra-chr fragment pairs in range,    700036 possible inter-chr fragment pairs
Chromosome 'chr20',     64 mappable fragments,  -160719 possible intra-chr fragment pairs in range,     195136 possible inter-chr fragment pairs
Chromosome 'chr21',     49 mappable fragments,  -93654 possible intra-chr fragment pairs in range,      150136 possible inter-chr fragment pairs
Chromosome 'chr22',     52 mappable fragments,  -105627 possible intra-chr fragment pairs in range,     159172 possible inter-chr fragment pairs
Chromosome 'chr3',      199 mappable fragments,         -1574304 possible intra-chr fragment pairs in range,    579886 possible inter-chr fragment pairs
Chromosome 'chr4',      192 mappable fragments,         -1465167 possible intra-chr fragment pairs in range,    560832 possible inter-chr fragment pairs
Chromosome 'chr5',      181 mappable fragments,         -1301586 possible intra-chr fragment pairs in range,    530692 possible inter-chr fragment pairs
Chromosome 'chr6',      172 mappable fragments,         -1174947 possible intra-chr fragment pairs in range,    505852 possible inter-chr fragment pairs
Chromosome 'chr7',      160 mappable fragments,         -1016175 possible intra-chr fragment pairs in range,    472480 possible inter-chr fragment pairs
Chromosome 'chr8',      147 mappable fragments,         -857172 possible intra-chr fragment pairs in range,     436002 possible inter-chr fragment pairs
Chromosome 'chr9',      142 mappable fragments,         -799617 possible intra-chr fragment pairs in range,     421882 possible inter-chr fragment pairs
Chromosome 'chrX',      156 mappable fragments,         -965811 possible intra-chr fragment pairs in range,     461292 possible inter-chr fragment pairs
Chromosome 'chrY',      60 mappable fragments,  -141075 possible intra-chr fragment pairs in range,     183180 possible inter-chr fragment pairs
Number of all fragments= 3113
Possible, Intra-chr in range: pairs= -19082983 
Possible, Intra-chr all: pairs= 241996.0 
Possible, Inter-chr all: pairs= 4604945.0 
Desired genomic distance range   [0 inf] 
Range of possible genomic distances  [100000  249450000] 
Baseline intrachromosomal probability is 4.13229970743318e-06 
Interchromosomal probability is 2.1715785964870374e-07 
5th quantile of biases: 0.57080572791248
50th quantile of biases: 1.01076079547
95th quantile of biases: 1.20269227401
Out of 3053 loci 85 were discarded with biases not in range [0.5 2]


Calculating probability means and standard deviations of contact counts
------------------------------------------------------------------------------------
Means and error written to outputs/Dixon_IMR90_HindIII_hg19_w100000.all/Dixon_IMR90_HindIII_hg19_w100000.fithic_pass1.res100000.txt


Fitting a univariate spline to the probability means
-----------------------------------------------------------------------------------
Spline successfully fit

The 'Possible, Intra-chr in range: pairs= -19082983' seems weird. If set -r to 1000000, the 'Intra-chr in range: pairs= ' is a positive number and the significant interactions greatly reduce. Shouldn't the resolution parameter (fithic -r) be the same as the fragment length (Dixon_IMR90_HindIII_hg19_w100000.gz)?

@ay-lab
Copy link
Owner

ay-lab commented Sep 16, 2021

thanks. we have made the change in the test script now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants