Skip to content
This repository has been archived by the owner on May 15, 2020. It is now read-only.

How is localSD metric calculated #24

Closed
eroller opened this issue Aug 18, 2016 · 0 comments
Closed

How is localSD metric calculated #24

eroller opened this issue Aug 18, 2016 · 0 comments

Comments

@eroller
Copy link
Member

eroller commented Aug 18, 2016

        /// <summary>
        /// Estimate local standard deviation (SD).
        /// </summary>
        /// <param name="bins">Genomic bins from which we filter out local SD outliers associated with FFPE biases.</param>
        /// <param name="threshold">Median SD value which is used to determine whereas to run RemoveBinsWithExtremeLocalMad on a sample and which set of bins to remove (set as threshold*5).</param>
        /// The rationale of this function is that standard deviation of difference of consecutive bins values, when taken over a small range of bin (i.e. 20 bins),
        /// has a distinct distribution for FFPE compared to Fresh Frozen (FF) samples. This property is used to flag and remove such bins.

        static double getLocalStandardDeviation(List<GenomicBin> bins)
        {
            // Will hold consecutive bin count difference (approximates Skellam Distribution: mean centred on zero so agnostic to CN changes)
            double[] countsDiffs = new double[bins.Count - 1];

            for (int binIndex = 0; binIndex < bins.Count - 1; binIndex++)
            {
                countsDiffs[binIndex] = Convert.ToDouble(bins[binIndex + 1].Count - bins[binIndex].Count);
            }

            // holder of local SD values (SDs of 20 bins)
            List<double> localSDs = new List<double>();
            List<string> chromosomeBin = new List<string>();

            // calculate local SD metric
            int windowSize = 20;
            for (int windowEnd = windowSize, windowStart = 0; windowEnd < countsDiffs.Length; windowStart += windowSize, windowEnd += windowSize)
            {
                double localSD = Utilities.StandardDeviation(countsDiffs, windowStart, windowEnd);
                localSDs.Add(localSD);
                chromosomeBin.Add(bins[windowStart].Chromosome);
                for (int binIndex = windowStart; binIndex < windowEnd; binIndex += 1)
                {
                    bins[binIndex].MadOfDiffs = localSD;
                }
            }

            // average of local SD metric
            double localSDaverage = GetLocalStandardDeviationAverage(localSDs, chromosomeBin);
            return localSDaverage;
        }
@eroller eroller closed this as completed Aug 18, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant