From 91ed9e07116cd855c94f9edcd75cda7455773119 Mon Sep 17 00:00:00 2001 From: "a.teschendorff" Date: Mon, 6 Dec 2010 00:00:00 +0000 Subject: [PATCH] version 1.1 --- DESCRIPTION | 10 +++++----- man/DoISVA.Rd | 4 ++-- man/EstDimRMT.Rd | 4 ++-- man/isva-package.Rd | 10 +++++----- man/isvaFn.Rd | 2 +- man/simdataISVA.Rd | 3 ++- 6 files changed, 17 insertions(+), 16 deletions(-) diff --git a/DESCRIPTION b/DESCRIPTION index 898378b..394548f 100755 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,16 +1,16 @@ Package: isva Type: Package Title: Independent Surrogate Variable Analysis -Version: 1.0 -Date: 2010-08-14 +Version: 1.1 +Date: 2010-12-06 Author: Andrew E Teschendorff Maintainer: Depends: qvalue, fastICA Description: Independent Surrogate Variable Analysis is a general - algorithm for feature selection in the presence of potentially + algorithm for feature selection in the presence of potential confounding factors. License: GPL-2 LazyLoad: yes -Packaged: Mon Aug 16 22:56:38 2010; aet21 +Packaged: Mon Dec 6 14:45:22 2010; aet21 Repository: CRAN -Date/Publication: 2010-08-17 05:06:03 +Date/Publication: 2010-12-06 16:00:23 diff --git a/man/DoISVA.Rd b/man/DoISVA.Rd index 8f20661..ab8fcfc 100755 --- a/man/DoISVA.Rd +++ b/man/DoISVA.Rd @@ -2,7 +2,7 @@ \alias{DoISVA} \title{Feature selection using independent surrogate variables} \description{ -This function performs feature selection for features associated with a phenotype of interest in the presence of known potential confounding factors subject to uncertainty or measurement error. +Given a data matrix and a phenotype of interest, this function performs feature selection for features associated with the phenotype of interest in the presence of potential confounding factors. The algorithm first finds the variation in the data matrix not associated with the phenotype of interest, and subsequently performs Independent Component Analysis on this residual variation matrix. The number of independent components to be inferred can be prespecified or estimated using Random Matrix Theory. Independent Surrogate Variables (ISVs) are constructed from the independent components and provide estimates of the effect of confounders on the data. These ISVs are then included as covariates in a multivariate regression model to identify features that correlate with the phenotype of interest independently of these potential confounders. } \usage{ DoISVA(data.m, pheno.v, cf.m, factor.log, pvthCF = 0.01, th = 0.05, ncomp = NULL) @@ -10,7 +10,7 @@ DoISVA(data.m, pheno.v, cf.m, factor.log, pvthCF = 0.01, th = 0.05, ncomp = NULL %- maybe also 'usage' for other objects documented here. \arguments{ \item{data.m}{Data matrix: rows label features, columns label samples. It is assumed that number of features is much larger than number of samples.} - \item{pheno.v}{Numeric vector of length equal to number of columns of data matrix. At present categorical phenotypes are not supported.} + \item{pheno.v}{Numeric vector of length equal to number of columns of data matrix. At present only numeric (ordinal) phenotypes are supported, so categorical phenotypes are excluded.} \item{cf.m}{Matrix of confounding factors. Rows label samples, Columns label confounding factors, which may be numeric or categorical.} \item{factor.log}{A logical vector of same length as columns of \code{cf.m}. FALSE indicates factor is to be treated as a numeric, TRUE as categorical.} \item{pvthCF}{P-value threshold to call a significant association between an independent surrogate variable and a confounding factor. By default this is 0.01.} diff --git a/man/EstDimRMT.Rd b/man/EstDimRMT.Rd index 5f43975..6923e75 100755 --- a/man/EstDimRMT.Rd +++ b/man/EstDimRMT.Rd @@ -1,9 +1,9 @@ \name{EstDimRMT} \alias{EstDimRMT} %- Also NEED an '\alias' for EACH other topic documented here. -\title{Estimates dimensionality of data set using Approximate Random Matrix Theory} +\title{Estimates dimensionality of a data set using Random Matrix Theory} \description{ -Given the data matrix, it estimates the approximate number of significant components of variation by comparing the observed distribution of spectral eigenvalues to the theoretical one under a Gaussian Orthogonal Ensemble (GOE). +Given the data matrix, it estimates the number of significant components of variation by comparing the observed distribution of spectral eigenvalues to the theoretical one under a Gaussian Orthogonal Ensemble (GOE). Specifically, a spectral decomposition of the data covariance matrix is performed and the number of eigenvalues larger than the theoretical maximum predicted by the GOE is taken as an estimate of the number of significant components. } \usage{ EstDimRMT(data.m) diff --git a/man/isva-package.Rd b/man/isva-package.Rd index e6eb4ac..8ef5ec3 100755 --- a/man/isva-package.Rd +++ b/man/isva-package.Rd @@ -6,18 +6,18 @@ Independent Surrogate Variable Analysis } \description{ -Independent Surrogate Variable Analysis is a general algorithm for feature selection in the presence of potentially confounding factors, designed for the analysis of large-scale quantitative genomic data (e.g microarrays). It uses Independent Component Analysis to model the confounding factors and allows for heterogeneity within the phenotype of interest. +Independent Surrogate Variable Analysis is a general algorithm for feature selection in the presence of potential confounding factors, designed for the analysis of large-scale quantitative genomic data (e.g microarrays). It uses Independent Component Analysis (ICA) to model the confounding factors as independent surrogate variables (ISVs). These ISVs are included as covariates in a multivariate regression model to identify features that correlate with a phenotype of interest independently of these confounders. The ICA implementation used is that of the fastICA R-package. } \details{ \tabular{ll}{ Package: \tab isva\cr Type: \tab Package\cr -Version: \tab 1.0\cr -Date: \tab 2010-08-14\cr -License: \tab What license is it under?\cr +Version: \tab 1.1\cr +Date: \tab 2010-12-06\cr +License: \tab GPL-2\cr LazyLoad: \tab yes\cr } -Two internal functions perform the dimensionality estimation using approximate Random Matrix Theory (EstDimRMT) and modelling of confounding factors using Independent Component Analysis on the residual variation orthogonal to that of the phenotype of interest (isvaFn). DoISVA allows is the main user function, performing feature selection using independent surrogate variables. +Two internal functions perform the dimensionality estimation using approximate Random Matrix Theory (EstDimRMT) and modelling of confounding factors using Independent Component Analysis on the residual variation orthogonal to that of the phenotype of interest (isvaFn). DoISVA is the main user function, performing feature selection using independent surrogate variables. } \author{ Andrew E Teschendorff diff --git a/man/isvaFn.Rd b/man/isvaFn.Rd index 153710a..14e371d 100755 --- a/man/isvaFn.Rd +++ b/man/isvaFn.Rd @@ -3,7 +3,7 @@ %- Also NEED an '\alias' for EACH other topic documented here. \title{Main engine function for inference of independent surrogate variables (ISVs)} \description{ -This function infers statistically independent surrogate variables by performing Independent Component Analysis on the residual variation orthogonal to that of a phenotype of interest. +This is the main engine function which infers statistically independent surrogate variables by performing Independent Component Analysis on the residual variation orthogonal to that of a phenotype of interest. It uses the ICA implementation of the fastICA R-package. } \usage{ isvaFn(data.m, pheno.v, ncomp = NULL) diff --git a/man/simdataISVA.Rd b/man/simdataISVA.Rd index de81af7..c98edee 100755 --- a/man/simdataISVA.Rd +++ b/man/simdataISVA.Rd @@ -5,7 +5,8 @@ \description{ A data set over 2000 features and 50 samples with a binary phenotype and two confounding factors. Relative effect size of confounding - factors to that of phenotype of interest is 4. + factors to that of phenotype of interest is 4. For further details please + see reference. } \usage{simdataISVA} \format{