Greedy AUC stepwise
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
R
inst/doc
man
.gitignore
DESCRIPTION
GAS1.png
LICENSE
NAMESPACE
Profiling.png
README.md
auc.png

README.md

BioStaCs

Data manupilating/Analytic tools for GWAS data.

Data Clean

The following needs to be integrated in our projects.

  1. C/Cpp
  2. Python/perl
  3. R
  4. Java

TODO LIST:

  1. Use text files to store files: csv/txt.
  2. Offering README file which includes
  • Data description
  • Colum description (types: Numeric/Categorical/Ordered; cate: What is the var? )
  • Header/No headr
  1. State how to handle NA/NULL/missing value

R code Analysing

Additional package is need:

# R>3.0
library(devtools)
devtools::install_github("lineprof")
devtools::install_github("shiny-slickgrid", "wch")

To use it, one can try

#library(lineprof)
#source(find_ex("read-delim.r"))
#wine <- find_ex("wine.csv")
#x <- lineprof(read_delim(wine, sep = ","), torture = TRUE)
#shine(x)

It will open an web page such as

Profiling

The profile information can show more details.

Memory Strategy

There are some built-in function in this package to monitor/clean memory

Built-in functions

lsos() shows the memory usage with neat format;
showMemoryUse() shows memory usage and performs a gc() automatically.

Besides,

Outer Packages

bigalgebra BIGMEM
biganalytics BIGMEM
bigmemory BIGMEM
bigtabulate BIGMEM
synchronicity BIGMEM

GAS

GAS (Greedy AUC Stepwise) is a classification framework, which is successfully applied our SpermatogenesisOnline project. In binary classification problem, GAS maximizes ROC curve with pregiven number of variables automatically, which aims to solve a K-Sparse problem by finding the best K features using greedy searching that maximizes AUC. The strategy of GAS is similar to forward selection, which only adds one variable that is not already in the model and increases the value of AUC. If GAS fails to find out the solution with K variables, it will output the model that generates the maximum AUC instead For each step with maximum allowed numbers.

Pic1 AUC