# Getting a subset of a data structure
Credits: http://www.cookbook-r.com/ (Creative Commons Attribution-Share Alike 3.0 Unported License)

## Problem
You want to do get a subset of the elements of a vector, matrix, or data frame.

## Solution
To get a subset based on some conditional criterion, the subset() function or indexing using square brackets can be used. In the examples here, both ways are shown.

In [2]:
# A sample vector
v <- c(1,4,4,3,2,2,3)

In [3]:
subset(v, v<3)
v[v<3]

In [4]:
# Another vector
t <- c("small", "small", "large", "medium")

In [5]:
# Remove "small" entries
subset(t, t!="small")
t[t!="small"]

One important difference between the two methods is that you can assign values to elements with square bracket indexing, but you cannot with subset().

In [6]:
v[v<3] <- 9

subset(v, v<3) <- 9

ERROR: Error in subset(v, v < 3) <- 9: konnte Funktion "subset<-" nicht finden


With data frames:

In [7]:
# A sample data frame
data <- read.table(header=T, text='
 subject sex size
       1   M    7
       2   F    6
       3   F    9
       4   M   11
 ')

In [8]:
subset(data, subject < 3)
data[data$subject < 3, ]

Unnamed: 0,subject,sex,size
1,1,M,7
2,2,F,6


Unnamed: 0,subject,sex,size
1,1,M,7
2,2,F,6


In [9]:
# Subset of particular rows and columns
subset(data, subject < 3, select = -subject)
subset(data, subject < 3, select = c(sex,size))
subset(data, subject < 3, select = sex:size)
data[data$subject < 3, c("sex","size")]

Unnamed: 0,sex,size
1,M,7
2,F,6


Unnamed: 0,sex,size
1,M,7
2,F,6


Unnamed: 0,sex,size
1,M,7
2,F,6


Unnamed: 0,sex,size
1,M,7
2,F,6


In [10]:
# Logical AND of two conditions
subset(data, subject < 3  &  sex=="M")
data[data$subject < 3  |  data$sex=="M", ]

Unnamed: 0,subject,sex,size
1,1,M,7


Unnamed: 0,subject,sex,size
1,1,M,7
2,2,F,6
4,4,M,11


In [11]:
# Condition based on transformed data
subset(data, log2(size) > 3 )
data[log2(data$size) > 3, ]

Unnamed: 0,subject,sex,size
3,3,F,9
4,4,M,11


Unnamed: 0,subject,sex,size
3,3,F,9
4,4,M,11


In [12]:
# Subset if elements are in another vector
subset(data, subject %in% c(1,3))
data[data$subject %in% c(1,3), ]

Unnamed: 0,subject,sex,size
1,1,M,7
3,3,F,9


Unnamed: 0,subject,sex,size
1,1,M,7
3,3,F,9


### Help doc
get the documentation with:

In [13]:
?subset

0,1
subset {base},R Documentation

0,1
x,object to be subsetted.
subset,logical expression indicating elements or rows to keep: missing values are taken as false.
select,"expression, indicating columns to select from a data frame."
drop,passed on to [ indexing operator.
...,further arguments to be passed to or from other methods.
