# Creating subsets of data frames

Here we use data from the British Election Study 2010. The data set [bes2010feelings-prepost.RData](https://github.com/melff/dataman-r/raw/main/03-data-frames/bes2010feelings-prepost.RData) is prepared from the original available at https://www.britishelectionstudy.com/data-object/2010-bes-cross-section/ by removing identifying information and scrambling the data.

In [1]:
load("bes2010feelings-prepost.RData")

We then create a subset with only observations from Scotland
and with parties and party leaders that run in Scotland:

In [2]:
bes2010flngs_pre_scotland <- subset(bes2010flngs_pre,
                                    region=="Scotland",
                                    select=c(
                                        flng.brown,
                                        flng.cameron,
                                        flng.clegg,
                                        flng.salmond,
                                        flng.labour,
                                        flng.cons,
                                        flng.libdem,
                                        flng.snp,
                                        flng.green))

We can now compare the avarage feeling about Gordon Brown
in the whole sample and in the subsample from Scotland:
First the whole UK:

In [3]:
with(bes2010flngs_pre,mean(flng.brown,na.rm=TRUE))

[1] 4.339703

then the Scotland subsample:

In [4]:
with(bes2010flngs_pre_scotland,mean(flng.brown,na.rm=TRUE))

[1] 5.395

It is also possible to create a subset of cases and variables with the
bracket operator, but this is pretty tedious:

In [5]:
bes2010flngs_pre_scotland <- bes2010flngs_pre[
    bes2010flngs_pre$region=="Scotland",c(
                             "flng.labour",
                             "flng.cons",
                             "flng.libdem",
                             "flng.snp",
                             "flng.green",
                             "flng.brown",
                             "flng.cameron",
                             "flng.clegg",
                             "flng.salmond")]

In [6]:
with(bes2010flngs_pre_scotland,mean(flng.brown,na.rm=TRUE))

[1] 5.395