# Data Manipulation 
## Dplyr Exercises

Perform the following operations using only the dplyr library. We will be reviewing the following operations:
* filter() (and slice())
* arrange()
* select() (and rename())
* distinct()
* mutate() (and transmute())
* summarise()
* sample_n() and sample_frac()

In [3]:
library(dplyr)

"package 'dplyr' was built under R version 3.3.3"
Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union



**We will use the mtcars dataframe for this exercise!**

In [4]:
head(mtcars)

Unnamed: 0,mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb
Mazda RX4,21.0,6,160,110,3.9,2.62,16.46,0,1,4,4
Mazda RX4 Wag,21.0,6,160,110,3.9,2.875,17.02,0,1,4,4
Datsun 710,22.8,4,108,93,3.85,2.32,18.61,1,1,4,1
Hornet 4 Drive,21.4,6,258,110,3.08,3.215,19.44,1,0,3,1
Hornet Sportabout,18.7,8,360,175,3.15,3.44,17.02,0,0,3,2
Valiant,18.1,6,225,105,2.76,3.46,20.22,1,0,3,1


**Return rows of cars that have an mpg value greater than 20 and 6 cylinders.**

In [5]:
filter(mtcars,mpg>20,cyl==6)

mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb
21.0,6,160,110,3.9,2.62,16.46,0,1,4,4
21.0,6,160,110,3.9,2.875,17.02,0,1,4,4
21.4,6,258,110,3.08,3.215,19.44,1,0,3,1


**Reorder the Data Frame by cyl first, then by descending wt.**

In [6]:
head(arrange(mtcars,cyl,desc(wt)))

mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb
24.4,4,146.7,62,3.69,3.19,20.0,1,0,4,2
22.8,4,140.8,95,3.92,3.15,22.9,1,0,4,2
21.4,4,121.0,109,4.11,2.78,18.6,1,1,4,2
21.5,4,120.1,97,3.7,2.465,20.01,1,0,3,1
22.8,4,108.0,93,3.85,2.32,18.61,1,1,4,1
32.4,4,78.7,66,4.08,2.2,19.47,1,1,4,1


**Select the columns mpg and hp**

In [7]:
head(select(mtcars,mpg,hp))

Unnamed: 0,mpg,hp
Mazda RX4,21.0,110
Mazda RX4 Wag,21.0,110
Datsun 710,22.8,93
Hornet 4 Drive,21.4,110
Hornet Sportabout,18.7,175
Valiant,18.1,105


**Select the distinct values of the gear column.**

In [8]:
distinct(select(mtcars,gear))

gear
4
3
5


**Create a new column called "Performance" which is calculated by hp divided by wt.**

In [9]:
head(mutate(mtcars,Performance=hp/wt))

mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb,Performance
21.0,6,160,110,3.9,2.62,16.46,0,1,4,4,41.98473
21.0,6,160,110,3.9,2.875,17.02,0,1,4,4,38.26087
22.8,4,108,93,3.85,2.32,18.61,1,1,4,1,40.08621
21.4,6,258,110,3.08,3.215,19.44,1,0,3,1,34.21462
18.7,8,360,175,3.15,3.44,17.02,0,0,3,2,50.87209
18.1,6,225,105,2.76,3.46,20.22,1,0,3,1,30.34682


**Find the mean mpg value using dplyr.**

In [10]:
summarise(mtcars,avg_mpg=mean(mpg))

avg_mpg
20.09062


**Use pipe operators to get the mean hp value for cars with 6 cylinders.**

In [11]:
mtcars %>% filter(cyl==6) %>% summarise(avg_hp = mean(hp))

avg_hp
122.2857
