# Sort & Order
> Sort a Data Frame using ```Order()```

This notebook is following [this tutorial](https://www.guru99.com/r-sort-data-frame.html)

In data analysis you can sort your data according to a certain variable in the dataset. In R, we can use the help of the function ```order()```. In R, we can easily sort a vector of continuous variable or factor variable. Arranging the data can be of ascending or descending order

Syntax:

```R
sort(x, decreasing = FALSE, na.last = TRUE):
```

Argument:

* x: A vector containing continuous or factor variable
* decreasing: Control for the order of the sort method. By default, decreasing is set to `FALSE`. (default is sort in ascending)
* last: Indicates whether the `NA` 's value should be put last or not

## Example 1


### Tibble
For instance, we can create a **tibble** data frame and sort one or multiple variables. A tibble data frame is a new approach to data frame. It improves the syntax of data frame and avoid frustrating data type formatting, especially for character to factor. It is also a convenient way to create a data frame by hand, which is our purpose here. To learn more about tibble, please refer to the [vignette](https://cran.r-project.org/web/packages/tibble/vignettes/tibble.html)

In [3]:
library(dplyr)
set.seed(1234)
df <- tibble(
    c1 = rnorm(50,5,1.5),
    c2 = rnorm(50,5,1.5),
    c3 = rnorm(50,5,1.5),
    c4 = rnorm(50,5,1.5),
    c5 = rnorm(50,5,1.5),
)

In [5]:
head(df)

c1,c2,c3,c4,c5
<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
3.189401,2.290953,5.621785,4.434144,5.72784
5.416144,4.126886,4.287922,5.146429,6.045153
6.626662,3.336666,5.09899,7.458117,5.278271
1.481453,3.477557,4.246283,3.686611,6.0511
5.643687,4.756536,3.761002,5.18264,5.467522
5.759084,5.844584,5.250484,7.043196,6.140694


Sort by column c1, in **ascending** order

In [10]:
df2 <- df[order(df$c1),]
head(df2,10)

c1,c2,c3,c4,c5
<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
1.481453,3.477557,4.246283,3.686611,6.0511003
1.729941,5.824996,4.525823,6.753663,0.1502718
2.55636,6.275348,2.524849,6.368483,5.4787404
2.827693,4.769902,5.120089,3.743626,4.0103449
2.98851,4.395902,2.077631,4.236894,4.617688
3.122021,6.317305,5.41384,3.551145,5.6067027
3.189401,2.290953,5.621785,4.434144,5.7278402
3.248571,6.046413,4.451222,7.975598,3.3836868
3.339023,3.298088,7.494285,5.930315,7.0359117
3.397036,5.382794,7.092722,0.716362,5.6200983


Sort by column c2, in descending order

In [12]:
df3 <- df[order(df$c2,decreasing = TRUE),]
head(df3,10)

c1,c2,c3,c4,c5
<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
3.50242,8.823487,4.026395,4.609041,4.399647
4.254725,8.181676,5.521328,1.882644,5.402066
3.95942,8.105406,7.736312,7.116894,5.431565
4.834572,7.665627,6.246711,5.93695,5.633013
3.716953,7.558946,5.955012,3.81553,3.353342
4.13789,7.471726,3.655603,4.648068,7.763695
4.153322,7.408864,5.532452,3.695325,5.048996
3.633207,7.051741,5.25354,4.89696,4.090773
3.744242,6.994347,6.009749,4.51874,4.542918
4.579065,6.50227,4.837352,5.731722,6.066763


Order by 2 columns, by the 1st column first.

But in this case, such operation is pointless, no same c3 cell values

In [13]:
df4 <- df[order(df$c3,df$c4,decreasing = TRUE),]
head(df4,10)

c1,c2,c3,c4,c5
<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
4.339178,4.450214,8.087243,4.501014,8.410225
3.95942,8.105406,7.736312,7.1168936,5.431565
3.339023,3.298088,7.494285,5.9303153,7.035912
3.397036,5.382794,7.092722,0.716362,5.620098
6.653446,4.733315,6.520536,0.9016707,4.51341
4.558559,4.712609,6.380086,6.0562703,5.044277
5.096688,3.99555,6.273911,4.7254238,2.589379
4.834572,7.665627,6.246711,5.9369497,5.633013
5.689384,5.97243,6.125752,2.2666469,6.760246
3.744242,6.994347,6.009749,4.5187401,4.542918
