# Sorting and Ranking

In [1]:
import numpy as np
import pandas as pd
from pandas import DataFrame, Series

- Sorting a data set by some criteria is a important operation.
- `sort_index()` method can be used to sort row or column index lexicographically (in a dictionary or alphabetical order)

## Sorting
---

### Sorting series

In [2]:
obj = Series(range(4), index=list('dabc'))
obj

d    0
a    1
b    2
c    3
dtype: int64

In [3]:
obj.sort_index()

a    1
b    2
c    3
d    0
dtype: int64

### Sorting DataFrames

In [4]:
frame = DataFrame(np.arange(8).reshape(2, 4),
                  index=['three', 'one'],
                  columns=list('dabc'))
frame

Unnamed: 0,d,a,b,c
three,0,1,2,3
one,4,5,6,7


In [5]:
frame.sort_index()

Unnamed: 0,d,a,b,c
one,4,5,6,7
three,0,1,2,3


In [6]:
# column wise descending sorting
frame.sort_index(axis=1, ascending=False) 

Unnamed: 0,d,c,b,a
three,0,3,2,1
one,4,7,6,5


## Ranking

- **Ranking** is closely related to sorting, assigining ranks from one through the number of valid data points in an arry.
- Uses `rank()` method that assigns each group or value the mean rank. 

In [7]:
obj = Series([7, -5, 7, 4, 2, 0, 4])
obj.rank()

0    6.5
1    1.0
2    6.5
3    4.5
4    3.0
5    2.0
6    4.5
dtype: float64

In [14]:
df = DataFrame({'values': [100, 200, 200, 300, 400, 500]})
df

Unnamed: 0,values
0,100
1,200
2,200
3,300
4,400
5,500


In [16]:
df['rank'] = df['values'].rank()

By default, rankings are assigned in ascending order where smallest value gets the rank 1.

In [17]:
df

Unnamed: 0,values,rank
0,100,1.0
1,200,2.5
2,200,2.5
3,300,4.0
4,400,5.0
5,500,6.0
