## Chapter11: Sorting Arrays

---
* Author:  [Yuttapong Mahasittiwat](mailto:khala1391@gmail.com)
* Technologist | Data Modeler | Data Analyst
* [YouTube](https://www.youtube.com/khala1391)
* [LinkedIn](https://www.linkedin.com/in/yuttapong-m/)
---

Source: [**Python Data Science Handbook** by **VanderPlas**](https://jakevdp.github.io/PythonDataScienceHandbook/)

In [2]:
import numpy as np
import pandas as pd
print("numpy version :",np.__version__)
print("pandas version :",pd.__version__)

numpy version : 1.26.4
pandas version : 2.2.1


In [6]:
L = [3,1,4,1,5,9,2,6]
sorted(L)  # sorted copy

[1, 1, 2, 3, 4, 5, 6, 9]

In [10]:
L.sort()  # acts in-place, return None
L

[1, 1, 2, 3, 4, 5, 6, 9]

In [12]:
sorted('Python')

['P', 'h', 'n', 'o', 't', 'y']

## Fast Sorting: np.sort, np.argsort

In [14]:
x = np.array([2,1,4,3,5])
np.sort(x)  # sorted copy

array([1, 2, 3, 4, 5])

In [20]:
# python built-in function
x.sort()
x

array([1, 2, 3, 4, 5])

In [22]:
# argsort return indices
x = np.array([2,1,4,3,5])
i = np.argsort(x)
i

array([1, 0, 3, 2, 4], dtype=int64)

In [24]:
x[i]  # fancy indexing

array([1, 2, 3, 4, 5])

## sorting along rows or columns

In [38]:
rng = np.random.default_rng(seed=42)
X = rng.integers(0,10,(4,6))
X

array([[0, 7, 6, 4, 4, 8],
       [0, 6, 2, 0, 5, 9],
       [7, 7, 7, 7, 5, 1],
       [8, 4, 5, 3, 1, 9]], dtype=int64)

In [40]:
# sort each column
np.sort(X, axis=0)

array([[0, 4, 2, 0, 1, 1],
       [0, 6, 5, 3, 4, 8],
       [7, 7, 6, 4, 5, 9],
       [8, 7, 7, 7, 5, 9]], dtype=int64)

In [42]:
# sort each row
np.sort(X, axis=1)

array([[0, 4, 4, 6, 7, 8],
       [0, 0, 2, 5, 6, 9],
       [1, 5, 7, 7, 7, 7],
       [1, 3, 4, 5, 8, 9]], dtype=int64)

## partial sorts: partitioning

In [36]:
x = np.array([7,2,3,1,6,5,4])
np.partition(x,3)

array([2, 1, 3, 4, 6, 5, 7])

In [46]:
np.partition(X,2,axis=1)  # partition by column

array([[0, 4, 4, 7, 6, 8],
       [0, 0, 2, 6, 5, 9],
       [1, 5, 7, 7, 7, 7],
       [1, 3, 4, 5, 8, 9]], dtype=int64)

In [50]:
np.partition(X,2,axis=0)  # partition by row

array([[0, 4, 2, 0, 1, 1],
       [0, 6, 5, 3, 4, 8],
       [7, 7, 6, 4, 5, 9],
       [8, 7, 7, 7, 5, 9]], dtype=int64)

## example: k-nearest neighbors