# Alphamind新手入门之二：因子排序和分位数

alpha-mind的data文件夹提供了对于因子数据进行排序和求分位数的工具函数

### 一、因子排序： *rank*
- 从小到大排序，返回序列值。
- 可以进行整体排序，也可以分行业(分组)排序。

In [1]:
import numpy as np
import pandas as pd
from alphamind.data.rank import rank

# 假设有10只股票，每只股票有2个因子，构成一个矩阵
factors = pd.DataFrame(np.random.rand(10, 2))
factors.columns = ['factor_1', 'factor_2']
factors['rank_1'] = rank(factors['factor_1'].values)
factors['rank_2'] = rank(factors['factor_2'].values)

factors


Unnamed: 0,factor_1,factor_2,rank_1,rank_2
0,0.3679,0.66722,2.0,5.0
1,0.748232,0.522733,6.0,3.0
2,0.550754,0.41541,5.0,2.0
3,0.165541,0.144597,1.0,0.0
4,0.089515,0.912614,0.0,8.0
5,0.536981,0.901607,4.0,7.0
6,0.943403,0.967978,9.0,9.0
7,0.415528,0.87031,3.0,6.0
8,0.831134,0.144879,8.0,1.0
9,0.811315,0.55429,7.0,4.0


In [2]:
# 假设有10只股票，每只股票有1个因子
factors = pd.DataFrame(np.random.rand(10, 1))
factors.columns = ['factor_1']

# 假设这10只股票分为两个行业,前5个和后5个分属不同类别
industry = np.concatenate([np.array([1.0]*5), np.array([2.0]*5)])

factors['rank'] = rank(factors['factor_1'].values, groups=industry)
factors

Unnamed: 0,factor_1,rank
0,0.969648,4
1,0.776467,3
2,0.283248,0
3,0.408707,1
4,0.502428,2
5,0.435588,3
6,0.656777,4
7,0.378913,2
8,0.294724,1
9,0.069706,0


### 二、因子分位数: *quantile*
- 根据给定组数*(n_bins)*，按从小达到的顺序进行分组，返回每个因子属于的组别。

In [3]:
from alphamind.data.quantile import quantile

factors['quantile'] = quantile(factors['factor_1'].values, n_bins=5)
factors

Unnamed: 0,factor_1,rank,quantile
0,0.969648,4,4
1,0.776467,3,4
2,0.283248,0,0
3,0.408707,1,2
4,0.502428,2,3
5,0.435588,3,2
6,0.656777,4,3
7,0.378913,2,1
8,0.294724,1,1
9,0.069706,0,0
