In [None]:
import pandas as pd
pd.set_option("display.max_columns", None)

# [Scale and Transform（尺度と変換）](https://pycaret.gitbook.io/docs/get-started/preprocessing/scale-and-transform)

## Normalize（正規化）<a id="normalize"></a>

Normalization is a technique often applied as part of data preparation for machine learning. The goal of normalization is to rescale the values of numeric columns in the dataset without distorting differences in the ranges of values or losing information. There are several methods available for normalization, by default, PyCaret uses `zscore`.

正規化は、機械学習のためのデータ準備の一部としてしばしば適用される技術です。正規化の目的は、データセットの数値列の値を、値の範囲の違いを歪めることなく、また情報を失うことなく再スケールすることです。正規化にはいくつかの方法がありますが、PyCaret はデフォルトで `zscore` を使用します。

### PARAMETERS

- **normalize**: bool, default = False
  - When set to True, the feature space is transformed using the method defined under the normalized_method parameter.
  - `True` に設定すると， `normalized_method` パラメータで定義された手法で特徴空間が変換されます。
  - 
- **normalize_method**: string, default = ‘zscore’
  - Defines the method to be used for normalization. By default, the method is set to `zscore`. The other available options are:
  - 正規化に使用するメソッドを定義します。デフォルトでは、このメソッドは `zscore` に設定されています。他に利用可能なオプションは:
  - 
    - **`z-score`** The standard zscore is calculated as `z = (x – u) / s`
    - **`z-score`** 標準的な `zscore` は、 `z = (x – u) / s` で計算されます。
    - 
    - **`minmax`** scales and translates each feature individually such that it is in the range of `0 – 1`.
    - **`minmax`** 特徴量を `0 - 1` の範囲になるように個別にスケーリングして変換します。
    - 
    - **`maxabs`** scales and translates each feature individually such that the maximal absolute value of each feature will be `1.0`. It does not shift/center the data and thus does not destroy any sparsity.
    - **`maxabs`** 各特徴の絶対値が最大で `1.0` となるように、特徴量を個別にスケーリングして変換します。また、データのシフトやセンタリングは行わないため、疎密性は損なわれません。
    - 
    - **`robust`** scales and translates each feature according to the Interquartile range. When the dataset contains outliers, the robust scaler often gives better results.
    - **`robust`** 特徴量の四分位範囲に基づき，各特徴量をスケーリングして変換します。データセットに外れ値が含まれる場合、ロバストスケーラの方が良い結果が得られることが多いです。
    - 

### Example

In [1]:
# load dataset
from pycaret.datasets import get_data
pokemon = get_data('pokemon')

# init setup
from pycaret.classification import *
clf1 = setup(
    data=pokemon,
    target='Legendary',
    normalize=True
)

Unnamed: 0,Description,Value
0,session_id,7843
1,Target,Legendary
2,Target Type,Binary
3,Label Encoded,"False: 0, True: 1"
4,Original Data,"(800, 13)"
5,Missing Values,True
6,Numeric Features,8
7,Categorical Features,4
8,Ordinal Features,False
9,High Cardinality Features,False


In [2]:
display(pokemon)
get_config("X")

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...
795,719,Diancie,Rock,Fairy,600,50,100,150,100,150,50,6,True
796,719,DiancieMega Diancie,Rock,Fairy,700,50,160,110,160,110,110,6,True
797,720,HoopaHoopa Confined,Psychic,Ghost,600,80,110,60,150,130,70,6,True
798,720,HoopaHoopa Unbound,Psychic,Dark,680,80,160,60,170,130,80,6,True


Unnamed: 0,#,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Name_Abomasnow,Name_AbomasnowMega Abomasnow,...,Type 2_Rock,Type 2_Steel,Type 2_Water,Type 2_not_available,Generation_1,Generation_2,Generation_3,Generation_4,Generation_5,Generation_6
0,-1.718623,-0.941105,-0.895726,-0.911126,-0.761737,-0.223239,-0.204259,-0.818617,0.0,0.0,...,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0
1,-1.713838,-0.211094,-0.323529,-0.505791,-0.306949,0.239062,0.342759,-0.288447,0.0,0.0,...,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0
2,-1.709053,0.795817,0.439401,0.117802,0.342747,0.855464,1.072117,0.418445,0.0,0.0,...,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0
3,-1.709053,1.634910,0.439401,0.679036,1.642138,1.533506,1.801475,0.418445,0.0,0.0,...,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0
4,-1.704268,-1.016623,-1.124605,-0.817587,-0.956645,-0.377339,-0.751278,-0.111724,0.0,0.0,...,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
795,1.716825,1.425136,-0.704994,0.679036,2.519228,0.855464,2.895512,-0.641894,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0
796,1.716825,2.264229,-0.704994,2.549814,1.219836,2.704669,1.436796,1.478783,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0
797,1.721610,1.425136,0.439401,0.990832,-0.404404,2.396468,2.166154,0.064999,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0
798,1.721610,2.096411,0.439401,2.549814,-0.404404,3.012870,2.166154,0.418445,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0


#### Effect of Normalization: