## Best Model 제작을 위한 실험 프로세스 

어떤 실험이 가능한가? 
- 학습 실험 과정 
    - 데이터를 변경하면서 모델 성능 확인 
    - Step 의 코드를 업데이트 하면서 학습 결과 확인
    - Step 의 파라미터를 변경하면서 모델 성능 확인
    - 특정 Step 만 집중적으로 재학습 
</br></br>
 
- 추론 실험 과정    
    - 이전 학습 실험의 모델로 실험 가능  
    - 고정된 모델로 데이터를 변경하면서 추론 결과 확인
    - 고정된 모델로 파라미터를 변경하면서 추론 결과 확인     

</br>

**추론 결과가 좋은 모델을 찾는 것을 목표로 한다. !!**

### Step0. 사전 준비물: 

##### 준비물1: 
```bash
    conda activate {ENV-NAME}           ## python main.py 실행 용 가상환경
    pip install ipykernel        
    python -m ipykernel install --user --name {ENV-NAME} --display-name {IPYKERNEL-NAME}
```

##### 준비물2: 
Asset 에 self.asset.save_summary() (ALO API) 를 사용하여 학습 및 추론 결과를 저장합니다.    
- train/inference_pipeline 에서 선언되는 경우 에 따라 train/inference history 에 따로 저장

===========  
```python 
# {solution_name}/assets/output/asset_output.py
summary = {}
summary['result'] = # model.predict() # 'OK'                                            ## Mandatory
summary['score'] = # model.predict_proba() # 0.98                                       ## Mandatory
summary['note'] = # Score는 모델의 추론 예측 결과에 대한 확률 값을 나타냅니다.            ## Mandatory
summary['probability'] = # model.predict_proba() # {'OK': 0.65, 'NG1':0.25, 'NG2':0.1}  ## Optional

self.asset.save_summary(result=summary['result'], score=summary['score'], note=summary['note'], probability=summary['probability'])
```

In [None]:
## history table 을 시각적 표현을 위해 
%pip install pandas

In [1]:
from src.alo import ALO
import os 
from pprint import pprint
import pandas as pd

alo = ALO()
train_pipeline = alo.pipeline(pipes='train_pipeline')

[[1;32mINFO[0m][PROCESS][2024-03-09 16:21:30,204]: alolib already exists in local path.
[[1;32mINFO[0m][PROCESS][2024-03-09 16:21:31,370]: Success installing alolib requirements.txt

[[1;32mINFO[0m][PROCESS][2024-03-09 16:21:31,377]: Successfully loaded experimental plan yaml: 
 /home/sehyun.song/Project/alo/dev-240228/solution/experimental_plan.yaml
[[1;32mINFO[0m][PROCESS][2024-03-09 16:21:31,380]: Successfully loaded << experimental_plan.yaml >> from: 
 /home/sehyun.song/Project/alo/dev-240228/solution/experimental_plan.yaml
[[1;32mINFO[0m][META][2024-03-09 16:21:31,398]: Loaded solution_metadata: 
None

[[1;32mINFO[0m][PROCESS][2024-03-09 16:21:31,401]: Process start-time: 240309_162131
[[1;32mINFO[0m][META][2024-03-09 16:21:31,403]: ALO version = develop


### Step 1. 학습 과정을 실행
- setup() : step 별 code 를 git repository 에서 clone 하고, python package 를 설치 합니다. 
- load() : external_data 를 ALO 환경으로 load 합니다. 
- run() : step 을 pipeline 순서대로 실행 합니다. 
- save() : 학습 및 추론 결과를 *_artifact 에 저장 합니다. 실험 내용을 history 에 backup 합니다. 


history 폴더에 아래와 같은 내용들이 저장 됩니다. 


```bash
├── history 
     ├── train
     │   ├── {UTC}-{random}-{contents_name}
     │        ├── experimental_plan.yaml
     │        ├── register_source    ## train docker container 준비물
     │            ├── assets
     │            ├── solution
     │            ├── src
     │            ├── alolib
     │        ├── models
     │        ├── score
     │        ├── report
     │        ├── output
     │        ├── log
     │            ├── experimental_history.json  ## 학습 데이터, 코드, 파라미터 변경 사항 기록
     │
     ├── inference
     │   ├── {UTC}-{random}-{contents_name}
     │        ├── experimental_plan.yaml
     │        ├── register_source    ## inference docker container 준비물
     │            ├── assets
     │            ├── solution
     │            ├── src
     │            ├── alolib
     │        ├── score
     │        ├── extra_output
     │        ├── output
     │        ├── log
     │            ├── experimental_history.json  ## 추론 데이터, 코드, 파라미터 변경 사항 기록
 ```

In [8]:
train_pipeline.setup()
train_pipeline.load()
train_pipeline.run()
train_pipeline.save()

[[1;32mINFO[0m][PROCESS][2024-03-09 16:21:54,924]: Successfully emptied << /home/sehyun.song/Project/alo/dev-240228/train_artifacts/score >> 
[[1;32mINFO[0m][PROCESS][2024-03-09 16:21:54,935]: Successfully emptied << /home/sehyun.song/Project/alo/dev-240228/train_artifacts/models >> 
[[1;32mINFO[0m][PROCESS][2024-03-09 16:21:54,937]: Successfully emptied << /home/sehyun.song/Project/alo/dev-240228/train_artifacts/output >> 
[[1;32mINFO[0m][PROCESS][2024-03-09 16:21:54,940]: Successfully emptied << /home/sehyun.song/Project/alo/dev-240228/train_artifacts/report >> 
[[1;32mINFO[0m][PROCESS][2024-03-09 16:21:54,942]: Start setting-up << input >> asset @ << assets >> directory.
[[1;32mINFO[0m][PROCESS][2024-03-09 16:21:54,945]: << input >> asset had already been created at 2024-03-06 23:43:57.387463
[[1;32mINFO[0m][PROCESS][2024-03-09 16:21:54,947]: Start setting-up << train >> asset @ << assets >> directory.
[[1;32mINFO[0m][PROCESS][2024-03-09 16:21:54,949]: << train >> as

Folder '/home/sehyun.song/Project/alo/dev-240228/.package_list/train_pipeline/' has been removed.
Folder '/home/sehyun.song/Project/alo/dev-240228/.package_list/train_pipeline/' has been created.
SSH2


[[1;32mINFO[0m][ASSET][2024-03-09 16:21:55,176][train_pipeline][train]: Successfully got model path for saving or loading your AI model: 
 /home/sehyun.song/Project/alo/dev-240228/train_artifacts/models/train/
[[1;32mINFO[0m][ASSET][2024-03-09 16:21:55,186][train_pipeline][train]: Successfully saved inference summary yaml. 
 >> /home/sehyun.song/Project/alo/dev-240228/train_artifacts/score/train_summary.yaml
[[1;32mINFO[0m][ASSET][2024-03-09 16:21:55,188][train_pipeline][train]: [36m

- time (UTC)        : 2024-03-09 07:21:55
- current step      : train
- save config. keys : dict_keys(['meta', 'x_columns', 'y_column', 'model_path'])
- save data keys    : dict_keys(['dataframe0'])

[0m
[[1;32mINFO[0m][PROCESS][2024-03-09 16:21:55,195]: None of external path is written in your experimental_plan.yaml. Skip saving artifacts into external path. 

[[1;32mINFO[0m][PROCESS][2024-03-09 16:21:55,197]: Process finish-time: 2024-03-09 16:21:55


SHS


[[1;32mINFO[0m][PROCESS][2024-03-09 16:21:55,618]: [INFO] copy from " /home/sehyun.song/Project/alo/dev-240228/main.py "  -->  " /home/sehyun.song/Project/alo/dev-240228/history/train/20240309T072131Z-13050773-demo-titanic/register_source/ " 
[[1;32mINFO[0m][PROCESS][2024-03-09 16:21:55,626]: [INFO] copy from " /home/sehyun.song/Project/alo/dev-240228/src "  -->  " /home/sehyun.song/Project/alo/dev-240228/history/train/20240309T072131Z-13050773-demo-titanic/register_source/ " 
[[1;32mINFO[0m][PROCESS][2024-03-09 16:21:55,658]: [INFO] copy from " /home/sehyun.song/Project/alo/dev-240228/assets "  -->  " /home/sehyun.song/Project/alo/dev-240228/history/train/20240309T072131Z-13050773-demo-titanic/register_source/ " 
[[1;32mINFO[0m][PROCESS][2024-03-09 16:21:55,679]: [INFO] copy from " /home/sehyun.song/Project/alo/dev-240228/solution "  -->  " /home/sehyun.song/Project/alo/dev-240228/history/train/20240309T072131Z-13050773-demo-titanic/register_source/ " 
[[1;32mINFO[0m][PROCES

실험 결과를 확인 합니다.    
- "data_id"    : 실험에 사용된 데이터가 일부라도 변경될 경우 ID 변경. 동일 데이터로 실험 결과 비교 목적     
- "code_id"    : step 코드가 일부라도 변경됨을 감지(간단 프린트 문 추가 만으로도 ID 변경). 동일 코드로 실험 결과 비교에 집중    
- "param_id"   : 파라미터가 추가 되거나 값이 변경될 경우 ID 변경     

In [10]:
table_list = train_pipeline.history( )
display(pd.DataFrame(table_list).head(20))

Unnamed: 0,id,start_time,end_time,score,result,note,probability,version,data_id,code_id,param_id
0,20240309T072131Z-13050773-demo-titanic,2024-03-09 07:21:31,2024-03-09 16:21:55,0.2,precision: 0.796594261196031,Test Titanic-demo (date: 2024-03-09 07:21:55),{},,7732711228ca,8a3cb8f9217c,80885fb84940
1,20240309T072131Z-53505915-demo-titanic,2024-03-09 07:21:31,2024-03-09 16:21:53,0.85,precision: 0.796594261196031,Test Titanic-demo (date: 2024-03-09 07:21:53),{},,7732711228ca,8a3cb8f9217c,80885fb84940
2,20240309T072131Z-92229745-demo-titanic,2024-03-09 07:21:31,2024-03-09 16:21:51,0.55,precision: 0.796594261196031,Test Titanic-demo (date: 2024-03-09 07:21:51),{},,7732711228ca,8a3cb8f9217c,80885fb84940
3,20240309T072131Z-34778465-demo-titanic,2024-03-09 07:21:31,2024-03-09 16:21:35,0.58,precision: 0.796594261196031,Test Titanic-demo (date: 2024-03-09 07:21:35),{},,7732711228ca,8a3cb8f9217c,80885fb84940


### Step 2. 데이터 변경하여 실험하기

In [13]:
new_data = './solution/sample_data2/train_data/'  ## 실험에 필요한 데이터로 변경 !!

train_pipeline.load(data_path=new_data)
train_pipeline.run()
train_pipeline.save()

                                     you have to write the aws_key_profile or set << AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY >> in your os environment. 

[[1;32mINFO[0m][PROCESS][2024-03-09 16:23:18,105]: Successfuly removed << /home/sehyun.song/Project/alo/dev-240228/input/train/ >> before loading external data.
[[1;32mINFO[0m][PROCESS][2024-03-09 16:23:18,107]: << ./solution/sample_data2/train_data/ >> may be relative path. The reference folder of relative path is << /home/sehyun.song/Project/alo/dev-240228/ >>. 
 If this is not appropriate relative path, Loading external data process would raise error.
[[1;32mINFO[0m][PROCESS][2024-03-09 16:23:18,110]: Successfully done loading external data: 
 ./solution/sample_data2/train_data/ --> /home/sehyun.song/Project/alo/dev-240228/input/train/
[[1;32mINFO[0m][PROCESS][2024-03-09 16:23:18,113]: Successfuly finish loading << ./solution/sample_data2/train_data/ >> into << /home/sehyun.song/Project/alo/dev-240228/input/ >>
[[1;32mINF

SSH2


[[1;32mINFO[0m][ASSET][2024-03-09 16:23:18,328][train_pipeline][train]: Successfully saved inference summary yaml. 
 >> /home/sehyun.song/Project/alo/dev-240228/train_artifacts/score/train_summary.yaml
[[1;32mINFO[0m][ASSET][2024-03-09 16:23:18,330][train_pipeline][train]: [36m

- time (UTC)        : 2024-03-09 07:23:18
- current step      : train
- save config. keys : dict_keys(['meta', 'x_columns', 'y_column', 'model_path'])
- save data keys    : dict_keys(['dataframe0'])

[0m
[[1;32mINFO[0m][PROCESS][2024-03-09 16:23:18,337]: None of external path is written in your experimental_plan.yaml. Skip saving artifacts into external path. 

[[1;32mINFO[0m][PROCESS][2024-03-09 16:23:18,339]: Process finish-time: 2024-03-09 16:23:18


SHS


[[1;32mINFO[0m][PROCESS][2024-03-09 16:23:18,789]: [INFO] copy from " /home/sehyun.song/Project/alo/dev-240228/main.py "  -->  " /home/sehyun.song/Project/alo/dev-240228/history/train/20240309T072131Z-27087406-demo-titanic/register_source/ " 
[[1;32mINFO[0m][PROCESS][2024-03-09 16:23:18,796]: [INFO] copy from " /home/sehyun.song/Project/alo/dev-240228/src "  -->  " /home/sehyun.song/Project/alo/dev-240228/history/train/20240309T072131Z-27087406-demo-titanic/register_source/ " 
[[1;32mINFO[0m][PROCESS][2024-03-09 16:23:18,829]: [INFO] copy from " /home/sehyun.song/Project/alo/dev-240228/assets "  -->  " /home/sehyun.song/Project/alo/dev-240228/history/train/20240309T072131Z-27087406-demo-titanic/register_source/ " 
[[1;32mINFO[0m][PROCESS][2024-03-09 16:23:18,849]: [INFO] copy from " /home/sehyun.song/Project/alo/dev-240228/solution "  -->  " /home/sehyun.song/Project/alo/dev-240228/history/train/20240309T072131Z-27087406-demo-titanic/register_source/ " 
[[1;32mINFO[0m][PROCES

In [19]:
table_list = train_pipeline.history()
display(pd.DataFrame(table_list).head(20))

Unnamed: 0,id,start_time,end_time,score,result,note,probability,version,data_id,code_id,param_id
0,20240309T072131Z-27087406-demo-titanic,2024-03-09 07:21:31,2024-03-09 16:23:18,0.99,precision: 0.796594261196031,Test Titanic-demo (date: 2024-03-09 07:23:18),{},,e72889ca48a6,8a3cb8f9217c,80885fb84940
1,20240309T072131Z-03916014-demo-titanic,2024-03-09 07:21:31,2024-03-09 16:22:44,0.56,precision: 0.796594261196031,Test Titanic-demo (date: 2024-03-09 07:22:44),{},,e72889ca48a6,8a3cb8f9217c,80885fb84940
2,20240309T072131Z-13050773-demo-titanic,2024-03-09 07:21:31,2024-03-09 16:21:55,0.2,precision: 0.796594261196031,Test Titanic-demo (date: 2024-03-09 07:21:55),{},,7732711228ca,8a3cb8f9217c,80885fb84940
3,20240309T072131Z-53505915-demo-titanic,2024-03-09 07:21:31,2024-03-09 16:21:53,0.85,precision: 0.796594261196031,Test Titanic-demo (date: 2024-03-09 07:21:53),{},,7732711228ca,8a3cb8f9217c,80885fb84940
4,20240309T072131Z-92229745-demo-titanic,2024-03-09 07:21:31,2024-03-09 16:21:51,0.55,precision: 0.796594261196031,Test Titanic-demo (date: 2024-03-09 07:21:51),{},,7732711228ca,8a3cb8f9217c,80885fb84940
5,20240309T072131Z-34778465-demo-titanic,2024-03-09 07:21:31,2024-03-09 16:21:35,0.58,precision: 0.796594261196031,Test Titanic-demo (date: 2024-03-09 07:21:35),{},,7732711228ca,8a3cb8f9217c,80885fb84940
6,20240309T065929Z-03426922-demo-titanic,2024-03-09 06:59:29,2024-03-09 15:59:33,0.64,precision: 0.796594261196031,Test Titanic-demo (date: 2024-03-09 06:59:33),{},,0000000017c20530,8a3cb8f9217c,80885fb84940
7,20240309T031352Z-04407409-demo-titanic,2024-03-09 03:13:52,2024-03-09 12:16:20,0.25,precision: 0.796594261196031,Test Titanic-demo (date: 2024-03-09 03:16:20),{},,,9484828294349639189,16505050446478898296
8,20240308T180553Z-88428745-demo-titanic,2024-03-08 18:05:53,2024-03-09 03:20:40,0.98,precision: 0.796594261196031,Test Titanic-demo (date: 2024-03-08 18:20:40),{},,398591280,9484828294349639189,3860141117657530653
9,20240308T131729Z-91458146-demo-titanic,2024-03-08 13:17:29,2024-03-08 22:17:31,0.35,precision: 0.796594261196031,Test Titanic-demo (date: 2024-03-08 13:17:31),{},,398591280,9484828294349639189,3860141117657530653


### Step 3. 코드 변경하여 실행하기

In [None]:
train_pipeline.run()
train_pipeline.save()

In [None]:
table_list = train_pipeline.history()
display(pd.DataFrame(table_list).head(20))

### Step 4. 파라미터 변경하여 실행하기

In [None]:
train_pipeline.get_parameter('train')['n_estimators'] = 150
train_pipeline.get_parameter('train')
train_pipeline.run()
train_pipeline.save()

### Step 4-2: 파라미터 변경 & 특정 Step 만 반복 실행하기 

In [None]:
for value in [100, 110, 120, 130]:
    train_pipeline.get_parameter('train')['n_estimators'] = value
    train_pipeline.run(steps = ['train'])
    train_pipeline.save()

In [None]:
table_list = train_pipeline.history()
display(pd.DataFrame(table_list).head(20))

### Step 5: 추론 결과를 확인 (데이터, 코드, 파라미터 변경 가)

In [None]:
train_id = '20240308T131729Z-91458146-demo-titanic'
train_id = ''
inference_pipeline = alo.pipeline(pipes='inference_pipeline', train_id=train_id)

In [None]:
inference_pipeline.load()
inference_pipeline.system_envs['inference_history']


In [None]:
inference_pipeline.setup()
inference_pipeline.load()
inference_pipeline.run()
inference_pipeline.save()

In [None]:
table_list = inference_pipeline.history()
display(pd.DataFrame(table_list).head(20))

### 실험 결과 등록 (train_id, inference_id 입력)

In [None]:
import getpass

username = input('Username: ')
password = getpass.getpass('Password: ')

print("Your ID : ", username)
print("Your PW : ", password.replace(password, '*' * len(password)))

In [None]:
infra_setup = "./setting/example_infra_config/infra_config.localtest.yaml"
solution_info ={
    'inference_only': False, # True, False
    'solution_update': False,
    'solution_name': 'titanic-exp-test1',
    'solution_type': 'private',
    'contents_type': {
            'support_labeling': False,
            'inference_result_datatype': 'table', # 'image'
            'train_datatype': 'table', # 'image'
            'labeling_column_name': ''
    },
    'train_gpu': False, ## cpu, gpu
    'inference_gpu': False,
    "inference_arm": False,  # amd, arm  
}


history table 에서 ID 를 확인하고, 입력하여 솔루션을 등록 한다. 
- 학습 실험 과 추론 실험 을 혼합하여 솔루션 등록이 가능 
- train_id, inference_id 가 None 인 경우, 마지막 실험 결과를 등록 
    - history 의 experimental_plan.yaml 를 참조하여 1회 실행하고, 이를 솔루션 등록 함 
    - history 가 오염 될 수 있음을 고려 

In [None]:
register = alo.register(
    infra_setup=infra_setup,
    solution_info=solution_info,
    train_id='',   ## history 에서 확인 한 Train ID 입력 하세요.!!
    inference_id='',  ## history 에서 확인 한 Inference ID 를 입력 하세요 !!
)

register.login(username, password)

In [None]:
register.debugging = True  ## default: False (skip 항목: docker 생성, solution 등록)
register.run(username=username, password=password)

In [None]:
register.run_train(
    status_period = 10, ## 몇 초 간격으로 학습 상태를 체크할 것인지 설정
    delete_instance = False,   ## 학습 테스트에 사용한 solution instance & stream 을 삭제할 것인지 설정 
    delete_solution= False,  ## solution 을 삭제할 것인지 설정
    )