## Autogluon 3번째 시도

In [12]:
from autogluon.multimodal import MultiModalPredictor
import pandas as pd
from pathlib import Path
from sklearn.model_selection import train_test_split
import torch
import os

# 설정
train_dir = 'open/train'
test_csv_path = 'open/test.csv'
output_dir = 'autogluon_output_best3'
resume = True  # True로 두면 중단 후 이어서 학습

In [13]:
# 1. 학습 데이터 구성
all_img_paths = list(Path(train_dir).rglob("*/*.jpg"))
df = pd.DataFrame({'img_path': [str(p) for p in all_img_paths]})
df['label'] = df['img_path'].apply(lambda x: Path(x).parent.name)
df = df.rename(columns={'img_path': 'image'})  # 컬럼명 변경

# 2. 각 폴더에서 7천 장의 데이터를 랜덤 추출
df_balanced = pd.DataFrame()
for label in df['label'].unique():
    label_df = df[df['label'] == label]
    label_df_sampled = label_df.sample(n=7000, random_state=41)
    df_balanced = pd.concat([df_balanced, label_df_sampled], axis=0)

# 3. train/val 분리
train_df, val_df = train_test_split(df_balanced, test_size=0.3, stratify=df_balanced['label'], random_state=41)
train_df_small = train_df.sample(n=min(49000, len(train_df)), random_state=41)
val_df_small = val_df.sample(n=min(15000, len(val_df)), random_state=41)

# # 4. GPU 설정
# num_gpus = 1 if torch.cuda.is_available() else 0


In [14]:
# 5. 기존 predictor 불러오기 또는 새로 생성
if resume and os.path.exists(os.path.join(output_dir, 'predictor.pkl')):
    print("🔁 이전 학습 결과를 불러옵니다.")
    predictor = MultiModalPredictor.load(output_dir)
else:
    print("🆕 새로운 predictor를 생성합니다.")
    predictor = MultiModalPredictor(
        label='label',
        problem_type='classification',
        eval_metric='accuracy',
        path=output_dir
    )

🆕 새로운 predictor를 생성합니다.


In [18]:
hyperparameters = None  # 또는 빈 dict {} 로 초기화

# 6. 학습 시작 (남은 에포크만큼)
predictor.fit(
    train_data=train_df_small,
    tuning_data=val_df_small,
    hyperparameters=None,   # 혹은 그냥 아예 이 인자 생략
    presets='medium_quality',
    time_limit=1800,
    column_types={'image': 'image'}
)


AutoGluon Version:  1.3.0
Python Version:     3.11.1
Operating System:   Windows
Platform Machine:   AMD64
Platform Version:   10.0.26100
CPU Count:          6
Pytorch Version:    2.6.0+cu126
CUDA Version:       12.6
Memory Avail:       15.90 GB / 23.91 GB (66.5%)
Disk Space Avail:   33.31 GB / 222.28 GB (15.0%)

AutoMM starts to create your model. ✨✨✨

To track the learning progress, you can open a terminal and launch Tensorboard:
    ```shell
    # Assume you have installed tensorboard
    tensorboard --logdir C:\Users\FOR\Deep Learning\autogluon_output_best3
    ```

Seed set to 0
GPU Count: 1
GPU Count to be Used: 1

Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name              | Type                            | Params | Mode 
------------------------------------------------------------------------------
0 | model       

Sanity Checking: |                                                                               | 0/? [00:00<…

Training: |                                                                                      | 0/? [00:00<…

Validation: |                                                                                    | 0/? [00:00<…

Epoch 0, global step 134: 'val_accuracy' reached 0.51252 (best 0.51252), saving model to 'C:\\Users\\FOR\\Deep Learning\\autogluon_output_best3\\epoch=0-step=134.ckpt' as top 3


Validation: |                                                                                    | 0/? [00:00<…

Epoch 0, global step 268: 'val_accuracy' reached 0.61463 (best 0.61463), saving model to 'C:\\Users\\FOR\\Deep Learning\\autogluon_output_best3\\epoch=0-step=268.ckpt' as top 3


Validation: |                                                                                    | 0/? [00:00<…

Epoch 1, global step 402: 'val_accuracy' reached 0.65578 (best 0.65578), saving model to 'C:\\Users\\FOR\\Deep Learning\\autogluon_output_best3\\epoch=1-step=402.ckpt' as top 3


Validation: |                                                                                    | 0/? [00:00<…

Epoch 1, global step 536: 'val_accuracy' reached 0.67898 (best 0.67898), saving model to 'C:\\Users\\FOR\\Deep Learning\\autogluon_output_best3\\epoch=1-step=536.ckpt' as top 3


Validation: |                                                                                    | 0/? [00:00<…

Epoch 2, global step 670: 'val_accuracy' reached 0.69816 (best 0.69816), saving model to 'C:\\Users\\FOR\\Deep Learning\\autogluon_output_best3\\epoch=2-step=670.ckpt' as top 3


Validation: |                                                                                    | 0/? [00:00<…

Epoch 2, global step 804: 'val_accuracy' reached 0.70891 (best 0.70891), saving model to 'C:\\Users\\FOR\\Deep Learning\\autogluon_output_best3\\epoch=2-step=804.ckpt' as top 3


Validation: |                                                                                    | 0/? [00:00<…

Epoch 3, global step 938: 'val_accuracy' reached 0.72000 (best 0.72000), saving model to 'C:\\Users\\FOR\\Deep Learning\\autogluon_output_best3\\epoch=3-step=938.ckpt' as top 3


Validation: |                                                                                    | 0/? [00:00<…

Epoch 3, global step 1072: 'val_accuracy' reached 0.73680 (best 0.73680), saving model to 'C:\\Users\\FOR\\Deep Learning\\autogluon_output_best3\\epoch=3-step=1072.ckpt' as top 3


Validation: |                                                                                    | 0/? [00:00<…

Epoch 4, global step 1206: 'val_accuracy' reached 0.73728 (best 0.73728), saving model to 'C:\\Users\\FOR\\Deep Learning\\autogluon_output_best3\\epoch=4-step=1206.ckpt' as top 3


Validation: |                                                                                    | 0/? [00:00<…

Epoch 4, global step 1340: 'val_accuracy' reached 0.73109 (best 0.73728), saving model to 'C:\\Users\\FOR\\Deep Learning\\autogluon_output_best3\\epoch=4-step=1340.ckpt' as top 3


Validation: |                                                                                    | 0/? [00:00<…

Epoch 5, global step 1474: 'val_accuracy' reached 0.73830 (best 0.73830), saving model to 'C:\\Users\\FOR\\Deep Learning\\autogluon_output_best3\\epoch=5-step=1474.ckpt' as top 3


Validation: |                                                                                    | 0/? [00:00<…

Epoch 5, global step 1608: 'val_accuracy' reached 0.74891 (best 0.74891), saving model to 'C:\\Users\\FOR\\Deep Learning\\autogluon_output_best3\\epoch=5-step=1608.ckpt' as top 3
Time limit reached. Elapsed time is 0:30:00. Signaling Trainer to stop.


Validation: |                                                                                    | 0/? [00:00<…

Epoch 6, global step 1657: 'val_accuracy' reached 0.75156 (best 0.75156), saving model to 'C:\\Users\\FOR\\Deep Learning\\autogluon_output_best3\\epoch=6-step=1657.ckpt' as top 3
Start to fuse 3 checkpoints via the greedy soup algorithm.
Using default `ModelCheckpoint`. Consider installing `litmodels` package to enable `LitModelCheckpoint` for automatic upload to the Lightning model registry.


Predicting: |                                                                                    | 0/? [00:00<…

Using default `ModelCheckpoint`. Consider installing `litmodels` package to enable `LitModelCheckpoint` for automatic upload to the Lightning model registry.


Predicting: |                                                                                    | 0/? [00:00<…

Using default `ModelCheckpoint`. Consider installing `litmodels` package to enable `LitModelCheckpoint` for automatic upload to the Lightning model registry.


Predicting: |                                                                                    | 0/? [00:00<…

AutoMM has created your model. 🎉🎉🎉

To load the model, use the code below:
    ```python
    from autogluon.multimodal import MultiModalPredictor
    predictor = MultiModalPredictor.load("C:\Users\FOR\Deep Learning\autogluon_output_best3")
    ```

If you are not satisfied with the model, try to increase the training time, 
adjust the hyperparameters (https://auto.gluon.ai/stable/tutorials/multimodal/advanced_topics/customization.html),
or post issues on GitHub (https://github.com/autogluon/autogluon/issues).




<autogluon.multimodal.predictor.MultiModalPredictor at 0x23233bfa6d0>

In [24]:
# 7. 테스트 데이터 준비
test_df = pd.read_csv(test_csv_path)
test_df['image'] = test_df['img_path'].apply(lambda x: os.path.join('open', x.strip('./')))

# 8. 예측
preds = predictor.predict(test_df)

Using default `ModelCheckpoint`. Consider installing `litmodels` package to enable `LitModelCheckpoint` for automatic upload to the Lightning model registry.


Predicting: |                                                                                    | 0/? [00:00<…

In [25]:
# 9. 제출 파일 생성
submission = pd.read_csv('open/sample_submission.csv')
submission['rock_type'] = preds
submission.to_csv('submission3.csv', index=False)

### test 데이터 로드 확인

In [26]:
print(test_df['image'].head(3))  # 경로가 잘 보정되었는지 확인
print("존재 여부 예시:", os.path.exists(test_df['image'].iloc[0]))  # True이면 성공


0    open\test/TEST_00000.jpg
1    open\test/TEST_00001.jpg
2    open\test/TEST_00002.jpg
Name: image, dtype: object
존재 여부 예시: True


# 

🔁 이어서 학습하려면?
1. 처음 실행
resume=True이지만 predictor.pkl이 없으면 새로 학습 시작

2. 학습 중 중단 (예: 에포크 1만 끝나고 종료됨)
3. 다시 실행
predictor.pkl 존재 → 불러오기

다시 predictor.fit() 호출하면 이어서 훈련함 (단, 총 에포크 수는 사용자가 수동 조정해야 함)

예:

처음에 1 에포크만 학습함

다음에 다시 실행할 땐 epochs=2로 두면 1 + 2 = 총 3번 학습됨