#**👶🏻 북실북실 : 아동도서관리 어플리케이션 📚**
- **Main goal** : 아동도서가 빼곡히 꽂힌 책장으로부터 도서 데이터를 수집하여 한 눈에 확인하고 관리할 수 있는 서비스
- **Keyword** :  Book title extraction, Detectron2, Segmentation, OCR
- **Modeling Overview**
  - **CV : Book title extraction (책 제목 추출)**
    - COCO dataset과 추가적인 책장 데이터셋 수집 및 라벨링을 통해 총총 685개의 image Dataset 구성
    - Segmentation을 수행할 수 있는 라이브러리 중 하나인 Detectron2로 Mask R-CNN 모델 학습
    - Segmentation된 책등 이미지로부터 CLOVA OCR을 이용해 책 제목 텍스트 추출
    - 출판사, 도서관 청구기호 등 불필요한 정보 제거 → 인식 정확도를 높이고자 자체적으로 구축한 데이터베이스의 책 제목과 비교하여 제목 교정
  - **NLP : 책장 도서 분석 및 새로운 도서 추천 기능**
    - **도서 분석** : 소장 도서 리스트 입력 시 4가지 카테고리에 대한 분석 제공
      - 책의 장르 (놀이책/그림책/학습책) 분류
      - 출판 연도 (신간/그외/스테디셀러)
      - 대상 연령 (0-3세/4-6세/유아)
      - 국가 (한국도서/외국도서)
    - **키워드로 보는 책장**
      - 소장 도서들을 몇 가지 대표 키워드들로 요약해주는 기능
      - 제목 + 줄거리(1-gram, 2-gram) → SBERT embedding → 유사도 계산 → 키워드 추출
    - **도서 추천 : 이 책은 어때요?**
      - 책장의 도서들과 유사하여 사용자가 관심을 가질만한 도서 & 책장에 없는 완전히 새로운 내용의 도서 추천
      - 줄거리 데이터 word embedding (SBERT, TF-IDF) → 코사인 유사도 계산
      - Input : 책장에서 인식된 각 책들의 ISBN 리스트
      - 각 도서와 데이터베이스 책들의 코사인 유사도 계산
      - Output : 가장 높/낮은 코사인 유사도를 가지면서 ISBN 리스트에 존재하지 않는 책 리스트


---




## 1. Environment Setting

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
%cd /content/drive/MyDrive/detectron2

/content/drive/MyDrive/detectron2


In [None]:
# torchvision 설치
!pip install -U torch torchvision
!pip install git+https://github.com/facebookresearch/fvcore.git

import torch, torchvision
torch.__version__

'2.0.0+cu118'

In [None]:
# detectron2 original repo clone
%cd /content/drive/MyDrive/detectron2
!git clone https://github.com/facebookresearch/detectron2 detectron2_repo
!pip install -e detectron2_repo

Successfully installed black-22.3.0 detectron2-0.6 fvcore-0.1.5.post20221221


In [None]:
!pip install 'git+https://github.com/facebookresearch/detectron2.git'

  Building wheel for detectron2 (setup.py) ... [?25l[?25hdone
  Created wheel for detectron2: filename=detectron2-0.6-cp39-cp39-linux_x86_64.whl size=7801612 sha256=c9c2724f2f9d414e82f45fa43f06a1164e8c6bd1bfd350fcf2cedf2166070547
  Stored in directory: /tmp/pip-ephem-wheel-cache-gc_fnosz/wheels/59/b4/83/84bfca751fa4dcc59998468be8688eb50e97408a83af171d42
Successfully built detectron2
Installing collected packages: detectron2
  Attempting uninstall: detectron2
    Found existing installation: detectron2 0.6
    Can't uninstall 'detectron2'. No files were found to uninstall.
Successfully installed detectron2-0.6


In [None]:
# restart the runtime prior to this

# setup detectron2 logger
import detectron2
from detectron2.utils.logger import setup_logger
setup_logger()

# import some common libraries
import matplotlib.pyplot as plt
import numpy as np
import cv2
from google.colab.patches import cv2_imshow

# import some common detectron2 utilities
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog, DatasetCatalog

In [None]:
# # data folder make
# %cd /content/drive/My Drive/Coding/detectron2
# !mkdir data
# %cd data
# !mkdir images
# %cd ..
# !pwd     

## 2. Dataset
- COCO dataset + Custom Bookshelf dataset + Additional dataset = 685
- Adding 'ybigkidsbook' coco instance

In [None]:
# coco dataset에 새로운 instance 추가
from detectron2.data.datasets import register_coco_instances

register_coco_instances("ybigkidsbook", {}, "/content/drive/MyDrive/detectron2/data/trainval.json", "/content/drive/MyDrive/detectron2/data/images")
person_metadata = MetadataCatalog.get("ybigkidsbook")
dataset_dicts = DatasetCatalog.get("ybigkidsbook")

Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you.

[04/04 14:25:25 d2.data.datasets.coco]: Loaded 686 images in COCO format from /content/drive/MyDrive/detectron2/data/trainval.json


In [None]:
# Label data 확인
import random

for d in random.sample(dataset_dicts, 3):
    img = cv2.imread(d["file_name"])
    visualizer = Visualizer(img[:, :, ::-1], metadata=person_metadata, scale=0.5)
    vis = visualizer.draw_dataset_dict(d)
    cv2_imshow(vis.get_image()[:, :, ::-1])
    

Output hidden; open in https://colab.research.google.com to view.

## 3. Train
- Detection/Segmentation API **Detectron2 → Mask R-CNN** 모델 학습
- Segmentation된 책등 이미지로부터 **CLOVA OCR**을 이용해 책 제목 텍스트 추출

In [None]:
# Training
from detectron2.engine import DefaultTrainer
from detectron2.config import get_cfg
import os

cfg = get_cfg()
cfg.merge_from_file("/content/drive/MyDrive/detectron2/detectron2_repo/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
cfg.DATASETS.TRAIN = ("ybigkidsbook",)
cfg.DATASETS.TEST = ()   # no metrics implemented for this dataset
cfg.DATALOADER.NUM_WORKERS = 2
cfg.MODEL.WEIGHTS = "detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl"  # initialize from model zoo

cfg.SOLVER.IMS_PER_BATCH = 2
cfg.SOLVER.MAX_ITER = 5000    # 300 iterations seems good enough, but you can certainly train longer
cfg.SOLVER.WEIGHT_DECAY = 0.0001
cfg.SOLVER.WEIGHT_DECAY_NORM = 0.0

cfg.SOLVER.BASE_LR = 0.01
cfg.SOLVER.BASE_LR_END = 0.0
cfg.SOLVER.GAMMA = 0.1
cfg.SOLVER.STEPS = (2500,) # lr decay
cfg.SOLVER.MOMENTUM = 0.9
cfg.SOLVER.CHECKPOINT_PERIOD = 1000

cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 200   # faster, and good enough for this toy dataset
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1  # 1 classes

cfg.OUTPUT_DIR = "/content/drive/MyDrive/detectron2/output"

os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
trainer = DefaultTrainer(cfg)
trainer.resume_or_load(resume=False)
trainer.train()                                        

[32m[06/23 15:28:12 d2.engine.defaults]: [0mModel:
GeneralizedRCNN(
  (backbone): FPN(
    (fpn_lateral2): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (fpn_lateral3): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output3): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (fpn_lateral4): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output4): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (fpn_lateral5): Conv2d(2048, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output5): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (top_block): LastLevelMaxPool()
    (bottom_up): ResNet(
      (stem): BasicStem(
        (conv1): Conv2d(
          3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False
          (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
        )
      )
 

model_final_f10217.pkl: 178MB [00:04, 40.8MB/s]                           
Skip loading parameter 'roi_heads.box_predictor.cls_score.weight' to the model due to incompatible shapes: (81, 1024) in the checkpoint but (2, 1024) in the model! You might want to double check if this is expected.
Skip loading parameter 'roi_heads.box_predictor.cls_score.bias' to the model due to incompatible shapes: (81,) in the checkpoint but (2,) in the model! You might want to double check if this is expected.
Skip loading parameter 'roi_heads.box_predictor.bbox_pred.weight' to the model due to incompatible shapes: (320, 1024) in the checkpoint but (4, 1024) in the model! You might want to double check if this is expected.
Skip loading parameter 'roi_heads.box_predictor.bbox_pred.bias' to the model due to incompatible shapes: (320,) in the checkpoint but (4,) in the model! You might want to double check if this is expected.
Skip loading parameter 'roi_heads.mask_head.predictor.weight' to the model due to i

[32m[06/23 15:28:22 d2.engine.train_loop]: [0mStarting training from iteration 0


  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]


[32m[06/23 15:28:46 d2.utils.events]: [0m eta: 1:39:48  iter: 19  total_loss: 2.555  loss_cls: 0.6107  loss_box_reg: 0.9275  loss_mask: 0.6832  loss_rpn_cls: 0.03218  loss_rpn_loc: 0.2871  time: 1.1526  data_time: 0.6860  lr: 0.00019981  max_mem: 2337M
[32m[06/23 15:29:09 d2.utils.events]: [0m eta: 1:37:34  iter: 39  total_loss: 2.066  loss_cls: 0.3896  loss_box_reg: 0.7975  loss_mask: 0.5651  loss_rpn_cls: 0.02983  loss_rpn_loc: 0.2848  time: 1.1465  data_time: 0.6705  lr: 0.00039961  max_mem: 2337M
[32m[06/23 15:29:33 d2.utils.events]: [0m eta: 1:39:00  iter: 59  total_loss: 1.556  loss_cls: 0.2465  loss_box_reg: 0.625  loss_mask: 0.3601  loss_rpn_cls: 0.02232  loss_rpn_loc: 0.317  time: 1.1665  data_time: 0.7224  lr: 0.00059941  max_mem: 2337M
[32m[06/23 15:29:55 d2.utils.events]: [0m eta: 1:37:51  iter: 79  total_loss: 1.291  loss_cls: 0.209  loss_box_reg: 0.4904  loss_mask: 0.3047  loss_rpn_cls: 0.01884  loss_rpn_loc: 0.257  time: 1.1465  data_time: 0.5857  lr: 0.00079921 

## 4. Test

In [None]:
# torchvision 설치
!pip install -U torch torchvision
!pip install git+https://github.com/facebookresearch/fvcore.git
import torch, torchvision
torch.__version__

Installing collected packages: fvcore
  Attempting uninstall: fvcore
    Found existing installation: fvcore 0.1.5.post20221221
    Uninstalling fvcore-0.1.5.post20221221:
      Successfully uninstalled fvcore-0.1.5.post20221221
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
detectron2 0.6 requires fvcore<0.1.6,>=0.1.5, but you have fvcore 0.1.6 which is incompatible.[0m[31m
[0mSuccessfully installed fvcore-0.1.6


'2.0.0+cu118'

In [None]:
# detectron2 original repo clone
!pip install -e detectron2_repo

Installing collected packages: fvcore, detectron2
  Attempting uninstall: fvcore
    Found existing installation: fvcore 0.1.6
    Uninstalling fvcore-0.1.6:
      Successfully uninstalled fvcore-0.1.6
  Attempting uninstall: detectron2
    Found existing installation: detectron2 0.6
    Uninstalling detectron2-0.6:
      Successfully uninstalled detectron2-0.6
  Running setup.py develop for detectron2
Successfully installed detectron2-0.6 fvcore-0.1.5.post20221221


In [None]:
%cd /content/drive/MyDrive/detectron2

/content/drive/MyDrive/detectron2


In [None]:
# You may need to restart your runtime prior to this, to let your installation take effect
# Some basic setup
# Setup detectron2 logger
import detectron2
from detectron2.utils.logger import setup_logger
setup_logger()

# import some common libraries
import matplotlib.pyplot as plt
import numpy as np
import cv2
from google.colab.patches import cv2_imshow

# import some common detectron2 utilities
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog, DatasetCatalog

from detectron2.engine import DefaultTrainer
from detectron2.config import get_cfg
import os

In [None]:
cfg = get_cfg()

cfg.merge_from_file("/content/drive/MyDrive/detectron2/detectron2_repo/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
cfg.DATASETS.TRAIN = ("ybigkidsbook",)
cfg.DATASETS.TEST = ()   # no metrics implemented for this dataset
cfg.DATALOADER.NUM_WORKERS = 2
cfg.MODEL.WEIGHTS = "detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl"  # initialize from model zoo

cfg.SOLVER.IMS_PER_BATCH = 2
cfg.SOLVER.MAX_ITER = 10000    # 300 iterations seems good enough, but you can certainly train longer
cfg.SOLVER.WEIGHT_DECAY = 0.0001
cfg.SOLVER.WEIGHT_DECAY_NORM = 0.0

cfg.SOLVER.BASE_LR = 0.001
cfg.SOLVER.BASE_LR_END = 0.0
cfg.SOLVER.GAMMA = 0.1
cfg.SOLVER.STEPS = (5000,) # lr decay
cfg.SOLVER.MOMENTUM = 0.9
cfg.SOLVER.CHECKPOINT_PERIOD = 1000

cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 200   # faster, and good enough for this toy dataset
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1  # 1 classes


In [None]:
# Test

cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth")
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5   # set the testing threshold for this model
cfg.DATASETS.TEST = ("ybigkidsbook", )
predictor = DefaultPredictor(cfg)

path = "/content/drive/MyDrive/detectron2/data/images_test/kidbooks2.jpg"
im = cv2.imread(path)
outputs = predictor(im)
v = Visualizer(im[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2)
v = v.draw_instance_predictions(outputs["instances"].to("cpu"))
cv2_imshow(v.get_image()[:, :, ::-1])


Output hidden; open in https://colab.research.google.com to view.

## 5. Masking
- 개별 도서별로 cropping하여 이미지 형태로 저장

In [None]:
from matplotlib.image import imread
import scipy.misc
from PIL import Image  
import numpy

def cropper(org_image_path, mask_array, out_file_name):
    num_instances = mask_array.shape[0]
    mask_array = np.moveaxis(mask_array, 0, -1)
    mask_array_instance = []
    img = imread(str(org_image_path))
    #output = np.zeros_like(img)
    for i in range(num_instances):
        mask_array_instance.append(mask_array[:, :, i:(i+1)])
        output_img = np.where(mask_array_instance[i] == True, img, 255)
        im = Image.fromarray(output_img)
        im.save(out_file_name + str(i), 'png')


mask_array = outputs["instances"].pred_masks.cpu().numpy()
cropper(path, mask_array, 'result')

## 6. OCR
- Segmentation된 책등 이미지로부터 **CLOVA OCR**을 이용해 책 제목 텍스트 추출

In [None]:
import uuid
import requests
import time
import json

In [None]:
api_url = 'https://sgihdfj2uw.apigw.ntruss.com/custom/v1/16108/0712cf53bcfa1d74a4adcf8507ec8173cb85748862a62158049854dffe4bfef5/general'
secret_key = 'Y3dhVnJkdGFsYWtXQUpYbVNzWld1bGRMdHhBY1hWRks='

In [None]:
files = [('file', open('/content/drive/MyDrive/detectron2/result10', 'rb'))]

In [None]:
request_json = {'images': [{'format': 'jpg',
                                'name': 'demo'
                               }],
                    'requestId': str(uuid.uuid4()),
                    'version': 'V2',
                    'timestamp': int(round(time.time() * 1000))
                   }
 
payload = {'message': json.dumps(request_json).encode('UTF-8')}
 
headers = {
  'X-OCR-SECRET': secret_key,
}
 
response = requests.request("POST", api_url, headers=headers, data=payload, files=files)
result = response.json()

In [None]:
title = ' '.join([i['inferText'] for i in result['images'][0]['fields']])

In [None]:
title

'오러와오도 먀오족의 콩쥐팥쥐이야기 이영경 글. 그림 길벗어린이'

## 7. ISBN
- 국제표준도서번호(International Standard Book Number or ISBN)
- 국제적으로 표준화된 방법에 의해 전 세계의 각종 도서에 부여하는 고유한 식별번호

In [None]:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
from tqdm.notebook import tqdm
import numpy as np
import pandas as pd

In [None]:
cd /content/drive/MyDrive/detectron2

/content/drive/MyDrive/detectron2


In [None]:
def search_for_isbn (text) :
    # books
    books_name_only = pd.read_csv('./books_name_only.csv')
    books = pd.read_csv('./books.csv', index_col=0)
    
    # add the input text to the dataframe
    input_sentence = {'제목': text}
    books_name_only = books_name_only.append(input_sentence, ignore_index=True)

    # tf-idf vectorization
    tfidf = TfidfVectorizer()
    tfidf_matrix = tfidf.fit_transform(books_name_only['제목'])

    # calculate cosine-similarity
    cosine_sim = cosine_similarity(tfidf_matrix, tfidf_matrix)

    # define "category to index" dictionary
    category_to_index = dict(zip(books_name_only['제목'], books_name_only.index))

    # find the most similar category
    idx = len(books_name_only)-1
    sim_scores = list(enumerate(cosine_sim[idx]))
    sim_scores = sim_scores[:-1]

    answer = pd.DataFrame(index=range(0, len(books)),columns=range(2))
    answer['ISBN'] = books['ISBN']
    answer['sim_score'] = [idx[1] for idx in sim_scores]

    for i in range(len(answer)) :
        if answer['sim_score'][i] == max(answer['sim_score']) :
            answer_index = i


In [None]:
search_for_isbn(title)

  books_name_only = books_name_only.append(input_sentence, ignore_index=True)
