# 1. load model



## reference

`load pretrained TransformerSum models` [github](https://github.com/HHousen/TransformerSum) | [tutorial](https://transformersum.readthedocs.io/en/latest/general/getting-started.html#install)


1. `distilbert-base-uncased-ext-sum(CNN/DM): 6h 22m 32s` [model download](https://drive.google.com/uc?id=1VNoFhqfwlvgwKuJwjlHnlGcGg38cGM--)
    
    > Currently, distilbert beats bert-base-uncased by 1.0014% Since bert-base-uncased has more parameters than distilbert, this is unusual and is likely a tuning issue. This suggests that tuning the hyperparameters of bert-base-uncased can improve its performance. distilroberta matches 92.7% of the performance of roberta-base

2. `bert-base-uncased-ext-sum(CNN/DM): 12h 51m 17s` [model download](https://drive.google.com/uc?id=1yGvarxhq78Vl6m8IZgG9HFQC2qXDB-KU)

3. `mobilebert-uncased-ext-sum(CNN/DM): 8h 26m 32s` [model download](https://drive.google.com/uc?id=1R3tRH07z_9nYW8sC8eFceBmxC7u0kP_W)

    > mobilebert-uncased-ext-sum achieves 96.59%  of the performance of BertSum while containing 4.45 times fewer parameters. It achieves 94.06%  of the performance of MatchSum the current extractive state-of-the-art.

4. `distilroberta-base-ext-sum(WikiHow): 4h 27m 23s` [model download](https://drive.google.com/uc?id=1RdFcoeuHd_JCj5gBQRFXFpieb-3EXkiN)
    
    > These are the results of an extractive model, which means they are fairly good because they come close to abstractive models. The R1/R2/RL-Sum results of a base transformer model from the PEGASUS paper are 32.48/10.53/23.86. The net difference from distilroberta-base-ext-sum is +1.41/+1.57/-5.09. Compared to the abstractive SOTA prior to PEGASUS, which was 28.53/9.23/26.54, distilroberta-base-ext-sum performs +2.54/-0.27/+2.41. 
    
    However, the base PEGASUS model obtains scores of 36.58/15.64/30.01, which are much better than distilroberta-base-ext-sum, as one would expect.

5. `bert-base-uncased-ext-sum(WikiHow): 7h 29m 06s` [model download](https://drive.google.com/uc?id=1EPCaQySWJgm368XypDeCwEMdRCxB5w7Z)

`check architecture, parameters` [blog](https://rabo0313.tistory.com/entry/Pytorch-%EB%AA%A8%EB%8D%B8-%EA%B5%AC%EC%A1%B0-%ED%99%95%EC%9D%B8-parameter%ED%99%95%EC%9D%B8)

`[error] `

### variables

In [62]:
# PRETRAINED_MODEL_PATH="./models/cnn dm [bert-base-uncased-ext-sum] epoch=3.ckpt"
# PRETRAINED_MODEL_PATH="./models/cnn dm [distilroberta-base-ext-sum] epoch=3.ckpt"
PRETRAINED_MODEL_PATH="./models/cnn dm [mobilebert-uncased-ext-sum] epoch=3.ckpt"
# PRETRAINED_MODEL_PATH="./models/wikihow [bert-base-uncased-ext-sum] epoch=2.ckpt"
# PRETRAINED_MODEL_PATH="./models/wikihow [distilbert-base-uncased-ext-sum] epoch=3.ckpt"

### import modules

In [63]:
from src.extractive import ExtractiveSummarizer
import torch
from torch import nn
# !pip install torchsummary
from torchsummary import summary as summary_
from torch.nn import functional as F

### load model

In [64]:
model = ExtractiveSummarizer.load_from_checkpoint(PRETRAINED_MODEL_PATH)

### check model architecture

In [65]:
print(model)

ExtractiveSummarizer(
  (word_embedding_model): MobileBertModel(
    (embeddings): MobileBertEmbeddings(
      (word_embeddings): Embedding(30522, 128, padding_idx=0)
      (position_embeddings): Embedding(512, 512)
      (token_type_embeddings): Embedding(2, 512)
      (embedding_transformation): Linear(in_features=384, out_features=512, bias=True)
      (LayerNorm): NoNorm()
      (dropout): Dropout(p=0.0, inplace=False)
    )
    (encoder): MobileBertEncoder(
      (layer): ModuleList(
        (0): MobileBertLayer(
          (attention): MobileBertAttention(
            (self): MobileBertSelfAttention(
              (query): Linear(in_features=128, out_features=128, bias=True)
              (key): Linear(in_features=128, out_features=128, bias=True)
              (value): Linear(in_features=512, out_features=128, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): MobileBertSelfOutput(
              (dense): Linear(in_features=128, o

### check model parameters
- parameters(): size, name

In [66]:
paramCount=0
for name, param in model.named_parameters():
    paramCount+=1
    print(f"{param.size()}: {name}")
print(f"parameters: {paramCount}")

torch.Size([30522, 128]): word_embedding_model.embeddings.word_embeddings.weight
torch.Size([512, 512]): word_embedding_model.embeddings.position_embeddings.weight
torch.Size([2, 512]): word_embedding_model.embeddings.token_type_embeddings.weight
torch.Size([512, 384]): word_embedding_model.embeddings.embedding_transformation.weight
torch.Size([512]): word_embedding_model.embeddings.embedding_transformation.bias
torch.Size([512]): word_embedding_model.embeddings.LayerNorm.bias
torch.Size([512]): word_embedding_model.embeddings.LayerNorm.weight
torch.Size([128, 128]): word_embedding_model.encoder.layer.0.attention.self.query.weight
torch.Size([128]): word_embedding_model.encoder.layer.0.attention.self.query.bias
torch.Size([128, 128]): word_embedding_model.encoder.layer.0.attention.self.key.weight
torch.Size([128]): word_embedding_model.encoder.layer.0.attention.self.key.bias
torch.Size([128, 512]): word_embedding_model.encoder.layer.0.attention.self.value.weight
torch.Size([128]): word

-  childrens() : module(layer)

In [67]:
moduleCount = 0
for name,layer in model.named_children():
    moduleCount += 1
    print(f"==========={name}===========")
    print(layer)
print("===============================")
print(f"module: {moduleCount}")


MobileBertModel(
  (embeddings): MobileBertEmbeddings(
    (word_embeddings): Embedding(30522, 128, padding_idx=0)
    (position_embeddings): Embedding(512, 512)
    (token_type_embeddings): Embedding(2, 512)
    (embedding_transformation): Linear(in_features=384, out_features=512, bias=True)
    (LayerNorm): NoNorm()
    (dropout): Dropout(p=0.0, inplace=False)
  )
  (encoder): MobileBertEncoder(
    (layer): ModuleList(
      (0): MobileBertLayer(
        (attention): MobileBertAttention(
          (self): MobileBertSelfAttention(
            (query): Linear(in_features=128, out_features=128, bias=True)
            (key): Linear(in_features=128, out_features=128, bias=True)
            (value): Linear(in_features=512, out_features=128, bias=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
          (output): MobileBertSelfOutput(
            (dense): Linear(in_features=128, out_features=128, bias=True)
            (LayerNorm): NoNorm()
          )
        )
    

# 2. check data

In [68]:
import pandas as pd
import math
import logging
from datetime import datetime

import torch
from torch.utils.data import DataLoader

# ! pip install openpyxl

In [69]:
file_path='dataset.xlsx'
df = pd.read_excel(file_path, engine='openpyxl')

In [70]:
df_list = df.values.tolist()

2448

In [None]:
question_list=[]
for row in df_list:
    value=row[0].replace('\n',' ').replace('  ',' ').strip()
    question_list.append(value)

In [72]:
original=[]
for row in df_list:
    value=row[0].replace('\n',' ').replace('  ',' ').strip()
    original.append([value,row[1]])

2448

In [49]:
import csv
with open('csv/text_origin1.csv','w',newline='',encoding='utf-8') as f:
    write = csv.writer(f) 
    write.writerows(original)

# 3.Predict

In [74]:
result=[]
for question in question_list:
    result.append(model.predict(question))

In [50]:
import csv
with open('csv/text_result1.csv','w',newline='',encoding='utf-8') as f:
    write = csv.writer(f) 
    for q,r in zip(question_list,result):
        write.writerow([q,r])

In [84]:
import csv
with open('csv/text_result_origin1.csv','w',newline='',encoding='utf-8') as f:
    write = csv.writer(f) 
    for q,r,o in zip(question_list,result,original):
        write.writerow([q,r,o[0]])

In [55]:
# ! pip install torchvision
from __future__ import print_function
from __future__ import division
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
import torchvision
from torchvision import datasets, models, transforms
import matplotlib.pyplot as plt
import time
import os
import copy
print("PyTorch Version: ",torch.__version__)
print("Torchvision Version: ",torchvision.__version__)

PyTorch Version:  1.13.0
Torchvision Version:  0.14.0+cpu


In [None]:
print('Device:', torch.device('cuda:0'))

In [57]:
from sklearn.model_selection import train_test_split

In [75]:
train_text,val_text,train_result,val_result  = train_test_split(question_list, result, test_size=0.3, shuffle=False)

In [76]:
len(train_text),len(train_result)

(1713, 1713)