# Contents

### 1. Introductions
- Source of the data : [네이버 영화 리뷰 데이터](https://github.com/songys/Toxic_comment_data)
- Brief on the type of content in `documents` : string 형태, 영화리뷰 데이터

### 2. Dataset Dimensioins:
- state the total number of records(rows) and features (columns).

### 3. Feature Descriptions:
- documents
- toxic, obsencse, threat, insult, identity_hate

### 4. Data Distribution:
- Count and percentage of positive (1) labels.
- Count and percentage of negative (0) labels.

### 5. Data Quality
- Missing Values
- Duplicates
- noise or anomailes encountered

### 6. Descriptive Statistics for `document`:
- Average, median and mode of document lengths
- Most Frequent words or phrase
- Dsitribution of document lengths
  
### 7. Inter-label Relationship:
- Correlations:
- correlation matrix or heatmap

### 8. Visualizations:
- Bar graphs for label distributions
- Word clouds for most common words in each label
- Histogram for document lenghts
- Heatmap for label correlations

### 9. Preliminary Observations:
- Based on the descriptive analysis:

### 10. Challenges & Limitations:

# Libary Import

In [1]:
import pandas as pd
import numpy as np
from konlpy.tag import Okt, Hannanum, Mecab, Komoran, Kkma

# set column width
pd.set_option('max_colwidth', 1000)

## Data Load

In [2]:
train = pd.read_csv('./data/ko_train_label.csv')
test = pd.read_csv('./data/ko_test_label.csv')

In [3]:
train

Unnamed: 0,id,document,toxic,obscene,threat,insult,identity_hate
0,9976970,아 더빙.. 진짜 짜증나네요 목소리,1,0,0,0,0
1,9045019,교도소 이야기구먼 ..솔직히 재미는 없다..평점 조정,1,0,0,0,0
2,5403919,막 걸음마 뗀 3세부터 초등학교 1학년생인 8살용영화.ㅋㅋㅋ...별반개도 아까움.,1,0,0,0,0
3,7797314,원작의 긴장감을 제대로 살려내지못했다.,1,0,0,0,0
4,9443947,별 반개도 아깝다 욕나온다 이응경 길용우 연기생활이몇년인지..정말 발로해도 그것보단 낫겟다 납치.감금만반복반복..이드라마는 가족도없다 연기못하는사람만모엿네,1,0,0,0,1
...,...,...,...,...,...,...,...
9994,7448293,혹시나 그래도 카메론디아즈니까 하고봤는데...먼이런영화를..이도저도아닌 .아암튼 시간 남아도보지않길.내용.전개 꽝입니다,1,0,0,0,0
9995,5824024,10점주는것들 한국영화는 1점주네. M창,1,0,0,0,0
9996,6420437,영상도 아름답고 뭘 말하는지 알겠지만 그렇기 때문에 짜증나고 답답하다,1,0,0,0,0
9997,6777278,영화를 왜 영화라고 하는지 모르는 애들이 나왔네.,1,0,0,0,0


In [4]:
test

Unnamed: 0,id,document,"""toxic""","""obscene""","""threat""","""insult""","""identity_hate""",Unnamed: 7
0,8544678.0,뭐야 이 평점들은.... 나쁘진 않지만 10점 짜리는 더더욱 아니잖아,1.0,0.0,0.0,0.0,0.0,
1,6825595.0,지루하지는 않은데 완전 막장임... 돈주고 보기에는....,1.0,0.0,0.0,0.0,0.0,
2,6723715.0,3D만 아니었어도 별 다섯 개 줬을텐데.. 왜 3D로 나와서 제 심기를 불편하게 하죠??,1.0,0.0,0.0,0.0,0.0,
3,6315043.0,진정한 쓰레기,1.0,0.0,0.0,0.0,0.0,
4,8932678.0,갈수록 개판되가는 중국영화 유치하고 내용없음 폼잡다 끝남 말도안되는 무기에 유치한cg남무 아 그립다 동사서독같은 영화가 이건 3류아류작이다,1.0,0.0,0.0,0.0,0.0,
...,...,...,...,...,...,...,...,...
11408,,,,,,,,
11409,,,,,,,,
11410,,,,,,,,
11411,,,,,,,,


In [5]:
del test['Unnamed: 7']

In [6]:
print(train.toxic.value_counts())
print(train.obscene.value_counts())
print(train.threat.value_counts())
print(train.insult.value_counts())
print(train.identity_hate.value_counts())

toxic
1    9969
0      30
Name: count, dtype: int64
obscene
0    9904
1      95
Name: count, dtype: int64
threat
0    9966
1      33
Name: count, dtype: int64
insult
0    9774
1     225
Name: count, dtype: int64
identity_hate
0    9738
1     261
Name: count, dtype: int64


- preprocesed test columns

In [7]:
col_list = test.columns.tolist()
pre_col_list = [x.replace('"', "") for x in col_list]
test.columns = pre_col_list

In [8]:
test.isnull().sum()

id               1414
document         1414
toxic            1414
obscene          1410
threat           1414
insult           1413
identity_hate    1414
dtype: int64

In [9]:
test.dropna(inplace=True)
test.reset_index(drop=True, inplace=True)

In [10]:
print(test.toxic.value_counts())
print(test.obscene.value_counts())
print(test.threat.value_counts())
print(test.insult.value_counts())
print(test.identity_hate.value_counts())

toxic
1.0    9908
0.0      91
Name: count, dtype: int64
obscene
0.0    9911
1.0      88
Name: count, dtype: int64
threat
0.0    9977
1.0      22
Name: count, dtype: int64
insult
0.0    9806
1.0     193
Name: count, dtype: int64
identity_hate
0.0    9788
1.0     211
Name: count, dtype: int64


## Data Distributions

In [11]:
documents = train['document'].tolist()

In [12]:
documents[:5]

['아 더빙.. 진짜 짜증나네요 목소리',
 '교도소 이야기구먼 ..솔직히 재미는 없다..평점 조정',
 '막 걸음마 뗀 3세부터 초등학교 1학년생인 8살용영화.ㅋㅋㅋ...별반개도 아까움.',
 '원작의 긴장감을 제대로 살려내지못했다.',
 '별 반개도 아깝다 욕나온다 이응경 길용우 연기생활이몇년인지..정말 발로해도 그것보단 낫겟다 납치.감금만반복반복..이드라마는 가족도없다 연기못하는사람만모엿네']

In [13]:
len('아 더빙.. 진짜 짜증나네요 목소리')

19

In [14]:
train['senteces_len'] = [len(x) for x in train['document']]
test['senteces_len'] = [len(x) for x in test['document']]

In [15]:
train.senteces_len.value_counts()

senteces_len
14     297
13     276
15     274
12     264
16     254
      ... 
1        7
110      6
122      6
143      1
141      1
Name: count, Length: 142, dtype: int64

In [16]:
train['senteces_len'].describe()

count    9999.000000
mean       36.518852
std        30.713891
min         1.000000
25%        16.000000
50%        27.000000
75%        43.000000
max       143.000000
Name: senteces_len, dtype: float64

In [17]:
test.senteces_len.describe()

count    9999.000000
mean       36.283228
std        30.427773
min         1.000000
25%        16.000000
50%        27.000000
75%        43.000000
max       144.000000
Name: senteces_len, dtype: float64

In [18]:
test.loc[test.senteces_len == 144]

Unnamed: 0,id,document,toxic,obscene,threat,insult,identity_hate,senteces_len
5692,8108106.0,"""차라리 막장이면 욕이나 하지 이건 개콘의 시청률의 제왕이 만드는 드라마보다 못해 작가랑 피디가 """"기왕 망한거 끝까지 가보자""""는 마음으로 제대로 낄낄대며 장난친듯. 웃어라 동해야 이후로 제대로 쓰레기를 만난 느낌ㅋㅋ 민폐녀 최세영은 드라마 최고악녀 등극""",1.0,0.0,0.0,0.0,0.0,144


In [19]:
train.loc[train.senteces_len == 143]

Unnamed: 0,id,document,toxic,obscene,threat,insult,identity_hate,senteces_len
2558,10141871,"""배우들 연기 :10점, 연출 :10점 , 스토리 : 10노잼 .. 내용이 짜집기가 안될거같을때 주인공이 """"이건현실이 아니야!!"""" 한마디하면서 게속 얼렁뚱땅 넘어가는느낌? 보는데 진짜 찝찝;; 내용이 앞뒤는 안맞고 뭐 자꾸 이건현실이 아니라면서넘어가니""",1,0,0,0,0,143


In [20]:
test.loc[test.senteces_len == 1]

Unnamed: 0,id,document,toxic,obscene,threat,insult,identity_hate,senteces_len
190,7191789.0,헐,1.0,0.0,0.0,0.0,0.0,1
2308,5096494.0,땡,1.0,0.0,0.0,0.0,0.0,1
4793,1320638.0,뷁,1.0,0.0,0.0,0.0,0.0,1
9602,5584524.0,헐,0.0,0.0,0.0,0.0,0.0,1


In [21]:
train.loc[train.senteces_len == 1]

Unnamed: 0,id,document,toxic,obscene,threat,insult,identity_hate,senteces_len
2329,4920691,즐,1,0,0,0,0,1
4168,1640173,굿,1,0,0,0,0,1
4192,6931782,쒯,1,0,0,0,0,1
5550,5873214,똥,1,0,0,0,0,1
7340,132033,음,1,0,0,0,0,1
7738,7080841,헐,1,0,0,0,0,1
8657,6383337,꽝,1,0,0,0,0,1


In [22]:
test.loc[test.senteces_len == 2]

Unnamed: 0,id,document,toxic,obscene,threat,insult,identity_hate,senteces_len
97,1050028.0,글쎄,0.0,0.0,0.0,0.0,0.0,2
220,1650924.0,별로,1.0,0.0,0.0,0.0,0.0,2
295,7086051.0,토해,1.0,0.0,0.0,0.0,0.0,2
313,7648802.0,최악,1.0,0.0,0.0,0.0,0.0,2
498,6995414.0,졸작,1.0,0.0,0.0,0.0,0.0,2
...,...,...,...,...,...,...,...,...
9520,82108.0,꾸엑,1.0,0.0,0.0,0.0,0.0,2
9577,6160534.0,토해,1.0,0.0,0.0,0.0,0.0,2
9579,5058289.0,별로,1.0,0.0,0.0,0.0,0.0,2
9617,4562968.0,냠냠,1.0,0.0,0.0,0.0,0.0,2


In [38]:
# each label duplicate check
label_columns = ["toxic", "obscene", "threat", "insult", "identity_hate"]
duplicates = train[train.duplicated(subset=['document'], keep=False)].sort_values(by=label_columns)

In [39]:
num_duplicates = len(duplicates)
sample_duplicates = duplicates.head(10)

In [43]:
document_counts = duplicates.groupby('document').size().reset_index(name='counts')
document_counts_sorted = document_counts.sort_values(by='counts', ascending=False)

In [44]:
# 해당 데이터에 대한 처리 필요
document_counts_sorted

Unnamed: 0,document,counts
12,별로,8
4,bad,8
41,최악,5
3,OOO기영화,4
9,물체가 움직이거나 어떤 일이 진행되는 빠르기.,4
22,비추,3
33,재미없어,3
18,별루,3
34,재미없음,3
15,별로다,3


## pos tagging

In [55]:
from tqdm import tqdm

In [45]:
train_document = train['document'].tolist()
test_document = test['document'].tolist() 

In [48]:
okt = Okt()
hannanum = Hannanum()
komoran = Komoran()
kkma = Kkma()

In [60]:
okt_pos = []
hannanum_pos = []
komoran_pos = []
kkma_pos = []  

for value in tqdm(train_document):
    okt_pos.append(okt.pos(value))
    hannanum_pos.append(hannanum.pos(value))
    komoran_pos.append(komoran.pos(value))
    kkma_pos.append(kkma.pos(value))

100%|██████████| 9999/9999 [03:07<00:00, 53.43it/s]


In [61]:
okt_pos_test = []
hannanum_pos_test = []
komoran_pos_test = []
kkma_pos_test = []  

for value in tqdm(test_document):
    okt_pos_test.append(okt.pos(value))
    hannanum_pos_test.append(hannanum.pos(value))
    komoran_pos_test.append(komoran.pos(value))
    kkma_pos_test.append(kkma.pos(value))

100%|██████████| 9999/9999 [03:30<00:00, 47.51it/s]


In [64]:
train['okt_pos'] = okt_pos
train['hannanum_pos'] = hannanum_pos
train['komoran_pos'] = komoran_pos
train['kkma_pos'] = kkma_pos
test['okt_pos'] = okt_pos_test
test['hannanum_pos'] = hannanum_pos_test
test['komoran_pos'] = komoran_pos_test
test['kkma_pos'] = kkma_pos_test

In [65]:
test

Unnamed: 0,id,document,toxic,obscene,threat,insult,identity_hate,senteces_len,okt_pos,hannanum_pos,komoran_pos,kkma_pos
0,8544678.0,뭐야 이 평점들은.... 나쁘진 않지만 10점 짜리는 더더욱 아니잖아,1.0,0.0,0.0,0.0,0.0,38,"[(뭐, Noun), (야, Josa), (이, Noun), (평점, Noun), (들, Suffix), (은, Josa), (...., Punctuation), (나쁘진, Adjective), (않지만, Verb), (10, Number), (점, Noun), (짜, Verb), (리, Noun), (는, Josa), (더, Noun), (더욱, Noun), (아니잖아, Adjective)]","[(뭐, N), (이, J), (야, E), (이, M), (평점들, N), (은, J), (...., S), (나쁘진, N), (않, P), (지, E), (말, P), (ㄴ, E), (10점, N), (짜, P), (리는, E), (더더욱, M), (아니, P), (잖아, E)]","[(뭐, NP), (야, JX), (이, MM), (평점, NNG), (들, XSN), (은, JX), (..., SE), (., SF), (나쁘, VA), (지, EC), (ㄴ, JX), (않, VX), (지만, EC), (10, SN), (점, NNB), (짜리, XSN), (는, JX), (더더욱, MAG), (아니, VCN), (잖아, EC)]","[(뭐, NP), (야, JX), (이, MDT), (평점, NNG), (들, XSN), (은, JX), (...., SW), (나쁘, VA), (지, EFN), (는, JX), (않, VXV), (지만, ECE), (10, NR), (점, NNM), (짜리, VV), (는, ETD), (더더욱, MAG), (아니, VV), (잖아, EFN)]"
1,6825595.0,지루하지는 않은데 완전 막장임... 돈주고 보기에는....,1.0,0.0,0.0,0.0,0.0,32,"[(지루하지는, Adjective), (않은데, Verb), (완전, Noun), (막장, Noun), (임, Noun), (..., Punctuation), (돈, Noun), (주고, Verb), (보기, Noun), (에는, Josa), (...., Punctuation)]","[(지루, N), (하, X), (어, E), (지, P), (는, E), (않, P), (은, E), (데, N), (완전, N), (막장, N), (이, J), (ㅁ, E), (..., S), (돈주, N), (이, J), (고, E), (보, P), (기, E), (에는, J), (...., S)]","[(지루, XR), (하, XSA), (지, EC), (는, JX), (않, VX), (은데, EC), (완전, NNG), (막, NNG), (장임, NNP), (..., SE), (돈, NNG), (주, VV), (고, EC), (보, VV), (기, ETN), (에, JKB), (는, JX), (..., SE), (., SF)]","[(지루, XR), (하, XSA), (지, ECD), (는, JX), (않, VXA), (은데, ECD), (완전, NNG), (막장, NNG), (임, NNG), (..., SE), (돈, NNG), (주고, NNG), (보, VV), (기에, ECD), (는, JX), (...., SW)]"
2,6723715.0,3D만 아니었어도 별 다섯 개 줬을텐데.. 왜 3D로 나와서 제 심기를 불편하게 하죠??,1.0,0.0,0.0,0.0,0.0,49,"[(3, Number), (D, Alpha), (만, Noun), (아니었어도, Adjective), (별, Noun), (다섯, Noun), (개, Noun), (줬을텐데, Verb), (.., Punctuation), (왜, Noun), (3, Number), (D, Alpha), (로, Noun), (나와서, Verb), (제, Noun), (심기, Noun), (를, Josa), (불편하게, Adjective), (하죠, Verb), (??, Punctuation)]","[(3D, N), (만, J), (아니, P), (었어도, E), (별, M), (다섯, N), (개, N), (주, P), (었을텐데, E), (.., S), (왜, M), (3D, N), (로, J), (나오, P), (아, E), (저, N), (의, J), (심기, N), (를, J), (불편, N), (하, X), (게, E), (하, P), (죠, E), (??, S)]","[(3D, NNP), (만, JX), (아니, VCN), (었, EP), (어도, EC), (별, MM), (다섯, NR), (개, NNB), (주, VX), (었, EP), (을, ETM), (텐, NNG), (데, NNB), (., SF), (., SF), (왜, MAG), (3D, NNP), (로, JKB), (나오, VV), (아서, EC), (제, XPN), (심기, NNG), (를, JKO), (불편, NNG), (하, XSV), (게, EC), (하, VX), (죠, EF), (?, SF), (?, SF)]","[(3, NR), (D, OL), (만, JX), (아니, VCN), (었, EPT), (어도, ECD), (별, NNG), (다섯, NR), (개, NNM), (주, VV), (었, EPT), (을, ETD), (터, NNB), (이, VCP), (ㄴ데, ECE), (.., SW), (왜, MAG), (3, NR), (D, OL), (로, JKM), (나오, VV), (아서, ECD), (저, NP), (의, JKG), (심기, NNG), (를, JKO), (불편, NNG), (하, XSV), (게, ECD), (하, VV), (죠, EFN), (??, SW)]"
3,6315043.0,진정한 쓰레기,1.0,0.0,0.0,0.0,0.0,7,"[(진정한, Adjective), (쓰레기, Noun)]","[(진정한, N), (쓰레기, N)]","[(진정, XR), (하, XSA), (ㄴ, ETM), (쓰레기, NNP)]","[(진정, NNG), (하, XSV), (ㄴ, ETD), (쓰레기, NNG)]"
4,8932678.0,갈수록 개판되가는 중국영화 유치하고 내용없음 폼잡다 끝남 말도안되는 무기에 유치한cg남무 아 그립다 동사서독같은 영화가 이건 3류아류작이다,1.0,0.0,0.0,0.0,0.0,77,"[(갈수록, Noun), (개판, Noun), (되가는, Verb), (중국영화, Noun), (유치하고, Adjective), (내용, Noun), (없음, Adjective), (폼, Noun), (잡다, Verb), (끝남, Verb), (말, Noun), (도, Josa), (안되는, Adjective), (무기, Noun), (에, Josa), (유치한, Adjective), (cg, Alpha), (남무, Noun), (아, Exclamation), (그립다, Verb), (동사서독, Noun), (같은, Adjective), (영화, Noun), (가, Josa), (이건, Noun), (3, Number), (류, Noun), (아, Josa), (류작, Noun), (이다, Josa)]","[(가, P), (ㄹ수록, E), (개판되가, N), (는, J), (중국영화, N), (유치, N), (하고, J), (내용, N), (없, X), (음, E), (폼잡, P), (다, E), (끝나, P), (ㅁ, E), (말도안되, N), (는, J), (무기, N), (에, J), (유치한cg남무, N), (아, I), (그립, P), (다, E), (동사서독, N), (같, X), (은, E), (영화, N), (가, J), (이, N), (이, J), (건, E), (3류아류작, N), (이, J), (다, E)]","[(갈수록, MAG), (개판, NNG), (되, XSV), (가, XSN), (는, JX), (중국, NNP), (영화, NNP), (유치, XR), (하, XSA), (고, EC), (내용, NNG), (없, VA), (음, ETN), (폼, NNG), (잡, VV), (다, EC), (끝나, VV), (ㅁ, ETN), (말도, NNP), (안, NNG), (되, XSV), (는, ETM), (무기, NNG), (에, JKB), (유치, NNP), (한, NNP), (cg, SL), (남무, NNP), (아, IC), (그립, VA), (다, EC), (동사서독, NNP), (같, VA), (은, ETM), (영화, NNG), (가, JKS), (이건, NNP), (3, SN), (류, NNP), (아, NNP), (류, NNP), (작, NNG), (이, VCP), (다, EC)]","[(갈수록, MAG), (개판되가, UN), (는, JX), (중국, NNG), (영화, NNG), (유치, NNG), (하, XSV), (고, ECE), (내용, NNG), (없, VA), (음, ETN), (폼잡, VV), (다, ECS), (끝나, VV), (ㅁ, ETN), (말, NNG), (도, JX), (안되, VA), (는, ETD), (무기, NNG), (에, JKM), (유치, NNG), (하, XSV), (ㄴ, ETD), (cg, OL), (남무, NNG), (아, VV), (아, ECS), (그립, VA), (다, EFN), (동사, NNG), (서독, NNG), (같, VA), (은, ETD), (영화, NNG), (가, JKS), (이건, NNP), (3, NR), (류, NNG), (아류, NNG), (작, NNG), (이, VCP), (다, EFN)]"
...,...,...,...,...,...,...,...,...,...,...,...,...
9994,6766192.0,올해 최악의 영화,1.0,0.0,0.0,0.0,0.0,9,"[(올해, Noun), (최악, Noun), (의, Josa), (영화, Noun)]","[(올해, N), (최악, N), (의, J), (영화, N)]","[(올해, NNG), (최악, NNG), (의, JKG), (영화, NNG)]","[(올해, NNG), (최악, NNG), (의, JKG), (영화, NNG)]"
9995,3006940.0,"정말재미없게본영화는시간이지난다음 내용,엔딩이잘기억안나는데방탄승도생각안난다..",1.0,0.0,0.0,0.0,0.0,42,"[(정말, Noun), (재미없게, Adjective), (본, Modifier), (영화, Noun), (는, Josa), (시간, Noun), (이, Josa), (지난, Noun), (다음, Noun), (내용, Noun), (,, Punctuation), (엔딩, Noun), (이, Josa), (잘, Verb), (기억, Noun), (안나, Noun), (는, Josa), (데, Noun), (방탄, Noun), (승도, Noun), (생각, Noun), (안, Noun), (난, Josa), (다, Adverb), (.., Punctuation)]","[(정말재미없게본영화는시간이지난다음, N), (내용,엔딩이잘기억안나는데방탄승도생각안난다, N), (.., S)]","[(정말, MAG), (재미없, VA), (게, EC), (보, VX), (ㄴ, ETM), (영화, NNP), (는, JX), (시간, NNG), (이, VCP), (지, EC), (나, VX), (ㄴ, ETM), (다음, NNP), (내용, NNG), (,, SP), (엔, NNG), (딩, MAG), (이, VCP), (잘, ETM), (기억, NNP), (안나, NNP), (는, JX), (데, NNB), (방탄, NNP), (승도, NNG), (생각, NNG), (안나, NNP), (ㄴ다, EF), (., SF), (., SF)]","[(정말, MAG), (재미없, VA), (게, ECD), (보, VV), (ㄴ, ETD), (영화, NNG), (는, JX), (시간, NNG), (이, JKS), (지나, VV), (ㄴ, ETD), (다음, NNG), (내용, NNG), (,, SP), (엔딩, NNG), (이, JKS), (잘, MAG), (기억, NNG), (안, NNG), (나, VV), (는데, ECD), (방탄, NNG), (승, NNG), (도, JX), (생각, NNG), (안, NNG), (낳, VV), (ㄴ다, ECS), (.., SW)]"
9996,3342697.0,내 취향 아닌갑다,1.0,0.0,0.0,0.0,0.0,9,"[(내, Noun), (취향, Noun), (아닌, Adjective), (갑다, Verb)]","[(내, N), (취향, N), (아닌갑다, N)]","[(내, NP), (취향, NNG), (아니, VCN), (ㄴ, ETM), (갑, NNG), (다, JX)]","[(내, NP), (취향, NNG), (아니, VV), (ㄴ, ETD), (갑, NNG), (닿, VV)]"
9997,10120473.0,"""여주 """"이들과 함께아니면 안가"""",""""날속이다니 퉤"""",""""내친구를 쐈어"""" 브루스""""헬기돌려""",1.0,0.0,0.0,0.0,0.0,55,"[("", Punctuation), (여주, Noun), ("""", Punctuation), (이, Noun), (들, Suffix), (과, Josa), (함께, Adverb), (아니면, Adjective), (안, Noun), (가, Josa), ("""","""", Punctuation), (날, Noun), (속이다니, Verb), (퉤, Noun), ("""","""", Punctuation), (내, Determiner), (친구, Noun), (를, Josa), (쐈어, Verb), ("""", Punctuation), (브루스, Noun), ("""", Punctuation), (헬기, Noun), (돌려, Verb), ("", Punctuation)]","[(""여주, N), ("""", S), (이, N), (들, X), (과, J), (함께아니, N), (이, J), (면, E), (안, N), (가, J), ("""","""", S), (날, N), (속, X), (이, J), (다니, E), (퉤"""",""""내친구, N), (를, J), (쓰, P), (었어, E), ("""", S), (브루스""""헬기돌려, N), ("", S)]","[("", SS), (여주, NNP), ("", SS), ("", SS), (이, NP), (들, XSN), (과, JC), (함께, MAG), (아니, VA), (면, EC), (알, VV), (ㄴ가, EC), ("", SS), ("", SS), (,, SP), ("", SS), ("", SS), (날, NNG), (속, NNG), (이, VCP), (다니, EC), (퉤, MAG), ("", SS), ("", SS), (,, SP), ("", SS), ("", SS), (내, NP), (친구, NNG), (를, JKO), (쏘, VV), (았, EP), (어, EC), ("", SS), ("", SS), (브루스, NNP), ("", SS), ("", SS), (헬기, NNG), (돌리, VV), (어, EC), ("", SS)]","[("", SS), (여주, NNG), ("""", SW), (이, NNG), (들, XSN), (과, JKO), (함께, MAG), (아니, VV), (면, ECE), (안, MAG), (가, VV), (아, ECS), ("""", SW), (,, SP), ("""", SW), (날, NNG), (속, NNG), (이, JKS), (닿, VV), (니, ECD), (퉤, MAG), ("""", SW), (,, SP), ("""", SW), (내, NP), (친구, NNG), (를, JKO), (쏘, VV), (았, EPT), (어, EFN), ("""", SW), (브루스, NNG), ("""", SW), (헬기, NNG), (돌리, VV), (어, ECS), ("", SS)]"
