In [10]:
#!pip install python-crfsuite

Collecting python-crfsuite
  Downloading python_crfsuite-0.9.7-cp37-cp37m-manylinux1_x86_64.whl (743 kB)
[K     |████████████████████████████████| 743 kB 1.4 MB/s eta 0:00:01
[?25hInstalling collected packages: python-crfsuite
Successfully installed python-crfsuite-0.9.7
You should consider upgrading via the '/home/marynepo/anaconda3/envs/py37/bin/python -m pip install --upgrade pip' command.[0m


# Чтение файлов с данными

In [1]:
from sklearn.model_selection import train_test_split
import stanza
stanza.download('ru')

HBox(children=(FloatProgress(value=0.0, description='Downloading https://raw.githubusercontent.com/stanfordnlp…

2021-12-27 02:53:46 INFO: Downloading default packages for language: ru (Russian)...





2021-12-27 02:53:49 INFO: File exists: /home/marynepo/stanza_resources/ru/default.zip.
2021-12-27 02:53:59 INFO: Finished downloading models and saved to /home/marynepo/stanza_resources.


In [2]:
from sklearn.model_selection import train_test_split
import pandas as pd

train_asp = pd.read_csv(
    'train_split_aspects.txt', 
    delimiter='\t', 
    names=['text_id', 'category', 'mention', 'start', 'end', 'sentiment']
)
train_texts = pd.read_csv('train_split_reviews.txt', delimiter='\t', names=['text_id','text'])

# Baseline 1: категория упоминаний

Основной способ, которым мы пользуемся - crf (через pycrfsuite). Для этого делаем bio-разметку данных.
Признаки слов, которые подаем на вход в crf: лемма, часть речи, регистр (istitle, islower, issuper) самого слова и его соседей справа и слева.

Эксперименты:

1) Добавление tfidf и word2vec (ruscorpora_mystem_cbow_300_2_2015.bin.gz) в признаки (смотрим вектора лемм, а не самих слов). Результат практически не изменился (возможно, их надо подавать по-другому, но не нашли как, или эта модель не предназначена для обучения на них).

2) Изменение алгоритмов crf (lbfgs, l2sgd, ap, pa, arow). Дают похожие результаты, лучше всего работают lbfgs, l2sgd  (и у pa лучшая полнота, но хуже все остальное). Неплохие точность и accuracy, хуже всего показатели полноты. Поэтому пробуем добавить информацию из словарей (пункт 3). 

3) Добавление информации из словарей. Составляем правила с помощью yargy и добавляем то, что он извлек, к результатам crf. В результате для всех алгоритмов полнота действительно возрастает на 0.3-0.5, но ожидаемо падают точность и accuracy. 

Датасеты, которые для этого использовались:

1) Датасет со списом заведений общественного питания Москвы от правительства Москвы 2015 г.. Из него брались названия заведений и их тип для категории Whole. Давал совсем плохие результаты на yargy, поэтому не использовался в дальнейшем. [cafes.csv]() [Источник](https://data.gov.ru/opendata/7710881420-obshchestvennoe)

2) Датасет с 147000 рецептами с сайта povarenok.ru. Использовалась только колонка с ингредиентами (брались названия продуктов), поскольку названия блюд были слишком длинные и их нужно было дополнительно парсить, но при попытке выделить NP из них с помощью yargy компьютер зависал и падал. [recipes.csv]() [Источник](https://www.kaggle.com/rogozinushka/povarenok-recipes)

3) Семантический датасет от КартыСлов (есть набор слов, где каждому соответсутвует тег какой-то семантической категории). Брались слова с тегами FOOD (для категории Food) и CONSTRUCTION (для категории Interior, поскольку там встречалась мебель) [semantic_simple.csv]() [Источник](https://raw.githubusercontent.com/dkulagin/kartaslov/master/dataset/open_semantics/simple/semantics_simple.csv)

Результаты хранятся в файлах вида pred_asp.csv (без всего), pred_asp_arg.csv (векторы+алгоритмы) и pred_asp_arg_dict.csv (векторы+алгоритмы+словари), где вместо arg -  название алгоритм.

Лучшие результаты по заданным метрикам (pred_asp_lbfgs):

Full match precision: 0.7906724511930586
Full match recall: 0.6126050420168068
Partial match ratio in pred: 0.8893709327548807
Full category accuracy: 0.7657266811279827
Partial category accuracy: 0.8828633405639913

Судя по этим метрикам + по другим метрикам для crf, самые большие проблемы возникают с аспектами, в которых больше 1 токена, плохо предсказываются I- теги.

### Yargy и словари.

In [3]:
import json

with open('cafes.json') as f:
    cafes = json.load(f)

cafes = pd.json_normalize(cafes)

In [4]:
import re

names = set()
for name in cafes['Name'].values:
    if re.search('«.+»', name):
        names.add(re.search('«(.+)»', name).group(1).lower())

In [27]:
from yargy import rule, Parser, or_, and_
from yargy.predicates import eq
from yargy.predicates import gte, lte, caseless, normalized, dictionary, gram, is_capitalized
from yargy.pipelines import morph_pipeline

'''NP = rule(
    gram('ADJF').optional().repeatable(),
    gram('NOUN'),
    dictionary('и').optional(),
    gram('NOUN').optional(),
    dictionary('c').optional(),
    gram('NOUN').optional(),
)'''

CAFE_TYPE = dictionary(set(cafes['TypeObject'].values))

CAFE_NAMES = dictionary(names)

#WHOLE = rule(CAFE_TYPE.optional(), eq('"').optional(), and_(CAFE_NAMES, is_capitalized()), eq('"').optional())

In [28]:
recipes = pd.read_csv('recipes.csv')

In [29]:
from tqdm import tqdm

In [30]:
import ast
ingrs = set()

for ing in recipes['ingredients'].values:
    try:
        for k in ast.literal_eval(ing).keys():
            ingrs.add(k.lower())
    except:
        pass

In [31]:
!wget https://raw.githubusercontent.com/dkulagin/kartaslov/master/dataset/open_semantics/simple/semantics_simple.csv semantics_simple.csv

--2021-12-27 03:45:07--  https://raw.githubusercontent.com/dkulagin/kartaslov/master/dataset/open_semantics/simple/semantics_simple.csv
Распознаётся raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.111.133, 185.199.108.133, ...
Подключение к raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... соединение установлено.
HTTP-запрос отправлен. Ожидание ответа... 200 OK
Длина: 1123450 (1,1M) [text/plain]
Сохранение в каталог: ««semantics_simple.csv.2»».


2021-12-27 03:45:08 (1,67 MB/s) - «semantics_simple.csv.2» сохранён [1123450/1123450]

--2021-12-27 03:45:08--  http://semantics_simple.csv/
Распознаётся semantics_simple.csv (semantics_simple.csv)... ошибка: Имя или служба не известны.
wget: не удаётся разрешить адрес «semantics_simple.csv»
ЗАВЕРШЕНО --2021-12-27 03:45:08--
Общее время: 1,2s
Загружено: 1 файлов, 1,1M за 0,6s (1,67 MB/s)


In [32]:
semsim = pd.read_csv('semantics_simple.csv', delimiter=';')

In [62]:
set(semsim['tag'].values)

{'ABSTRACT',
 'ABSTRACT:ACTION',
 'ANATOMY',
 'ANIMAL',
 'CONSTRUCTION',
 'FOOD',
 'HUMAN',
 'PLACE',
 'PLANT',
 'SUBSTANCE',
 'THING',
 'TRANSPORT'}

In [65]:
FOOD_N = rule(dictionary(set(semsim['term'][semsim['tag'] == 'FOOD'].values).union(ingrs)))
INTERIOR = rule(dictionary(set(semsim['term'][semsim['tag'] == 'CONSTRUCTION'].values)))

In [38]:
def rule_matches(r, t):
    parser = Parser(r)
    res = []
    for match in parser.findall(t):
        start, end = match.span
        res.append((t[start:end], start, end))
    return res

In [7]:
test_asp = pd.read_csv(
    'dev_aspects.txt', 
    delimiter='\t', 
    names=['text_id', 'category', 'mention', 'start', 'end', 'sentiment']
)
test_texts = pd.read_csv('dev_reviews.txt', delimiter='\t', names=['text_id','text'])



In [66]:
foods = []
wholes = []
intrs = []
for text in tqdm(test_texts['text'].values):
    foods.append(rule_matches(FOOD_N, text))
    wholes.append(rule_matches(WHOLE, text))
    intrs.append(rule_matches(INTERIOR, text))

100%|██████████| 71/71 [00:26<00:00,  2.71it/s]


### Векторайзеры

In [19]:
import urllib

urllib.request.urlretrieve("http://rusvectores.org/static/models/rusvectores2/ruscorpora_mystem_cbow_300_2_2015.bin.gz", "ruscorpora_mystem_cbow_300_2_2015.bin.gz")

('ruscorpora_mystem_cbow_300_2_2015.bin.gz',
 <http.client.HTTPMessage at 0x7f02a26ebd90>)

In [11]:
import gensim

m = 'ruscorpora_mystem_cbow_300_2_2015.bin.gz'

if m.endswith('.vec.gz'):
    model = gensim.models.KeyedVectors.load_word2vec_format(m, binary=False)
elif m.endswith('.bin.gz'):
    model = gensim.models.KeyedVectors.load_word2vec_format(m, binary=True)
else:
    model = gensim.models.KeyedVectors.load(m)

In [12]:
from sklearn.feature_extraction.text import TfidfVectorizer
nlp = stanza.Pipeline('ru', processors='tokenize, pos, lemma')

def normalize(text):
    doc = nlp(text)
    words = [word.lemma for sent in doc.sentences for word in sent.words if word.pos != 'PUNCT']
    return ' '.join(words)
corpus = [normalize(text) for text in train_texts['text'].values]
vectorizer = TfidfVectorizer()
X = vectorizer.fit_transform(corpus)

2021-12-27 02:58:03 INFO: Loading these models for language: ru (Russian):
| Processor | Package   |
-------------------------
| tokenize  | syntagrus |
| pos       | syntagrus |
| lemma     | syntagrus |

2021-12-27 02:58:03 INFO: Use device: cpu
2021-12-27 02:58:03 INFO: Loading: tokenize
2021-12-27 02:58:03 INFO: Loading: pos
2021-12-27 02:58:04 INFO: Loading: lemma
2021-12-27 02:58:04 INFO: Done loading processors!


### CRF

In [13]:
import pandas as pd
nlp = stanza.Pipeline('ru', processors='tokenize, pos, lemma')
texts = []
bios = []
poses = []
sentences = []
train_sents = []
lemmas = []
sent_ids = []
sent_id=0

for text_id, text in tqdm(train_texts.values):
    processed = nlp(text)
    text_data = []
    bio_data = []
    lemma_data = []
    pos_data = []
    count_token = 0
    for token in processed.iter_tokens():
        add = False
        for mention in train_asp[train_asp['text_id'] == text_id].values:
            if token.start_char == int(mention[3]) and token.end_char <= int(mention[4]):
                text_data.append(token.text)
                lemma_data.append(token.words[0].lemma)
                bio_data.append('B-'+mention[1])
                pos_data.append(token.words[0].pos)
                count_token += 1
                add = True
                sent_ids.append(sent_id)
            elif token.start_char > int(mention[3]) and token.end_char <= int(mention[4]):
                text_data.append(token.text)
                lemma_data.append(token.words[0].lemma)
                bio_data.append('I-'+mention[1])
                pos_data.append(token.words[0].pos)
                count_token += 1
                add = True
                sent_ids.append(sent_id)
        if not add:
            text_data.append(token.text)
            lemma_data.append(token.words[0].lemma)
            bio_data.append('O')
            pos_data.append(token.words[0].pos)
            count_token += 1
            sent_ids.append(sent_id)
    texts.extend(text_data)
    lemmas.extend(lemma_data)
    bios.extend(bio_data)
    poses.extend(pos_data)
    train_sents.append(list(zip(text_data, lemma_data, pos_data, bio_data)))
    sentences.extend([f'Sentence: {sent_id}'] + ['']*(count_token - 1))
    sent_id+=1

2021-12-27 02:59:23 INFO: Loading these models for language: ru (Russian):
| Processor | Package   |
-------------------------
| tokenize  | syntagrus |
| pos       | syntagrus |
| lemma     | syntagrus |

2021-12-27 02:59:23 INFO: Use device: cpu
2021-12-27 02:59:23 INFO: Loading: tokenize
2021-12-27 02:59:23 INFO: Loading: pos
2021-12-27 02:59:24 INFO: Loading: lemma
2021-12-27 02:59:25 INFO: Done loading processors!
100%|██████████| 213/213 [01:34<00:00,  2.25it/s]


In [14]:
import pandas as pd
nlp = stanza.Pipeline('ru', processors='tokenize, pos, lemma')
texts = []
bios = []
poses = []
sentences = []
test_sents = []
lemmas = []
sent_ids = []
sent_id=0

for text_id, text in tqdm(test_texts.values):
    processed = nlp(text)
    text_data = []
    bio_data = []
    lemma_data = []
    pos_data = []
    count_token = 0
    for token in processed.iter_tokens():
        add = False
        for mention in test_asp[test_asp['text_id'] == text_id].values:
            if token.start_char == int(mention[3]) and token.end_char <= int(mention[4]):
                text_data.append(token.text)
                lemma_data.append(token.words[0].lemma)
                bio_data.append('B-'+mention[1])
                pos_data.append(token.words[0].pos)
                count_token += 1
                add = True
                sent_ids.append(sent_id)
            elif token.start_char > int(mention[3]) and token.end_char <= int(mention[4]):
                text_data.append(token.text)
                lemma_data.append(token.words[0].lemma)
                bio_data.append('I-'+mention[1])
                pos_data.append(token.words[0].pos)
                count_token += 1
                add = True
                sent_ids.append(sent_id)
        if not add:
            text_data.append(token.text)
            lemma_data.append(token.words[0].lemma)
            bio_data.append('O')
            pos_data.append(token.words[0].pos)
            count_token += 1
            sent_ids.append(sent_id)
    texts.extend(text_data)
    lemmas.extend(lemma_data)
    bios.extend(bio_data)
    poses.extend(pos_data)
    test_sents.append(list(zip(text_data, lemma_data, pos_data, bio_data)))
    sentences.extend([f'Sentence: {sent_id}'] + ['']*(count_token - 1))
    sent_id+=1

2021-12-27 03:00:59 INFO: Loading these models for language: ru (Russian):
| Processor | Package   |
-------------------------
| tokenize  | syntagrus |
| pos       | syntagrus |
| lemma     | syntagrus |

2021-12-27 03:00:59 INFO: Use device: cpu
2021-12-27 03:00:59 INFO: Loading: tokenize
2021-12-27 03:01:00 INFO: Loading: pos
2021-12-27 03:01:00 INFO: Loading: lemma
2021-12-27 03:01:00 INFO: Done loading processors!
100%|██████████| 71/71 [00:31<00:00,  2.23it/s]


In [15]:
import numpy as np

In [16]:
def word2features(sent, i):
    word = sent[i][0]
    lemma = sent[i][1]
    postag = sent[i][2]
    tfvec = vectorizer.transform([lemma])
    if lemma + '_' + postag in model:
        w2vec = model[lemma + '_' + postag]
    else:
        w2vec= np.zeros((300,))
    features = [
        'bias',
        'word.lower=' + word.lower(),
        'word.isupper=%s' % word.isupper(),
        'word.istitle=%s' % word.istitle(),
        'word.isdigit=%s' % word.isdigit(),
        'postag=' + postag,
        'lemma=' + lemma,
        'w2v=%s' % w2vec,
        'tfvec=%s' % tfvec,
    ]
    if i > 0:
        word1 = sent[i-1][0]
        lemma1 = sent[i-1][1]
        postag1 = sent[i-1][2]
        features.extend([
            '-1:word.lower=' + word1.lower(),
            '-1:word.istitle=%s' % word1.istitle(),
            '-1:word.isupper=%s' % word1.isupper(),
            '-1:postag=' + postag1,
            '-1:lemma=' + lemma,
        ])
    else:
        features.append('BOS')
  
    if i < len(sent)-1:
        word1 = sent[i+1][0]
        lemma1 = sent[i+1][1]
        postag1 = sent[i+1][2]
        features.extend([
            '+1:word.lower=' + word1.lower(),
            '+1:word.istitle=%s' % word1.istitle(),
            '+1:word.isupper=%s' % word1.isupper(),
            '+1:postag=' + postag1,
            '+1:lemma=' + lemma,
        ])
    else:
        features.append('EOS')
                
    return features


def sent2features(sent):
    return [word2features(sent, i) for i in range(len(sent))]

def sent2labels(sent):
    return [label for token, lemma, postag, label in sent]

def sent2tokens(sent):
    return [token for token, lemma, postag, label in sent] 

In [17]:
X_train = [sent2features(s) for s in train_sents]
y_train = [sent2labels(s) for s in train_sents]

X_test = [sent2features(s) for s in test_sents]
y_test = [sent2labels(s) for s in test_sents]

In [19]:
def bio_classification_report(y_true, y_pred):
    """
    Classification report for a list of BIO-encoded sequences.
    It computes token-level metrics and discards "O" labels.
    
    Note that it requires scikit-learn 0.15+ (or a version from github master)
    to calculate averages properly!
    """
    lb = LabelBinarizer()
    y_true_combined = lb.fit_transform(list(chain.from_iterable(y_true)))
    y_pred_combined = lb.transform(list(chain.from_iterable(y_pred)))
        
    tagset = set(lb.classes_) - {'O'}
    tagset = sorted(tagset, key=lambda tag: tag.split('-', 1)[::-1])
    class_indices = {cls: idx for idx, cls in enumerate(lb.classes_)}
    
    return classification_report(
        y_true_combined,
        y_pred_combined,
        labels = [class_indices[cls] for cls in tagset],
        target_names = tagset,
    )

In [69]:
from itertools import chain
import nltk
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.preprocessing import LabelBinarizer
import sklearn
import pycrfsuite

nlp = stanza.Pipeline('ru', processors='tokenize')

algs = ['lbfgs', 'l2sgd', 'ap', 'pa', 'arow']
for i, alg in enumerate(algs):
    trainer = pycrfsuite.Trainer(algorithm=alg, verbose=False)

    for xseq, yseq in zip(X_train, y_train):
        trainer.append(xseq, yseq)
    '''trainer.set_params({
        'c1': 1.0,   # coefficient for L1 penalty
        'c2': 1e-3,  # coefficient for L2 penalty
        'max_iterations': 50,  # stop earlier

        # include transitions that are possible, but not observed
        'feature.possible_transitions': True
    })'''

    trainer.train('restaurants.crfsuite')
    tagger = pycrfsuite.Tagger()
    tagger.open('restaurants.crfsuite')
    y_pred = [tagger.tag(xseq) for xseq in X_test]
    print(bio_classification_report(y_test, y_pred))

    tids = []
    mentions = []
    cats = []
    starts = []
    ends = []
    for texts, ss, pred, fd, interior in zip(test_texts.values, test_sents, y_pred, foods, intrs):
        text = list(nlp(texts[1]).iter_tokens())
        ment = []
        for i, ss in enumerate(zip(text, pred)):
            token = ss[0]
            tg = ss[1]
            if tg != 'O':
                ment.append(token.text)
                cat = tg[2:]
                if tg[:2] == 'B-':
                    st = token.start_char
                if i + 1 != len(pred):
                    if pred[i+1][:2] != 'I-':
                        en = token.end_char
                        mentions.append(ment)
                        tids.append(texts[0])
                    #print(i, en, token, tg, pred[i+1], pred[i-1])
                        starts.append(st)
                        ends.append(en)
                        cats.append(cat)

                        with open(f'pred_asp_{alg}_dict.txt', 'a') as f:
                            print(texts[0], cat, texts[1][st:en], st, en, 'positive', sep='\t', file=f)
                        ment = []
                else:
                    en = token.end_char
                    mentions.append(ment)
                    tids.append(texts[0])
                    starts.append(st)
                    ends.append(en)
                    cats.append(cat)
                    with open(f'pred_asp_{alg}_dict.txt', 'a') as f:
                        print(texts[0], cat, texts[1][st:en], st, en, 'positive', sep='\t', file=f)
                    ment = []
            elif (token.text, token.start_char, token.end_char) in fd:
                with open(f'pred_asp_{alg}_dict.txt', 'a') as f:
                        print(texts[0], 'B-Food', token.text, token.start_char, token.end_char, 'positive', sep='\t', file=f)
            elif (token.text, token.start_char, token.end_char) in interior:
                with open(f'pred_asp_{alg}_dict.txt', 'a') as f:
                        print(texts[0], 'B-Interior', token.text, token.start_char, token.end_char, 'positive', sep='\t', file=f)

2021-12-27 04:13:53 INFO: Loading these models for language: ru (Russian):
| Processor | Package   |
-------------------------
| tokenize  | syntagrus |

2021-12-27 04:13:53 INFO: Use device: cpu
2021-12-27 04:13:53 INFO: Loading: tokenize
2021-12-27 04:13:53 INFO: Done loading processors!
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

      B-Food       0.76      0.65      0.70       449
      I-Food       0.78      0.58      0.67       273
  B-Interior       0.90      0.54      0.68       176
  I-Interior       0.53      0.34      0.42        29
     B-Price       0.79      0.56      0.66        34
     I-Price       0.00      0.00      0.00        11
   B-Service       0.91      0.63      0.75       338
   I-Service       0.75      0.49      0.59        74
     B-Whole       0.80      0.74      0.77       185
     I-Whole       0.75      0.31      0.44        48

   micro avg       0.81      0.60      0.69      1617
   macro avg       0.70      0.48      0.57      1617
weighted avg       0.80      0.60      0.69      1617
 samples avg       0.08      0.08      0.08      1617



  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

      B-Food       0.75      0.67      0.71       449
      I-Food       0.80      0.59      0.68       273
  B-Interior       0.90      0.54      0.68       176
  I-Interior       0.50      0.34      0.41        29
     B-Price       0.78      0.53      0.63        34
     I-Price       0.00      0.00      0.00        11
   B-Service       0.89      0.65      0.75       338
   I-Service       0.61      0.51      0.56        74
     B-Whole       0.80      0.78      0.79       185
     I-Whole       0.77      0.35      0.49        48

   micro avg       0.80      0.62      0.70      1617
   macro avg       0.68      0.50      0.57      1617
weighted avg       0.80      0.62      0.69      1617
 samples avg       0.08      0.08      0.08      1617



  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

      B-Food       0.73      0.67      0.70       449
      I-Food       0.83      0.49      0.62       273
  B-Interior       0.86      0.60      0.71       176
  I-Interior       0.59      0.34      0.43        29
     B-Price       0.81      0.65      0.72        34
     I-Price       0.00      0.00      0.00        11
   B-Service       0.87      0.65      0.75       338
   I-Service       0.80      0.43      0.56        74
     B-Whole       0.77      0.74      0.76       185
     I-Whole       0.71      0.35      0.47        48

   micro avg       0.80      0.61      0.69      1617
   macro avg       0.70      0.49      0.57      1617
weighted avg       0.79      0.61      0.68      1617
 samples avg       0.08      0.08      0.08      1617



  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

      B-Food       0.73      0.71      0.72       449
      I-Food       0.81      0.53      0.64       273
  B-Interior       0.85      0.64      0.73       176
  I-Interior       0.58      0.52      0.55        29
     B-Price       0.79      0.68      0.73        34
     I-Price       0.00      0.00      0.00        11
   B-Service       0.84      0.68      0.75       338
   I-Service       0.74      0.47      0.58        74
     B-Whole       0.76      0.77      0.77       185
     I-Whole       0.75      0.38      0.50        48

   micro avg       0.78      0.64      0.70      1617
   macro avg       0.68      0.54      0.60      1617
weighted avg       0.78      0.64      0.70      1617
 samples avg       0.09      0.09      0.09      1617



  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

      B-Food       0.62      0.67      0.64       449
      I-Food       0.71      0.46      0.56       273
  B-Interior       0.77      0.56      0.65       176
  I-Interior       0.45      0.48      0.47        29
     B-Price       0.71      0.59      0.65        34
     I-Price       0.00      0.00      0.00        11
   B-Service       0.67      0.64      0.65       338
   I-Service       0.49      0.31      0.38        74
     B-Whole       0.70      0.67      0.68       185
     I-Whole       0.46      0.23      0.31        48

   micro avg       0.65      0.58      0.61      1617
   macro avg       0.56      0.46      0.50      1617
weighted avg       0.65      0.58      0.61      1617
 samples avg       0.08      0.08      0.08      1617

