Реализация практического задания в рамках математического спецкурса "Машинное обучение и искусственный интеллект" на ВМК МГУ в 2022
В рамках работы над одним проектом, необходимо классифицировать строки документов с техническим заданием на следующие классы:
raw_text
- обычный текстtitle
- название документаtoc
- элемент содержанияitem
- заголовокpart
- пункт списка
Каждая строка определяется следующими параметрами:
day_month_regexp
- срабатывание регулярного выражения дня и месяцаyear_regexp
- срабатывание регулярного выражения годаdot_number_regexp
- срабатывание регулярного выражения на маркированный списокnamed_item_regexp
- срабатывание регулярного выражения на нумерованный списокdot_number_regexp_len
- длина вхождения подстроки в маркированный списокindentation
- отступis_in_toc
- входит ли в содержаниеis_lower
- вся ли строка в нижнем регистреis_toc_line
- предсталвяет ли строку содержанияis_upper
- вся ли строка в верхнем регистреline_id
- номер строкиlist_item
- элемент спискаstart_regexp_0
- срабатывание первого начального регулярного выраженияstart_regexp_1
- срабатывание второго начального регулярного выраженияstart_regexp_2
- срабатывание третьего начального регулярного выраженияstart_regexp_3
- срабатывание четвёртого начального регулярного выраженияstart_regexp_num_matches
- количество срабатываний начальных регулярных выраженийtext_length
- длина текстаwords_number
- количество словuid
- уникальный идентификаторtext
- текстовое содержимоеlabel
- целевая метка
Для классификации помимо текущей строки используются некоторые параметры трёх строк выше и трёх строк ниже классифицируемой. Такие признаки оканчиваются на _prev_{1..3}
и _next_{1..3}
, например, is_lower_prev_3
.
- Количество признаков: 145
- Количество элементов: 8557
- Ближайшие центроиды (nearest centroid)
- Машина опорных векторов (SVM)
- k-ближайших соседей (KNN)
- Наивный Байес (naive Bayes)
- Обыкновенный персептрон (perceptron)
- Passive aggressive
- Дерево принятия решений (decision tree)
- Многослойный персептрон (multi layer perceptron)
- Гребневая регрессия (ridge)
- Логистическся регрессия (logistic regression)
- Случайный лес (random forest)
- Градиентный бустинг (gradient boost, XGBoost, CatBoost)
Каждая из моделей обучалась с использованием кроссвалидационного тестирования с разбиением на 10 частей. По окончании кроссвалидации набор данных разбивался на обучающую и тестовую часть, модели обучались на тренировочном наборе и замерялось качество непосредственно на тестовой выборке.
Для улучшения качества одних моделей входные данные должны быть нормализованы, а для других (градиентные бустинги) достаточно (а иногда даже необходимо) скармливать данные в сыром виде. Поэтому все эксперименты проводились для трёх режимов нормализации:
- без какой либо нормализации - как есть
- min-max нормализация
- стандартизация
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.8036 | 0.8316 | 0.7318 | 0.8252 | 0.8726 | 0.7960 | 0.6105 | 0.8804 | 0.8557 | 0.8215 | 0.8105 |
Precision | 0.8065 | 0.6848 | 0.8656 | 0.8980 | 0.8825 | 0.8094 | 0.6929 | 0.8031 | 0.7417 | 0.8435 | 0.8433 |
f1 | 0.7847 | 0.7072 | 0.7507 | 0.8286 | 0.8742 | 0.7987 | 0.6324 | 0.8294 | 0.7783 | 0.8261 | 0.8211 |
Accuracy | 0.8818 | 0.7967 | 0.8575 | 0.9299 | 0.9077 | 0.8902 | 0.9848 | 0.8737 | 0.8632 | 0.8596 | 0.8550 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.2617 | 0.4036 | 0.1838 | 0.2611 | 0.2483 | 0.1107 | 0.0000 | 0.3377 | 0.4108 | 0.3679 | 0.2927 |
Precision | 0.3527 | 0.4168 | 0.3112 | 0.5353 | 0.5000 | 0.1577 | 0.0000 | 0.4024 | 0.4085 | 0.3857 | 0.4089 |
f1 | 0.2294 | 0.4014 | 0.1314 | 0.2767 | 0.2543 | 0.1290 | 0.0000 | 0.2605 | 0.3027 | 0.2533 | 0.2845 |
Accuracy | 0.2788 | 0.6893 | 0.1589 | 0.2336 | 0.2921 | 0.2079 | 0.0000 | 0.2772 | 0.3041 | 0.3111 | 0.3135 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.5024 | 0.5907 | 0.4079 | 0.5604 | 0.5361 | 0.3747 | 0.4846 | 0.5238 | 0.4610 | 0.5233 | 0.5613 |
Precision | 0.4545 | 0.3652 | 0.3720 | 0.4780 | 0.4807 | 0.3395 | 0.4923 | 0.5014 | 0.4358 | 0.5316 | 0.5487 |
f1 | 0.4533 | 0.4118 | 0.3852 | 0.4931 | 0.4940 | 0.3442 | 0.4880 | 0.4878 | 0.4097 | 0.4812 | 0.5377 |
Accuracy | 0.5801 | 0.5596 | 0.5724 | 0.4650 | 0.5082 | 0.7407 | 0.9311 | 0.4947 | 0.4807 | 0.4901 | 0.5591 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.5695 | 0.7818 | 0.5838 | 0.6286 | 0.6781 | 0.3830 | 0.2296 | 0.6465 | 0.5510 | 0.6772 | 0.5349 |
Precision | 0.5977 | 0.7686 | 0.4877 | 0.7708 | 0.7439 | 0.4022 | 0.1581 | 0.6810 | 0.6088 | 0.7222 | 0.6333 |
f1 | 0.5253 | 0.7692 | 0.4987 | 0.5974 | 0.6543 | 0.3478 | 0.0991 | 0.5810 | 0.5132 | 0.6619 | 0.5304 |
Accuracy | 0.5894 | 0.8026 | 0.6600 | 0.6297 | 0.6834 | 0.5970 | 0.1542 | 0.6211 | 0.5485 | 0.6468 | 0.5509 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.6284 | 0.5617 | 0.4733 | 0.6209 | 0.7240 | 0.3647 | 0.7494 | 0.7400 | 0.6289 | 0.6375 | 0.7840 |
Precision | 0.5668 | 0.4208 | 0.5971 | 0.5967 | 0.6246 | 0.3216 | 0.4383 | 0.7245 | 0.6455 | 0.6375 | 0.6616 |
f1 | 0.5423 | 0.4525 | 0.4910 | 0.4855 | 0.6214 | 0.2101 | 0.5227 | 0.7208 | 0.6152 | 0.6210 | 0.6831 |
Accuracy | 0.6855 | 0.6811 | 0.7150 | 0.5339 | 0.7196 | 0.3902 | 0.9790 | 0.7532 | 0.6409 | 0.7357 | 0.7064 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.4644 | 0.2718 | 0.4593 | 0.2158 | 0.5123 | 0.7829 | 0.2418 | 0.4918 | 0.5742 | 0.4420 | 0.6520 |
Precision | 0.3018 | 0.2505 | 0.3438 | 0.1595 | 0.3137 | 0.5611 | 0.2491 | 0.2477 | 0.3474 | 0.2246 | 0.3205 |
f1 | 0.2953 | 0.1929 | 0.3685 | 0.1556 | 0.3180 | 0.5654 | 0.2454 | 0.2325 | 0.3416 | 0.2029 | 0.3305 |
Accuracy | 0.5012 | 0.4428 | 0.6098 | 0.3107 | 0.4077 | 0.7465 | 0.9638 | 0.3778 | 0.4398 | 0.3029 | 0.4105 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.8586 | 0.8843 | 0.8494 | 0.8429 | 0.8500 | 0.9152 | 0.8520 | 0.8733 | 0.8378 | 0.8316 | 0.8498 |
Precision | 0.7492 | 0.6342 | 0.6987 | 0.9263 | 0.8017 | 0.6988 | 0.7500 | 0.7550 | 0.6750 | 0.7789 | 0.7731 |
f1 | 0.7726 | 0.7096 | 0.7541 | 0.8725 | 0.8162 | 0.7318 | 0.7409 | 0.7795 | 0.7213 | 0.7989 | 0.8010 |
Accuracy | 0.8714 | 0.8259 | 0.8493 | 0.9042 | 0.8902 | 0.8902 | 0.9778 | 0.8374 | 0.8444 | 0.8468 | 0.8480 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.5761 | 0.5348 | 0.3267 | 0.4637 | 0.4907 | 0.6961 | 0.6850 | 0.5773 | 0.7079 | 0.6698 | 0.6086 |
Precision | 0.5209 | 0.3692 | 0.3286 | 0.3392 | 0.5409 | 0.6358 | 0.8062 | 0.6128 | 0.4183 | 0.5704 | 0.5879 |
f1 | 0.4833 | 0.3636 | 0.2369 | 0.3622 | 0.5031 | 0.6571 | 0.6114 | 0.5826 | 0.4129 | 0.5187 | 0.5843 |
Accuracy | 0.6314 | 0.6986 | 0.4136 | 0.5035 | 0.6600 | 0.7255 | 0.8423 | 0.6117 | 0.5696 | 0.6304 | 0.6585 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.8011 | 0.8655 | 0.7608 | 0.8695 | 0.8142 | 0.7690 | 0.6994 | 0.8812 | 0.7854 | 0.7884 | 0.7777 |
Precision | 0.7848 | 0.6785 | 0.8582 | 0.8271 | 0.8309 | 0.7418 | 0.6953 | 0.8481 | 0.7304 | 0.8669 | 0.7708 |
f1 | 0.7782 | 0.7385 | 0.7754 | 0.8252 | 0.8191 | 0.7537 | 0.6960 | 0.8593 | 0.7442 | 0.8113 | 0.7591 |
Accuracy | 0.8642 | 0.8329 | 0.8528 | 0.8563 | 0.8820 | 0.8493 | 0.9942 | 0.8795 | 0.8281 | 0.8842 | 0.7825 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.6854 | 0.8057 | 0.4701 | 0.7559 | 0.7747 | 0.4614 | 0.4834 | 0.8321 | 0.7394 | 0.7724 | 0.7592 |
Precision | 0.7130 | 0.7438 | 0.6307 | 0.8575 | 0.8703 | 0.4381 | 0.5103 | 0.7972 | 0.6595 | 0.8520 | 0.7705 |
f1 | 0.6843 | 0.7505 | 0.4830 | 0.7992 | 0.8132 | 0.4474 | 0.4920 | 0.8095 | 0.6842 | 0.8009 | 0.7633 |
Accuracy | 0.8362 | 0.8037 | 0.6811 | 0.8703 | 0.8692 | 0.8843 | 0.9778 | 0.8491 | 0.7778 | 0.8491 | 0.8000 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.8830 | 0.8454 | 0.8585 | 0.9595 | 0.9472 | 0.6935 | 0.9992 | 0.9081 | 0.9225 | 0.8507 | 0.8451 |
Precision | 0.8067 | 0.5745 | 0.8884 | 0.9500 | 0.8840 | 0.5764 | 0.9454 | 0.8371 | 0.7261 | 0.8948 | 0.7898 |
f1 | 0.8250 | 0.6191 | 0.8664 | 0.9544 | 0.9094 | 0.6093 | 0.9710 | 0.8646 | 0.7792 | 0.8639 | 0.8123 |
Accuracy | 0.9018 | 0.8002 | 0.8820 | 0.9650 | 0.9322 | 0.9100 | 0.9977 | 0.8959 | 0.8830 | 0.8982 | 0.8538 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.8329 | 0.8966 | 0.5546 | 0.9365 | 0.9078 | 0.7017 | 0.7494 | 0.9247 | 0.8797 | 0.8897 | 0.8885 |
Precision | 0.7921 | 0.8046 | 0.6795 | 0.9146 | 0.9085 | 0.5733 | 0.7079 | 0.8488 | 0.7453 | 0.9142 | 0.8243 |
f1 | 0.7967 | 0.8378 | 0.5498 | 0.9227 | 0.9058 | 0.6071 | 0.7276 | 0.8787 | 0.7891 | 0.8981 | 0.8504 |
Accuracy | 0.9051 | 0.8680 | 0.8201 | 0.9544 | 0.9276 | 0.9112 | 0.9930 | 0.9099 | 0.8655 | 0.9170 | 0.8842 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.8510 | 0.8860 | 0.5871 | 0.8953 | 0.9227 | 0.6929 | 0.9622 | 0.9202 | 0.9033 | 0.8866 | 0.8535 |
Precision | 0.8242 | 0.7796 | 0.7159 | 0.9084 | 0.9425 | 0.5772 | 0.9450 | 0.8560 | 0.7561 | 0.9182 | 0.8428 |
f1 | 0.8248 | 0.8233 | 0.6010 | 0.9003 | 0.9313 | 0.6096 | 0.9534 | 0.8800 | 0.8033 | 0.8990 | 0.8469 |
Accuracy | 0.9119 | 0.8692 | 0.8657 | 0.9439 | 0.9276 | 0.9112 | 0.9965 | 0.9123 | 0.8678 | 0.9427 | 0.8819 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.9004 | 0.9225 | 0.7452 | 0.9247 | 0.9302 | 0.9496 | 0.9622 | 0.9249 | 0.9245 | 0.8747 | 0.8457 |
Precision | 0.8630 | 0.7939 | 0.8815 | 0.9147 | 0.9369 | 0.7661 | 0.9450 | 0.8560 | 0.7866 | 0.9354 | 0.8141 |
f1 | 0.8680 | 0.8456 | 0.7778 | 0.9182 | 0.9327 | 0.8127 | 0.9534 | 0.8814 | 0.8329 | 0.8970 | 0.8282 |
Accuracy | 0.9139 | 0.8855 | 0.8633 | 0.9498 | 0.9334 | 0.9136 | 0.9965 | 0.9088 | 0.8901 | 0.9298 | 0.8678 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.8099 | 0.9174 | 0.6571 | 0.8089 | 0.8837 | 0.7940 | 0.6164 | 0.8994 | 0.8464 | 0.8525 | 0.8228 |
Precision | 0.7600 | 0.6576 | 0.6064 | 0.9146 | 0.8699 | 0.7296 | 0.6026 | 0.8137 | 0.7518 | 0.8873 | 0.7664 |
f1 | 0.7647 | 0.7148 | 0.6249 | 0.8357 | 0.8741 | 0.7500 | 0.5751 | 0.8408 | 0.7787 | 0.8646 | 0.7884 |
Accuracy | 0.8883 | 0.8668 | 0.8341 | 0.9194 | 0.9054 | 0.8914 | 0.9836 | 0.8854 | 0.8643 | 0.8936 | 0.8386 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.5090 | 0.4742 | 0.4796 | 0.4970 | 0.5212 | 0.4033 | 0.4620 | 0.5890 | 0.4803 | 0.5603 | 0.6234 |
Precision | 0.5279 | 0.3193 | 0.5433 | 0.5910 | 0.7007 | 0.3591 | 0.2272 | 0.6610 | 0.5508 | 0.6564 | 0.6704 |
f1 | 0.4700 | 0.2829 | 0.4689 | 0.4980 | 0.5343 | 0.3751 | 0.2860 | 0.5988 | 0.4771 | 0.5601 | 0.6186 |
Accuracy | 0.6571 | 0.6157 | 0.6928 | 0.6484 | 0.6414 | 0.7523 | 0.7371 | 0.6515 | 0.5544 | 0.6480 | 0.6292 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.7256 | 0.5945 | 0.6370 | 0.8521 | 0.8063 | 0.8035 | 0.4616 | 0.8288 | 0.7667 | 0.7760 | 0.7293 |
Precision | 0.6351 | 0.4627 | 0.4996 | 0.8481 | 0.8122 | 0.6948 | 0.4830 | 0.6627 | 0.5757 | 0.6786 | 0.6340 |
f1 | 0.6514 | 0.4927 | 0.5449 | 0.8455 | 0.8059 | 0.7210 | 0.4356 | 0.6952 | 0.6106 | 0.7130 | 0.6498 |
Accuracy | 0.8257 | 0.7500 | 0.8072 | 0.8680 | 0.8879 | 0.8703 | 0.9685 | 0.7883 | 0.7743 | 0.7813 | 0.7614 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.4720 | 0.6667 | 0.4906 | 0.4149 | 0.5855 | 0.3859 | 0.2096 | 0.3938 | 0.4904 | 0.6349 | 0.4477 |
Precision | 0.5140 | 0.6307 | 0.3945 | 0.6914 | 0.6787 | 0.3489 | 0.0706 | 0.6145 | 0.5544 | 0.5720 | 0.5847 |
f1 | 0.3909 | 0.5021 | 0.3221 | 0.4546 | 0.5231 | 0.2442 | 0.0265 | 0.4585 | 0.4324 | 0.4725 | 0.4728 |
Accuracy | 0.4135 | 0.3189 | 0.3271 | 0.4848 | 0.5654 | 0.4591 | 0.0292 | 0.5357 | 0.4655 | 0.4491 | 0.5006 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.8091 | 0.8197 | 0.8519 | 0.7782 | 0.8454 | 0.8847 | 0.6050 | 0.8966 | 0.7559 | 0.8584 | 0.7953 |
Precision | 0.7897 | 0.6448 | 0.8607 | 0.9234 | 0.8853 | 0.8024 | 0.6893 | 0.7854 | 0.6653 | 0.8795 | 0.7608 |
f1 | 0.7791 | 0.6787 | 0.8486 | 0.8138 | 0.8615 | 0.8330 | 0.6255 | 0.8148 | 0.6798 | 0.8605 | 0.7746 |
Accuracy | 0.8565 | 0.7780 | 0.8586 | 0.8843 | 0.8703 | 0.9077 | 0.9708 | 0.8702 | 0.7579 | 0.8725 | 0.7942 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.8705 | 0.9303 | 0.8548 | 0.9030 | 0.8705 | 0.8820 | 0.7762 | 0.9010 | 0.8709 | 0.8779 | 0.8382 |
Precision | 0.7949 | 0.7258 | 0.7363 | 0.9188 | 0.8728 | 0.7284 | 0.7325 | 0.8173 | 0.7498 | 0.8927 | 0.7749 |
f1 | 0.8085 | 0.7986 | 0.7833 | 0.9099 | 0.8689 | 0.7629 | 0.6529 | 0.8454 | 0.7821 | 0.8828 | 0.7979 |
Accuracy | 0.8968 | 0.8727 | 0.8598 | 0.9369 | 0.9065 | 0.9030 | 0.9766 | 0.8854 | 0.8713 | 0.9088 | 0.8468 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.8637 | 0.9205 | 0.8898 | 0.8567 | 0.8409 | 0.9203 | 0.8531 | 0.8705 | 0.8290 | 0.8291 | 0.8274 |
Precision | 0.7520 | 0.6339 | 0.6904 | 0.9309 | 0.8180 | 0.7123 | 0.8027 | 0.7462 | 0.6657 | 0.7757 | 0.7442 |
f1 | 0.7770 | 0.7162 | 0.7577 | 0.8840 | 0.8217 | 0.7491 | 0.7919 | 0.7695 | 0.7122 | 0.7959 | 0.7721 |
Accuracy | 0.8698 | 0.8271 | 0.8446 | 0.9054 | 0.8925 | 0.8949 | 0.9813 | 0.8316 | 0.8409 | 0.8444 | 0.8351 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.7992 | 0.8482 | 0.8328 | 0.7819 | 0.8511 | 0.8428 | 0.4925 | 0.8924 | 0.7477 | 0.8585 | 0.8438 |
Precision | 0.7845 | 0.6452 | 0.8107 | 0.9321 | 0.9057 | 0.7056 | 0.6254 | 0.7729 | 0.7567 | 0.8796 | 0.8110 |
f1 | 0.7613 | 0.6657 | 0.8158 | 0.8128 | 0.8698 | 0.7338 | 0.4975 | 0.8032 | 0.7398 | 0.8539 | 0.8205 |
Accuracy | 0.8676 | 0.8271 | 0.8551 | 0.8972 | 0.8820 | 0.8925 | 0.9159 | 0.8620 | 0.8222 | 0.8678 | 0.8538 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.7953 | 0.8929 | 0.8630 | 0.8555 | 0.8437 | 0.6725 | 0.4814 | 0.8829 | 0.8101 | 0.8284 | 0.8228 |
Precision | 0.7860 | 0.7833 | 0.8771 | 0.8744 | 0.9108 | 0.5663 | 0.5213 | 0.8612 | 0.7571 | 0.8973 | 0.8116 |
f1 | 0.7796 | 0.8162 | 0.8658 | 0.8517 | 0.8740 | 0.5956 | 0.4913 | 0.8701 | 0.7638 | 0.8526 | 0.8152 |
Accuracy | 0.8913 | 0.8762 | 0.8692 | 0.9182 | 0.9159 | 0.9054 | 0.9743 | 0.8854 | 0.8316 | 0.8947 | 0.8421 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.6858 | 0.8057 | 0.4696 | 0.7577 | 0.7803 | 0.4627 | 0.4834 | 0.8321 | 0.7394 | 0.7678 | 0.7592 |
Precision | 0.7124 | 0.7438 | 0.6296 | 0.8583 | 0.8703 | 0.4390 | 0.5103 | 0.7972 | 0.6595 | 0.8450 | 0.7705 |
f1 | 0.6843 | 0.7505 | 0.4819 | 0.8007 | 0.8173 | 0.4484 | 0.4920 | 0.8095 | 0.6842 | 0.7957 | 0.7633 |
Accuracy | 0.8361 | 0.8037 | 0.6787 | 0.8727 | 0.8692 | 0.8867 | 0.9778 | 0.8491 | 0.7778 | 0.8456 | 0.8000 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.8808 | 0.8450 | 0.8585 | 0.9413 | 0.9472 | 0.6923 | 0.9992 | 0.9081 | 0.9225 | 0.8507 | 0.8430 |
Precision | 0.8043 | 0.5746 | 0.8884 | 0.9367 | 0.8840 | 0.5721 | 0.9454 | 0.8371 | 0.7261 | 0.8934 | 0.7851 |
f1 | 0.8224 | 0.6191 | 0.8664 | 0.9380 | 0.9094 | 0.6047 | 0.9710 | 0.8646 | 0.7792 | 0.8636 | 0.8083 |
Accuracy | 0.9015 | 0.8002 | 0.8820 | 0.9638 | 0.9322 | 0.9089 | 0.9977 | 0.8959 | 0.8830 | 0.8982 | 0.8526 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.8327 | 0.8966 | 0.5546 | 0.9365 | 0.9078 | 0.7017 | 0.7494 | 0.9247 | 0.8797 | 0.8871 | 0.8885 |
Precision | 0.7917 | 0.8046 | 0.6795 | 0.9146 | 0.9085 | 0.5733 | 0.7079 | 0.8488 | 0.7453 | 0.9105 | 0.8243 |
f1 | 0.7964 | 0.8378 | 0.5498 | 0.9227 | 0.9058 | 0.6071 | 0.7276 | 0.8787 | 0.7891 | 0.8949 | 0.8504 |
Accuracy | 0.9046 | 0.8680 | 0.8201 | 0.9544 | 0.9276 | 0.9112 | 0.9930 | 0.9099 | 0.8655 | 0.9123 | 0.8842 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.8511 | 0.8877 | 0.5871 | 0.8953 | 0.9227 | 0.6929 | 0.9622 | 0.9202 | 0.9033 | 0.8866 | 0.8535 |
Precision | 0.8243 | 0.7813 | 0.7159 | 0.9084 | 0.9425 | 0.5772 | 0.9450 | 0.8560 | 0.7561 | 0.9182 | 0.8428 |
f1 | 0.8250 | 0.8253 | 0.6010 | 0.9003 | 0.9313 | 0.6096 | 0.9534 | 0.8800 | 0.8033 | 0.8990 | 0.8469 |
Accuracy | 0.9120 | 0.8703 | 0.8657 | 0.9439 | 0.9276 | 0.9112 | 0.9965 | 0.9123 | 0.8678 | 0.9427 | 0.8819 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.9004 | 0.9225 | 0.7452 | 0.9247 | 0.9302 | 0.9496 | 0.9622 | 0.9249 | 0.9245 | 0.8747 | 0.8457 |
Precision | 0.8630 | 0.7939 | 0.8815 | 0.9147 | 0.9369 | 0.7661 | 0.9450 | 0.8560 | 0.7866 | 0.9354 | 0.8141 |
f1 | 0.8680 | 0.8456 | 0.7778 | 0.9182 | 0.9327 | 0.8127 | 0.9534 | 0.8814 | 0.8329 | 0.8970 | 0.8282 |
Accuracy | 0.9139 | 0.8855 | 0.8633 | 0.9498 | 0.9334 | 0.9136 | 0.9965 | 0.9088 | 0.8901 | 0.9298 | 0.8678 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.8047 | 0.8865 | 0.6886 | 0.8127 | 0.8674 | 0.8448 | 0.5744 | 0.9187 | 0.8298 | 0.8230 | 0.8007 |
Precision | 0.7880 | 0.6732 | 0.6889 | 0.9396 | 0.9094 | 0.7525 | 0.6274 | 0.8654 | 0.7544 | 0.9069 | 0.7626 |
f1 | 0.7810 | 0.7355 | 0.6845 | 0.8461 | 0.8869 | 0.7815 | 0.5848 | 0.8859 | 0.7759 | 0.8500 | 0.7787 |
Accuracy | 0.8930 | 0.8563 | 0.8621 | 0.9241 | 0.9206 | 0.9019 | 0.9801 | 0.9041 | 0.8538 | 0.9099 | 0.8175 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.5993 | 0.7743 | 0.4764 | 0.6343 | 0.6694 | 0.6142 | 0.3243 | 0.6858 | 0.5246 | 0.6146 | 0.6748 |
Precision | 0.6368 | 0.5265 | 0.5397 | 0.7988 | 0.8078 | 0.6410 | 0.3333 | 0.7056 | 0.5889 | 0.7333 | 0.6925 |
f1 | 0.5783 | 0.4959 | 0.4745 | 0.6388 | 0.7131 | 0.6229 | 0.3288 | 0.6881 | 0.5373 | 0.6171 | 0.6663 |
Accuracy | 0.7225 | 0.7336 | 0.6916 | 0.6846 | 0.7465 | 0.7336 | 0.9673 | 0.7041 | 0.6105 | 0.6819 | 0.6713 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.7444 | 0.7899 | 0.6139 | 0.6997 | 0.7902 | 0.8362 | 0.7170 | 0.8014 | 0.6891 | 0.7733 | 0.7337 |
Precision | 0.6866 | 0.6812 | 0.5275 | 0.8462 | 0.7877 | 0.6892 | 0.6584 | 0.6848 | 0.5623 | 0.7537 | 0.6751 |
f1 | 0.6924 | 0.7152 | 0.5613 | 0.7179 | 0.7768 | 0.7187 | 0.6629 | 0.7241 | 0.5944 | 0.7606 | 0.6921 |
Accuracy | 0.8113 | 0.7967 | 0.7921 | 0.8236 | 0.8411 | 0.8692 | 0.9743 | 0.7520 | 0.7216 | 0.7988 | 0.7439 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.4667 | 0.6566 | 0.4712 | 0.4133 | 0.5837 | 0.3711 | 0.2094 | 0.3866 | 0.4824 | 0.6464 | 0.4461 |
Precision | 0.5069 | 0.6213 | 0.3833 | 0.6864 | 0.6816 | 0.3423 | 0.0703 | 0.6008 | 0.5448 | 0.5561 | 0.5819 |
f1 | 0.3863 | 0.4956 | 0.3211 | 0.4515 | 0.5215 | 0.2334 | 0.0258 | 0.4455 | 0.4266 | 0.4690 | 0.4733 |
Accuracy | 0.4075 | 0.3131 | 0.3259 | 0.4825 | 0.5666 | 0.4428 | 0.0280 | 0.5228 | 0.4596 | 0.4374 | 0.4959 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.7635 | 0.8064 | 0.6500 | 0.8176 | 0.8404 | 0.8115 | 0.5243 | 0.8209 | 0.7827 | 0.8013 | 0.7798 |
Precision | 0.7390 | 0.6711 | 0.6693 | 0.9260 | 0.8913 | 0.7189 | 0.4811 | 0.7614 | 0.6653 | 0.8486 | 0.7566 |
f1 | 0.7405 | 0.7160 | 0.6542 | 0.8585 | 0.8639 | 0.7442 | 0.5012 | 0.7815 | 0.7001 | 0.8203 | 0.7649 |
Accuracy | 0.8621 | 0.8236 | 0.8470 | 0.9206 | 0.9054 | 0.8902 | 0.9790 | 0.8175 | 0.7906 | 0.8386 | 0.8082 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.8510 | 0.9188 | 0.8862 | 0.8687 | 0.8682 | 0.9142 | 0.6167 | 0.8801 | 0.8815 | 0.8497 | 0.8261 |
Precision | 0.7718 | 0.7572 | 0.8786 | 0.9036 | 0.8391 | 0.7360 | 0.6283 | 0.7308 | 0.6849 | 0.8779 | 0.6816 |
f1 | 0.7889 | 0.8203 | 0.8777 | 0.8752 | 0.8471 | 0.7752 | 0.5960 | 0.7779 | 0.7393 | 0.8584 | 0.7213 |
Accuracy | 0.8826 | 0.8703 | 0.8808 | 0.9416 | 0.8855 | 0.9065 | 0.9836 | 0.8456 | 0.8433 | 0.8807 | 0.7883 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.8569 | 0.8843 | 0.8494 | 0.8429 | 0.8500 | 0.9152 | 0.8523 | 0.8728 | 0.8341 | 0.8316 | 0.8366 |
Precision | 0.7486 | 0.6342 | 0.6987 | 0.9263 | 0.8017 | 0.6988 | 0.7676 | 0.7513 | 0.6712 | 0.7789 | 0.7574 |
f1 | 0.7721 | 0.7096 | 0.7541 | 0.8725 | 0.8162 | 0.7318 | 0.7591 | 0.7759 | 0.7178 | 0.7989 | 0.7852 |
Accuracy | 0.8707 | 0.8259 | 0.8493 | 0.9042 | 0.8902 | 0.8902 | 0.9790 | 0.8363 | 0.8433 | 0.8468 | 0.8421 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.7678 | 0.7558 | 0.7060 | 0.8057 | 0.8116 | 0.8619 | 0.4605 | 0.9034 | 0.7361 | 0.8487 | 0.7878 |
Precision | 0.7718 | 0.7170 | 0.6565 | 0.9298 | 0.8491 | 0.7434 | 0.6282 | 0.8424 | 0.7226 | 0.9134 | 0.7159 |
f1 | 0.7539 | 0.7165 | 0.6769 | 0.8358 | 0.8292 | 0.7762 | 0.5121 | 0.8634 | 0.7205 | 0.8731 | 0.7350 |
Accuracy | 0.8696 | 0.7956 | 0.8657 | 0.9159 | 0.8902 | 0.9030 | 0.9404 | 0.8947 | 0.8082 | 0.9076 | 0.7743 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.8011 | 0.8458 | 0.5958 | 0.8061 | 0.8816 | 0.8822 | 0.6006 | 0.8855 | 0.8302 | 0.8769 | 0.8064 |
Precision | 0.8003 | 0.7680 | 0.7010 | 0.8904 | 0.8798 | 0.7647 | 0.6645 | 0.8606 | 0.7517 | 0.9264 | 0.7959 |
f1 | 0.7881 | 0.7989 | 0.6233 | 0.8210 | 0.8797 | 0.7999 | 0.6237 | 0.8685 | 0.7672 | 0.8981 | 0.8006 |
Accuracy | 0.8909 | 0.8797 | 0.8470 | 0.9054 | 0.9171 | 0.9112 | 0.9743 | 0.8819 | 0.8480 | 0.9181 | 0.8269 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.6853 | 0.8057 | 0.4683 | 0.7559 | 0.7783 | 0.4627 | 0.4834 | 0.8313 | 0.7384 | 0.7693 | 0.7592 |
Precision | 0.7127 | 0.7438 | 0.6296 | 0.8575 | 0.8731 | 0.4390 | 0.5103 | 0.7972 | 0.6589 | 0.8473 | 0.7705 |
f1 | 0.6840 | 0.7505 | 0.4803 | 0.7992 | 0.8163 | 0.4484 | 0.4920 | 0.8089 | 0.6833 | 0.7975 | 0.7633 |
Accuracy | 0.8360 | 0.8037 | 0.6787 | 0.8703 | 0.8703 | 0.8867 | 0.9778 | 0.8491 | 0.7766 | 0.8468 | 0.8000 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.8812 | 0.8462 | 0.8585 | 0.9422 | 0.9472 | 0.6923 | 0.9992 | 0.9081 | 0.9225 | 0.8507 | 0.8451 |
Precision | 0.8052 | 0.5762 | 0.8884 | 0.9392 | 0.8840 | 0.5721 | 0.9454 | 0.8371 | 0.7261 | 0.8934 | 0.7898 |
f1 | 0.8232 | 0.6212 | 0.8664 | 0.9397 | 0.9094 | 0.6047 | 0.9710 | 0.8646 | 0.7792 | 0.8636 | 0.8123 |
Accuracy | 0.9018 | 0.8014 | 0.8820 | 0.9650 | 0.9322 | 0.9089 | 0.9977 | 0.8959 | 0.8830 | 0.8982 | 0.8538 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.8329 | 0.8966 | 0.5546 | 0.9365 | 0.9078 | 0.7017 | 0.7494 | 0.9247 | 0.8797 | 0.8897 | 0.8885 |
Precision | 0.7921 | 0.8046 | 0.6795 | 0.9146 | 0.9085 | 0.5733 | 0.7079 | 0.8488 | 0.7453 | 0.9142 | 0.8243 |
f1 | 0.7967 | 0.8378 | 0.5498 | 0.9227 | 0.9058 | 0.6071 | 0.7276 | 0.8787 | 0.7891 | 0.8981 | 0.8504 |
Accuracy | 0.9051 | 0.8680 | 0.8201 | 0.9544 | 0.9276 | 0.9112 | 0.9930 | 0.9099 | 0.8655 | 0.9170 | 0.8842 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.8511 | 0.8877 | 0.5871 | 0.8953 | 0.9227 | 0.6929 | 0.9622 | 0.9202 | 0.9033 | 0.8866 | 0.8535 |
Precision | 0.8243 | 0.7813 | 0.7159 | 0.9084 | 0.9425 | 0.5772 | 0.9450 | 0.8560 | 0.7561 | 0.9182 | 0.8428 |
f1 | 0.8250 | 0.8253 | 0.6010 | 0.9003 | 0.9313 | 0.6096 | 0.9534 | 0.8800 | 0.8033 | 0.8990 | 0.8469 |
Accuracy | 0.9120 | 0.8703 | 0.8657 | 0.9439 | 0.9276 | 0.9112 | 0.9965 | 0.9123 | 0.8678 | 0.9427 | 0.8819 |
Разбиение | в среднем | 1 / 10 | 2 / 10 | 3 / 10 | 4 / 10 | 5 / 10 | 6 / 10 | 7 / 10 | 8 / 10 | 9 / 10 | 10 / 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Recall | 0.9004 | 0.9225 | 0.7452 | 0.9247 | 0.9302 | 0.9496 | 0.9622 | 0.9249 | 0.9245 | 0.8747 | 0.8457 |
Precision | 0.8630 | 0.7939 | 0.8815 | 0.9147 | 0.9369 | 0.7661 | 0.9450 | 0.8560 | 0.7866 | 0.9354 | 0.8141 |
f1 | 0.8680 | 0.8456 | 0.7778 | 0.9182 | 0.9327 | 0.8127 | 0.9534 | 0.8814 | 0.8329 | 0.8970 | 0.8282 |
Accuracy | 0.9139 | 0.8855 | 0.8633 | 0.9498 | 0.9334 | 0.9136 | 0.9965 | 0.9088 | 0.8901 | 0.9298 | 0.8678 |
Модель | Recall | Precision | macro f1 | Accuracy | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
без предобработки | нормализация | стандартизация | без предобработки | нормализация | стандартизация | без предобработки | нормализация | стандартизация | без предобработки | нормализация | стандартизация | |
Logistic regression | 0.8036 | 0.8099 | 0.8047 | 0.8065 | 0.76 | 0.788 | 0.7847 | 0.7647 | 0.781 | 0.8818 | 0.8883 | 0.893 |
Nearest centroid | 0.2617 | 0.509 | 0.5993 | 0.3527 | 0.5279 | 0.6368 | 0.2294 | 0.47 | 0.5783 | 0.2788 | 0.6571 | 0.7225 |
KNN | 0.5024 | 0.7256 | 0.7444 | 0.4545 | 0.6351 | 0.6866 | 0.4533 | 0.6514 | 0.6924 | 0.5801 | 0.8257 | 0.8113 |
Naive Bayes | 0.5695 | 0.472 | 0.4667 | 0.5977 | 0.514 | 0.5069 | 0.5253 | 0.3909 | 0.3863 | 0.5894 | 0.4135 | 0.4075 |
Passive aggressive | 0.6284 | 0.8091 | 0.7635 | 0.5668 | 0.7897 | 0.739 | 0.5423 | 0.7791 | 0.7405 | 0.6855 | 0.8565 | 0.8621 |
SVM | 0.4644 | 0.8705 | 0.851 | 0.3018 | 0.7949 | 0.7718 | 0.2953 | 0.8085 | 0.7889 | 0.5012 | 0.8968 | 0.8826 |
Ridge | 0.8586 | 0.8637 | 0.8569 | 0.7492 | 0.752 | 0.7486 | 0.7726 | 0.777 | 0.7721 | 0.8714 | 0.8698 | 0.8707 |
Perceptron | 0.5761 | 0.7992 | 0.7678 | 0.5209 | 0.7845 | 0.7718 | 0.4833 | 0.7613 | 0.7539 | 0.6314 | 0.8676 | 0.8696 |
Multi layer perceptron | 0.8011 | 0.7953 | 0.8011 | 0.7848 | 0.786 | 0.8003 | 0.7782 | 0.7796 | 0.7881 | 0.8642 | 0.8913 | 0.8909 |
Decision tree | 0.6854 | 0.6858 | 0.6853 | 0.713 | 0.7124 | 0.7127 | 0.6843 | 0.6843 | 0.684 | 0.8362 | 0.8361 | 0.836 |
Random forest | 0.883 | 0.8808 | 0.8812 | 0.8067 | 0.8043 | 0.8052 | 0.825 | 0.8224 | 0.8232 | 0.9018 | 0.9015 | 0.9018 |
Gradient boost | 0.8329 | 0.8327 | 0.8329 | 0.7921 | 0.7917 | 0.7921 | 0.7967 | 0.7964 | 0.7967 | 0.9051 | 0.9046 | 0.9051 |
XGBoost | 0.851 | 0.8511 | 0.8511 | 0.8242 | 0.8243 | 0.8243 | 0.8248 | 0.825 | 0.825 | 0.9119 | 0.912 | 0.912 |
CatBoost | 0.9004 | 0.9004 | 0.9004 | 0.863 | 0.863 | 0.863 | 0.868 | 0.868 | 0.868 | 0.9139 | 0.9139 | 0.9139 |
Перебор параметров осуществлялся по следующей сетке:
- скорость обучения (
learning_rate
): 0.04, 0.08, 0.1, 0.25, 0.5 - максимальная глубина (
max_depth
): 3, 4, 5 - число деревьев (
n_estimators
): 100, 250, 400, 600, 800 colsample_bynode
: 0.5, 0.8, 1colsample_bytree
: 0.5, 0.8, 1- методы обучения деревьев (
tree_method
): hist, approx, exact
learning rate | max depth | n estimators | colsample by node | colsample by tree | tree method | Recall | Precision | macro f1 | Accuracy |
---|---|---|---|---|---|---|---|---|---|
0.25 | 5 | 100 | 0.5 | 0.5 | exact | 0.9087 | 0.8685 | 0.8745 | 0.9177 |
0.04 | 4 | 800 | 0.5 | 0.8 | exact | 0.907 | 0.8726 | 0.8746 | 0.9178 |
0.04 | 5 | 600 | 0.5 | 0.5 | hist | 0.9072 | 0.8702 | 0.8747 | 0.9174 |
0.1 | 5 | 400 | 0.5 | 0.5 | exact | 0.9068 | 0.8686 | 0.8753 | 0.9171 |
0.1 | 4 | 400 | 0.5 | 0.5 | exact | 0.9048 | 0.8712 | 0.8754 | 0.9169 |
0.1 | 4 | 600 | 0.5 | 0.5 | exact | 0.9052 | 0.8702 | 0.8754 | 0.9172 |
0.08 | 4 | 250 | 0.5 | 0.5 | exact | 0.9063 | 0.8721 | 0.8754 | 0.9168 |
0.08 | 4 | 800 | 0.5 | 0.8 | exact | 0.9048 | 0.8734 | 0.8754 | 0.917 |
0.1 | 4 | 600 | 0.5 | 0.5 | approx | 0.9038 | 0.8738 | 0.8757 | 0.917 |
0.1 | 4 | 600 | 0.8 | 0.5 | approx | 0.9038 | 0.8738 | 0.8757 | 0.917 |
0.1 | 4 | 600 | 1.0 | 0.5 | approx | 0.9038 | 0.8738 | 0.8757 | 0.917 |
0.08 | 3 | 400 | 0.5 | 0.5 | exact | 0.9068 | 0.8731 | 0.8759 | 0.9176 |
0.1 | 5 | 250 | 0.5 | 0.5 | exact | 0.9085 | 0.8702 | 0.8761 | 0.9181 |
0.08 | 5 | 400 | 0.5 | 0.5 | exact | 0.906 | 0.8702 | 0.8762 | 0.9174 |
0.1 | 4 | 400 | 0.5 | 0.5 | approx | 0.905 | 0.8757 | 0.8762 | 0.9177 |
0.1 | 4 | 400 | 0.8 | 0.5 | approx | 0.905 | 0.8757 | 0.8762 | 0.9177 |
0.1 | 4 | 400 | 1.0 | 0.5 | approx | 0.905 | 0.8757 | 0.8762 | 0.9177 |
Перебор параметров осуществлялся по следующей сетке:
- скорость обучения (
learning_rate
): 0.04, 0.08, 0.1, 0.25, 0.5 - максимальная глубина (
max_depth
): 2, 3, 4, 5, 6 - количество итераций (
iterations
): 100, 250, 500, 800, 1000, 2000 - регуляризация листьев (
l2_leaf_regs
): 1, 3, 5, 10, 100
learning rate | max depth | iterations | l2 leaf reg | Recall | Precision | macro f1 | Accuracy |
---|---|---|---|---|---|---|---|
0.08 | 6 | 800 | 3 | 0.9018 | 0.8658 | 0.8704 | 0.9167 |
0.25 | 3 | 2000 | 5 | 0.8988 | 0.87 | 0.8708 | 0.9144 |
0.08 | 5 | 2000 | 5 | 0.9056 | 0.8645 | 0.8712 | 0.9137 |
0.08 | 6 | 2000 | 5 | 0.9064 | 0.8635 | 0.8712 | 0.9147 |
0.1 | 5 | 500 | 1 | 0.9015 | 0.8683 | 0.8715 | 0.9165 |
0.08 | 2 | 2000 | 1 | 0.9006 | 0.8676 | 0.8715 | 0.9149 |
0.08 | 3 | 2000 | 1 | 0.9033 | 0.8681 | 0.8717 | 0.9144 |
0.04 | 5 | 2000 | 3 | 0.9053 | 0.8663 | 0.8718 | 0.9147 |
0.25 | 3 | 800 | 3 | 0.9023 | 0.8687 | 0.8719 | 0.9129 |
0.08 | 6 | 1000 | 3 | 0.9062 | 0.8659 | 0.8724 | 0.9157 |
0.04 | 5 | 2000 | 1 | 0.9069 | 0.8674 | 0.8725 | 0.9161 |
0.04 | 4 | 2000 | 1 | 0.9057 | 0.8677 | 0.8725 | 0.9148 |
0.25 | 3 | 2000 | 3 | 0.8985 | 0.8734 | 0.8729 | 0.9157 |
0.1 | 4 | 2000 | 5 | 0.9051 | 0.8685 | 0.8729 | 0.9153 |
0.1 | 3 | 2000 | 3 | 0.9046 | 0.8693 | 0.8733 | 0.9147 |
0.25 | 3 | 1000 | 3 | 0.9038 | 0.8704 | 0.8735 | 0.914 |