Build a model that can rate the sentiment of a Tweet based on its content.

You'll build an NLP model to analyze Twitter sentiment about Apple and Google products. The dataset comes from CrowdFlower via data.world. Human raters rated the sentiment in over 9,000 Tweets as positive, negative, or neither.

Aim for a Proof of Concept There are many approaches to NLP problems - start with something simple and iterate from there. For example, you could start by limiting your analysis to positive and negative Tweets only, allowing you to build a binary classifier. Then you could add in the neutral Tweets to build out a multiclass classifier. You may also consider using some of the more advanced NLP methods in the Mod 4 Appendix.

Evaluation Evaluating multiclass classifiers can be trickier than binary classifiers because there are multiple ways to mis-classify an observation, and some errors are more problematic than others. Use the business problem that your NLP project sets out to solve to inform your choice of evaluation metrics.

Data: https://data.world/crowdflower/brands-and-product-emotions

# Business Understanding

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split

In [2]:
# nltk related imports
import nltk
from nltk.tokenize import RegexpTokenizer, TweetTokenizer
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer
import re

In [3]:
!ls ../../data

judge-1377884607_tweet_product_company.csv


In [4]:
df = pd.read_csv('../../data/judge-1377884607_tweet_product_company.csv', encoding = 'unicode_escape')
df

Unnamed: 0,tweet_text,emotion_in_tweet_is_directed_at,is_there_an_emotion_directed_at_a_brand_or_product
0,.@wesley83 I have a 3G iPhone. After 3 hrs twe...,iPhone,Negative emotion
1,@jessedee Know about @fludapp ? Awesome iPad/i...,iPad or iPhone App,Positive emotion
2,@swonderlin Can not wait for #iPad 2 also. The...,iPad,Positive emotion
3,@sxsw I hope this year's festival isn't as cra...,iPad or iPhone App,Negative emotion
4,@sxtxstate great stuff on Fri #SXSW: Marissa M...,Google,Positive emotion
...,...,...,...
9088,Ipad everywhere. #SXSW {link},iPad,Positive emotion
9089,"Wave, buzz... RT @mention We interrupt your re...",,No emotion toward brand or product
9090,"Google's Zeiger, a physician never reported po...",,No emotion toward brand or product
9091,Some Verizon iPhone customers complained their...,,No emotion toward brand or product


In [5]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 9093 entries, 0 to 9092
Data columns (total 3 columns):
 #   Column                                              Non-Null Count  Dtype 
---  ------                                              --------------  ----- 
 0   tweet_text                                          9092 non-null   object
 1   emotion_in_tweet_is_directed_at                     3291 non-null   object
 2   is_there_an_emotion_directed_at_a_brand_or_product  9093 non-null   object
dtypes: object(3)
memory usage: 213.2+ KB


In [6]:
first_tweet = df['tweet_text'][0]
first_tweet

'.@wesley83 I have a 3G iPhone. After 3 hrs tweeting at #RISE_Austin, it was dead!  I need to upgrade. Plugin stations at #SXSW.'

In [7]:
#lowercase
first_tweet_lower = first_tweet.lower()
first_tweet_lower

'.@wesley83 i have a 3g iphone. after 3 hrs tweeting at #rise_austin, it was dead!  i need to upgrade. plugin stations at #sxsw.'

In [8]:
#tweet tokenizer
tweet_tknzr = TweetTokenizer(strip_handles=True)
first_tweet_lower_tt = tweet_tknzr.tokenize(first_tweet_lower)
first_tweet_lower_tt

['.',
 'i',
 'have',
 'a',
 '3g',
 'iphone',
 '.',
 'after',
 '3',
 'hrs',
 'tweeting',
 'at',
 '#rise_austin',
 ',',
 'it',
 'was',
 'dead',
 '!',
 'i',
 'need',
 'to',
 'upgrade',
 '.',
 'plugin',
 'stations',
 'at',
 '#sxsw',
 '.']

In [9]:
#turn tokenized words back into tweet
first_tweet_lower_tweet = " ".join(first_tweet_lower_tt)
first_tweet_lower_tweet

'. i have a 3g iphone . after 3 hrs tweeting at #rise_austin , it was dead ! i need to upgrade . plugin stations at #sxsw .'

In [10]:
#use regexptokenizer
pattern = r"(?u)\w{2,}" # select all words with 2 or more characters
         #r"(?u)\b\w\w+\b"
         #r'\w+'     <-REMOVES PUNCTUATION
regexp_tknzr = RegexpTokenizer(pattern)
first_tweet_regexp = regexp_tknzr.tokenize(first_tweet_lower_tweet)
first_tweet_regexp

['have',
 '3g',
 'iphone',
 'after',
 'hrs',
 'tweeting',
 'at',
 'rise_austin',
 'it',
 'was',
 'dead',
 'need',
 'to',
 'upgrade',
 'plugin',
 'stations',
 'at',
 'sxsw']

In [11]:
# create list of stopwords in English
stopwords_list = stopwords.words('english')

#remove stopwords
first_tweet_sw_removed = [word for word in first_tweet_regexp if word not in stopwords_list]
first_tweet_sw_removed

['3g',
 'iphone',
 'hrs',
 'tweeting',
 'rise_austin',
 'dead',
 'need',
 'upgrade',
 'plugin',
 'stations',
 'sxsw']

In [12]:
# create lemma object
lemma = WordNetLemmatizer()
first_tweet_lemma = [lemma.lemmatize(token) for token in first_tweet_sw_removed]
first_tweet_lemma

['3g',
 'iphone',
 'hr',
 'tweeting',
 'rise_austin',
 'dead',
 'need',
 'upgrade',
 'plugin',
 'station',
 'sxsw']

# *Doing 2 Train_Test_Splits first to avoid Data Leakage.<br> Then Preprocessing each split*

In [13]:
df = df.rename(columns = {'tweet_text': 'Tweet', 
                         'emotion_in_tweet_is_directed_at': 'Product', 
                         'is_there_an_emotion_directed_at_a_brand_or_product': 'Sentiment'})
df.head() #Sanity Check

Unnamed: 0,Tweet,Product,Sentiment
0,.@wesley83 I have a 3G iPhone. After 3 hrs twe...,iPhone,Negative emotion
1,@jessedee Know about @fludapp ? Awesome iPad/i...,iPad or iPhone App,Positive emotion
2,@swonderlin Can not wait for #iPad 2 also. The...,iPad,Positive emotion
3,@sxsw I hope this year's festival isn't as cra...,iPad or iPhone App,Negative emotion
4,@sxtxstate great stuff on Fri #SXSW: Marissa M...,Google,Positive emotion


In [14]:
df.shape

(9093, 3)

In [15]:
df['Tweet'].iloc[9092]

'\x8cÏ¡\x8eÏà\x8aü_\x8b\x81Ê\x8b\x81Î\x8b\x81Ò\x8b\x81£\x8b\x81Á\x8bââ\x8b\x81_\x8b\x81£\x8b\x81\x8f\x8bâ_\x8bÛâRT @mention Google Tests \x89ÛÏCheck-in Offers\x89Û\x9d At #SXSW {link}'

In [16]:
df['Tweet'].iloc[6]

nan

In [17]:
df.drop([6, 9092], inplace=True)
df.drop_duplicates(inplace=True)
df['Tweet'].dropna(inplace=True)

In [18]:
df.isna().sum()

Tweet           0
Product      5787
Sentiment       0
dtype: int64

In [19]:
df['Sentiment'].value_counts()

No emotion toward brand or product    5374
Positive emotion                      2970
Negative emotion                       569
I can't tell                           156
Name: Sentiment, dtype: int64

In [20]:
df['Sentiment'] = df['Sentiment'].apply(lambda x: 1 if x == "Positive emotion" else 0)

In [21]:
X = df[['Tweet']]
y = df['Sentiment']
X_tr, X_test, y_tr, y_test = train_test_split(X, y, test_size=0.10, random_state=42)
X_train, X_val, y_train, y_val = train_test_split(X_tr, y_tr, test_size=0.25, random_state=42)

In [22]:
#BASELINE UNDERSTANDING
y_train.value_counts(normalize=True)

0    0.67636
1    0.32364
Name: Sentiment, dtype: float64

In [23]:
# #If we did a multi class
# dict_sent = {'No emotion toward brand or product':1, 
#              'Positive emotion':2,
#              'Negative emotion':0,
#              "I can't tell": 1}
# df['Sentiment'] = df['Sentiment'].map(dict_sent)

In [24]:
# #Preprocess targets
# y_train = y_train.apply(lambda x: 1 if x == "Positive emotion" else 0)
# y_val = y_val.apply(lambda x: 1 if x == "Positive emotion" else 0)
# y_test = y_test.apply(lambda x: 1 if x == "Positive emotion" else 0)

In [25]:
X_train.head()

Unnamed: 0,Tweet
2324,@mention Can we make you an iPhone case with T...
5632,RT @mention Come party down with @mention &amp...
1751,#winning #winning - just gave away 5 red mophi...
5799,RT @mention google &amp; facebook have an offi...
3339,Rumor of Google launching their new social net...


In [26]:
X_train.shape

(6121, 1)

In [27]:
X_val.shape

(2041, 1)

In [28]:
#Instantiate necessary tools
tokenizer = RegexpTokenizer(r"(?u)\w{3,}")
stopwords_list = stopwords.words("english")
lemma = WordNetLemmatizer()
tweet_tknzr = TweetTokenizer(strip_handles=True)

In [29]:
def clean_tweets(text):
    no_handle = tweet_tknzr.tokenize(text)
    tweet = " ".join(no_handle) 
    #remove http websites, pound sign, any words in brackets, any words with ampersand right in front
        # ?, www dot com websites, links, videos, and non english characters
    #clean = re.sub("((^|\W)@\b([-a-zA-Z0-9._]{3,25})\b) \
        #|(&[a-z]+;)|([^\w\s]) \
    clean = re.sub("(https?:\/\/\S+) \
                   |(#[A-Za-z0-9_]+) \
                   |(\{([a-zA-Z].+)\}) \
                   |(&[a-z]+;) \
                   |(www\.[a-z]?\.?(com)+|[a-z]+\.(com))\
                   |({link})\
                   |(\[video\])\
                   |([^\x00-\x7F]+\ *(?:[^\x00-\x7F]| )*)"," ", tweet)
    lower = clean.lower()
    token_list = tokenizer.tokenize(lower)
    stopwords_removed=[token for token in token_list if token not in stopwords_list]
    lemma_list = [lemma.lemmatize(token) for token in stopwords_removed]
    cleaned_string = " ".join(lemma_list) #Turn the lemma list into a string for the Vectorizer
    return cleaned_string

In [30]:
#Sanity Check
clean_tweets(X_train['Tweet'].iloc[0])

'make iphone case ttye time sxsw want show support'

In [31]:
X_train['Tweet'] = X_train['Tweet'].apply(lambda x: clean_tweets(x))
X_val['Tweet'] = X_val['Tweet'].apply(lambda x: clean_tweets(x))
X_test['Tweet'] = X_test['Tweet'].apply(lambda x: clean_tweets(x))

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  X_train['Tweet'] = X_train['Tweet'].apply(lambda x: clean_tweets(x))
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  X_val['Tweet'] = X_val['Tweet'].apply(lambda x: clean_tweets(x))
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  X_test['Tweet'] = X_test['Tweet'].apply(lambda x: clean_tweets(x))


In [32]:
#Sanity Check
X_train

Unnamed: 0,Tweet
2324,make iphone case ttye time sxsw want show support
5632,come party google tonight sxsw link band food ...
1751,winning winning gave away red mophie juice pac...
5799,google facebook official death policy vast maj...
3339,rumor google launching new social network call...
...,...
5702,even security guard austin enjoy ipad time sxs...
8604,attending sxsw want explore austin check austi...
7836,apple popup store sxsw link gonnagetanipad2
7504,putting pop apple store sxsw smart talk unders...


In [33]:
X_val

Unnamed: 0,Tweet
891,hootsuite mobile sxsw update iphone blackberry...
4198,morning hearing google circle today link sxsw
2164,great location choice nice timing ipad launch ...
1885,win ipad sxsw via sxsw link
4700,launching product sxsw plenty else join h4cker...
...,...
1033,racing around sxsw best fueling great local fa...
4186,omg still line new ipad dieing hunger sxsw els...
7735,hour sxsw popup apple store lone security guar...
8211,great app interface example moma target flipbo...


In [34]:
#Sanity Check
y_train

2324    0
5632    1
1751    0
5799    0
3339    0
       ..
5702    1
8604    0
7836    0
7504    1
3536    0
Name: Sentiment, Length: 6121, dtype: int64

In [35]:
y_val

891     0
4198    0
2164    1
1885    1
4700    0
       ..
1033    0
4186    1
7735    0
8211    1
4517    0
Name: Sentiment, Length: 2041, dtype: int64

In [36]:
#DON'T NEED BECAUSE I ADDED A LINE TO THE CLEAN_TWEETS FUNCTION

# X_train["Tweet"] = X_train["Tweet"].str.join(" ")
# X_val["Tweet"] = X_val["Tweet"].str.join(" ")
# X_test["Tweet"] = X_test["Tweet"].str.join(" ")

In [37]:
#X_train.head()

# Vectorize

In [38]:
# Import the relevant vectorizers
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer

In [39]:
c_vectorizer = CountVectorizer()
c_vectorizer.fit(X_train['Tweet'])
X_train_c_vec = c_vectorizer.transform(X_train['Tweet'])
X_train_c_vec

<6121x7145 sparse matrix of type '<class 'numpy.int64'>'
	with 63041 stored elements in Compressed Sparse Row format>

In [40]:
X_train_c_vec_df = pd.DataFrame(X_train_c_vec.toarray(), columns=c_vectorizer.get_feature_names(), 
                              index=X_train.index)
X_train_c_vec_df

Unnamed: 0,000,0310apple,100,1000,101,106,10am,10pm,10x,10x2,...,zlf,zms,zomb,zombie,zomg,zone,zoom,zuckerberg,zynga,zzzs
2324,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
5632,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1751,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
5799,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3339,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
5702,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
8604,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
7836,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
7504,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [41]:
#Sanity Check
X_train_c_vec_df['sxsw']

2324    1
5632    1
1751    1
5799    1
3339    1
       ..
5702    1
8604    1
7836    1
7504    1
3536    1
Name: sxsw, Length: 6121, dtype: int64

In [42]:
X_val_c_vec = c_vectorizer.transform(X_val['Tweet'])
X_val_c_vec_df = pd.DataFrame(X_val_c_vec.toarray())

In [43]:
tfidf_vectorizer = TfidfVectorizer()
    #max_df=.95,  # removes words that appear in more than 95% of docs
    #min_df=2     # removes words that appear 2 or fewer times
    #max_features=10
tfidf_vectorizer.fit(X_train['Tweet'])
X_train_tfidf_vec = tfidf_vectorizer.transform(X_train['Tweet'])
X_val_tfidf_vec = tfidf_vectorizer.transform(X_val['Tweet'])
X_train_tfidf_vec_df = pd.DataFrame(X_train_tfidf_vec.toarray())
X_val_tfidf_vec_df = pd.DataFrame(X_val_tfidf_vec.toarray())
X_train_tfidf_vec_df.shape

(6121, 7145)

# Simple Logistic Regression Model w/ Count Vectorizer

In [44]:
from sklearn.linear_model import LogisticRegression
lr = LogisticRegression(random_state=42)
lr.fit(X_train_c_vec_df, y_train)
print(lr.score(X_train_c_vec_df, y_train))
print(lr.score(X_val_c_vec_df, y_val))

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


0.8921744812939062
0.7398334149926507


# Simple Logistic Regression Model w/ Tfidf Vectorizer

In [45]:
lr_2 = LogisticRegression()
lr_2.fit(X_train_tfidf_vec_df, y_train)
print(lr_2.score(X_train_tfidf_vec_df, y_train))
print(lr_2.score(X_val_tfidf_vec_df, y_val))

0.800359418395687
0.7315041646251838


# Naive Bayes w/ Count Vectorizer

In [46]:
from sklearn.naive_bayes import MultinomialNB
naive_bayes = MultinomialNB()
naive_bayes.fit(X_train_c_vec_df, y_train)
print(naive_bayes.score(X_train_c_vec_df, y_train))
print(naive_bayes.score(X_val_c_vec_df, y_val))

0.8577029897075641
0.7158255756981872


# Naive Bayes w/ Tfidf Vectorizer

In [47]:
naive_bayes_2 = MultinomialNB()
naive_bayes_2.fit(X_train_tfidf_vec_df, y_train)
print(naive_bayes_2.score(X_train_tfidf_vec_df, y_train))
print(naive_bayes_2.score(X_val_tfidf_vec_df, y_val))

0.7892501225289985
0.7123958843704067


# Naive Bayes w/ Tuned Vectorizers

In [48]:
c_vectorizer_2 = CountVectorizer(max_df=.99,min_df=2, max_features=1000)
    #max_df=.95,  # removes words that appear in more than 95% of docs
    #min_df=2     # removes words that appear 2 or fewer times
c_vectorizer_2.fit(X_train['Tweet'])
X_train_c_vec_2 = c_vectorizer_2.transform(X_train['Tweet'])
X_val_c_vec_2 = c_vectorizer_2.transform(X_val['Tweet'])
X_train_c_vec_df_2 = pd.DataFrame(X_train_c_vec_2.toarray())
X_val_c_vec_df_2 = pd.DataFrame(X_val_c_vec_2.toarray())

In [49]:
naive_bayes_3 = MultinomialNB()
naive_bayes_3.fit(X_train_c_vec_df_2, y_train)
print("naive bayes with tuned count vectorizer")
print(naive_bayes_3.score(X_train_c_vec_df_2, y_train))
print(naive_bayes_3.score(X_val_c_vec_df_2, y_val))

naive bayes with tuned count vectorizer
0.7582094429014867
0.705046545810877


In [50]:
tfidf_vectorizer_2 = TfidfVectorizer(max_df=.99,min_df=0.005, max_features=1000)
tfidf_vectorizer_2.fit(X_train['Tweet'])
X_train_tfidf_vec_2 = tfidf_vectorizer_2.transform(X_train['Tweet'])
X_val_tfidf_vec_2 = tfidf_vectorizer_2.transform(X_val['Tweet'])
X_train_tfidf_vec_df_2 = pd.DataFrame(X_train_tfidf_vec_2.toarray())
X_val_tfidf_vec_df_2 = pd.DataFrame(X_val_tfidf_vec_2.toarray())

In [51]:
naive_bayes_4 = MultinomialNB()
naive_bayes_4.fit(X_train_tfidf_vec_df_2, y_train)
print("naive bayes with tuned tfidf")
print(naive_bayes_4.score(X_train_tfidf_vec_df_2, y_train))
print(naive_bayes_4.score(X_val_tfidf_vec_df_2, y_val))

naive bayes with tuned tfidf
0.7162228394053259
0.7011268985791279


# Logistic Regression with max iter = 1000

In [52]:
lr_3 = LogisticRegression(max_iter=1000)
lr_3.fit(X_train_c_vec_df, y_train)
print("default count vectorizer")
print(lr_3.score(X_train_c_vec_df, y_train))
print(lr_3.score(X_val_c_vec_df, y_val))

default count vectorizer
0.8921744812939062
0.739343459088682


In [53]:
#lr_3 = LogisticRegression(max_iter=1000)
lr_3.fit(X_train_tfidf_vec_df, y_train)
print("default tfidf vectorizer")
print(lr_3.score(X_train_tfidf_vec_df, y_train))
print(lr_3.score(X_val_tfidf_vec_df, y_val))

default tfidf vectorizer
0.800359418395687
0.7315041646251838


In [54]:
#lr_3 = LogisticRegression(max_iter=1000)
lr_3.fit(X_train_tfidf_vec_df_2, y_train)
print("tfidf vectorizer with tuned parameters")
print(lr_3.score(X_train_tfidf_vec_df_2, y_train))
print(lr_3.score(X_val_tfidf_vec_df_2, y_val))

tfidf vectorizer with tuned parameters
0.7288024832543702
0.7158255756981872


# PCA w/ Logistic Regression

In [55]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train_c_vec_df) #Use default count vectorizer
X_val_scaled = scaler.transform(X_val_c_vec_df)

In [56]:
# Code to import, instantiate and fit a PCA object
from sklearn.decomposition import PCA

pca = PCA(n_components = .95, random_state=42)
pca.fit(X_train_scaled)
pca.n_components_

2916

In [57]:
from sklearn.pipeline import Pipeline
# Construct a pipelines
pipe_lr = Pipeline([('pca', pca), 
                    ('lr', LogisticRegression(random_state=42, max_iter=1000))])
pipe_lr.fit(X_train_scaled, y_train)
print("PCA with n_components=0.95, default count vectorizer, and logistic regression")
print(pipe_lr.score(X_train_scaled, y_train))
print(pipe_lr.score(X_val_scaled, y_val))

PCA with n_components=0.95, default count vectorizer, and logistic regression
0.9516418885802973
0.6888780009799118


In [58]:
# pipe_mnb = Pipeline([('pca', pca), 
#                     ('mnb', MultinomialNB())])
# pipe_mnb.fit(X_train_scaled, y_train)
# print("PCA with n_components=0.95, default count vectorizer, and naive bayes")
# print(pipe_lr.score(X_train_scaled, y_train))
# print(pipe_lr.score(X_val_scaled, y_val))

*I got an error about MultinomialNB not having negative values*

# PCA w/ Tuned TFIDF Vectorizer

In [59]:
scaler_2 = StandardScaler()
X_train_scaled_2 = scaler_2.fit_transform(X_train_tfidf_vec_df_2) #Use tuned tfidf vectorizer
X_val_scaled_2 = scaler_2.transform(X_val_tfidf_vec_df_2)

In [60]:
pca_2 = PCA(n_components = .90, random_state=42)
pca_2.fit(X_train_scaled_2)
pca_2.n_components_

223

In [61]:
pipe_lr_2 = Pipeline([('pca2', pca_2), 
                    ('lr2', LogisticRegression(random_state=42, max_iter=1000))])
pipe_lr_2.fit(X_train_scaled_2, y_train)
print("PCA with n_components=0.95, tuned tfidf vectorizer, and naive bayes")
print(pipe_lr_2.score(X_train_scaled_2, y_train))
print(pipe_lr_2.score(X_val_scaled_2, y_val))

PCA with n_components=0.95, tuned tfidf vectorizer, and naive bayes
0.7247181833033818
0.7143557079862812


# Pipeline and Cross Validate

In [62]:
from sklearn.model_selection import GridSearchCV, cross_validate
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from sklearn.metrics import roc_auc_score, plot_confusion_matrix, plot_roc_curve
from sklearn.ensemble import RandomForestClassifier

In [63]:
pipe_logreg = Pipeline(steps=[
    ('count_vectorizer', CountVectorizer()),
    ('logreg', LogisticRegression(random_state=42))
])
cv = cross_validate(pipe_logreg, X_train['Tweet'], y_train, return_train_score=True, \
                    scoring=['accuracy', 'precision','roc_auc'])
cv

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver opt

{'fit_time': array([0.15934896, 0.13930917, 0.13503003, 0.13716602, 0.27472997]),
 'score_time': array([0.02159619, 0.02020407, 0.02023196, 0.03457999, 0.02587795]),
 'test_accuracy': array([0.74040816, 0.71486928, 0.75408497, 0.71078431, 0.74264706]),
 'train_accuracy': array([0.90093954, 0.90034715, 0.89810088, 0.90320604, 0.90055136]),
 'test_precision': array([0.64259928, 0.58545455, 0.68060837, 0.57      , 0.6539924 ]),
 'train_precision': array([0.93301812, 0.93019608, 0.92093023, 0.93843725, 0.93502377]),
 'test_roc_auc': array([0.75917814, 0.72217647, 0.7512428 , 0.72191419, 0.7299032 ]),
 'train_roc_auc': array([0.96578595, 0.96697679, 0.96357515, 0.96520491, 0.96583773])}

In [64]:
pipe_rfc = Pipeline(steps=[
    ('tfidf_vectorizer', TfidfVectorizer(max_df=.99,min_df=0.005, max_features=1000)),
    ('rfc', RandomForestClassifier(random_state=42))
])
cv = cross_validate(pipe_rfc, X_train['Tweet'], y_train, return_train_score=True, \
                    scoring=['accuracy', 'precision','roc_auc'])
cv

{'fit_time': array([1.72083116, 1.4621079 , 2.2294631 , 1.45498419, 1.37938499]),
 'score_time': array([0.10485101, 0.09080291, 0.09856582, 0.0896368 , 0.09111977]),
 'test_accuracy': array([0.71510204, 0.71568627, 0.71650327, 0.72058824, 0.70915033]),
 'train_accuracy': array([0.95118464, 0.94853992, 0.94894834, 0.94813151, 0.94935675]),
 'test_precision': array([0.59677419, 0.59022556, 0.59760956, 0.608     , 0.57936508]),
 'train_precision': array([0.96030116, 0.95997239, 0.96450939, 0.95676047, 0.96650384]),
 'test_roc_auc': array([0.71998169, 0.71112087, 0.70720185, 0.71120474, 0.70237856]),
 'train_roc_auc': array([0.99131925, 0.99003833, 0.99008938, 0.98925492, 0.98986803])}

# Grid Search for Random Forest Classifier

In [68]:
pg_rfc = {
    "rfc__max_depth" :[25, 50, 100],
    "rfc__min_samples_leaf" : [1, 3, 5],
    "rfc__n_estimators": [500, 1000, 1500],
    "rfc__class_weight" :['balanced'],
    "rfc__random_state":[42]
}
grid_rfc = GridSearchCV(estimator = pipe_rfc, param_grid=pg_rfc, scoring='accuracy',
                        return_train_score = True)
grid_rfc.fit(X_train['Tweet'], y_train)

GridSearchCV(estimator=Pipeline(steps=[('tfidf_vectorizer',
                                        TfidfVectorizer(max_df=0.99,
                                                        max_features=1000,
                                                        min_df=0.005)),
                                       ('rfc',
                                        RandomForestClassifier(random_state=42))]),
             param_grid={'rfc__class_weight': ['balanced'],
                         'rfc__max_depth': [25, 50, 100],
                         'rfc__min_samples_leaf': [1, 3, 5],
                         'rfc__n_estimators': [500, 1000, 1500],
                         'rfc__random_state': [42]},
             return_train_score=True, scoring='accuracy')

In [69]:
pd.DataFrame(grid_rfc.cv_results_)

Unnamed: 0,mean_fit_time,std_fit_time,mean_score_time,std_score_time,param_rfc__class_weight,param_rfc__max_depth,param_rfc__min_samples_leaf,param_rfc__n_estimators,param_rfc__random_state,params,...,mean_test_score,std_test_score,rank_test_score,split0_train_score,split1_train_score,split2_train_score,split3_train_score,split4_train_score,mean_train_score,std_train_score
0,2.388841,0.065414,0.100762,0.001182,balanced,25,1,500,42,"{'rfc__class_weight': 'balanced', 'rfc__max_de...",...,0.692369,0.00884,9,0.860703,0.865836,0.864407,0.870125,0.869308,0.866076,0.003422
1,4.817236,0.064181,0.198789,0.01214,balanced,25,1,1000,42,"{'rfc__class_weight': 'balanced', 'rfc__max_de...",...,0.693513,0.007407,8,0.86009,0.862569,0.864815,0.868899,0.871554,0.865585,0.004159
2,7.189433,0.435921,0.31614,0.036888,balanced,25,1,1500,42,"{'rfc__class_weight': 'balanced', 'rfc__max_de...",...,0.693513,0.008173,7,0.861315,0.864407,0.864203,0.868491,0.872779,0.866239,0.00399
3,1.669266,0.067784,0.09586,0.007585,balanced,25,3,500,42,"{'rfc__class_weight': 'balanced', 'rfc__max_de...",...,0.677339,0.009797,18,0.764502,0.769655,0.759853,0.783949,0.781295,0.771851,0.009363
4,3.224799,0.089171,0.175012,0.001245,balanced,25,3,1000,42,"{'rfc__class_weight': 'balanced', 'rfc__max_de...",...,0.679789,0.010623,16,0.763685,0.766592,0.760261,0.783337,0.782928,0.771361,0.009819
5,4.871765,0.153225,0.257637,0.002735,balanced,25,3,1500,42,"{'rfc__class_weight': 'balanced', 'rfc__max_de...",...,0.679789,0.011646,17,0.762868,0.767409,0.76067,0.783745,0.783949,0.771728,0.010131
6,1.517088,0.036262,0.090586,0.000336,balanced,25,5,500,42,"{'rfc__class_weight': 'balanced', 'rfc__max_de...",...,0.672928,0.007028,27,0.737949,0.746171,0.736778,0.762712,0.759853,0.748693,0.010815
7,3.002232,0.099937,0.17267,0.00445,balanced,25,5,1000,42,"{'rfc__class_weight': 'balanced', 'rfc__max_de...",...,0.673745,0.010299,24,0.7404,0.745559,0.737799,0.763324,0.761078,0.749632,0.010586
8,4.446752,0.072476,0.25323,0.008405,balanced,25,5,1500,42,"{'rfc__class_weight': 'balanced', 'rfc__max_de...",...,0.673744,0.012017,25,0.739992,0.74658,0.738615,0.763733,0.761282,0.75004,0.010558
9,4.420471,0.084525,0.138189,0.002086,balanced,50,1,500,42,"{'rfc__class_weight': 'balanced', 'rfc__max_de...",...,0.706257,0.004379,2,0.928309,0.933633,0.93057,0.926486,0.934245,0.930648,0.002988


In [70]:
grid_rfc.best_params_

{'rfc__class_weight': 'balanced',
 'rfc__max_depth': 50,
 'rfc__min_samples_leaf': 1,
 'rfc__n_estimators': 1000,
 'rfc__random_state': 42}

## CatBoost


In [66]:
from catboost import CatBoostClassifier

In [67]:
pipe_cbc = Pipeline(steps=[
    ('tfidf_vectorizer', TfidfVectorizer(max_df=.99,min_df=0.005, max_features=1000)),
    ('cbc', CatBoostClassifier())
])
cv = cross_validate(pipe_cbc, X_train['Tweet'], y_train, return_train_score=True, \
                    scoring=['accuracy', 'precision','roc_auc'])
cv

Learning rate set to 0.0203
0:	learn: 0.6902729	total: 74.7ms	remaining: 1m 14s
1:	learn: 0.6871030	total: 80.3ms	remaining: 40s
2:	learn: 0.6842272	total: 85.6ms	remaining: 28.5s
3:	learn: 0.6814182	total: 91ms	remaining: 22.7s
4:	learn: 0.6787201	total: 97ms	remaining: 19.3s
5:	learn: 0.6760546	total: 103ms	remaining: 17s
6:	learn: 0.6735657	total: 108ms	remaining: 15.3s
7:	learn: 0.6709612	total: 114ms	remaining: 14.1s
8:	learn: 0.6686637	total: 119ms	remaining: 13.1s
9:	learn: 0.6664258	total: 124ms	remaining: 12.3s
10:	learn: 0.6638814	total: 129ms	remaining: 11.6s
11:	learn: 0.6618314	total: 133ms	remaining: 11s
12:	learn: 0.6590975	total: 139ms	remaining: 10.5s
13:	learn: 0.6571945	total: 144ms	remaining: 10.1s
14:	learn: 0.6552338	total: 152ms	remaining: 9.97s
15:	learn: 0.6535929	total: 157ms	remaining: 9.65s
16:	learn: 0.6513700	total: 162ms	remaining: 9.36s
17:	learn: 0.6493916	total: 167ms	remaining: 9.09s
18:	learn: 0.6476792	total: 172ms	remaining: 8.86s
19:	learn: 0.6460

164:	learn: 0.5743828	total: 978ms	remaining: 4.95s
165:	learn: 0.5742294	total: 984ms	remaining: 4.94s
166:	learn: 0.5740444	total: 990ms	remaining: 4.94s
167:	learn: 0.5738296	total: 996ms	remaining: 4.93s
168:	learn: 0.5736574	total: 1s	remaining: 4.92s
169:	learn: 0.5733833	total: 1s	remaining: 4.91s
170:	learn: 0.5731974	total: 1.01s	remaining: 4.9s
171:	learn: 0.5730204	total: 1.02s	remaining: 4.89s
172:	learn: 0.5727775	total: 1.02s	remaining: 4.88s
173:	learn: 0.5726184	total: 1.03s	remaining: 4.87s
174:	learn: 0.5723910	total: 1.03s	remaining: 4.86s
175:	learn: 0.5721736	total: 1.04s	remaining: 4.85s
176:	learn: 0.5719787	total: 1.04s	remaining: 4.85s
177:	learn: 0.5717937	total: 1.05s	remaining: 4.84s
178:	learn: 0.5716016	total: 1.05s	remaining: 4.83s
179:	learn: 0.5713764	total: 1.06s	remaining: 4.83s
180:	learn: 0.5711611	total: 1.07s	remaining: 4.82s
181:	learn: 0.5710123	total: 1.07s	remaining: 4.81s
182:	learn: 0.5707883	total: 1.08s	remaining: 4.81s
183:	learn: 0.57055

336:	learn: 0.5470730	total: 1.97s	remaining: 3.87s
337:	learn: 0.5469322	total: 1.97s	remaining: 3.86s
338:	learn: 0.5468372	total: 1.98s	remaining: 3.86s
339:	learn: 0.5467255	total: 1.99s	remaining: 3.85s
340:	learn: 0.5465924	total: 1.99s	remaining: 3.85s
341:	learn: 0.5465017	total: 2s	remaining: 3.85s
342:	learn: 0.5463776	total: 2s	remaining: 3.84s
343:	learn: 0.5462430	total: 2.01s	remaining: 3.83s
344:	learn: 0.5461089	total: 2.02s	remaining: 3.83s
345:	learn: 0.5459840	total: 2.02s	remaining: 3.83s
346:	learn: 0.5458351	total: 2.03s	remaining: 3.82s
347:	learn: 0.5457078	total: 2.04s	remaining: 3.81s
348:	learn: 0.5456054	total: 2.04s	remaining: 3.81s
349:	learn: 0.5453975	total: 2.05s	remaining: 3.8s
350:	learn: 0.5452714	total: 2.05s	remaining: 3.8s
351:	learn: 0.5451338	total: 2.06s	remaining: 3.79s
352:	learn: 0.5450076	total: 2.07s	remaining: 3.79s
353:	learn: 0.5448855	total: 2.07s	remaining: 3.78s
354:	learn: 0.5447146	total: 2.08s	remaining: 3.78s
355:	learn: 0.544512

504:	learn: 0.5238901	total: 2.97s	remaining: 2.92s
505:	learn: 0.5235987	total: 3s	remaining: 2.92s
506:	learn: 0.5234846	total: 3s	remaining: 2.92s
507:	learn: 0.5232882	total: 3.01s	remaining: 2.92s
508:	learn: 0.5231640	total: 3.02s	remaining: 2.91s
509:	learn: 0.5230064	total: 3.02s	remaining: 2.9s
510:	learn: 0.5229138	total: 3.03s	remaining: 2.9s
511:	learn: 0.5227983	total: 3.03s	remaining: 2.89s
512:	learn: 0.5227367	total: 3.04s	remaining: 2.89s
513:	learn: 0.5225686	total: 3.05s	remaining: 2.88s
514:	learn: 0.5223934	total: 3.05s	remaining: 2.88s
515:	learn: 0.5222037	total: 3.06s	remaining: 2.87s
516:	learn: 0.5221111	total: 3.06s	remaining: 2.86s
517:	learn: 0.5219377	total: 3.07s	remaining: 2.85s
518:	learn: 0.5218442	total: 3.07s	remaining: 2.85s
519:	learn: 0.5216915	total: 3.08s	remaining: 2.84s
520:	learn: 0.5216130	total: 3.09s	remaining: 2.84s
521:	learn: 0.5214190	total: 3.09s	remaining: 2.83s
522:	learn: 0.5212689	total: 3.1s	remaining: 2.83s
523:	learn: 0.5211646

666:	learn: 0.5001889	total: 3.96s	remaining: 1.98s
667:	learn: 0.4999930	total: 3.97s	remaining: 1.97s
668:	learn: 0.4997276	total: 3.98s	remaining: 1.97s
669:	learn: 0.4995622	total: 3.98s	remaining: 1.96s
670:	learn: 0.4994092	total: 3.99s	remaining: 1.96s
671:	learn: 0.4992599	total: 4s	remaining: 1.95s
672:	learn: 0.4992176	total: 4s	remaining: 1.95s
673:	learn: 0.4991152	total: 4.01s	remaining: 1.94s
674:	learn: 0.4990553	total: 4.02s	remaining: 1.93s
675:	learn: 0.4988637	total: 4.02s	remaining: 1.93s
676:	learn: 0.4987385	total: 4.03s	remaining: 1.92s
677:	learn: 0.4986031	total: 4.03s	remaining: 1.92s
678:	learn: 0.4985016	total: 4.04s	remaining: 1.91s
679:	learn: 0.4983506	total: 4.04s	remaining: 1.9s
680:	learn: 0.4982120	total: 4.05s	remaining: 1.9s
681:	learn: 0.4980701	total: 4.06s	remaining: 1.89s
682:	learn: 0.4979386	total: 4.06s	remaining: 1.89s
683:	learn: 0.4977580	total: 4.07s	remaining: 1.88s
684:	learn: 0.4977152	total: 4.08s	remaining: 1.87s
685:	learn: 0.497610

832:	learn: 0.4784892	total: 4.95s	remaining: 993ms
833:	learn: 0.4783578	total: 4.96s	remaining: 987ms
834:	learn: 0.4782493	total: 4.96s	remaining: 981ms
835:	learn: 0.4781502	total: 4.97s	remaining: 975ms
836:	learn: 0.4780100	total: 4.97s	remaining: 969ms
837:	learn: 0.4779699	total: 4.98s	remaining: 963ms
838:	learn: 0.4778718	total: 4.99s	remaining: 957ms
839:	learn: 0.4777086	total: 5s	remaining: 952ms
840:	learn: 0.4775500	total: 5s	remaining: 945ms
841:	learn: 0.4774490	total: 5.01s	remaining: 940ms
842:	learn: 0.4772785	total: 5.01s	remaining: 934ms
843:	learn: 0.4771064	total: 5.02s	remaining: 927ms
844:	learn: 0.4770107	total: 5.02s	remaining: 921ms
845:	learn: 0.4769075	total: 5.03s	remaining: 915ms
846:	learn: 0.4767676	total: 5.03s	remaining: 909ms
847:	learn: 0.4766407	total: 5.04s	remaining: 903ms
848:	learn: 0.4766134	total: 5.04s	remaining: 897ms
849:	learn: 0.4765813	total: 5.05s	remaining: 892ms
850:	learn: 0.4764450	total: 5.06s	remaining: 886ms
851:	learn: 0.4763

Learning rate set to 0.020302
0:	learn: 0.6902273	total: 13ms	remaining: 13s
1:	learn: 0.6870854	total: 18.5ms	remaining: 9.24s
2:	learn: 0.6841552	total: 23.5ms	remaining: 7.8s
3:	learn: 0.6814163	total: 28.6ms	remaining: 7.12s
4:	learn: 0.6784323	total: 33.5ms	remaining: 6.67s
5:	learn: 0.6760145	total: 38.5ms	remaining: 6.38s
6:	learn: 0.6735672	total: 43.6ms	remaining: 6.19s
7:	learn: 0.6714027	total: 48.4ms	remaining: 6s
8:	learn: 0.6692465	total: 53.4ms	remaining: 5.88s
9:	learn: 0.6667630	total: 58.5ms	remaining: 5.79s
10:	learn: 0.6642402	total: 63.3ms	remaining: 5.69s
11:	learn: 0.6620496	total: 68.2ms	remaining: 5.62s
12:	learn: 0.6600278	total: 73.2ms	remaining: 5.56s
13:	learn: 0.6582009	total: 78.3ms	remaining: 5.51s
14:	learn: 0.6562898	total: 82.9ms	remaining: 5.44s
15:	learn: 0.6543239	total: 87.7ms	remaining: 5.39s
16:	learn: 0.6526984	total: 92.7ms	remaining: 5.36s
17:	learn: 0.6510703	total: 98.1ms	remaining: 5.35s
18:	learn: 0.6492303	total: 103ms	remaining: 5.31s
1

171:	learn: 0.5719663	total: 986ms	remaining: 4.75s
172:	learn: 0.5716849	total: 993ms	remaining: 4.75s
173:	learn: 0.5714873	total: 998ms	remaining: 4.74s
174:	learn: 0.5712442	total: 1s	remaining: 4.73s
175:	learn: 0.5710358	total: 1.01s	remaining: 4.73s
176:	learn: 0.5709049	total: 1.01s	remaining: 4.72s
177:	learn: 0.5706284	total: 1.02s	remaining: 4.71s
178:	learn: 0.5704527	total: 1.03s	remaining: 4.71s
179:	learn: 0.5701420	total: 1.03s	remaining: 4.7s
180:	learn: 0.5698723	total: 1.04s	remaining: 4.69s
181:	learn: 0.5697294	total: 1.04s	remaining: 4.69s
182:	learn: 0.5695621	total: 1.05s	remaining: 4.68s
183:	learn: 0.5693896	total: 1.05s	remaining: 4.68s
184:	learn: 0.5691511	total: 1.06s	remaining: 4.67s
185:	learn: 0.5689594	total: 1.07s	remaining: 4.67s
186:	learn: 0.5688717	total: 1.07s	remaining: 4.66s
187:	learn: 0.5687108	total: 1.08s	remaining: 4.65s
188:	learn: 0.5685035	total: 1.08s	remaining: 4.64s
189:	learn: 0.5683473	total: 1.09s	remaining: 4.64s
190:	learn: 0.56

337:	learn: 0.5454127	total: 1.94s	remaining: 3.8s
338:	learn: 0.5452937	total: 1.95s	remaining: 3.79s
339:	learn: 0.5451069	total: 1.95s	remaining: 3.79s
340:	learn: 0.5449915	total: 1.96s	remaining: 3.78s
341:	learn: 0.5448585	total: 1.96s	remaining: 3.78s
342:	learn: 0.5447029	total: 1.97s	remaining: 3.77s
343:	learn: 0.5446146	total: 1.98s	remaining: 3.77s
344:	learn: 0.5444189	total: 1.98s	remaining: 3.76s
345:	learn: 0.5442930	total: 1.99s	remaining: 3.75s
346:	learn: 0.5441680	total: 1.99s	remaining: 3.75s
347:	learn: 0.5440536	total: 2s	remaining: 3.74s
348:	learn: 0.5439208	total: 2s	remaining: 3.74s
349:	learn: 0.5437843	total: 2.01s	remaining: 3.73s
350:	learn: 0.5436773	total: 2.02s	remaining: 3.73s
351:	learn: 0.5435049	total: 2.02s	remaining: 3.72s
352:	learn: 0.5433935	total: 2.03s	remaining: 3.72s
353:	learn: 0.5432910	total: 2.03s	remaining: 3.71s
354:	learn: 0.5431664	total: 2.04s	remaining: 3.71s
355:	learn: 0.5430331	total: 2.05s	remaining: 3.71s
356:	learn: 0.54291

512:	learn: 0.5199488	total: 2.93s	remaining: 2.78s
513:	learn: 0.5197774	total: 2.93s	remaining: 2.77s
514:	learn: 0.5196394	total: 2.94s	remaining: 2.77s
515:	learn: 0.5193674	total: 2.94s	remaining: 2.76s
516:	learn: 0.5192329	total: 2.95s	remaining: 2.76s
517:	learn: 0.5190699	total: 2.96s	remaining: 2.75s
518:	learn: 0.5188397	total: 2.96s	remaining: 2.75s
519:	learn: 0.5187536	total: 2.97s	remaining: 2.74s
520:	learn: 0.5186798	total: 2.98s	remaining: 2.74s
521:	learn: 0.5184601	total: 2.98s	remaining: 2.73s
522:	learn: 0.5183634	total: 2.99s	remaining: 2.73s
523:	learn: 0.5182692	total: 2.99s	remaining: 2.72s
524:	learn: 0.5181077	total: 3s	remaining: 2.72s
525:	learn: 0.5178383	total: 3.01s	remaining: 2.71s
526:	learn: 0.5177874	total: 3.01s	remaining: 2.7s
527:	learn: 0.5176232	total: 3.02s	remaining: 2.7s
528:	learn: 0.5175377	total: 3.02s	remaining: 2.69s
529:	learn: 0.5173344	total: 3.03s	remaining: 2.69s
530:	learn: 0.5171596	total: 3.04s	remaining: 2.68s
531:	learn: 0.517

686:	learn: 0.4959209	total: 3.92s	remaining: 1.78s
687:	learn: 0.4957696	total: 3.92s	remaining: 1.78s
688:	learn: 0.4956684	total: 3.93s	remaining: 1.77s
689:	learn: 0.4955503	total: 3.94s	remaining: 1.77s
690:	learn: 0.4954742	total: 3.94s	remaining: 1.76s
691:	learn: 0.4953147	total: 3.95s	remaining: 1.76s
692:	learn: 0.4951547	total: 3.95s	remaining: 1.75s
693:	learn: 0.4950394	total: 3.96s	remaining: 1.74s
694:	learn: 0.4949143	total: 3.96s	remaining: 1.74s
695:	learn: 0.4947000	total: 3.97s	remaining: 1.73s
696:	learn: 0.4945134	total: 3.97s	remaining: 1.73s
697:	learn: 0.4944295	total: 3.98s	remaining: 1.72s
698:	learn: 0.4943106	total: 3.99s	remaining: 1.72s
699:	learn: 0.4939666	total: 3.99s	remaining: 1.71s
700:	learn: 0.4938470	total: 4s	remaining: 1.7s
701:	learn: 0.4936880	total: 4s	remaining: 1.7s
702:	learn: 0.4935151	total: 4.01s	remaining: 1.69s
703:	learn: 0.4934736	total: 4.01s	remaining: 1.69s
704:	learn: 0.4934375	total: 4.02s	remaining: 1.68s
705:	learn: 0.493292

861:	learn: 0.4740458	total: 4.9s	remaining: 785ms
862:	learn: 0.4739040	total: 4.91s	remaining: 779ms
863:	learn: 0.4737968	total: 4.91s	remaining: 773ms
864:	learn: 0.4736933	total: 4.92s	remaining: 767ms
865:	learn: 0.4735423	total: 4.92s	remaining: 762ms
866:	learn: 0.4734769	total: 4.93s	remaining: 756ms
867:	learn: 0.4733289	total: 4.93s	remaining: 751ms
868:	learn: 0.4732837	total: 4.94s	remaining: 745ms
869:	learn: 0.4732448	total: 4.95s	remaining: 739ms
870:	learn: 0.4730554	total: 4.95s	remaining: 734ms
871:	learn: 0.4729066	total: 4.96s	remaining: 728ms
872:	learn: 0.4727776	total: 4.96s	remaining: 722ms
873:	learn: 0.4726213	total: 4.97s	remaining: 717ms
874:	learn: 0.4725728	total: 4.97s	remaining: 711ms
875:	learn: 0.4724370	total: 4.98s	remaining: 705ms
876:	learn: 0.4723293	total: 4.99s	remaining: 699ms
877:	learn: 0.4721694	total: 4.99s	remaining: 694ms
878:	learn: 0.4719716	total: 5s	remaining: 688ms
879:	learn: 0.4718489	total: 5s	remaining: 682ms
880:	learn: 0.47165

32:	learn: 0.6306833	total: 195ms	remaining: 5.71s
33:	learn: 0.6297407	total: 201ms	remaining: 5.72s
34:	learn: 0.6286594	total: 208ms	remaining: 5.73s
35:	learn: 0.6275046	total: 214ms	remaining: 5.74s
36:	learn: 0.6265351	total: 220ms	remaining: 5.72s
37:	learn: 0.6256552	total: 225ms	remaining: 5.7s
38:	learn: 0.6245721	total: 230ms	remaining: 5.68s
39:	learn: 0.6236437	total: 236ms	remaining: 5.67s
40:	learn: 0.6227317	total: 242ms	remaining: 5.65s
41:	learn: 0.6217975	total: 247ms	remaining: 5.63s
42:	learn: 0.6209584	total: 253ms	remaining: 5.63s
43:	learn: 0.6202919	total: 257ms	remaining: 5.59s
44:	learn: 0.6195374	total: 263ms	remaining: 5.58s
45:	learn: 0.6186872	total: 268ms	remaining: 5.57s
46:	learn: 0.6180278	total: 275ms	remaining: 5.57s
47:	learn: 0.6173797	total: 281ms	remaining: 5.57s
48:	learn: 0.6165396	total: 286ms	remaining: 5.56s
49:	learn: 0.6158127	total: 291ms	remaining: 5.53s
50:	learn: 0.6150068	total: 296ms	remaining: 5.51s
51:	learn: 0.6141942	total: 302m

211:	learn: 0.5661902	total: 1.19s	remaining: 4.42s
212:	learn: 0.5660191	total: 1.19s	remaining: 4.41s
213:	learn: 0.5658818	total: 1.2s	remaining: 4.42s
214:	learn: 0.5657401	total: 1.21s	remaining: 4.42s
215:	learn: 0.5656043	total: 1.22s	remaining: 4.42s
216:	learn: 0.5654090	total: 1.22s	remaining: 4.41s
217:	learn: 0.5651809	total: 1.23s	remaining: 4.41s
218:	learn: 0.5650145	total: 1.24s	remaining: 4.41s
219:	learn: 0.5648870	total: 1.24s	remaining: 4.4s
220:	learn: 0.5647037	total: 1.25s	remaining: 4.39s
221:	learn: 0.5645141	total: 1.25s	remaining: 4.39s
222:	learn: 0.5642826	total: 1.26s	remaining: 4.39s
223:	learn: 0.5640446	total: 1.26s	remaining: 4.38s
224:	learn: 0.5638989	total: 1.27s	remaining: 4.38s
225:	learn: 0.5637001	total: 1.28s	remaining: 4.37s
226:	learn: 0.5635230	total: 1.28s	remaining: 4.37s
227:	learn: 0.5633166	total: 1.29s	remaining: 4.36s
228:	learn: 0.5631111	total: 1.29s	remaining: 4.36s
229:	learn: 0.5630083	total: 1.3s	remaining: 4.35s
230:	learn: 0.5

375:	learn: 0.5424357	total: 2.16s	remaining: 3.58s
376:	learn: 0.5423079	total: 2.16s	remaining: 3.58s
377:	learn: 0.5421473	total: 2.17s	remaining: 3.57s
378:	learn: 0.5418008	total: 2.18s	remaining: 3.56s
379:	learn: 0.5416287	total: 2.18s	remaining: 3.56s
380:	learn: 0.5415083	total: 2.19s	remaining: 3.56s
381:	learn: 0.5414315	total: 2.19s	remaining: 3.55s
382:	learn: 0.5413403	total: 2.2s	remaining: 3.54s
383:	learn: 0.5412143	total: 2.21s	remaining: 3.54s
384:	learn: 0.5411069	total: 2.21s	remaining: 3.53s
385:	learn: 0.5409609	total: 2.22s	remaining: 3.52s
386:	learn: 0.5408512	total: 2.22s	remaining: 3.52s
387:	learn: 0.5407607	total: 2.23s	remaining: 3.51s
388:	learn: 0.5406499	total: 2.23s	remaining: 3.51s
389:	learn: 0.5405200	total: 2.24s	remaining: 3.5s
390:	learn: 0.5403973	total: 2.24s	remaining: 3.5s
391:	learn: 0.5402550	total: 2.25s	remaining: 3.49s
392:	learn: 0.5400762	total: 2.26s	remaining: 3.48s
393:	learn: 0.5399669	total: 2.26s	remaining: 3.48s
394:	learn: 0.5

545:	learn: 0.5177627	total: 3.15s	remaining: 2.62s
546:	learn: 0.5175466	total: 3.15s	remaining: 2.61s
547:	learn: 0.5173697	total: 3.16s	remaining: 2.6s
548:	learn: 0.5172459	total: 3.16s	remaining: 2.6s
549:	learn: 0.5169718	total: 3.17s	remaining: 2.59s
550:	learn: 0.5168416	total: 3.17s	remaining: 2.59s
551:	learn: 0.5167674	total: 3.18s	remaining: 2.58s
552:	learn: 0.5165832	total: 3.19s	remaining: 2.58s
553:	learn: 0.5164753	total: 3.19s	remaining: 2.57s
554:	learn: 0.5164047	total: 3.2s	remaining: 2.56s
555:	learn: 0.5162474	total: 3.21s	remaining: 2.56s
556:	learn: 0.5161210	total: 3.21s	remaining: 2.55s
557:	learn: 0.5159753	total: 3.22s	remaining: 2.55s
558:	learn: 0.5158288	total: 3.22s	remaining: 2.54s
559:	learn: 0.5157561	total: 3.23s	remaining: 2.54s
560:	learn: 0.5155127	total: 3.23s	remaining: 2.53s
561:	learn: 0.5153161	total: 3.24s	remaining: 2.52s
562:	learn: 0.5152082	total: 3.24s	remaining: 2.52s
563:	learn: 0.5150244	total: 3.25s	remaining: 2.51s
564:	learn: 0.5

724:	learn: 0.4935567	total: 4.13s	remaining: 1.56s
725:	learn: 0.4934228	total: 4.13s	remaining: 1.56s
726:	learn: 0.4932902	total: 4.14s	remaining: 1.55s
727:	learn: 0.4932228	total: 4.14s	remaining: 1.55s
728:	learn: 0.4931003	total: 4.15s	remaining: 1.54s
729:	learn: 0.4929574	total: 4.16s	remaining: 1.54s
730:	learn: 0.4929115	total: 4.16s	remaining: 1.53s
731:	learn: 0.4927693	total: 4.17s	remaining: 1.52s
732:	learn: 0.4925584	total: 4.17s	remaining: 1.52s
733:	learn: 0.4924975	total: 4.18s	remaining: 1.51s
734:	learn: 0.4923848	total: 4.18s	remaining: 1.51s
735:	learn: 0.4922697	total: 4.19s	remaining: 1.5s
736:	learn: 0.4921435	total: 4.19s	remaining: 1.5s
737:	learn: 0.4920909	total: 4.2s	remaining: 1.49s
738:	learn: 0.4919829	total: 4.2s	remaining: 1.48s
739:	learn: 0.4919241	total: 4.21s	remaining: 1.48s
740:	learn: 0.4918219	total: 4.21s	remaining: 1.47s
741:	learn: 0.4917075	total: 4.22s	remaining: 1.47s
742:	learn: 0.4915533	total: 4.22s	remaining: 1.46s
743:	learn: 0.49

911:	learn: 0.4729962	total: 5.12s	remaining: 494ms
912:	learn: 0.4729474	total: 5.12s	remaining: 488ms
913:	learn: 0.4728379	total: 5.13s	remaining: 482ms
914:	learn: 0.4727995	total: 5.13s	remaining: 477ms
915:	learn: 0.4726546	total: 5.14s	remaining: 471ms
916:	learn: 0.4725416	total: 5.14s	remaining: 466ms
917:	learn: 0.4725136	total: 5.15s	remaining: 460ms
918:	learn: 0.4723191	total: 5.16s	remaining: 454ms
919:	learn: 0.4722914	total: 5.16s	remaining: 449ms
920:	learn: 0.4721990	total: 5.17s	remaining: 443ms
921:	learn: 0.4720814	total: 5.17s	remaining: 437ms
922:	learn: 0.4719720	total: 5.17s	remaining: 432ms
923:	learn: 0.4719076	total: 5.18s	remaining: 426ms
924:	learn: 0.4718047	total: 5.18s	remaining: 420ms
925:	learn: 0.4716232	total: 5.19s	remaining: 415ms
926:	learn: 0.4715178	total: 5.2s	remaining: 409ms
927:	learn: 0.4714743	total: 5.2s	remaining: 404ms
928:	learn: 0.4713427	total: 5.21s	remaining: 398ms
929:	learn: 0.4711646	total: 5.21s	remaining: 392ms
930:	learn: 0.

74:	learn: 0.5985130	total: 391ms	remaining: 4.83s
75:	learn: 0.5980683	total: 396ms	remaining: 4.82s
76:	learn: 0.5977030	total: 402ms	remaining: 4.81s
77:	learn: 0.5972951	total: 407ms	remaining: 4.81s
78:	learn: 0.5969679	total: 413ms	remaining: 4.81s
79:	learn: 0.5965399	total: 418ms	remaining: 4.81s
80:	learn: 0.5960755	total: 423ms	remaining: 4.8s
81:	learn: 0.5954269	total: 429ms	remaining: 4.8s
82:	learn: 0.5951604	total: 434ms	remaining: 4.8s
83:	learn: 0.5947238	total: 439ms	remaining: 4.79s
84:	learn: 0.5943744	total: 444ms	remaining: 4.78s
85:	learn: 0.5938997	total: 450ms	remaining: 4.78s
86:	learn: 0.5935348	total: 455ms	remaining: 4.78s
87:	learn: 0.5932877	total: 461ms	remaining: 4.77s
88:	learn: 0.5929977	total: 466ms	remaining: 4.77s
89:	learn: 0.5926690	total: 471ms	remaining: 4.76s
90:	learn: 0.5921793	total: 475ms	remaining: 4.75s
91:	learn: 0.5917200	total: 481ms	remaining: 4.74s
92:	learn: 0.5913588	total: 487ms	remaining: 4.75s
93:	learn: 0.5909670	total: 492ms	

263:	learn: 0.5547849	total: 1.38s	remaining: 3.85s
264:	learn: 0.5546230	total: 1.39s	remaining: 3.86s
265:	learn: 0.5545470	total: 1.4s	remaining: 3.85s
266:	learn: 0.5544738	total: 1.4s	remaining: 3.85s
267:	learn: 0.5543150	total: 1.41s	remaining: 3.84s
268:	learn: 0.5542071	total: 1.41s	remaining: 3.84s
269:	learn: 0.5540790	total: 1.42s	remaining: 3.83s
270:	learn: 0.5538886	total: 1.42s	remaining: 3.83s
271:	learn: 0.5536703	total: 1.43s	remaining: 3.82s
272:	learn: 0.5534618	total: 1.43s	remaining: 3.82s
273:	learn: 0.5532829	total: 1.44s	remaining: 3.81s
274:	learn: 0.5531324	total: 1.44s	remaining: 3.81s
275:	learn: 0.5529823	total: 1.45s	remaining: 3.8s
276:	learn: 0.5528900	total: 1.45s	remaining: 3.79s
277:	learn: 0.5527477	total: 1.46s	remaining: 3.79s
278:	learn: 0.5526185	total: 1.46s	remaining: 3.78s
279:	learn: 0.5524844	total: 1.47s	remaining: 3.78s
280:	learn: 0.5523221	total: 1.47s	remaining: 3.77s
281:	learn: 0.5521839	total: 1.48s	remaining: 3.77s
282:	learn: 0.5

448:	learn: 0.5278418	total: 2.37s	remaining: 2.91s
449:	learn: 0.5277303	total: 2.38s	remaining: 2.9s
450:	learn: 0.5276405	total: 2.38s	remaining: 2.9s
451:	learn: 0.5275264	total: 2.39s	remaining: 2.9s
452:	learn: 0.5273457	total: 2.39s	remaining: 2.89s
453:	learn: 0.5272232	total: 2.4s	remaining: 2.88s
454:	learn: 0.5270873	total: 2.4s	remaining: 2.88s
455:	learn: 0.5269740	total: 2.41s	remaining: 2.87s
456:	learn: 0.5268406	total: 2.41s	remaining: 2.87s
457:	learn: 0.5267424	total: 2.42s	remaining: 2.86s
458:	learn: 0.5266285	total: 2.42s	remaining: 2.86s
459:	learn: 0.5264671	total: 2.43s	remaining: 2.85s
460:	learn: 0.5262985	total: 2.43s	remaining: 2.85s
461:	learn: 0.5261816	total: 2.44s	remaining: 2.84s
462:	learn: 0.5260350	total: 2.45s	remaining: 2.84s
463:	learn: 0.5257797	total: 2.45s	remaining: 2.83s
464:	learn: 0.5255442	total: 2.46s	remaining: 2.83s
465:	learn: 0.5254099	total: 2.46s	remaining: 2.82s
466:	learn: 0.5253101	total: 2.47s	remaining: 2.81s
467:	learn: 0.525

637:	learn: 0.4990974	total: 3.36s	remaining: 1.91s
638:	learn: 0.4988589	total: 3.36s	remaining: 1.9s
639:	learn: 0.4986950	total: 3.37s	remaining: 1.9s
640:	learn: 0.4985292	total: 3.38s	remaining: 1.89s
641:	learn: 0.4983931	total: 3.38s	remaining: 1.89s
642:	learn: 0.4983183	total: 3.39s	remaining: 1.88s
643:	learn: 0.4982238	total: 3.39s	remaining: 1.87s
644:	learn: 0.4980878	total: 3.4s	remaining: 1.87s
645:	learn: 0.4979013	total: 3.4s	remaining: 1.86s
646:	learn: 0.4977136	total: 3.41s	remaining: 1.86s
647:	learn: 0.4976625	total: 3.41s	remaining: 1.85s
648:	learn: 0.4974741	total: 3.42s	remaining: 1.85s
649:	learn: 0.4972866	total: 3.42s	remaining: 1.84s
650:	learn: 0.4971411	total: 3.43s	remaining: 1.84s
651:	learn: 0.4969740	total: 3.43s	remaining: 1.83s
652:	learn: 0.4969009	total: 3.44s	remaining: 1.83s
653:	learn: 0.4968561	total: 3.44s	remaining: 1.82s
654:	learn: 0.4968101	total: 3.45s	remaining: 1.81s
655:	learn: 0.4966139	total: 3.45s	remaining: 1.81s
656:	learn: 0.49

826:	learn: 0.4738766	total: 4.35s	remaining: 910ms
827:	learn: 0.4737332	total: 4.36s	remaining: 905ms
828:	learn: 0.4736544	total: 4.36s	remaining: 900ms
829:	learn: 0.4736059	total: 4.37s	remaining: 895ms
830:	learn: 0.4734992	total: 4.37s	remaining: 889ms
831:	learn: 0.4733016	total: 4.38s	remaining: 884ms
832:	learn: 0.4732018	total: 4.38s	remaining: 879ms
833:	learn: 0.4730514	total: 4.39s	remaining: 873ms
834:	learn: 0.4728626	total: 4.39s	remaining: 868ms
835:	learn: 0.4726961	total: 4.4s	remaining: 863ms
836:	learn: 0.4725978	total: 4.4s	remaining: 858ms
837:	learn: 0.4724500	total: 4.41s	remaining: 852ms
838:	learn: 0.4723619	total: 4.41s	remaining: 847ms
839:	learn: 0.4722123	total: 4.42s	remaining: 842ms
840:	learn: 0.4721256	total: 4.42s	remaining: 836ms
841:	learn: 0.4719194	total: 4.43s	remaining: 831ms
842:	learn: 0.4717124	total: 4.43s	remaining: 826ms
843:	learn: 0.4715726	total: 4.44s	remaining: 821ms
844:	learn: 0.4714618	total: 4.44s	remaining: 815ms
845:	learn: 0.

Learning rate set to 0.020302
0:	learn: 0.6899225	total: 5.55ms	remaining: 5.54s
1:	learn: 0.6872096	total: 10.5ms	remaining: 5.23s
2:	learn: 0.6841753	total: 16.2ms	remaining: 5.38s
3:	learn: 0.6815958	total: 21ms	remaining: 5.24s
4:	learn: 0.6788626	total: 26.3ms	remaining: 5.22s
5:	learn: 0.6761513	total: 31.7ms	remaining: 5.24s
6:	learn: 0.6739681	total: 36.7ms	remaining: 5.21s
7:	learn: 0.6717714	total: 41.9ms	remaining: 5.2s
8:	learn: 0.6693994	total: 47.8ms	remaining: 5.26s
9:	learn: 0.6670572	total: 53.1ms	remaining: 5.25s
10:	learn: 0.6645505	total: 58.2ms	remaining: 5.24s
11:	learn: 0.6624714	total: 63.3ms	remaining: 5.21s
12:	learn: 0.6601922	total: 68.1ms	remaining: 5.17s
13:	learn: 0.6582470	total: 73ms	remaining: 5.14s
14:	learn: 0.6561337	total: 78.8ms	remaining: 5.18s
15:	learn: 0.6544670	total: 84ms	remaining: 5.17s
16:	learn: 0.6524769	total: 88.9ms	remaining: 5.14s
17:	learn: 0.6505106	total: 94.1ms	remaining: 5.13s
18:	learn: 0.6486894	total: 98.7ms	remaining: 5.1s


186:	learn: 0.5682562	total: 985ms	remaining: 4.28s
187:	learn: 0.5681420	total: 990ms	remaining: 4.28s
188:	learn: 0.5679159	total: 996ms	remaining: 4.27s
189:	learn: 0.5677119	total: 1s	remaining: 4.27s
190:	learn: 0.5675360	total: 1.01s	remaining: 4.27s
191:	learn: 0.5673127	total: 1.01s	remaining: 4.26s
192:	learn: 0.5671884	total: 1.02s	remaining: 4.25s
193:	learn: 0.5669935	total: 1.02s	remaining: 4.25s
194:	learn: 0.5668114	total: 1.03s	remaining: 4.25s
195:	learn: 0.5666631	total: 1.03s	remaining: 4.24s
196:	learn: 0.5664701	total: 1.04s	remaining: 4.23s
197:	learn: 0.5663307	total: 1.04s	remaining: 4.23s
198:	learn: 0.5661342	total: 1.05s	remaining: 4.22s
199:	learn: 0.5659561	total: 1.05s	remaining: 4.22s
200:	learn: 0.5657451	total: 1.06s	remaining: 4.21s
201:	learn: 0.5655844	total: 1.07s	remaining: 4.21s
202:	learn: 0.5654554	total: 1.07s	remaining: 4.2s
203:	learn: 0.5652260	total: 1.08s	remaining: 4.2s
204:	learn: 0.5650266	total: 1.08s	remaining: 4.2s
205:	learn: 0.5648

369:	learn: 0.5391532	total: 1.97s	remaining: 3.35s
370:	learn: 0.5390291	total: 1.97s	remaining: 3.34s
371:	learn: 0.5388872	total: 1.98s	remaining: 3.34s
372:	learn: 0.5387474	total: 1.98s	remaining: 3.33s
373:	learn: 0.5385898	total: 1.99s	remaining: 3.33s
374:	learn: 0.5384595	total: 1.99s	remaining: 3.32s
375:	learn: 0.5383263	total: 2s	remaining: 3.32s
376:	learn: 0.5382208	total: 2s	remaining: 3.31s
377:	learn: 0.5380499	total: 2.01s	remaining: 3.31s
378:	learn: 0.5379718	total: 2.02s	remaining: 3.3s
379:	learn: 0.5378071	total: 2.02s	remaining: 3.3s
380:	learn: 0.5377114	total: 2.03s	remaining: 3.29s
381:	learn: 0.5375717	total: 2.03s	remaining: 3.29s
382:	learn: 0.5374367	total: 2.04s	remaining: 3.28s
383:	learn: 0.5373721	total: 2.04s	remaining: 3.28s
384:	learn: 0.5372298	total: 2.05s	remaining: 3.27s
385:	learn: 0.5371058	total: 2.05s	remaining: 3.27s
386:	learn: 0.5368657	total: 2.06s	remaining: 3.26s
387:	learn: 0.5366595	total: 2.06s	remaining: 3.26s
388:	learn: 0.536423

548:	learn: 0.5136982	total: 2.95s	remaining: 2.43s
549:	learn: 0.5135316	total: 2.96s	remaining: 2.42s
550:	learn: 0.5133147	total: 2.96s	remaining: 2.42s
551:	learn: 0.5132276	total: 2.97s	remaining: 2.41s
552:	learn: 0.5131259	total: 2.98s	remaining: 2.41s
553:	learn: 0.5129396	total: 2.98s	remaining: 2.4s
554:	learn: 0.5127467	total: 2.99s	remaining: 2.4s
555:	learn: 0.5126047	total: 3s	remaining: 2.39s
556:	learn: 0.5124110	total: 3s	remaining: 2.39s
557:	learn: 0.5122568	total: 3.01s	remaining: 2.38s
558:	learn: 0.5121007	total: 3.02s	remaining: 2.38s
559:	learn: 0.5120518	total: 3.02s	remaining: 2.38s
560:	learn: 0.5118808	total: 3.03s	remaining: 2.37s
561:	learn: 0.5115696	total: 3.04s	remaining: 2.37s
562:	learn: 0.5114670	total: 3.04s	remaining: 2.36s
563:	learn: 0.5112929	total: 3.05s	remaining: 2.36s
564:	learn: 0.5110562	total: 3.05s	remaining: 2.35s
565:	learn: 0.5109696	total: 3.06s	remaining: 2.35s
566:	learn: 0.5108314	total: 3.07s	remaining: 2.34s
567:	learn: 0.510751

713:	learn: 0.4906949	total: 3.93s	remaining: 1.57s
714:	learn: 0.4905955	total: 3.94s	remaining: 1.57s
715:	learn: 0.4904146	total: 3.94s	remaining: 1.56s
716:	learn: 0.4902800	total: 3.95s	remaining: 1.56s
717:	learn: 0.4901075	total: 3.95s	remaining: 1.55s
718:	learn: 0.4900613	total: 3.96s	remaining: 1.55s
719:	learn: 0.4899103	total: 3.97s	remaining: 1.54s
720:	learn: 0.4897403	total: 3.97s	remaining: 1.54s
721:	learn: 0.4896141	total: 3.98s	remaining: 1.53s
722:	learn: 0.4894683	total: 3.98s	remaining: 1.53s
723:	learn: 0.4892702	total: 3.99s	remaining: 1.52s
724:	learn: 0.4890458	total: 4s	remaining: 1.51s
725:	learn: 0.4888808	total: 4s	remaining: 1.51s
726:	learn: 0.4887065	total: 4.01s	remaining: 1.5s
727:	learn: 0.4885242	total: 4.01s	remaining: 1.5s
728:	learn: 0.4883240	total: 4.02s	remaining: 1.49s
729:	learn: 0.4882071	total: 4.02s	remaining: 1.49s
730:	learn: 0.4880374	total: 4.03s	remaining: 1.48s
731:	learn: 0.4878307	total: 4.04s	remaining: 1.48s
732:	learn: 0.487687

876:	learn: 0.4694757	total: 4.9s	remaining: 688ms
877:	learn: 0.4694458	total: 4.91s	remaining: 682ms
878:	learn: 0.4693867	total: 4.92s	remaining: 677ms
879:	learn: 0.4692150	total: 4.92s	remaining: 671ms
880:	learn: 0.4691657	total: 4.93s	remaining: 667ms
881:	learn: 0.4690114	total: 4.94s	remaining: 661ms
882:	learn: 0.4688715	total: 4.95s	remaining: 655ms
883:	learn: 0.4687446	total: 4.95s	remaining: 650ms
884:	learn: 0.4687182	total: 4.96s	remaining: 644ms
885:	learn: 0.4685848	total: 4.96s	remaining: 639ms
886:	learn: 0.4684866	total: 4.97s	remaining: 633ms
887:	learn: 0.4682836	total: 4.97s	remaining: 627ms
888:	learn: 0.4682554	total: 4.98s	remaining: 622ms
889:	learn: 0.4680381	total: 4.99s	remaining: 616ms
890:	learn: 0.4679063	total: 4.99s	remaining: 611ms
891:	learn: 0.4676606	total: 5s	remaining: 605ms
892:	learn: 0.4674881	total: 5s	remaining: 600ms
893:	learn: 0.4673504	total: 5.01s	remaining: 594ms
894:	learn: 0.4672438	total: 5.01s	remaining: 588ms
895:	learn: 0.46717

{'fit_time': array([6.25341892, 5.99257088, 5.89370108, 5.49667788, 5.95950818]),
 'score_time': array([0.03642988, 0.02947021, 0.02756095, 0.02810717, 0.03046703]),
 'test_accuracy': array([0.73632653, 0.72794118, 0.73202614, 0.71813725, 0.71078431]),
 'train_accuracy': array([0.79758987, 0.80416582, 0.79354707, 0.80437002, 0.79844803]),
 'test_precision': array([0.73717949, 0.68      , 0.72077922, 0.64912281, 0.62352941]),
 'train_precision': array([0.91468531, 0.92297297, 0.92082111, 0.92307692, 0.92112676]),
 'test_roc_auc': array([0.73420825, 0.7268061 , 0.72506771, 0.7153342 , 0.7136568 ]),
 'train_roc_auc': array([0.88700191, 0.88634704, 0.88297282, 0.88746752, 0.88956476])}