## Can you use the following data to build....?
1. A model with an embedding layer and dense layers (but w/ no layers meant for sequential data)
2.  A model using Conv1d Layers
3.  A model with one sequential layer (LSTM or GRU)
4. A model with stacked sequential layers (LSTM or GRU)
5. A model with bidirectional sequential layers 

### After choosing a model, feed it some realistic tweets that are not from your training data to see if it returns meaningful/useful results.






•	Citation of paper providing original dataset:  Shahi, Gautam Kishore, Anne Dirkson, and Tim A. Majchrzak. "An exploratory study of covid-19 misinformation on twitter." Online Social Networks and Media 22 (2021): 100104.

## Import Data

In [2]:
#Source:Fighting an Infodemic: COVID-19 Fake News Dataset, https://github.com/diptamath/covid_fake_news,https://arxiv.org/abs/2011.03327 

import pandas as pd
trainingdata=pd.read_csv("https://raw.githubusercontent.com/diptamath/covid_fake_news/main/data/Constraint_Train.csv", usecols = ['tweet','label'])
testdata=pd.read_csv("https://raw.githubusercontent.com/diptamath/covid_fake_news/main/data/english_test_with_labels.csv", usecols = ['tweet','label'])

trainingdata

Unnamed: 0,tweet,label
0,The CDC currently reports 99031 deaths. In gen...,real
1,States reported 1121 deaths a small rise from ...,real
2,Politically Correct Woman (Almost) Uses Pandem...,fake
3,#IndiaFightsCorona: We have 1524 #COVID testin...,real
4,Populous states can generate large case counts...,real
...,...,...
6415,A tiger tested positive for COVID-19 please st...,fake
6416,???Autopsies prove that COVID-19 is??� a blood...,fake
6417,_A post claims a COVID-19 vaccine has already ...,fake
6418,Aamir Khan Donate 250 Cr. In PM Relief Cares Fund,fake


•	Present examples of tweets from the dataset that demonstrate real information or misinformation.

In [3]:
contains_real = trainingdata['label'].str.contains("real")

#contains_fake = trainingdata['label'].str.contains("fake")

filtered_real = trainingdata[contains_real]
filtered_real.head()

Unnamed: 0,tweet,label
0,The CDC currently reports 99031 deaths. In gen...,real
1,States reported 1121 deaths a small rise from ...,real
3,#IndiaFightsCorona: We have 1524 #COVID testin...,real
4,Populous states can generate large case counts...,real
5,"Covid Act Now found ""on average each person in...",real


In [4]:
contains_fake = trainingdata['label'].str.contains("fake")

filtered_fake = trainingdata[contains_fake]
filtered_fake.head()

Unnamed: 0,tweet,label
2,Politically Correct Woman (Almost) Uses Pandem...,fake
7,Obama Calls Trump’s Coronavirus Response A Cha...,fake
8,"???Clearly, the Obama administration did not l...",fake
9,Retraction—Hydroxychloroquine or chloroquine w...,fake
11,The NBA is poised to restart this month. In Ma...,fake


•	Discuss the dataset in general terms and describe why building a predictive model using this data might be practically useful.  Who could benefit from a model like this? Explain.

The dataset contains 6420 rows and 2 columns. The tweet column contains text information of tweets while the label means the information being tweeted is real or fake.

Building a predictive model using this data will help predict the real and fake information in past and real time tweets. 

Social network content security team will benefit a model like this, since they can closely monitor what is being said on the platform, and stop the spread of misinformation early on to avoid mass distribution. e.g. People might be misguided because of some misinformation on the internet about inaccurate virus information which could harm public health system. Content supervision team will be able to flag inaccurate content more promptly and secure a safer internet environment.

## Define Preprocessor

In [5]:
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
import numpy as np

# Build vocabulary from training text data
tokenizer = Tokenizer(num_words=10000)
tokenizer.fit_on_texts(trainingdata.tweet)

# preprocessor tokenizes words and makes sure all documents have the same length
def preprocessor(data, maxlen, max_words):

    sequences = tokenizer.texts_to_sequences(data)

    word_index = tokenizer.word_index
    X = pad_sequences(sequences, maxlen=maxlen)

    return X

## Prepare Train and Test Data

In [6]:
# tokenize and pad X data
X_train = preprocessor(trainingdata.tweet, maxlen=40, max_words=10000)
X_test = preprocessor(testdata.tweet, maxlen=40, max_words=10000)

# ohe encode Y data
y_train = pd.get_dummies(trainingdata.label)
y_test = pd.get_dummies(testdata.label)

In [7]:
print(X_train.shape)
print(X_test.shape)

(6420, 40)
(2140, 40)


## Train Placeholder Model 

In [8]:
from tensorflow.keras.layers import Dense, Embedding,Flatten
from tensorflow.keras.models import Sequential

# replace this model with the architectures from the task description
model = Sequential()
model.add(Embedding(10000, 16, input_length=40))
model.add(Flatten())
model.add(Dense(2, activation='softmax'))

model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['acc'])

history = model.fit(X_train, y_train,
                    epochs=10,
                    batch_size=32,
                    validation_split=0.2)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [9]:
# format y_pred as labels 
y_pred = model.predict(X_test).argmax(axis=1)
predicted_labels = [y_test.columns[i] for i in y_pred]
predicted_labels[0:10]

['real',
 'fake',
 'fake',
 'real',
 'real',
 'fake',
 'real',
 'real',
 'real',
 'real']

# build models





In [10]:
# tokenize and pad X data
X_train = preprocessor(trainingdata.tweet, maxlen=40, max_words=10000)
X_test = preprocessor(testdata.tweet, maxlen=40, max_words=10000)

# ohe encode Y data
y_train = pd.get_dummies(trainingdata.label)
y_test = pd.get_dummies(testdata.label)

In [11]:
# Example 1: simple RNN
from tensorflow.keras.layers import SimpleRNN, LSTM
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, SimpleRNN

model_1 = Sequential()
model_1.add(Embedding(10000, 32, input_length=40))
model_1.add(SimpleRNN(32))
model_1.add(Dense(2, activation='sigmoid'))

model_1.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])
history = model_1.fit(X_train, y_train,
                    epochs=10,
                    batch_size=32,
                    validation_split=0.2)

# Small training data.  Increase for model improvement

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [12]:
# format y_pred as labels 
y_pred = model_1.predict(X_test).argmax(axis=1)
predicted_labels = [y_test.columns[i] for i in y_pred]
predicted_labels[0:10]

['real',
 'fake',
 'fake',
 'real',
 'real',
 'fake',
 'real',
 'real',
 'real',
 'real']

In [13]:
# Example 2: Stacked RNN layers

model_2 = Sequential()
model_2.add(Embedding(10000, 32, input_length=40))
model_2.add(LSTM(32, return_sequences=True))
model_2.add(SimpleRNN(32, return_sequences=True))
model_2.add(SimpleRNN(32, return_sequences=True))
model_2.add(SimpleRNN(32))
model_2.add(Dense(2, activation='sigmoid'))

model_2.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])
history = model_2.fit(X_train, y_train,
                    epochs=10,
                    batch_size=32,
                    validation_split=0.2)

# Small training data.  Increase for model improvement

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [14]:
# format y_pred as labels 
y_pred = model_2.predict(X_test).argmax(axis=1)
predicted_labels = [y_test.columns[i] for i in y_pred]
predicted_labels[0:10]

['real',
 'fake',
 'fake',
 'real',
 'real',
 'fake',
 'real',
 'fake',
 'real',
 'real']

In [15]:
# Example 3: LSTM layer

model_3 = Sequential()
model_3.add(Embedding(10000, 32, input_length=40))
model_3.add(LSTM(32))
model_3.add(Dense(2, activation='softmax'))

model_3.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])
history = model_3.fit(X_train, y_train,
                    epochs=10,
                    batch_size=32,
                    validation_split=0.2)

# Small training data.  Increase for model improvement

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [16]:
# format y_pred as labels 
y_pred = model_3.predict(X_test).argmax(axis=1)
predicted_labels = [y_test.columns[i] for i in y_pred]
predicted_labels[0:10]

['real',
 'fake',
 'fake',
 'real',
 'real',
 'real',
 'real',
 'fake',
 'real',
 'real']

In [17]:
#Example 4: Bidirectional LSTM
from tensorflow.keras.layers import Embedding, Bidirectional

model_4 = Sequential()
model_4.add(Embedding(10000, 32, input_length=40))
model_4.add(Bidirectional(LSTM(32)))
model_4.add(Dense(2, activation='sigmoid'))

model_4.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])
history = model_4.fit(X_train, y_train, 
                      epochs=10, 
                      batch_size=128, 
                      validation_split=0.2)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [18]:
# format y_pred as labels 
y_pred = model_4.predict(X_test).argmax(axis=1)
predicted_labels = [y_test.columns[i] for i in y_pred]
predicted_labels[0:10]

['real',
 'fake',
 'fake',
 'real',
 'real',
 'fake',
 'real',
 'real',
 'real',
 'real']

In [19]:
#Example 5: LSTM with dropout layer
model_5 = Sequential()
model_5.add(Embedding(10000, 32, input_length=40))
model_5.add(LSTM(128, dropout=0.2, recurrent_dropout=0.2)) 
model_5.add(Dense(2, activation='sigmoid'))

# try using different optimizers and different optimizer configs

model_5.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])
history = model_5.fit(X_train, y_train,
                    epochs=15,
                    batch_size=32,
                    validation_split=0.2)


Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


In [20]:
model_5.summary()

Model: "sequential_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_5 (Embedding)      (None, 40, 32)            320000    
_________________________________________________________________
lstm_3 (LSTM)                (None, 128)               82432     
_________________________________________________________________
dense_5 (Dense)              (None, 2)                 258       
Total params: 402,690
Trainable params: 402,690
Non-trainable params: 0
_________________________________________________________________


In [21]:
# format y_pred as labels 
y_pred = model_5.predict(X_test).argmax(axis=1)
predicted_labels = [y_test.columns[i] for i in y_pred]
predicted_labels[0:10]

['real',
 'fake',
 'fake',
 'real',
 'real',
 'real',
 'real',
 'fake',
 'real',
 'real']

In [22]:
# Use 1D Conv layer rather than RNN or LSTM or GRU to fit model
# Why? Much lighter model to fit. Here we are training on the full dataset.  If you try
# to build a model using LSTM code after running this one it will be much slower.

from tensorflow.keras.models import Sequential
from tensorflow.keras import layers
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.layers import SimpleRNN, LSTM,Embedding

model_6 = Sequential()
model_6.add(layers.Embedding(10000, 8, input_length=40))
model_6.add(layers.Conv1D(32, 7, activation='relu')) 
model_6.add(layers.MaxPooling1D(5)) #
model_6.add(layers.Conv1D(32, 6, activation='relu'))
model_6.add(layers.GlobalMaxPooling1D())
model_6.add(layers.Dense(2))

model_6.summary()




Model: "sequential_6"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_6 (Embedding)      (None, 40, 8)             80000     
_________________________________________________________________
conv1d (Conv1D)              (None, 34, 32)            1824      
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 6, 32)             0         
_________________________________________________________________
conv1d_1 (Conv1D)            (None, 1, 32)             6176      
_________________________________________________________________
global_max_pooling1d (Global (None, 32)                0         
_________________________________________________________________
dense_6 (Dense)              (None, 2)                 66        
Total params: 88,066
Trainable params: 88,066
Non-trainable params: 0
__________________________________________________

In [23]:
model_6.compile(optimizer=RMSprop(lr=1e-4),
              loss='binary_crossentropy',
              metrics=['acc'])
history = model_6.fit(X_train, y_train,
                    epochs=50,
                    batch_size=128,
                    validation_split=0.2)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


In [24]:
# format y_pred as labels 
y_pred = model_6.predict(X_test).argmax(axis=1)
predicted_labels = [y_test.columns[i] for i in y_pred]
predicted_labels[0:10]

['real',
 'fake',
 'fake',
 'real',
 'fake',
 'fake',
 'real',
 'fake',
 'real',
 'real']

Discuss which models performed better and point out relevant hyper-parameter values for successful models: 

My best model is the 5th model; it is LSTM with dropout layer. It has 3 layers in total: Embedding, LSTM and Dense. It has 402,690 parameters. The embedding takes in 10,000 32, and input length of 40 as the parameter. LSTM layer has dropout value of 0.2; and dense layer uses sigmoid activation function. It uses optimizer of rmsprop, and loss function of binary_crossentropy. 

## Submit Model

In [25]:
# install aimodelshare library
%%capture
! pip install aimodelshare --upgrade --extra-index-url https://test.pypi.org/simple/ 

In [26]:
import aimodelshare as ai
from aimodelshare.aimsonnx import model_to_onnx
from aimodelshare.aimsonnx  import instantiate_model


In [27]:
# save preprocessor
ai.export_preprocessor(preprocessor,"")

In [28]:
# save model in onnx format
onnx_model = model_to_onnx(model_5, framework='keras',
                          transfer_learning=False,
                          deep_learning=True)

with open("onnx_model.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())

INFO:tensorflow:Assets written to: /tmp/assets


In [29]:
# set credentials for modeltoapi function 
# make sure you have uploaded your credentials.txt file
from aimodelshare.aws import set_credentials
api_url = "https://wvr23l2z9i.execute-api.us-east-1.amazonaws.com/prod/m"

set_credentials(apiurl=api_url,credential_file="credentials.txt", type="submit_model", manual=False)

AI Model Share login credentials set successfully.
AWS credentials set successfully.


In [30]:
# submit model and predictions to competition
ai.submit_model("onnx_model.onnx",
                api_url,
                prediction_submission=predicted_labels,
                preprocessor="preprocessor.zip")

'Your model has been submitted as model version 77'

In [31]:
# check leaderboard
data=ai.get_leaderboard(api_url, verbose=3)
ai.leaderboard.stylize_leaderboard(data)

Unnamed: 0,accuracy,f1_score,precision,recall,ml_framework,transfer_learning,deep_learning,model_type,depth,num_params,bidirectional_layers,conv1d_layers,dense_layers,embedding_layers,flatten_layers,globalmaxpooling1d_layers,lstm_layers,maxpooling1d_layers,simplernn_layers,relu_act,sigmoid_act,softmax_act,tanh_act,loss,optimizer,model_config,username,version
0,95.09%,95.09%,95.07%,95.12%,keras,False,True,Sequential,3,161922,,,1,1,1.0,,,,,,,1.0,,str,RMSprop,"{'name': 'sequential', 'layers...",hpeters,67
1,95.09%,95.09%,95.07%,95.12%,keras,False,True,Sequential,3,161922,,,1,1,1.0,,,,,,,1.0,,str,RMSprop,"{'name': 'sequential', 'layers...",hpeters,66
2,95.00%,94.99%,94.97%,95.02%,keras,False,True,Sequential,5,1081482,1.0,,2,1,,,1.0,,,1.0,,1.0,1.0,str,RMSprop,"{'name': 'sequential_29', 'lay...",kagenlim,61
3,94.86%,94.85%,94.84%,94.87%,keras,False,True,Sequential,5,1035746,,,2,1,,,2.0,,,1.0,,1.0,2.0,str,RMSprop,"{'name': 'sequential_3', 'laye...",kagenlim,19
4,94.77%,94.76%,94.74%,94.78%,keras,False,True,Sequential,9,1313030,,,2,1,1.0,,1.0,,4.0,,3.0,,4.0,str,RMSprop,"{'name': 'sequential_1', 'laye...",kka2120,69
5,94.58%,94.57%,94.57%,94.57%,keras,False,True,Sequential,5,1070202,,,2,1,,,2.0,,,1.0,,1.0,2.0,str,RMSprop,"{'name': 'sequential_4', 'laye...",kagenlim,60
6,94.49%,94.47%,94.47%,94.48%,keras,False,True,Sequential,3,161282,,,1,1,1.0,,,,,,,1.0,,str,RMSprop,"{'name': 'sequential', 'layers...",newusertest,4
7,94.35%,94.34%,94.32%,94.37%,keras,False,True,Sequential,6,148066,,2.0,1,1,1.0,,,1.0,,2.0,,1.0,,str,RMSprop,"{'name': 'sequential_72', 'lay...",prajseth,40
8,94.25%,94.24%,94.24%,94.24%,keras,False,True,Sequential,3,98818,,,1,1,,,1.0,,,,,1.0,1.0,str,RMSprop,"{'name': 'sequential_78', 'lay...",prajseth,41
9,94.21%,94.19%,94.18%,94.21%,keras,False,True,Sequential,3,402690,,,1,1,,,1.0,,,,1.0,,1.0,str,RMSprop,"{'name': 'sequential_5', 'laye...",xc2303_xc,63


In [32]:
 # Get best model architecture and view model summary, change version arg as needed
 
 bestmodel = ai.aimsonnx.instantiate_model(api_url, version=61) 

 bestmodel.summary()

Model: "sequential_29"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_27 (Embedding)     (None, 40, 100)           1000000   
_________________________________________________________________
bidirectional_5 (Bidirection (None, 40, 80)            45120     
_________________________________________________________________
lstm_37 (LSTM)               (None, 60)                33840     
_________________________________________________________________
dense_43 (Dense)             (None, 40)                2440      
_________________________________________________________________
dense_44 (Dense)             (None, 2)                 82        
Total params: 1,081,482
Trainable params: 1,081,482
Non-trainable params: 0
_________________________________________________________________


In [33]:
# Compare two model versions to see diffs
ai.aimsonnx.compare_models(api_url, version_list=[61,62]) 



Unnamed: 0,Model_61_Layer,Model_61_Shape,Model_61_Params,Model_62_Layer,Model_62_Shape,Model_62_Params
0,Embedding,"(None, 40, 100)",1000000,Embedding,"(None, 40, 32)",320000.0
1,Bidirectional,"(None, 40, 80)",45120,LSTM,"(None, 128)",82432.0
2,LSTM,"(None, 60)",33840,Dense,"(None, 2)",258.0
3,Dense,"(None, 40)",2440,,,
4,Dense,"(None, 2)",82,,,


**Explain how the model's structure is different from your best model./ Compare summary of your model with top model **


Performance: 
The difference between the top model (Version 61) and my model (version 62) is that, the accuracy score is much higher with 95.0% accuracy versus mine: 94.2%; and f1_scroe of 94.99% versus mine: 94.19%. 

Layers: 
The best model uses 5 layers in total: Embedding, Bidirectional, LSTM, and 2 Dense layers. The model contains 1,081,482 number of parameters. My model has 3 layers in total: Embedding, LSTM and Dense. It has 402,690 parameters. 

Activation and Optimizer: 
The best model used relu, softmax and tanh activation while my model used sigmoid and tanh activation. Both models used RMSprop as optimizer. 



In [34]:
# Fit the best model from the leader board to training data and evaluate it on test data to complete your report.
bestmodel.compile(optimizer='rmsprop', loss='binary_crossentropy',
              metrics=['acc'])

bestmodel.fit(X_train, y_train,batch_size=1,
              epochs = 5, verbose=1,validation_data=(X_test,y_test))

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x7f4c6f9fe390>

In [35]:
import pandas as pd

testdata_own=pd.read_csv("tweets_test.csv", usecols = ['tweets','label'])

testdata_own


Unnamed: 0,tweets,label
0,"Using rtweet package, I was able to find numbe...",fake
1,"n_id<-get_followers(user[i],n=u$followers_coun...",fake
2,"for a given user name. like (""Tarun""), but hav...",fake
3,Any suggestion will be helpful.,fake
4,We can extract number of tweets by user as bel...,fake
5,We can extract number of tweets by user as below,fake
6,This topic was automatically closed 7 days aft...,fake
7,Hello! Looks like you’re enjoying the discussi...,fake
8,"mlverse software, covering topics ranging from...",fake
9,"Patrick Corbin says ""I feel good, I feel healt...",real


In [39]:
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
import numpy as np

# Build vocabulary from test text data
tokenizer = Tokenizer(num_words=10000)
tokenizer.fit_on_texts(testdata_own.tweets)

# preprocessor tokenizes words and makes sure all documents have the same length
def preprocessor(data, maxlen, max_words):

    sequences = tokenizer.texts_to_sequences(data)

    word_index = tokenizer.word_index
    X = pad_sequences(sequences, maxlen=maxlen)

    return X


In [40]:
# tokenize and pad X data
X_test_2 = preprocessor(testdata_own.tweets, maxlen=40, max_words=10000)

# ohe encode Y data
y_test_2 = pd.get_dummies(testdata_own.label)

In [43]:
# y_pred 
y_pred = model_5.predict(X_test_2)
y_pred


array([[3.7288666e-04, 9.9964589e-01],
       [6.2756044e-01, 3.9685267e-01],
       [9.8370135e-01, 1.6256303e-02],
       [7.4199677e-01, 2.6254958e-01],
       [7.9996109e-01, 2.0854610e-01],
       [9.2655689e-01, 7.3887110e-02],
       [3.9740026e-01, 6.2416947e-01],
       [3.4831464e-03, 9.9706340e-01],
       [3.3529646e-05, 9.9996978e-01],
       [1.9299984e-04, 9.9983644e-01],
       [9.9999958e-01, 3.6463311e-07],
       [1.1288822e-03, 9.9860704e-01]], dtype=float32)