**Question:** Build, and compile the Twitter US Airline Sentiment dataset using the two layer LSTM model



**Description:**
    
Load the Twitter US Airline Sentiment dataset which has 14585 rows and 21 columns

Select only text and airline sentiment columns for modeling

Remove the neutral label from the dataset such that we need to process positive and negative labels

Convert airline sentiment feature into numeric values

Perform tokenization for text feature

Build the sequential model 12975 as vocabulary size, 32 as embedding length, 200 as input length, next add two layers LSTM layer with 50 neurons, dropout as 0.5, and finally add a dense layer with sigmoid as activation.

Compile the model using loss as binary cross-entropy, adam as the optimizer




**Level** : Hard


**input format:**
CSV Dataset


**Output format:**
Model output (positive or negative)


**sample input** :
Twitter US Airline sentiment dataset


**sample output** : 
positive



**SOLUTION:**


In [None]:
import pandas as pd
import tensorflow as tf
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM,Dense, Dropout, SpatialDropout1D
from tensorflow.keras.layers import Embedding

df= pd.read_csv('/home/metagogy/Tweets-train.csv', sep=',')
tweet_df = df[['text','airline_sentiment']]

tweet_df = tweet_df[tweet_df['airline_sentiment'] != 'neutral']

sentiment_label = tweet_df.airline_sentiment.factorize()

tweet = tweet_df.text.values
tokenizer = Tokenizer(num_words=5000)
tokenizer.fit_on_texts(tweet)
vocab_size = len(tokenizer.word_index) + 1
encoded_docs = tokenizer.texts_to_sequences(tweet)
padded_sequence = pad_sequences(encoded_docs, maxlen=200)

embedding_vector_length = 32

model = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_vector_length, input_length=200),
    tf.keras.layers.LSTM(50,return_sequences = True),
    tf.keras.layers.LSTM(50),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

model.compile(loss='binary_crossentropy',optimizer='adam', metrics=['accuracy'])  
