In [1]:
import pandas as pd

In [2]:
pd.set_option('max_colwidth', None)

In [3]:
df = pd.read_csv('BA_reviews.csv', index_col=0)
df.head()

Unnamed: 0,reviews,rating
0,"✅ Trip Verified | The entire booking and ticketing experience has been stressful and erroneous. I booked directly with BA as I thought - why go to any other airline when their home headquarters is UK and I need to go to the UK. Mistake. This was months ago. I could not cancel or change my flight without paying them $750 to cancel my flight and have that money sitting for just 1 year on account, otherwise all was lost minus a few hundred in taxes. As whom I am really flying with is American Airlines and another partner I can't check in without being redirected. Then those sites don't recognize me. I don't appear in AA or Aer Lingus applications. I couldn't ask for more time to change planes in London despite all reviews saying how long and hard it is, as that would mean a cancel of plane ticket - loss of all money. I've been on the phone with agents for over an hour on 3 different calls. Their app will not let me in without each time a reset of passcode, a 30 minute wait for the email and then I get the same error.",1
1,Not Verified | British Airways cancelled my flight less than 24 hours before. Automatically rebooked it for 2 days later. I called customer service 3 times trying to change it and they would Not help. My daughter was flying on a different reservation and because I am legally blind I needed to be on the same flight - they didn’t care nor would they help. We eventually bought new tickets on easyJet. When we arrive home I entered a complaint to get my fare refunded. It took them 4 months and then the response was it was cancelled because of a strike in Spain so they won’t refund it. EasyJet didn’t cancel. The whole experience was awful and so disappointing.,1
2,"Not Verified | I wanted to write this review so I could give a huge thank you to one of the staff on Lisbon named Jay Ramphul. She went above and beyond and really helped me in an urgent situation. I had boarded my flight with 20 minutes before take off when I realized I had left my cell phone in the club lounge. I was not going to get permission to deboard and get back on in time for take off. Jay stepped in and made it happened. She literally ran with me a far distance to retrieve my phone with me and get me back on the plane within minutes of take off. This is true service! I don't know if this review will ever get back to her or her management, but I hope it does. I want to again express my gratitude for her help and kindness in this matter.",10
3,"✅ Trip Verified | Check in fast and friendly. Security a breeze. Lounge was busy early evening but comfortable and clean. Flight attendants welcoming. Seat a nightmare it was the reverse/forward with a step over from the window seat, who the hell thought that was a good idea?? Meal were OK but all curry based, like curry and I was on a flight to India so. FA was brilliant as the only flyer awake on an overnight flight. Improving since my last BA flight",7
4,"✅ Trip Verified | This is the first time I have seen the new Club World suite. The seat was comfortable but unlike other airlines, BA has crammed as many seats in business class as is humanly possible so the cabin felt cramped. The crew behaved as though we, the passengers, should have been grateful for them turning up for their shift. They looked scruffy in the new uniforms, clearly the dress code policy has been scrapped. I appreciate that appearance is no measure of service quality (in the UK alone) but the service was appalling. The menu lacked choice and the food was presented as if it fell to the floor and was scooped back onto the plate. The bedding was atrocious, an old scraggly blanket and I’ll fitting seat cover. I was cold and asked for an extra blanket which never arrived. There were no drinks coasters (obviously cutbacks) so my drinks kept spilling. I decided to clean up myself as the crew couldn’t be bothered and didn’t pick up on the fact I was using bathroom hand towels to act as drinks coasters to mop up the mess. The aircraft was old (although retrofitted) and had a leak by the galley wall with lots of blue roll in situ to mop up the mess. Not at all a premium service, this felt more like a low cost carrier doing “business”.",3


### Preprocessing

In [4]:
import nltk
import re
from nltk.corpus import stopwords
from nltk.corpus import wordnet
from nltk.tokenize import word_tokenize
from nltk.stem import WordNetLemmatizer

In [5]:
### Create a clean text function
def clean_text(text):
    text = text.lower()
    text = re.sub(r'[^\w\s]', '', text)

    tokens = word_tokenize(text)

    return tokens

In [6]:
def stopword_removal(tokens):
    stop_words = set(stopwords.words('english'))
    tokens = [token for token in tokens if token not in stop_words]

    return tokens

In [7]:
def lemma(tokens):
    lemmatizer = WordNetLemmatizer()
    tokens = [lemmatizer.lemmatize(token) for token in tokens]

    return tokens

In [8]:
import nltk
from nltk.stem import WordNetLemmatizer

lemmatizer = WordNetLemmatizer()
# Custom lemmatization function using WordNet
def custom_lemma(tokens):
    lemmatized_tokens = []
    pos_mapping = {
        'N': wordnet.NOUN,  # Noun
        'V': wordnet.VERB,  # Verb
        'R': wordnet.ADV,   # Adverb
        'J': wordnet.ADJ    # Adjective
    }
    # Use WordNet to look up lemmas based on part-of-speech (POS) tags
    for token in tokens:
        pos = nltk.pos_tag([token])[0][1][0].upper()  # Get the POS tag of the token
        if pos in pos_mapping:
            lemmatized_tokens.append(lemmatizer.lemmatize(token, pos_mapping[pos]))
        else:
            lemmatized_tokens.append(token)
    return lemmatized_tokens

In [9]:
def preprocessed(text):
    tokens = clean_text(text)
    filtered_tokens = stopword_removal(tokens)
    lemmatized_tokens = custom_lemma(filtered_tokens)
    preprocessed_text = ' '.join(lemmatized_tokens)

    return preprocessed_text

In [10]:
df['clean_reviews'] = df['reviews'].apply(preprocessed)
df.head(2)

Unnamed: 0,reviews,rating,clean_reviews
0,"✅ Trip Verified | The entire booking and ticketing experience has been stressful and erroneous. I booked directly with BA as I thought - why go to any other airline when their home headquarters is UK and I need to go to the UK. Mistake. This was months ago. I could not cancel or change my flight without paying them $750 to cancel my flight and have that money sitting for just 1 year on account, otherwise all was lost minus a few hundred in taxes. As whom I am really flying with is American Airlines and another partner I can't check in without being redirected. Then those sites don't recognize me. I don't appear in AA or Aer Lingus applications. I couldn't ask for more time to change planes in London despite all reviews saying how long and hard it is, as that would mean a cancel of plane ticket - loss of all money. I've been on the phone with agents for over an hour on 3 different calls. Their app will not let me in without each time a reset of passcode, a 30 minute wait for the email and then I get the same error.",1,trip verify entire book ticket experience stressful erroneous book directly ba thought go airline home headquarters uk need go uk mistake month ago could cancel change flight without pay 750 cancel flight money sit 1 year account otherwise lose minus hundred tax really fly american airline another partner cant check without redirect site dont recognize dont appear aa aer lingus application couldnt ask time change plane london despite review say long hard would mean cancel plane ticket loss money ive phone agent hour 3 different call app let without time reset passcode 30 minute wait email get error
1,Not Verified | British Airways cancelled my flight less than 24 hours before. Automatically rebooked it for 2 days later. I called customer service 3 times trying to change it and they would Not help. My daughter was flying on a different reservation and because I am legally blind I needed to be on the same flight - they didn’t care nor would they help. We eventually bought new tickets on easyJet. When we arrive home I entered a complaint to get my fare refunded. It took them 4 months and then the response was it was cancelled because of a strike in Spain so they won’t refund it. EasyJet didn’t cancel. The whole experience was awful and so disappointing.,1,verify british airway cancel flight less 24 hour automatically rebooked 2 day later call customer service 3 time try change would help daughter fly different reservation legally blind need flight didnt care would help eventually bought new ticket easyjet arrive home enter complaint get fare refund take 4 month response cancel strike spain wont refund easyjet didnt cancel whole experience awful disappoint


In [11]:
# Remove "trip verify" or "verify" from the beginning of the sentences
df['clean_reviews'] = df['clean_reviews'].str.replace(r'^trip verify\s*', '', regex=True)
df['clean_reviews'] = df['clean_reviews'].str.replace(r'^verify\s*', '', regex=True)

In [12]:
df.to_csv('Clean_BA_reviews_v2.csv', index=False)