### RyanAir Customer review analysis 

#### Business Understanding 
In the airline industry, customer reviews and surveys play a pivotal role in shaping and enhancing the overall customer experience. Companies, such as RyanAir, routinely conduct follow-ups on specific services based on valuable customer feedback, making necessary improvements to ensure satisfaction.

Customer reviews hold significant influence, often swaying people's decisions to try out a particular airline. These reviews impact critical aspects, including boarding decisions. Recognizing the importance of customer feedback is crucial in a service-oriented industry like aviation, as it not only guides improvements but also contributes to better recommendations and choices in the market.

### Objectives
1. Conduct a sentiment analysis of **RyanAir** customer reviews.
2. Identify the most logged complainfrom clients.
3. Assess passenger classes to determine which class receives the highest number of complaints and, conversely, the highest appreciation.  

In [1]:
from bs4 import BeautifulSoup 
import pandas as pd 
import numpy as np 
import requests 
import re  
import matplotlib.pyplot as plt 
%matplotlib inline 
import seaborn as sns 
import nltk 
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer



In [2]:
base_url = "https://www.airlinequality.com/airline-reviews/ryanair"
pages = 20 
page_size = 100

reviews = []

# for i in range(1, pages + 1):
for i in range(1, pages + 1):

    print(f"Scraping page {i}")

    # Create URL to collect links from paginated data
    url = f"{base_url}/page/{i}/?sortby=post_date%3ADesc&pagesize={page_size}"

    # Collect HTML data from this page
    response = requests.get(url)

    # Parse content
    content = response.content
    parsed_content = BeautifulSoup(content, 'html.parser')
    for para in parsed_content.find_all("div", {"class": "text_content"}):
        reviews.append(para.get_text())
    
    print(f"   ---> {len(reviews)} total reviews")

Scraping page 1
   ---> 100 total reviews
Scraping page 2
   ---> 200 total reviews
Scraping page 3
   ---> 300 total reviews
Scraping page 4
   ---> 400 total reviews
Scraping page 5
   ---> 500 total reviews
Scraping page 6
   ---> 600 total reviews
Scraping page 7
   ---> 700 total reviews
Scraping page 8
   ---> 800 total reviews
Scraping page 9
   ---> 900 total reviews
Scraping page 10
   ---> 1000 total reviews
Scraping page 11
   ---> 1100 total reviews
Scraping page 12
   ---> 1200 total reviews
Scraping page 13
   ---> 1300 total reviews
Scraping page 14
   ---> 1400 total reviews
Scraping page 15
   ---> 1500 total reviews
Scraping page 16
   ---> 1600 total reviews
Scraping page 17
   ---> 1700 total reviews
Scraping page 18
   ---> 1800 total reviews
Scraping page 19
   ---> 1900 total reviews
Scraping page 20
   ---> 2000 total reviews


In [5]:
df = pd.DataFrame(reviews)
df.to_csv("ryan_reviews.csv")

In [6]:
df2 = pd.read_csv("ryan_reviews.csv")
df2.head()

Unnamed: 0.1,Unnamed: 0,0
0,0,✅ Trip Verified | Really impressed! You get wh...
1,1,✅ Trip Verified | I should like to review my ...
2,2,✅ Trip Verified | Flight left the gate ahead o...
3,3,Not Verified | Booked a fight from Copenhagen ...
4,4,Not Verified | The flight itself is operated ...


In [9]:
df2.rename(columns={'0':'air_reviews'}, inplace=True)
df2.drop(columns=['Unnamed: 0'], inplace=True) 

In [11]:
df2['air_reviews'] = df2['air_reviews'].str.split('|',expand=True)[1]