# British Airways Data Science Challenge

## Description
My solutions for the Forage program: web scraping, data cleaning, analysis, and visualization to extract business insights. Demonstrates practical data science skills for real-world problem-solving.

## Task 1
### 1.Scrape data from the web
The first thing to do will be to scrape review data from the site [Skytrax](https://www.airlinequality.com/airline-reviews/british-airways/)

I wil use Jupyter Notebook to perform data gathering, cleaning and analysis.

### 2.Clean the data

In [176]:
# Let's first see what technology the website uses using the 'builtwth' library
!pip install builtwith



In [177]:
import builtwith

In [178]:
website = "https://www.airlinequality.com"
result = builtwith.parse(website)
print(result)

{'cdn': ['CloudFlare'], 'advertising-networks': ['Google AdSense'], 'font-scripts': ['Google Font API'], 'photo-galleries': ['Lightbox'], 'javascript-frameworks': ['Lightbox', 'Modernizr', 'jQuery'], 'cms': ['WordPress'], 'programming-languages': ['PHP'], 'blogs': ['PHP', 'WordPress'], 'marketing-automation': ['Yoast SEO'], 'web-frameworks': ['ZURB Foundation']}


In [179]:
!pip install python-whois



In [180]:
import whois

In [181]:
print(whois.whois(website))

{
  "domain_name": "AIRLINEQUALITY.COM",
  "registrar": "TUCOWS, INC.",
  "registrar_url": [
    "http://www.tucows.com",
    "http://tucowsdomains.com"
  ],
  "reseller": "Namesco Limited",
  "whois_server": "whois.tucows.com",
  "referral_url": null,
  "updated_date": "2024-01-26 08:58:03",
  "creation_date": "2000-02-24 11:52:16",
  "expiration_date": "2025-02-24 11:52:14",
  "name_servers": [
    "AMIR.NS.CLOUDFLARE.COM",
    "CRUZ.NS.CLOUDFLARE.COM"
  ],
  "status": [
    "clientTransferProhibited https://icann.org/epp#clientTransferProhibited",
    "clientUpdateProhibited https://icann.org/epp#clientUpdateProhibited"
  ],
  "emails": [
    "domainabuse@tucows.com",
    "transfers@names.co.uk"
  ],
  "dnssec": "unsigned",
  "name": "REDACTED FOR PRIVACY",
  "org": "REDACTED FOR PRIVACY",
  "address": "REDACTED FOR PRIVACY",
  "city": "REDACTED FOR PRIVACY",
  "state": "Greater London",
  "registrant_postal_code": "REDACTED FOR PRIVACY",
  "country": "GB"
}


In [182]:
import requests
from bs4 import BeautifulSoup

In [183]:
# https://www.airlinequality.com/airline-reviews/british-airways . This is the first page with all the reviews.

# what we will need is:
# Title
# Name
# Date
# Review
# Verification

# This means our table will have 5 columns

In [184]:
# Let's start with titles of page 1:
url = "https://www.airlinequality.com/airline-reviews/british-airways/page/1/"
response = requests.get(url) # request for page
response.raise_for_status() # Checks if the page gives successful status (200)
soup = BeautifulSoup(response.text, 'html')
title = [h2.text.strip() for h2 in soup.find_all("h2", class_="text_header")]

In [185]:
print(title)

['"not use British Airways on this route"', '"they still haven\'t replied"', '“food has really gone downhill”', '"thoroughly enjoyed this flight"', '“customer support was terrible”', '"a really enjoyable experience"', '"Very good flight"', '"relatively comfortable elderly plane"', '"70 days chasing BA’s complaints department"', '"BA refused to reimburse me"']


In [186]:
# Great we now see that we can request, parse and display text details from our web page.
# let's see all the other details across our requirements.

In [187]:
name = [span.text.strip() for span in soup.find_all("span", itemprop="name")] # takes all the names for the reviews submitted

In [188]:
print(name)

['C Barton', 'E Vandoon', 'John Prescott', 'A Hashin', 'L Martin', 'Paul Lee', 'Guy Senior', 'Simon Channon', 'R Layne', 'Michael Chastain']


In [189]:
# Let's see if we can get the dates for page 1
date = [(time["datetime"], time.text.strip()) for time in soup.find_all("time", itemprop="datePublished")]

In [190]:
print(date)

[('2025-02-21', '21st February 2025'), ('2025-02-18', '18th February 2025'), ('2025-02-14', '14th February 2025'), ('2025-02-14', '14th February 2025'), ('2025-02-07', '7th February 2025'), ('2025-02-01', '1st February 2025'), ('2025-01-20', '20th January 2025'), ('2025-01-19', '19th January 2025'), ('2025-01-15', '15th January 2025'), ('2025-01-09', '9th January 2025')]


In [191]:
# This will display all user verification if the trip is verified or not
verification = [a.text.strip() for a in soup.find_all("a", href="https://www.airlinequality.com/verified-reviews/")]

In [192]:
print(verification)

['Trip Verified', 'Trip Verified', 'Trip Verified', 'Trip Verified', 'Trip Verified', 'Trip Verified', 'Not Verified', 'Not Verified', 'Trip Verified', 'Trip Verified']


In [193]:
# Now we have seen that we can get the data we need. We can create a scrape-bot for the pages 1-70 as our sample size

# Web Scrapping
 We will loop through the pages on Skytrax.com and collect the necessary data.

 We may not use all the collected data however, for future manipulation it may be vital.

In [194]:
# Scraper

base_url = "https://www.airlinequality.com/airline-reviews/british-airways"
pages = 40
titles = []
ratings = []
names = []
dates = []
reviews = []

# let's loop through our pages
for i in range(1, pages+1):
    page_url = f"{base_url}/page/{i}/?sortby=post_date%3ADesc&pagesize=100"
    response = requests.get(page_url, timeout=10) # Make request
    response.raise_for_status()
    soup = BeautifulSoup(response.text, 'html') # Parsing

    # Getting title data from the page and adding it to the list of titles
    for h2 in soup.find_all("h2", class_="text_header"):
        titles.append(h2.get_text())

    # Getting name data from the page and adding it to the name list.
    for span in soup.find_all("span", itemprop="name"):
        names.append(span.get_text())

    # Getting date data from the page and adding it to the list of dates.
    for time in soup.find_all("time", itemprop="datePublished"):
        dates.append(time.get_text())

    # Getting review data from the page and adding it to the list of reviews.
    for div in soup.find_all("div", itemprop="reviewBody", ):
        reviews.append(div.get_text())

    print(f"<--- Reviews in page {i}: {len(reviews)}")

<--- Reviews in page 1: 100
<--- Reviews in page 2: 200
<--- Reviews in page 3: 300
<--- Reviews in page 4: 400
<--- Reviews in page 5: 500
<--- Reviews in page 6: 600
<--- Reviews in page 7: 700
<--- Reviews in page 8: 800
<--- Reviews in page 9: 900
<--- Reviews in page 10: 1000
<--- Reviews in page 11: 1100
<--- Reviews in page 12: 1200
<--- Reviews in page 13: 1300
<--- Reviews in page 14: 1400
<--- Reviews in page 15: 1500
<--- Reviews in page 16: 1600
<--- Reviews in page 17: 1700
<--- Reviews in page 18: 1800
<--- Reviews in page 19: 1900
<--- Reviews in page 20: 2000
<--- Reviews in page 21: 2100
<--- Reviews in page 22: 2200
<--- Reviews in page 23: 2300
<--- Reviews in page 24: 2400
<--- Reviews in page 25: 2500
<--- Reviews in page 26: 2600
<--- Reviews in page 27: 2700
<--- Reviews in page 28: 2800
<--- Reviews in page 29: 2900
<--- Reviews in page 30: 3000
<--- Reviews in page 31: 3100
<--- Reviews in page 32: 3200
<--- Reviews in page 33: 3300
<--- Reviews in page 34: 340

In [195]:
# Let's Update our dataframe with the values we scraped from the website pages

import pandas as pd
data = pd.DataFrame()
#data["Title"] = titles
# commented the above statement because in older reviews there are no review titles.
data["Reviews"] = reviews

In [196]:
data

Unnamed: 0,Reviews
0,✅ Trip Verified | Prior to boarding a gate a...
1,✅ Trip Verified | I flew from Amsterdam to L...
2,"✅ Trip Verified | First the good news, the clu..."
3,✅ Trip Verified | I have never travelled wit...
4,"✅ Trip Verified | Terrible overall, medium ser..."
...,...
3912,Business LHR to BKK. 747-400. First try back w...
3913,LHR to HAM. Purser addresses all club passenge...
3914,My son who had worked for British Airways urge...
3915,London City-New York JFK via Shannon on A318 b...


# **Data Cleaning**

This is  the process of fixing or removing incorrect, corrupted, incorrectly formatted, duplicate, or incomplete data within a dataset.

When sourcing data, there are many opportunities for data to be in a format in which it is difficult to analyze or manipulate.

## **Text Cleaning**

### **Removing inconsistent data**
We remove the "✅ Trip Verified" and "Not Verified".
They do not appear in older reviews.

Additionaly, this text will bring errors in our analysis later on. Such as

In [197]:
# Let's split the reviews, removing the verification because older reviews lack this data

# creating a boolean mask for rows with "Verified" text
mask_verification = data["Reviews"].str.contains("Verified", na=False)

# We now remove the text that appears before the "|" in the rows with "Verified"
data.loc[mask_verification, "Reviews"] = (data.loc[mask_verification, "Reviews"].str.replace(r"^.*?\|", "", regex=True).str.strip())

In [198]:
data

Unnamed: 0,Reviews
0,Prior to boarding a gate agent seemed to pick ...
1,I flew from Amsterdam to Las Vegas with a layo...
2,"First the good news, the club suites are such ..."
3,I have never travelled with British airways be...
4,"Terrible overall, medium service and the fligh..."
...,...
3912,Business LHR to BKK. 747-400. First try back w...
3913,LHR to HAM. Purser addresses all club passenge...
3914,My son who had worked for British Airways urge...
3915,London City-New York JFK via Shannon on A318 b...


## **Converting Text To Lowercase**

Converting text to lowercase is a common preprocessing step in data cleaning and natural language processing.  

It transforms all text characters to their lowercase equivalents, ensuring uniformity and consistency within the dataset.

This is crucial because it treats words with the same spelling but different capitalization as identical, preventing them from being interpreted as distinct entities by algorithms and analytical tools.  

This standardization simplifies tasks like text comparison, search, and analysis, leading to more accurate and efficient results.  

For example, "Flight", "flight", and "FLIGHT" would all be treated as "flight" after lowercasing.

In [199]:
data[["Cleaned_Reviews"]] = data[["Reviews"]].apply(lambda x: x.str.lower())

In [200]:
data

Unnamed: 0,Reviews,Cleaned_Reviews
0,Prior to boarding a gate agent seemed to pick ...,prior to boarding a gate agent seemed to pick ...
1,I flew from Amsterdam to Las Vegas with a layo...,i flew from amsterdam to las vegas with a layo...
2,"First the good news, the club suites are such ...","first the good news, the club suites are such ..."
3,I have never travelled with British airways be...,i have never travelled with british airways be...
4,"Terrible overall, medium service and the fligh...","terrible overall, medium service and the fligh..."
...,...,...
3912,Business LHR to BKK. 747-400. First try back w...,business lhr to bkk. 747-400. first try back w...
3913,LHR to HAM. Purser addresses all club passenge...,lhr to ham. purser addresses all club passenge...
3914,My son who had worked for British Airways urge...,my son who had worked for british airways urge...
3915,London City-New York JFK via Shannon on A318 b...,london city-new york jfk via shannon on a318 b...


### **Removing All Special Characters/Punctuations**

Removing special characters and punctuation from data is a preprocessing step that cleans and standardizes text.  These characters, while sometimes meaningful, can often hinder analysis and confuse algorithms, particularly in natural language processing and machine learning.

Removing them creates a more uniform dataset, improving consistency, simplifying analysis, and optimizing the data for tasks like text mining, sentiment analysis, and model training.  

This ensures more accurate and reliable results by focusing on the core textual content.


In [201]:
data["Cleaned_Reviews"] = data["Cleaned_Reviews"].str.replace(r'[^\w\s]', "", regex=True)
data

Unnamed: 0,Reviews,Cleaned_Reviews
0,Prior to boarding a gate agent seemed to pick ...,prior to boarding a gate agent seemed to pick ...
1,I flew from Amsterdam to Las Vegas with a layo...,i flew from amsterdam to las vegas with a layo...
2,"First the good news, the club suites are such ...",first the good news the club suites are such a...
3,I have never travelled with British airways be...,i have never travelled with british airways be...
4,"Terrible overall, medium service and the fligh...",terrible overall medium service and the flight...
...,...,...
3912,Business LHR to BKK. 747-400. First try back w...,business lhr to bkk 747400 first try back with...
3913,LHR to HAM. Purser addresses all club passenge...,lhr to ham purser addresses all club passenger...
3914,My son who had worked for British Airways urge...,my son who had worked for british airways urge...
3915,London City-New York JFK via Shannon on A318 b...,london citynew york jfk via shannon on a318 bu...


## **Word Tokenization**
Word tokenization is the process of splitting a text string into individual words.

It’s a fundamental step in natural language processing (NLP) that allows further analysis such as sentiment analysis, text classification, and keyword extraction.

### *Why Word Tokenization Matters:*
◽ Prepares data for analysis: Converts raw text into manageable word units.

◽ Removes ambiguity: Handles punctuation and special characters.

◽ Facilitates feature extraction: Enables frequency counts, embeddings, and more.

In [202]:
!pip install spacy
!python -m spacy download en_core_web_sm

Collecting en-core-web-sm==3.7.1
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.7.1/en_core_web_sm-3.7.1-py3-none-any.whl (12.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.8/12.8 MB[0m [31m66.3 MB/s[0m eta [36m0:00:00[0m
[38;5;2m✔ Download and installation successful[0m
You can now load the package via spacy.load('en_core_web_sm')
[38;5;3m⚠ Restart to reload dependencies[0m
If you are in a Jupyter or Colab notebook, you may need to restart Python in
order to load all the package's dependencies. You can do this by selecting the
'Restart kernel' or 'Restart runtime' option.


In [203]:
# Spacy is ideal for complex NLP tasks where detailed token info is required
import spacy
nlp = spacy.load("en_core_web_sm")

data["Cleaned_Reviews"] = data["Cleaned_Reviews"].apply(lambda x: [token.text for token in nlp(x)])
data

Unnamed: 0,Reviews,Cleaned_Reviews
0,Prior to boarding a gate agent seemed to pick ...,"[prior, to, boarding, a, gate, agent, seemed, ..."
1,I flew from Amsterdam to Las Vegas with a layo...,"[i, flew, from, amsterdam, to, las, vegas, wit..."
2,"First the good news, the club suites are such ...","[first, the, good, news, the, club, suites, ar..."
3,I have never travelled with British airways be...,"[i, have, never, travelled, with, british, air..."
4,"Terrible overall, medium service and the fligh...","[terrible, overall, medium, service, and, the,..."
...,...,...
3912,Business LHR to BKK. 747-400. First try back w...,"[business, lhr, to, bkk, 747400, first, try, b..."
3913,LHR to HAM. Purser addresses all club passenge...,"[lhr, to, ham, purser, addresses, all, club, p..."
3914,My son who had worked for British Airways urge...,"[my, son, who, had, worked, for, british, airw..."
3915,London City-New York JFK via Shannon on A318 b...,"[london, citynew, york, jfk, via, shannon, on,..."


In [204]:
print(data["Cleaned_Reviews"][3912])

['business', 'lhr', 'to', 'bkk', '747400', 'first', 'try', 'back', 'with', 'ba', 'in', 'about', '5', 'years', 'during', 'which', 'i', 'have', 'flown', 'many', 'other', 'airlines', 'and', 'been', 'impressed', 'with', 'most', 'the', 'only', 'things', 'that', 'impress', 'me', 'with', 'ba', 'are', 'the', 'staff', 'and', 'their', 'ability', 'to', 'pretend', 'this', 'is', 'a', 'quality', 'airline', 'and', 'the', 'fact', 'that', 'people', 'pay', 'ba', 'so', 'much', 'money', 'to', 'be', 'treated', 'like', 'cattle', 'the', 'food', 'is', 'without', 'doubt', 'some', 'of', 'the', 'worst', 'i', 'have', 'had', 'in', 'business', 'anywhere', 'poorly', 'presented', 'badly', 'prepared', 'and', 'barely', 'edible', 'the', 'avod', 'looks', 'like', 'it', 'was', 'installed', '10', 'years', 'ago', 'and', 'then', 'was', 'nt', 'the', 'top', 'of', 'the', 'range', 'screen', 'size', 'is', 'terrible', 'viewing', 'angles', 'and', 'resolution', 'poor', 'the', 'seats', 'work', 'well', 'as', 'a', 'lie', 'flat', 'config

## **Part-of-Speech (POS) Tagging**

This is the process of assigning grammatical categories (such as nouns, verbs, adjectives, and adverbs) to each word in a text.

It helps in understanding the structure and meaning of sentences by identifying how words function in context, enabling deeper language analysis for tasks like sentiment analysis, text summarization, and information extraction.

### *What the POS Tags Mean (Common spaCy POS Tags):*
NOUN: Noun (person, place, thing)

VERB: Verb (action, process)

ADJ: Adjective (describes nouns)

ADV: Adverb (modifies verbs/adjectives)

DET: Determiner (e.g., the, a)

AUX: Auxiliary (helping verb, e.g., is, was)

ADP: Adposition (prepositions like for, in)

PUNCT: Punctuation

PRON: Pronoun (e.g., he, she)

In [205]:
from spacy.tokens import Doc

# Apply POS tagging where the new row will have ('tokenized word', 'respective POS tag')
data["POS_Tagged_Reviews"] = data["Cleaned_Reviews"].apply(lambda x :[(token.text, token.pos_) for token in nlp(" ".join(x))])
data

Unnamed: 0,Reviews,Cleaned_Reviews,POS_Tagged_Reviews
0,Prior to boarding a gate agent seemed to pick ...,"[prior, to, boarding, a, gate, agent, seemed, ...","[(prior, ADV), (to, ADP), (boarding, VERB), (a..."
1,I flew from Amsterdam to Las Vegas with a layo...,"[i, flew, from, amsterdam, to, las, vegas, wit...","[(i, PRON), (flew, VERB), (from, ADP), (amster..."
2,"First the good news, the club suites are such ...","[first, the, good, news, the, club, suites, ar...","[(first, ADV), (the, DET), (good, ADJ), (news,..."
3,I have never travelled with British airways be...,"[i, have, never, travelled, with, british, air...","[(i, PRON), (have, AUX), (never, ADV), (travel..."
4,"Terrible overall, medium service and the fligh...","[terrible, overall, medium, service, and, the,...","[(terrible, ADJ), (overall, ADJ), (medium, ADJ..."
...,...,...,...
3912,Business LHR to BKK. 747-400. First try back w...,"[business, lhr, to, bkk, 747400, first, try, b...","[(business, NOUN), (lhr, PROPN), (to, PART), (..."
3913,LHR to HAM. Purser addresses all club passenge...,"[lhr, to, ham, purser, addresses, all, club, p...","[(lhr, PROPN), (to, ADP), (ham, NOUN), (purser..."
3914,My son who had worked for British Airways urge...,"[my, son, who, had, worked, for, british, airw...","[(my, PRON), (son, NOUN), (who, PRON), (had, A..."
3915,London City-New York JFK via Shannon on A318 b...,"[london, citynew, york, jfk, via, shannon, on,...","[(london, PROPN), (citynew, PROPN), (york, PRO..."


 ## **Stopword Removal**

 This is the process of eliminating common words (such as "and," "the," "is," and "in") from text data.

 These words typically carry little meaningful information and are removed to reduce noise, improve processing efficiency, and enhance the performance of natural language processing (NLP) tasks like text classification, sentiment analysis, and keyword extraction.

In [206]:
# Removing stopwords and assigning the cleaned data to a new column
data["Reviews_No_Stopwords"] = data["Cleaned_Reviews"].apply(lambda x: " ".join([token.text for token in nlp(" ".join(x) if isinstance(x, list) else x) if not token.is_stop]))
data

Unnamed: 0,Reviews,Cleaned_Reviews,POS_Tagged_Reviews,Reviews_No_Stopwords
0,Prior to boarding a gate agent seemed to pick ...,"[prior, to, boarding, a, gate, agent, seemed, ...","[(prior, ADV), (to, ADP), (boarding, VERB), (a...",prior boarding gate agent pick elderly people ...
1,I flew from Amsterdam to Las Vegas with a layo...,"[i, flew, from, amsterdam, to, las, vegas, wit...","[(i, PRON), (flew, VERB), (from, ADP), (amster...",flew amsterdam las vegas layover heathrow nove...
2,"First the good news, the club suites are such ...","[first, the, good, news, the, club, suites, ar...","[(first, ADV), (the, DET), (good, ADJ), (news,...",good news club suites huge improvement old bus...
3,I have never travelled with British airways be...,"[i, have, never, travelled, with, british, air...","[(i, PRON), (have, AUX), (never, ADV), (travel...",travelled british airways time chose travel ba...
4,"Terrible overall, medium service and the fligh...","[terrible, overall, medium, service, and, the,...","[(terrible, ADJ), (overall, ADJ), (medium, ADJ...",terrible overall medium service flight delayed...
...,...,...,...,...
3912,Business LHR to BKK. 747-400. First try back w...,"[business, lhr, to, bkk, 747400, first, try, b...","[(business, NOUN), (lhr, PROPN), (to, PART), (...",business lhr bkk 747400 try ba 5 years flown a...
3913,LHR to HAM. Purser addresses all club passenge...,"[lhr, to, ham, purser, addresses, all, club, p...","[(lhr, PROPN), (to, ADP), (ham, NOUN), (purser...",lhr ham purser addresses club passengers board...
3914,My son who had worked for British Airways urge...,"[my, son, who, had, worked, for, british, airw...","[(my, PRON), (son, NOUN), (who, PRON), (had, A...",son worked british airways urged fly british a...
3915,London City-New York JFK via Shannon on A318 b...,"[london, citynew, york, jfk, via, shannon, on,...","[(london, PROPN), (citynew, PROPN), (york, PRO...",london citynew york jfk shannon a318 nice seat...


## **Lemmatization**

This is the process of converting words to their base or dictionary form, known as the **lemma**. Unlike stemming, lemmatization considers the context and part of speech of a word, ensuring that the transformed word is meaningful. For example, the words *"running"*, *"ran"*, and *"runs"* are all reduced to the lemma *"run"*.

This process is essential in natural language processing (NLP) tasks like text analysis, sentiment analysis, and search optimization, as it helps standardize words for better understanding and comparison.

**Explanation**

For this project we will carry out lemmatization on the Reviews_No_Stopwords

Here's a breakdown of what the difference would be between the two:

For most practical NLP tasks (like sentiment analysis, word clouds, or topic modeling):

Lemmatize Reviews_No_Stopwords (Recommended).

For tasks needing precise grammatical understanding (like syntactic parsing or deep linguistic analysis):

Lemmatize POS_Tagged_Reviews with POS context.


In [None]:
# lemmatization of Reviews_No_Stopwords
data["Lemmatized_Reviews"] = data["Reviews_No_Stopwords"].apply(lambda x: " ".join([token.lemma_ for token in nlp(x)]))
data

ERROR:root:Internal Python error in the inspect module.
Below is the traceback from this internal error.

ERROR:root:Internal Python error in the inspect module.
Below is the traceback from this internal error.

ERROR:root:Internal Python error in the inspect module.
Below is the traceback from this internal error.



Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/IPython/core/interactiveshell.py", line 3553, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-176-42097b4cfc8f>", line 2, in <cell line: 0>
    data["Lemmatized_Reviews"] = data["Reviews_No_Stopwords"].apply(lambda x: " ".join([token.lemma_ for token in nlp(x)]))
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/pandas/core/series.py", line 4924, in apply
    ).apply()
      ^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/pandas/core/apply.py", line 1427, in apply
    return self.apply_standard()
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/pandas/core/apply.py", line 1507, in apply_standard
    mapped = obj._map_values(
             ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/pan