# Sentiment Analysis of Tweets on Apple and Google Products

![](NLP_Image.jpg)

##  1. BUSINESS OVERVIEW  

###  1.1 **Business Understanding**

#### 1.1.1 **What is  sentiment analysis?**

**Sentiment Analysis** also known as **Sentiment Classification** in brief uses natural language processing to identify the emotional tone behind text, such as customer feedback, and categorize it as positive, negative, or neutral. 

The above can also be described as a text classification tasks, where we look at a phrase, or a list of phrases and use a classifier to tell if the sentiment behind that is:
- positive
- negative 
- neutral. 

In some cases, the third attribute is not taken to keep it a binary classification problem. 

In this project we will thus carry out a sentiment classification task where we will analyzes tweets, their emotions, and whether they are directed at a brand or product i.e Apple and Google products

#### 1.1.2 **How is sentiment analysis important to an organization?**

For your organization, sentiment analysis is crucial in the following ways:

- Understanding customer opinions for improving experiences, and addressing concerns proactively. 
- It helps to predict customer behavior for a particular product
- It can help to test the adaptability of a product
- Automates the task of customer preference reports.

The above are but a few benefits, but in general sentiment analysis assit business stake holders to also define various business problems regarding their products
#### 1.1.3 **An Overview of Apple Products**
Apple is one of the most recognized brands in the world valued at over $2 trillion in 2021. It is known for its innovative consumer electronics, including the iPhone, iPad, MacBook, and other devices. Apple’s next products, which may include a virtual reality headset and self-driving car. The past few product launches have been smaller in scope, like the HomePod and AirPod, and Apple fans are clamouring for the next iPhone.Given their wide range of products some mentioned above , below we have some statistics on their performance in 2023 as at September: 

- 231 million iPhones, 49 million iPads and 22 million Mac and MacBook units were sold in 2023
- Apple’s home and wearables division declined by 6.5% in 2023
- It sold 75 million AirPods and 38 million Apple Watches in 2023
- Apple Music has 93 million subscribers, Apple TV+ has 47 million

To be able to consistently get high revenues, Apple needs to continuously carry out sentiment analysis on the users' strong emotional reactions to the brand , which frequently result in a mix of positive and negative sentiments in their data.Tweets being a source of helpful data , looking at Tweets on Apple products could, among other things, cover customer service experiences, software upgrades, or the introduction of new items. 


#### 1.1.4 **An Overview of Google Products**

Google offers diverse products designed to enhance productivity, connectivity, and innovation.Key offerings include:
- Google Search
- Gmail
- Google Drive
- Google Workspace for organizing and collaborating
- YouTube
- Google Photos
- Google Play for entertainment
- Google Maps, Waze, and Google Earth for navigation. 
- Businesses benefit from Google Ads, Google Analytics, and Google Cloud Platform, while developers use tools like Firebase and   BigQuery.

Additionally, smart devices like Pixel phones, Nest home products, and Chromecast provide cutting-edge hardware solutions. Overall, Google's products aim to simplify daily life, empower businesses, and connect the world. In 2021 Statistics Highlighting Google's performance showed a revenue of $278.1 billion. Similar to Apple, for Google to continue to thrive, sentiment analysis is thus crucial to check on matters such as customer satisfaction.

#### 1.1.5 **Why Analyze Tweets?**
- **Social Media Influence**: Platforms like Twitter have become primary channels where customers share their feedback, both positive and negative, about brands and products.
- **Volume of Data**: The massive and real-time nature of tweets makes manual analysis impractical, necessitating automated solutions.
- **Business Impact**: Sentiment analysis of tweets can provide actionable insights to enhance customer experience, refine marketing strategies, and maintain a competitive edge.


#### 1.1.6   Stakeholders 

Sentiment analysis is important for various participants such as:

- **Business Managers**: Understand customer satisfaction and drive decision-making.
- **Marketing Teams**: Create sentiment-driven marketing campaigns.
- **Customer Service Teams**: Prioritize resolving issues flagged in negative reviews.

#### 1.1.7 **Challenges in Sentiment Analysis**
- **Unstructured Data**: Tweets are often informal, with abbreviations, slang, and emojis, making preprocessing essential.
- **Ambiguity**: Some texts may have mixed sentiments or implicit emotions that are challenging to classify.
- **Scalability**: Handling and processing large datasets efficiently is a significant challenge.

#### 1.1.8 **Proposed Solutions**
##### 1.1.8 Approach Methodology: 

##### 1.To execute the sentiment analysis . The following is the execution plan:
- Begin with simple approaches like bag-of-words or TF-IDF vectorization 
- Proceed to commplex methods (e.g., word embeddings or transformers)

##### 2. Pre-trained Tools: 
* NLP has many pre-trained models (e.g., spaCy, NLTK, Hugging Face Transformers) and libraries for quick text processing. 

For example, use:
- TF-IDF + Logistic Regression for a baseline.
- Pre-trained embeddings (e.g., Word2Vec, GloVe) for better results.
- Fine-tuned BERT if there is access to good hardware.

#### 1.1.9 Projected Conclusion

###  1.2 **Problem statement**

#### 1.2.1  Business Problem:
- In today’s digital world, customer feedback plays a critical role in shaping business decisions. Companies receive large volumes of unstructured textual data in the form of reviews, surveys, and social media posts. Analyzing this data manually is time-consuming and error-prone.

The goal of this project is to build a sentiment analysis model that classifies customer feedback as positive, negative, or neutral. 

This will enable businesses to:
- Identify key areas for improvement.
- Tailor marketing strategies based on customer sentiment.
- Monitor brand reputation over time.

###  1.3 **Objectives**

#### 1. **Primary Objective**:
   - Build a machine learning-based sentiment classification model that categorizes tweets as **positive**, **negative**, or **neutral** towards a brand or product.
   
#### 2. **Secondary Objectives**:
   - Identify whether a tweet contains an emotion directed at a specific brand or product.
   - Preprocess and clean the tweet text to remove noise (e.g., hashtags, mentions, and URLs).
   - Extract key textual features that indicate sentiment and brand-related emotions.
   - Provide actionable insights to help businesses improve customer satisfaction and marketing strategies.   
   
###  1.3.1 **Key Questions to Address**
1. How can we preprocess and clean textual data effectively to extract meaningful insights?
2. What are the best features to use (e.g., word embeddings, TF-IDF, or sentiment lexicons) for classifying tweet sentiment?
3. Which supervised learning models (e.g., Logistic Regression, Random Forest, or BERT) perform best for this task?
4. What level of accuracy, precision, and recall can we achieve for sentiment classification?

###  1.4 **Metrics of Success**

To evaluate the success of our sentiment analysis model, we will use metrics such as; accuracy, precision, recall or sensitivity, f1 score and the confusion matrix. To evaluate the performance of the sentiment classification model, we will use the following metrics:
#### 1. **Accuracy**:
   - Accuracy will check at the percentage of the correctly classified instances of sentiments out of the total sentmental            instances.
   - Target: **85% or higher**.

#### 2. **Precision**:
   - Precision will tell the percentage of actually correct positive sentiment predictions, thus telling us how often the model      is correct when it predicts a positive sentiment. The percentage of actual positive sentiments, that are correctly              identified by the model will be shown by recall. This metrics is important to strike a tradeoff between true positives and      false negatives
   - Target: **80% or higher** for each class (positive, negative, neutral).

#### 3. **Recall**:
   - Measure the model’s ability to correctly identify all relevant examples of a specific sentiment.
   - Target: **75% or higher** for each class.

#### 4. **F1-Score**:
   - Provide a balanced metric that considers both precision and recall.
   - Target: **80% or higher** overall.

#### 5. **Business Impact**:
   - Improved customer satisfaction through the identification of key negative sentiments.
   - Better marketing strategies based on trends in positive feedback.


### 2. DATA UNDERSTANDING
* Now we load the data, and proceed with understanding the shape, the basic statistics and the types of variable.
* We write function that we can load the data and get back the shape, info and description with df.shape, df.describe(), df.info() and df.isnull().sum()

#### 2.1 Import Necessary Libraries 

In [209]:
# Importing necessary libraries
import pandas as pd
import numpy as np
import re
import Project_Functions as Pf
import matplotlib.pyplot as plt
import seaborn as sns
from wordcloud import WordCloud
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import classification_report, confusion_matrix
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from nltk.stem import WordNetLemmatizer

from nltk.tokenize import word_tokenize

# Download NLTK resources
nltk.download('stopwords')
nltk.download('punkt')
nltk.download('wordnet')

[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\user\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\user\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package wordnet to
[nltk_data]     C:\Users\user\AppData\Roaming\nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


True

#### 2.2 General  Dataset Exploration

In [210]:
# Load and Display the first few rows of the dataset
df = Pf.Load_dataset('judge-1377884607_tweet_product_company.csv')
df

Unnamed: 0,tweet_text,emotion_in_tweet_is_directed_at,is_there_an_emotion_directed_at_a_brand_or_product
0,.@wesley83 I have a 3G iPhone. After 3 hrs twe...,iPhone,Negative emotion
1,@jessedee Know about @fludapp ? Awesome iPad/i...,iPad or iPhone App,Positive emotion
2,@swonderlin Can not wait for #iPad 2 also. The...,iPad,Positive emotion
3,@sxsw I hope this year's festival isn't as cra...,iPad or iPhone App,Negative emotion
4,@sxtxstate great stuff on Fri #SXSW: Marissa M...,Google,Positive emotion
...,...,...,...
9088,Ipad everywhere. #SXSW {link},iPad,Positive emotion
9089,"Wave, buzz... RT @mention We interrupt your re...",,No emotion toward brand or product
9090,"Google's Zeiger, a physician never reported po...",,No emotion toward brand or product
9091,Some Verizon iPhone customers complained their...,,No emotion toward brand or product


In [211]:
# Show the data information
Pf.check_Info(df)

(9093, 3)
Index(['tweet_text', 'emotion_in_tweet_is_directed_at',
       'is_there_an_emotion_directed_at_a_brand_or_product'],
      dtype='object')
tweet_text                                            object
emotion_in_tweet_is_directed_at                       object
is_there_an_emotion_directed_at_a_brand_or_product    object
dtype: object
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 9093 entries, 0 to 9092
Data columns (total 3 columns):
 #   Column                                              Non-Null Count  Dtype 
---  ------                                              --------------  ----- 
 0   tweet_text                                          9092 non-null   object
 1   emotion_in_tweet_is_directed_at                     3291 non-null   object
 2   is_there_an_emotion_directed_at_a_brand_or_product  9093 non-null   object
dtypes: object(3)
memory usage: 213.2+ KB
None
tweet_text                                               1
emotion_in_tweet_is_directed_at          

* The dataset comprise of 9093 rows and 3 columns; th tweet_text',emotion_in_tweet_is_directed_at, and is_there_an_emotion_directed_at_a_brand_or_product. 
* The tweet_text column contain the tweet or the text written on the twitter platform. The 'emotion_in_tweet_is_directed_at' column shows the company, google or apple, that the tweet was directed at. The last column shows whether the tweet written had a positive, negative or neutral impact. 
* All the columns are of the object data type.
* There are 5802 missing entries in the 'emotion_in_tweet_is_directed_at' column and one missing entry in the tweet_text column.
* There are 22 duplicated entries.

### 3. DATA PREPARATION
The data understanding section above checked for non null values, duplicates to gain surface level insights. This section delves into data preparation by performing various transformations suitable format for modelling.
But first we need to do a bit of data cleaning.
####  3.1  Data Cleaning
1. Deal with the missing values in the tweet_text and emotion_in_tweet_is_directed_at columns
2. Deal with Duplicates
3. Dealing with the text case
4. Further cleaning and transformation; Removing specific words and numbers in the text.

#####  3.1.1  Duplicates
* Drop the duplicated rows.
Rationale; The total number of duplictes, i.e 22. we remove them to maintain the integrity of our data set and only ensure only unique observations are considered.

In [212]:
df = df.drop_duplicates()

#####  3.1.2  Missing Values
* Drop the row with missing values in the tweet_text column. Implement the use of the dropna() pandas method.
* The 'emotion_in_tweet_is_directed_at' column, require strict check in regard to its contribution to the final model. Check for the percentage of the missing values; above 50%. With the trade off between droping this column and retaining it, try a method to get an absolute and rational values for the missing entries. 
* Loop through the tweet_text column texts and check for a probable entry. Fill in this entries to a new column. To do this, first, clean the tweet_text column and create a column for the cleaned text, then extract the possible entries.

In [213]:
# Drop the null value in tweet_text column.
df = df.dropna(subset = ['tweet_text'])

In [214]:
Pf.check_for_missing_values(df)

tweet_text                                               0
emotion_in_tweet_is_directed_at                       5788
is_there_an_emotion_directed_at_a_brand_or_product       0
dtype: int64


In [215]:
df['emotion_in_tweet_is_directed_at'].isna().value_counts(normalize = True)

True     0.638148
False    0.361852
Name: emotion_in_tweet_is_directed_at, dtype: float64

#####  3.1.2.1  Dealing with the text column
1. Basic cleaning: removing capitalization, special characters such as ?,;., converting to lower case
2. Tokenizing our texts column
3. create a new column with joined words
4. removing the stopwords

* First extract the words in the tweet_text column that starts with '@' and those starting with '#'. words starting with @ refers to the person who tweeted while those those starting with # refers to those who were tagged. worth notting that these words in our texts will be adding noise to our data_set.
* Extract the users and the tagged into separate columns.

In [216]:

import re
# Regular expression to extract Twitter usernames
pattern = r"@\w+"
pattern_2 = r'#\w+'

# Extract usernames from the 'Tweets' column
df['Usernames'] = df['tweet_text'].apply(lambda x: re.findall(pattern, x))
df['Tagged_Names'] = df['tweet_text'].apply(lambda x: re.findall(pattern_2, x))

df.head()

Unnamed: 0,tweet_text,emotion_in_tweet_is_directed_at,is_there_an_emotion_directed_at_a_brand_or_product,Usernames,Tagged_Names
0,.@wesley83 I have a 3G iPhone. After 3 hrs twe...,iPhone,Negative emotion,[@wesley83],"[#RISE_Austin, #SXSW]"
1,@jessedee Know about @fludapp ? Awesome iPad/i...,iPad or iPhone App,Positive emotion,"[@jessedee, @fludapp]",[#SXSW]
2,@swonderlin Can not wait for #iPad 2 also. The...,iPad,Positive emotion,[@swonderlin],"[#iPad, #SXSW]"
3,@sxsw I hope this year's festival isn't as cra...,iPad or iPhone App,Negative emotion,[@sxsw],[#sxsw]
4,@sxtxstate great stuff on Fri #SXSW: Marissa M...,Google,Positive emotion,[@sxtxstate],[#SXSW]


#####  3.1.2.2  Converting the tweet_text to lower case
* Write a function to access the tweets in the column, tweet_text, and lower case.
* Remove the usernames and tagged names in the texts and create the column 'clean_tweet_text' for the cleaned text

In [235]:
# Transform the whole dataset (df[tweet_text]) to lowercase
df["tweet_text"] = df["tweet_text"].str.lower()
# Display full text: uncomment the code below to display the whole texts
#df.style.set_properties(**{'text-align': 'left'})

In [237]:
# Create a function that removes words starting with @ and #
def remove_words_with_at(text):
    # Use a regular expression to remove words containing "@" and words starting with "#"
    cleaned_text = re.sub(r'\S*@\S*|#\w+','', text)
  
    return cleaned_text
# Create new column with tokenized data
df["clean_tweet_text"] = df["tweet_text"].apply(remove_words_with_at)
# Display full text: uncomment the code below to display the whole texts
#df.style.set_properties(**{'text-align': 'left'})

* Now loop through the tweet_text column and check for a probable entry. Fill in this entries to a new column, category_words.

In [219]:
def extract_category_words(tweet, categories):
    # Tokenize and check for category words
    extracted_words = []
    for category in categories:
        if category.lower() in tweet.lower():
            extracted_words.append(category)
    return " ".join(extracted_words)
df['Tweet_Directed_at'] = df['clean_tweet_text'].apply(lambda x: extract_category_words(x, categories))
# Check for the value counts in the new category column
df['Tweet_Directed_at'].value_counts()

Google                                   2156
                                         1781
iPad iPad                                1713
Apple                                    1191
iPhone                                    832
iPad Apple iPad                           568
Android                                   240
iPhone App iPhone                         208
iPhone Android                            101
iPad iPad iPhone                           98
Android App Android                        30
Apple Google                               23
iPad iPad Android                          23
Apple iPhone                               22
iPhone App iPhone Android                  17
Google Android                             15
iPad iPad Google                           10
iPad iPad iPhone Android                    8
Apple Android                               7
iPhone Android App Android                  6
iPad iPad iPhone App iPhone                 5
iPhone App iPhone Android App Andr

In [220]:
# Rearranging the dataframe.
df= df[['tweet_text', 'clean_tweet_text','emotion_in_tweet_is_directed_at','Tweet_Directed_at',
       'is_there_an_emotion_directed_at_a_brand_or_product']]
df.head(2)

Unnamed: 0,tweet_text,clean_tweet_text,emotion_in_tweet_is_directed_at,Tweet_Directed_at,is_there_an_emotion_directed_at_a_brand_or_product
0,.@wesley83 i have a 3g iphone. after 3 hrs twe...,"i have a 3g iphone. after 3 hrs tweeting at ,...",iPhone,iPhone,Negative emotion
1,@jessedee know about @fludapp ? awesome ipad/i...,know about ? awesome ipad/iphone app that yo...,iPad or iPhone App,iPad iPad iPhone App iPhone,Positive emotion


* From above display of the value counts, the category_words columns have 1781 null entries.
* Loop through the 'category_words' to determine which entries were null. For the indices of the null entries, check if they are present in the 'emotion_in_tweet_is_directed_at' column. If present, fill in the null entries in the category_words.
* Access the value counts of the category_words column to verify the decrease in the null entries. Notice a decrease in the number of null entries.
* Proceed to drop the rows with null entries.

In [221]:
# Check for rows where emotion_in_tweet_is_directed_at has data, but clean_tweet_text is blank
condition_a_data_b_blank = (df['emotion_in_tweet_is_directed_at'].notna() & (df['emotion_in_tweet_is_directed_at'] != "") & 
                            (df['Tweet_Directed_at'].isna() | (df['Tweet_Directed_at'] == "")))

# Update the 'category_words' column with the values from 'emotion_in_tweet_is_directed_at'
df.loc[condition_a_data_b_blank, 'Tweet_Directed_at'] = df.loc[condition_a_data_b_blank, 'emotion_in_tweet_is_directed_at']
# Uncomment the cell below to show the value counts
# df['Tweet_Directed_at'].value_counts()

* Create a mapping for the either google products, apple products, unknown and IRR. Map this dictionary to create a column to show which company the tweet was directed at.

In [222]:
# Define mapping for fewer categories
category_mapping = {
    'Google': 'Google Products',
    '': 'Unknown',
    'iPad iPad': 'Apple Products',
    'Apple': 'Apple Products',
    'iPhone': 'Apple Products',
    'iPad Apple iPad': 'Apple Products',
    'Android': 'Google Products',
    'iPhone App iPhone': 'Apple Products',
    'iPad iPad iPhone': 'Apple Products',
    'Android App Android': 'Google Products',
    'Apple Google': 'IRR',
    'iPad iPad Android': 'IRR',
    'Apple iPhone': 'Apple Products',
    'iPhone App iPhone Android': 'IRR',
    'Google Android': 'Google Products',
    'iPad iPad Google': 'IRR',
    'iPad iPad iPhone Android': 'IRR',
    'iPad iPad iPhone App iPhone': 'Apple Products',
    'iPhone App iPhone Android App Android': 'IRR',
    'iPad Apple iPad iPhone': 'Apple Products',
    'iPad iPad Google iPhone': 'IRR',
    'Apple iPhone App iPhone': 'Apple Products',
    'iPad': 'Apple Products',
    'iPad or iPhone App': 'Apple Products',
    'Android App': 'Google Products',
    'Other Google product or service': 'Google Products',
    'Other Apple product or service ': 'Apple Products'  
}

# Apply mapping to the dataframe
df['Company_Product'] = df['Tweet_Directed_at'].map(category_mapping)
df.head()
# Display the result
df['Company_Product'].value_counts(normalize = True)

Apple Products     0.547550
Google Products    0.282949
Unknown            0.159767
IRR                0.009734
Name: Company_Product, dtype: float64

In [223]:
# checking for any null values in the Company_product column
df['Company_Product'].isna().sum()

132

* The presence of 132 null values in the company_product column is reflective of the entries that could not be mapped from the 'tweet text' and also could not be mapped from 'emotion_in_tweet_is_directed_at' and 'Tweet_Directed_at' columns. Proceed to drop the null rows in this column.
* Then proceed to drop the 'emotion_in_tweet_is_directed_at' column, because the 'Company_product' is a better representative of this column. Now we have successfully dealt with the missing values. Proceed to dealing with the clean_tweet_text column semantic analysis.

In [224]:
df = df.dropna(subset = ['Company_Product'])

In [251]:
# Droping rows with null entries
df = df[(df['Company_Product'] != 'Unknown') & (df['Company_Product'] != 'IRR')]
df = df.reset_index(drop=True)
# Confirm the null entries
Pf.check_for_missing_values(df)

tweet_text                                               0
clean_tweet_text                                         0
emotion_in_tweet_is_directed_at                       4216
Tweet_Directed_at                                        0
is_there_an_emotion_directed_at_a_brand_or_product       0
Company_Product                                          0
text_tokenized                                           0
dtype: int64


In [252]:
df['Company_Product'].isna().sum()

0

####  3.2 Data Cleaning & EDA with NLTK
1. Remove URLs
2. remove non-alphanumeric characters
3. Remove numbers/ digits

#####  3.2.1
* In this section we access the clean_tweet_text column check for the appropriate checks in the text. For instance, removal of special characters, Urls and numbers/digits.
* Create a function that removes all the instances above.
* Access specific text, perform the transformations and apply to the whole dataframe.

In [302]:
sentence = df['clean_tweet_text'][377]
sentence

'before it even begins,  wins  {link}  '

In [298]:
# Clean text (remove unwanted characters and convert to lowercase)
def clean_text(text):
    if not isinstance(text, str):
        return ''
    text = re.sub(r'@[A-Za-z0-9_]+', '', text)  # confirm mentions (@user)
    text = re.sub(r'http\S+', '', text)  # Remove URLs
    text = re.sub(r'[^a-zA-Z\s]', '', text)  # Remove non-alphabetic characters
    text = text.lower()  # Convert text to lowercase
    return text

#df['cleaned_tweet'] = df['tweet_text'].apply(clean_text)
cleaned_sentence = clean_text(sentence)
cleaned_sentence

'the apple   has taken  and  by storm link   excited to a be a part '

In [283]:
# Import the regexptokenizer
from nltk.tokenize import RegexpTokenizer

basic_token_pattern = r"(?u)\b\w\w+\b"

tokenizer = RegexpTokenizer(basic_token_pattern)
tokenizer.tokenize(sentence)

['gotta',
 'love',
 'this',
 'google',
 'calendar',
 'featuring',
 'top',
 'parties',
 'show',
 'cases',
 'to',
 'check',
 'out',
 'rt',
 'via',
 'gt',
 'http',
 'bit',
 'ly',
 'axzwxb']

* Notice that the regexpTokenizer removes all the special characters, including urls. It splits the specific words and puts them into a list. However the it does not remove the stopwords/ filler words and numbers. 

In [244]:
# Create new column with tokenized data
df["text_tokenized"] = df["clean_tweet_text"].apply(tokenizer.tokenize)
# Display full text
#df.style.set_properties(**{'text-align': 'left'})
df.head(3)

Unnamed: 0,tweet_text,clean_tweet_text,emotion_in_tweet_is_directed_at,Tweet_Directed_at,is_there_an_emotion_directed_at_a_brand_or_product,Company_Product,text_tokenized
0,.@wesley83 i have a 3g iphone. after 3 hrs twe...,"i have a 3g iphone. after 3 hrs tweeting at ,...",iPhone,iPhone,Negative emotion,Apple Products,"[have, 3g, iphone, after, hrs, tweeting, at, i..."
1,@jessedee know about @fludapp ? awesome ipad/i...,know about ? awesome ipad/iphone app that yo...,iPad or iPhone App,iPad iPad iPhone App iPhone,Positive emotion,Apple Products,"[know, about, awesome, ipad, iphone, app, that..."
2,@swonderlin can not wait for #ipad 2 also. the...,can not wait for 2 also. they should sale th...,iPad,iPad,Positive emotion,Apple Products,"[can, not, wait, for, also, they, should, sale..."


In [None]:
# Function to remove numerical values using regex
def remove_numbers(text):
    return re.sub(r'\d+','', text)
# Check if the function works with our sentence
clean_sentence = remove_numbers(sentence)
clean_sentence