# Social Media Sentiments Analysis

## Table of Contents  <a id='back'></a> 
- [Project Introduction](#project-introduction)
    - [Analysis Outline](#analysis-outline)
    - [Results](#results)
- [Importing Libraries and Opening Data Files](#importing-libraries-and-opening-data-files)
- [Pre-Processing Data](#pre-processing-data)
    - [Header Style](#header-style)
    - [Duplicates](#duplicates)
    - [Missing Values](#missing-values)
    - [Data Usage and Formatting](#data-usage-and-formatting)
- [Data Analysis](#data-analysis)
- [Conclusion](#conclusion)

<a name='headers'>

## Project Introduction


### Analysis Outline


### Results

## Importing Libraries and Opening Data Files

In [1]:
# Importing the needed libraries for this assignment
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import seaborn as sns

In [2]:
# Importing file for assignment
try:
    sm = pd.read_csv('sentimentdataset.csv')
except:
    sm = pd.read_csv('/datasets/sentimentdataset.csv')

[Back to Table of Contents](#back)

## Pre-Processing Data

### Header Style

In [3]:
# Getting general information about the dataset
sm.info()
sm.head()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 732 entries, 0 to 731
Data columns (total 15 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   Unnamed: 0.1  732 non-null    int64  
 1   Unnamed: 0    732 non-null    int64  
 2   Text          732 non-null    object 
 3   Sentiment     732 non-null    object 
 4   Timestamp     732 non-null    object 
 5   User          732 non-null    object 
 6   Platform      732 non-null    object 
 7   Hashtags      732 non-null    object 
 8   Retweets      732 non-null    float64
 9   Likes         732 non-null    float64
 10  Country       732 non-null    object 
 11  Year          732 non-null    int64  
 12  Month         732 non-null    int64  
 13  Day           732 non-null    int64  
 14  Hour          732 non-null    int64  
dtypes: float64(2), int64(6), object(7)
memory usage: 85.9+ KB


Unnamed: 0.2,Unnamed: 0.1,Unnamed: 0,Text,Sentiment,Timestamp,User,Platform,Hashtags,Retweets,Likes,Country,Year,Month,Day,Hour
0,0,0,Enjoying a beautiful day at the park! ...,Positive,2023-01-15 12:30:00,User123,Twitter,#Nature #Park,15.0,30.0,USA,2023,1,15,12
1,1,1,Traffic was terrible this morning. ...,Negative,2023-01-15 08:45:00,CommuterX,Twitter,#Traffic #Morning,5.0,10.0,Canada,2023,1,15,8
2,2,2,Just finished an amazing workout! 💪 ...,Positive,2023-01-15 15:45:00,FitnessFan,Instagram,#Fitness #Workout,20.0,40.0,USA,2023,1,15,15
3,3,3,Excited about the upcoming weekend getaway! ...,Positive,2023-01-15 18:20:00,AdventureX,Facebook,#Travel #Adventure,8.0,15.0,UK,2023,1,15,18
4,4,4,Trying out a new recipe for dinner tonight. ...,Neutral,2023-01-15 19:55:00,ChefCook,Instagram,#Cooking #Food,12.0,25.0,Australia,2023,1,15,19


In [4]:
#checking for snakecase format
sm.columns

Index(['Unnamed: 0.1', 'Unnamed: 0', 'Text', 'Sentiment', 'Timestamp', 'User',
       'Platform', 'Hashtags', 'Retweets', 'Likes', 'Country', 'Year', 'Month',
       'Day', 'Hour'],
      dtype='object')

In [5]:
# Renaming column names to snake_case format
sm = sm.rename(columns={'Unnamed: 0.1': 'unnamed_0.01',
                        'Unnamed: 0': 'unnamed_0',
                        'Text': 'text',
                        'Sentiment': 'sentiment',
                        'Timestamp': 'timestamp',
                        'User': 'user',
                        'Platform': 'platform',
                        'Hashtags': 'hashtags',
                        'Retweets': 'reshared',
                        'Likes': 'likes',
                        'Country': 'country',
                        'Year': 'year',
                        'Month': 'month',
                        'Day': 'day',
                        'Hour': 'hour'})
sm.columns

Index(['unnamed_0.01', 'unnamed_0', 'text', 'sentiment', 'timestamp', 'user',
       'platform', 'hashtags', 'reshared', 'likes', 'country', 'year', 'month',
       'day', 'hour'],
      dtype='object')

### Duplicates

In [6]:
# Checking for duplicates
sm.duplicated().sum()

0

### Missing Values

In [7]:
# Checking for null values
sm.isna().sum()

unnamed_0.01    0
unnamed_0       0
text            0
sentiment       0
timestamp       0
user            0
platform        0
hashtags        0
reshared        0
likes           0
country         0
year            0
month           0
day             0
hour            0
dtype: int64

### Data Usage and Formatting

In [8]:
sm.info()
sm.sample(10)

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 732 entries, 0 to 731
Data columns (total 15 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   unnamed_0.01  732 non-null    int64  
 1   unnamed_0     732 non-null    int64  
 2   text          732 non-null    object 
 3   sentiment     732 non-null    object 
 4   timestamp     732 non-null    object 
 5   user          732 non-null    object 
 6   platform      732 non-null    object 
 7   hashtags      732 non-null    object 
 8   reshared      732 non-null    float64
 9   likes         732 non-null    float64
 10  country       732 non-null    object 
 11  year          732 non-null    int64  
 12  month         732 non-null    int64  
 13  day           732 non-null    int64  
 14  hour          732 non-null    int64  
dtypes: float64(2), int64(6), object(7)
memory usage: 85.9+ KB


Unnamed: 0,unnamed_0.01,unnamed_0,text,sentiment,timestamp,user,platform,hashtags,reshared,likes,country,year,month,day,hour
192,193,195,"Jealousy poisons my thoughts, resentment brew...",Jealousy,2018-08-05 16:30:00,PoisonedMind,Facebook,#Jealousy #Resentment,8.0,15.0,USA,2018,8,5,16
180,181,183,"Resentment festers, poisoning relationships. ...",Resentment,2022-01-15 12:00:00,BrokenTrust,Facebook,#Resentment #BrokenTrust,10.0,20.0,Australia,2022,1,15,12
409,410,414,Awe-struck by the grandeur of an ancient cathe...,Awe,2018-08-18 14:45:00,CathedralVisitor,Facebook,#Awe #ArchitecturalGrandeur,18.0,35.0,Czech Republic,2018,8,18,14
441,442,446,"Avoiding the thorns of regret, walking barefoo...",Regret,2022-04-18 11:30:00,RemorseWalker,Instagram,#Regret #PathOfRemorse,25.0,50.0,Canada,2022,4,18,11
57,58,60,Laughter is the best medicine—enjoying a come...,Joy,2023-02-13 19:30:00,ComedyFan,Facebook,#Joy #ComedyShow,22.0,45.0,Canada,2023,2,13,19
605,606,610,Savoring the flavors of a home-cooked meal. Si...,Contentment,2023-06-14 19:30:00,HomeChefSenior,Instagram,#HappinessInFood #SeniorLife,25.0,50.0,USA,2023,6,14,19
317,318,322,"Dismissive gestures, a curtain drawn to shiel...",Dismissive,2018-08-18 14:40:00,CurtainShield,Instagram,#Dismissive #IndifferencePerformance,12.0,24.0,Australia,2018,8,18,14
650,651,655,Bonding with friends over the latest K-pop sen...,Joy,2023-08-09 22:30:00,KpopFangirlHighSchool,Facebook,#KpopFangirl #HighSchoolMusic,20.0,40.0,USA,2023,8,9,22
724,725,729,Creating a beautiful mural with fellow art ent...,Happy,2023-10-22 20:45:00,MuralCreationHighSchool,Instagram,#ArtCollaboration #HighSchoolCreativity,22.0,43.0,UK,2023,10,22,20
200,201,203,"Gazing at the sunset, a melancholic longing f...",Melancholy,2019-06-18 20:45:00,SunsetDreamer,Instagram,#Melancholy #SunsetMoments,12.0,25.0,Australia,2019,6,18,20


In [9]:
# Looking at both of the numeric unnamed columns, it appears to look like a column 
# that was accidentally recording the row value. If they are we can remove these columns.

sm['unnamed_0'].isin(sm['unnamed_0.01']).count()

732

In [10]:
# After uncovering that these two columns are identical to the row value, 
# they can be removed to optimize data usage

sm = sm.drop(columns=['unnamed_0.01', 'unnamed_0'])
sm.head()

Unnamed: 0,text,sentiment,timestamp,user,platform,hashtags,reshared,likes,country,year,month,day,hour
0,Enjoying a beautiful day at the park! ...,Positive,2023-01-15 12:30:00,User123,Twitter,#Nature #Park,15.0,30.0,USA,2023,1,15,12
1,Traffic was terrible this morning. ...,Negative,2023-01-15 08:45:00,CommuterX,Twitter,#Traffic #Morning,5.0,10.0,Canada,2023,1,15,8
2,Just finished an amazing workout! 💪 ...,Positive,2023-01-15 15:45:00,FitnessFan,Instagram,#Fitness #Workout,20.0,40.0,USA,2023,1,15,15
3,Excited about the upcoming weekend getaway! ...,Positive,2023-01-15 18:20:00,AdventureX,Facebook,#Travel #Adventure,8.0,15.0,UK,2023,1,15,18
4,Trying out a new recipe for dinner tonight. ...,Neutral,2023-01-15 19:55:00,ChefCook,Instagram,#Cooking #Food,12.0,25.0,Australia,2023,1,15,19


In [11]:
# Checking the text column

sm['text'].unique()

array([' Enjoying a beautiful day at the park!              ',
       ' Traffic was terrible this morning.                 ',
       ' Just finished an amazing workout! 💪               ',
       ' Excited about the upcoming weekend getaway!        ',
       ' Trying out a new recipe for dinner tonight.        ',
       ' Feeling grateful for the little things in life.    ',
       ' Rainy days call for cozy blankets and hot cocoa.   ',
       ' The new movie release is a must-watch!             ',
       ' Political discussions heating up on the timeline.  ',
       ' Missing summer vibes and beach days.               ',
       ' Just published a new blog post. Check it out!      ',
       ' Feeling a bit under the weather today.             ',
       " Exploring the city's hidden gems.                  ",
       ' New year, new fitness goals! 💪                    ',
       ' Technology is changing the way we live.            ',
       ' Reflecting on the past and looking ahead.       

In [12]:
# Lowering the text column to snakecase format and removing spaces from the fron and end of the the text values

sm['text'] = sm['text'].str.lower()
sm['text'] = sm['text'].apply(lambda x: x.rstrip()).apply(lambda x: x.lstrip())
sm['text'] = sm['text'].apply(lambda x: x.replace(' ', '_'))
sm['text'].unique()

array(['enjoying_a_beautiful_day_at_the_park!',
       'traffic_was_terrible_this_morning.',
       'just_finished_an_amazing_workout!_💪',
       'excited_about_the_upcoming_weekend_getaway!',
       'trying_out_a_new_recipe_for_dinner_tonight.',
       'feeling_grateful_for_the_little_things_in_life.',
       'rainy_days_call_for_cozy_blankets_and_hot_cocoa.',
       'the_new_movie_release_is_a_must-watch!',
       'political_discussions_heating_up_on_the_timeline.',
       'missing_summer_vibes_and_beach_days.',
       'just_published_a_new_blog_post._check_it_out!',
       'feeling_a_bit_under_the_weather_today.',
       "exploring_the_city's_hidden_gems.",
       'new_year,_new_fitness_goals!_💪',
       'technology_is_changing_the_way_we_live.',
       'reflecting_on_the_past_and_looking_ahead.',
       'just_adopted_a_cute_furry_friend!_🐾',
       'late-night_gaming_session_with_friends.',
       'attending_a_virtual_conference_on_ai.',
       'winter_blues_got_me_feeling_low.',
 

In [13]:
# Checking the sentiment column

sm['sentiment'].unique()

array([' Positive  ', ' Negative  ', ' Neutral   ', ' Anger        ',
       ' Fear         ', ' Sadness      ', ' Disgust      ',
       ' Happiness    ', ' Joy          ', ' Love         ',
       ' Amusement    ', ' Enjoyment    ', ' Admiration   ',
       ' Affection    ', ' Awe          ', ' Disappointed ',
       ' Surprise     ', ' Acceptance   ', ' Adoration    ',
       ' Anticipation ', ' Bitter       ', ' Calmness     ',
       ' Confusion    ', ' Excitement   ', ' Kind         ',
       ' Pride        ', ' Shame        ', ' Confusion ', ' Excitement ',
       ' Shame ', ' Elation       ', ' Euphoria      ', ' Contentment   ',
       ' Serenity      ', ' Gratitude     ', ' Hope          ',
       ' Empowerment   ', ' Compassion    ', ' Tenderness    ',
       ' Arousal       ', ' Enthusiasm    ', ' Fulfillment  ',
       ' Reverence     ', ' Compassion', ' Fulfillment   ', ' Reverence ',
       ' Elation   ', ' Despair         ', ' Grief           ',
       ' Loneliness     

In [14]:
# Lowering the elements to snakecase format, removing irregular spaces, and changing the data type to lower data usage

sm['sentiment'] = sm['sentiment'].str.lower()
sm['sentiment'] = sm['sentiment'].apply(lambda x: x.rstrip()).apply(lambda x: x.lstrip())
sm['sentiment'] = sm['sentiment'].apply(lambda x: x.replace(' ', '_'))
sm['sentiment'] = sm['sentiment'].astype('category')
sm['sentiment'].unique()

['positive', 'negative', 'neutral', 'anger', 'fear', ..., 'mischievous', 'sad', 'hate', 'bad', 'happy']
Length: 191
Categories (191, object): ['acceptance', 'accomplishment', 'admiration', 'adoration', ..., 'wonder', 'wonderment', 'yearning', 'zest']

In [15]:
# Looking at the first few values it shows the time stamp column is a string type but
# we should convert it to a datetime type to save data usage

sm['timestamp'] = pd.to_datetime(sm['timestamp'], format='%Y-%m-%d %H:%M:%S')

In [16]:
# Checking the user column

sm['user'].unique()

array([' User123      ', ' CommuterX    ', ' FitnessFan   ',
       ' AdventureX   ', ' ChefCook     ', ' GratitudeNow ',
       ' RainyDays    ', ' MovieBuff    ', ' DebateTalk   ',
       ' BeachLover   ', ' BloggerX     ', ' WellnessCheck',
       ' UrbanExplorer', ' FitJourney   ', ' TechEnthusiast',
       ' Reflections  ', ' PetAdopter   ', ' GamerX       ',
       ' TechConference', ' WinterBlues  ', ' Bookworm     ',
       ' VRExplorer   ', ' ProductivityPro', ' FitnessWarrior',
       ' CareerMilestone', ' BrunchBuddy  ', ' LanguageLearner',
       ' BookLover    ', ' MentalHealthMatters', ' ArtistInAction',
       ' RoadTripper  ', ' SunsetWatcher', ' CodeEnthusiast',
       ' WorkshopAttendee', ' WinterSports  ', ' FamilyTime   ',
       ' MusicLover   ', ' MindfulMoments', ' DessertExplorer',
       ' GamingEnthusiast', ' GardenPlanner ', ' BirthdayBash ',
       ' ProductivityWin', ' MovieNight   ', ' ArtExplorer  ',
       ' BookwormX    ', ' VRMeetup     ', ' NatureLove

In [17]:
# Lowering the user column to snakecase format and removing spaces that are not needed

sm['user'] = sm['user'].str.lower()
sm['user'] = sm['user'].apply(lambda x: x.rstrip()).apply(lambda x: x.lstrip())
sm['user'].unique()

array(['user123', 'commuterx', 'fitnessfan', 'adventurex', 'chefcook',
       'gratitudenow', 'rainydays', 'moviebuff', 'debatetalk',
       'beachlover', 'bloggerx', 'wellnesscheck', 'urbanexplorer',
       'fitjourney', 'techenthusiast', 'reflections', 'petadopter',
       'gamerx', 'techconference', 'winterblues', 'bookworm',
       'vrexplorer', 'productivitypro', 'fitnesswarrior',
       'careermilestone', 'brunchbuddy', 'languagelearner', 'booklover',
       'mentalhealthmatters', 'artistinaction', 'roadtripper',
       'sunsetwatcher', 'codeenthusiast', 'workshopattendee',
       'wintersports', 'familytime', 'musiclover', 'mindfulmoments',
       'dessertexplorer', 'gamingenthusiast', 'gardenplanner',
       'birthdaybash', 'productivitywin', 'movienight', 'artexplorer',
       'bookwormx', 'vrmeetup', 'naturelover', 'chefathome',
       'optimisticmindset', 'fitnesschallenge', 'bikeexplorer',
       'socialjustice', 'thrillerfan', 'empathyfirst', 'ecoawareness',
       'proudf

In [18]:
# Checking the platform column

sm['platform'].unique()

array([' Twitter  ', ' Instagram ', ' Facebook ', ' Twitter '],
      dtype=object)

In [19]:
# Lowering the platform column to snakecase format and fixing spacing typos

sm['platform'] = sm['platform'].str.lower()
sm['platform'] = sm['platform'].apply(lambda x: x.strip())
sm['platform'] = sm['platform'].astype('category')
sm['platform'].unique()

['twitter', 'instagram', 'facebook']
Categories (3, object): ['facebook', 'instagram', 'twitter']

In [20]:
# Checking the hashtags column

sm['hashtags'].unique()

array([' #Nature #Park                            ',
       ' #Traffic #Morning                        ',
       ' #Fitness #Workout                        ',
       ' #Travel #Adventure                       ',
       ' #Cooking #Food                           ',
       ' #Gratitude #PositiveVibes              ',
       ' #RainyDays #Cozy                         ',
       ' #MovieNight #MustWatch                  ',
       ' #Politics #Debate                       ',
       ' #Summer #BeachDays                      ',
       ' #Blogging #NewPost                      ',
       ' #SickDay #Health                        ',
       ' #CityExplore #HiddenGems                ',
       ' #NewYear #FitnessGoals                  ',
       ' #Tech #Innovation                       ',
       ' #Reflection #Future                     ',
       ' #PetAdoption #FurryFriend               ',
       ' #Gaming #LateNight                      ',
       ' #AI #TechConference                     ',
       

In [21]:
# Lowering the hashtags column to snakecase format and removing spaces that are not needed

sm['hashtags'] = sm['hashtags'].str.lower()
sm['hashtags'] = sm['hashtags'].apply(lambda x: x.rstrip()).apply(lambda x: x.lstrip())
sm['hashtags'] = sm['hashtags'].apply(lambda x: x.replace(' ', '_'))
sm['hashtags'].unique()

array(['#nature_#park', '#traffic_#morning', '#fitness_#workout',
       '#travel_#adventure', '#cooking_#food',
       '#gratitude_#positivevibes', '#rainydays_#cozy',
       '#movienight_#mustwatch', '#politics_#debate',
       '#summer_#beachdays', '#blogging_#newpost', '#sickday_#health',
       '#cityexplore_#hiddengems', '#newyear_#fitnessgoals',
       '#tech_#innovation', '#reflection_#future',
       '#petadoption_#furryfriend', '#gaming_#latenight',
       '#ai_#techconference', '#winterblues_#mood',
       '#reading_#coffeetime', '#vr_#virtualreality',
       '#productivity_#workfromhome', '#fitness_#challengeaccepted',
       '#career_#milestone', '#brunch_#friends',
       '#languagelearning_#personalgrowth', '#reading_#quiettime',
       '#mentalhealth_#selfcare', '#art_#paintinginprogress',
       '#roadtrip_#scenicviews', '#teatime_#sunset',
       '#coding_#enthusiasm', '#inspiration_#workshop',
       '#wintersports_#fun', '#familytime_#weekend',
       '#music_#conce

In [22]:
# Checking the country column

sm['country'].unique()

array([' USA      ', ' Canada   ', ' USA        ', ' UK       ',
       ' Australia ', ' India    ', ' USA    ', 'USA', ' Canada    ',
       ' USA       ', ' USA ', ' Canada  ', ' UK ', ' India     ',
       ' Canada ', ' UK        ', ' India ', ' UK   ', ' UK         ',
       ' USA     ', ' Canada     ', ' USA          ', ' India      ',
       ' Australia  ', ' UK           ', ' Canada       ',
       ' Australia   ', ' Australia    ', ' UK            ', ' USA   ',
       ' India       ', ' UK          ', ' USA  ', ' UK      ',
       ' Canada      ', ' India   ', ' Canada          ',
       ' India        ', ' Australia     ', ' Canada        ',
       ' India         ', ' USA           ', ' USA               ',
       ' Canada            ', ' UK                ',
       ' India              ', ' Australia          ',
       ' France            ', ' Brazil            ',
       ' Japan             ', ' Greece            ',
       ' India             ', ' Brazil           ', ' Franc

In [23]:
# Lowering the country column to snakecase format and removing spaces

sm['country'] = sm['country'].str.lower()
sm['country'] = sm['country'].apply(lambda x: x.rstrip()).apply(lambda x: x.lstrip())
sm['country'] = sm['country'].apply(lambda x: x.replace(' ', '_'))
sm['country'] = sm['country'].astype('category')
sm['country'].unique()

['usa', 'canada', 'uk', 'australia', 'india', ..., 'ireland', 'jamaica', 'kenya', 'scotland', 'thailand']
Length: 33
Categories (33, object): ['australia', 'austria', 'belgium', 'brazil', ..., 'switzerland', 'thailand', 'uk', 'usa']

In [24]:
sm.info()
sm.head()

# Memory usage has decreased from 85.9kb to 67.2kb

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 732 entries, 0 to 731
Data columns (total 13 columns):
 #   Column     Non-Null Count  Dtype         
---  ------     --------------  -----         
 0   text       732 non-null    object        
 1   sentiment  732 non-null    category      
 2   timestamp  732 non-null    datetime64[ns]
 3   user       732 non-null    object        
 4   platform   732 non-null    category      
 5   hashtags   732 non-null    object        
 6   reshared   732 non-null    float64       
 7   likes      732 non-null    float64       
 8   country    732 non-null    category      
 9   year       732 non-null    int64         
 10  month      732 non-null    int64         
 11  day        732 non-null    int64         
 12  hour       732 non-null    int64         
dtypes: category(3), datetime64[ns](1), float64(2), int64(4), object(3)
memory usage: 67.2+ KB


Unnamed: 0,text,sentiment,timestamp,user,platform,hashtags,reshared,likes,country,year,month,day,hour
0,enjoying_a_beautiful_day_at_the_park!,positive,2023-01-15 12:30:00,user123,twitter,#nature_#park,15.0,30.0,usa,2023,1,15,12
1,traffic_was_terrible_this_morning.,negative,2023-01-15 08:45:00,commuterx,twitter,#traffic_#morning,5.0,10.0,canada,2023,1,15,8
2,just_finished_an_amazing_workout!_💪,positive,2023-01-15 15:45:00,fitnessfan,instagram,#fitness_#workout,20.0,40.0,usa,2023,1,15,15
3,excited_about_the_upcoming_weekend_getaway!,positive,2023-01-15 18:20:00,adventurex,facebook,#travel_#adventure,8.0,15.0,uk,2023,1,15,18
4,trying_out_a_new_recipe_for_dinner_tonight.,neutral,2023-01-15 19:55:00,chefcook,instagram,#cooking_#food,12.0,25.0,australia,2023,1,15,19


[Back to Table of Contents](#back)

### Data Wrangling

In [25]:
# Looking at the hashtags column, it would be more convenient to split the hashtags so that each row only has one unique hashtag to make filtering easier in the future

sm.head(10)

Unnamed: 0,text,sentiment,timestamp,user,platform,hashtags,reshared,likes,country,year,month,day,hour
0,enjoying_a_beautiful_day_at_the_park!,positive,2023-01-15 12:30:00,user123,twitter,#nature_#park,15.0,30.0,usa,2023,1,15,12
1,traffic_was_terrible_this_morning.,negative,2023-01-15 08:45:00,commuterx,twitter,#traffic_#morning,5.0,10.0,canada,2023,1,15,8
2,just_finished_an_amazing_workout!_💪,positive,2023-01-15 15:45:00,fitnessfan,instagram,#fitness_#workout,20.0,40.0,usa,2023,1,15,15
3,excited_about_the_upcoming_weekend_getaway!,positive,2023-01-15 18:20:00,adventurex,facebook,#travel_#adventure,8.0,15.0,uk,2023,1,15,18
4,trying_out_a_new_recipe_for_dinner_tonight.,neutral,2023-01-15 19:55:00,chefcook,instagram,#cooking_#food,12.0,25.0,australia,2023,1,15,19
5,feeling_grateful_for_the_little_things_in_life.,positive,2023-01-16 09:10:00,gratitudenow,twitter,#gratitude_#positivevibes,25.0,50.0,india,2023,1,16,9
6,rainy_days_call_for_cozy_blankets_and_hot_cocoa.,positive,2023-01-16 14:45:00,rainydays,facebook,#rainydays_#cozy,10.0,20.0,canada,2023,1,16,14
7,the_new_movie_release_is_a_must-watch!,positive,2023-01-16 19:30:00,moviebuff,instagram,#movienight_#mustwatch,15.0,30.0,usa,2023,1,16,19
8,political_discussions_heating_up_on_the_timeline.,negative,2023-01-17 08:00:00,debatetalk,twitter,#politics_#debate,30.0,60.0,usa,2023,1,17,8
9,missing_summer_vibes_and_beach_days.,neutral,2023-01-17 12:20:00,beachlover,facebook,#summer_#beachdays,18.0,35.0,australia,2023,1,17,12


In [26]:
# Splitting the hashtags into two columns

new_hashtag = sm['hashtags'].str.split('_', n=1, expand=True)

# Creating new columns for the newly split hashtags

sm['first_hashtag'] = new_hashtag[0]
sm['second_hashtag'] = new_hashtag[1]

# Removing the old hashtags column

sm.drop(columns=['hashtags'], inplace=True)

sm.head()

Unnamed: 0,text,sentiment,timestamp,user,platform,reshared,likes,country,year,month,day,hour,first_hashtag,second_hashtag
0,enjoying_a_beautiful_day_at_the_park!,positive,2023-01-15 12:30:00,user123,twitter,15.0,30.0,usa,2023,1,15,12,#nature,#park
1,traffic_was_terrible_this_morning.,negative,2023-01-15 08:45:00,commuterx,twitter,5.0,10.0,canada,2023,1,15,8,#traffic,#morning
2,just_finished_an_amazing_workout!_💪,positive,2023-01-15 15:45:00,fitnessfan,instagram,20.0,40.0,usa,2023,1,15,15,#fitness,#workout
3,excited_about_the_upcoming_weekend_getaway!,positive,2023-01-15 18:20:00,adventurex,facebook,8.0,15.0,uk,2023,1,15,18,#travel,#adventure
4,trying_out_a_new_recipe_for_dinner_tonight.,neutral,2023-01-15 19:55:00,chefcook,instagram,12.0,25.0,australia,2023,1,15,19,#cooking,#food


[Back to Table of Contents](#back)

## Data Analysis

[Back to Table of Contents](#back)

## Conclusion

[Back to Table of Contents](#back)