# Twitter Hashtag Search Notebook
This notebook enables you to request tweets from Twitter (assuming you have a developer account and active credentials)

## Pre-requisites
For this notebook to work, you need a Twitter developer account and active credentials for authorisation to access the Twitter API. If you do not have a Twitter developer account, then you can sign up on the Twitter [Developer Platform](https://developer.twitter.com/en/docs/twitter-api/getting-started/getting-access-to-the-twitter-api). If you have a Twitter developer account, but do not have active credentials, then you need to follow step two in the above link to acquire your API key and secret (also known as Consumer Key and Secret) and user Access Token (key) and Secret.

## Install required packages
Install the required packages and download lexicon for sentiment analysis.

In [1]:
import tweepy
import pandas as pd
from nltk import download
from nltk.sentiment.vader import SentimentIntensityAnalyzer
download('vader_lexicon');

[nltk_data] Downloading package vader_lexicon to
[nltk_data]     /Users/matt/nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!


## Enter credentials and setup API authorisation
Enter the credentials you should have setup using your Twitter developer account. If not, see above.

In [2]:
consumer_key = ''
consumer_secret = ''
access_key = ''
access_secret = ''

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)

## Specify search parameters and request tweets
Specify the hashtag you want to search with and then request the tweets.

In [3]:
def twitter_hashtag_search(hashtag):
    tweets = []
    for tweet in tweepy.Cursor(api.search_tweets, hashtag, lang="en", tweet_mode='extended').items(1000):
        tweets.append(tweet._json['full_text'])
    return(tweets)

hashtag = '#Marriott'
#hashtag = '#Hilton'
#hashtag = '#InterContinental'
tweets = twitter_hashtag_search(hashtag)

## Prepare tweets and analyse sentiment
Remove duplicates and analyse sentiment and count number of positive, negative, and neutral classes.

In [4]:
# Remove duplicates
tweets = list(set(tweets))

# Setup sentiment analyser
scorer = SentimentIntensityAnalyzer()
def predict_sentiment(text_string):
    return(scorer.polarity_scores(text_string)['compound'])

# Predict sentiment
df_tweets = pd.DataFrame({'tweets' : tweets})
df_tweets['sentiment'] = df_tweets['tweets'].apply(predict_sentiment)

# Create class
df_tweets['class'] = df_tweets['sentiment'].apply(lambda s: 1 if s>0 else -1 if s<0 else 0)

# Count values
print(df_tweets['class'].value_counts())

 1    11
 0     4
-1     3
Name: class, dtype: int64


## Save tweets and analysis (for download)
Save the tweets with a unique and identifiable filename.

In [5]:
# Get the current time and format to string
from datetime import datetime
now = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")

# Specify the filename to {hastag}_{now}.csv and remove the '#' symbol
filename = f'{hashtag}_{now}.csv'.replace('#','')

# Save csv to data directory
df_tweets.to_csv(f'../data/{filename}',index=False)