# Exploratory Data and Sentiment Analysis on COVID-19 related tweets through the TwitterAPI

### Author: George Spyrou
### Date: 01/03/2020

Purpose of this project is to leverage the TwitterAPI functionality offered by Twitter, and conduct an analysis on tweets that are related with the COVID-19 virus. Initially, this project started as an exploratory task to learn how the TwitterAPI can be used to retrieve data(tweets) from the web, and how to use the tweepy and searchtweets packages. 

After I managed to retrieve the data, I found myself really interested into performing some data analysis on the retrieved tweets. COVID-19 - or as it's been commonly known as Coronavirus - is one of the most discussed topics in Twitter for the period 01/01/2020 - 01/03/2020. Using relevant tweets we want to perform some exploratory data analysis (find the most common words used in tweets related to covid-19 or identify the bigrams) and then attempt to identify the sentiment of the tweets by using a variety of methods.

#### Version 1: The first version has been completed on 01/03/2020 and it includes analysis on:
    - Most common words present in tweets.
    - Most common bigrams (i.e. pairs of words that often appear next to each other).
    - Sentiment analysis by using the Liu Hu opinion lexicon algorithm.
    
At the first part of the project, we deal with setting up the environment required for our analysis.

In [1]:
# Import dependencies
import os
import json
import pandas as pd

# Plots and graphs
import matplotlib.pyplot as plt
import seaborn as sns

# Set up the project environment

# Secure location of the required keys to connect to the API
# This config also contains the search query (in this case 'coronavirus')
json_loc = '/Users/georgiosspyrou/Desktop/config_tweets/Twitter/twitter_config.json'

with open(json_loc) as json_file:
    data = json.load(json_file)

# Project folder location and keys
os.chdir(data["project_directory"])

import twitterCustomFunc as twf

# Import the data from the created .jsonl files

# Read the data from the jsonl files
jsonl_files_folder = os.path.join(data["project_directory"], data["outputFiles"])
