# Social Media Data Analysis Report

## Overview

This report contains the analysis and insights for a social media data analysis project aimed at deriving insights from a provided dataset containing social media data and metrics. The project focuses on understanding user engagement, content performance and presenting insights. Time series analysis, Natural Language processing and statistical techniques were leveraged in the course of this project.

## Table Of Contents

* [Objectives](#objectives)
* [Data](#data)
* [Methodology](#method)
* [Key Questions Covered in this Project](#questions)
* [Analysis and Insights](#analysis)

<a id = 'objectives'></a>
## Objectives

The main objective of this project is to generate actionable insights from the provided social media data. We aim to answer key questions related to engagement, content performance and user behavior to optimize digital marketing strategies.

<a id = 'data'></a>
## Data

The dataset used in this analysis contains information from various social media platforms, including Twitter, LinkedIn. The data includes engagement metrics such as likes, comments, shares, and reactions as well as post details and user interactions.

<a id = 'method'></a>
## Methodology

To achieve our aims, we take the following steps:
* **Data Preprocessing:** The first step is to remove duplicates, we then make use of nlp and machine learning techinques to fill in the missing values in the data. The data types of the features where converted  to the correct datatype for example the date column was converted to Datetime while likes, comments features where converted to float data type.

* **Exploratory Data Analysis:** Data Analysis and statistical techniques were used to explore the dataset which led to formulation of key questions.

* **Insights Generation:** Derive insights from our analysis

* **Visualization and Reporting:** Create informative visualizations and compile a comprehensive report.

<a id = 'questions'></a>
## Key Questions Covered in this Project

* [What are the first top 5 Posts](#q1)
* [Which Social Media Network is most active?](#q2)
* [When Are Peak Engagement Times?](#q3)
* [What Is the Relationship Between Impressions and Engagement?](#q4)
* [Which Content Types Receive the Most Shares?](#q5)
* [How Effective Are Hashtags in Driving Engagement?](#q6)
* [How Do Different Types of Clicks Correlate with Engagement?](#q7)
* [Is There a Seasonal or Periodic Pattern in Engagement?](#q8)
* [What Is the Overall Engagement Rate and How Does It Vary Across Platforms?](#q9)
* [Most Used Words In the Social Media Posts](#q10)

<a id = 'analysis'></a>
## Analysis and Insights

<a id = 'q1'></a>
## Top Posts By Engagement Rates
![Top%20Posts%20By%20Engagement%20Rates.JPG](attachment:Top%20Posts%20By%20Engagement%20Rates.JPG)

<a id = 'q2'></a>
## Most Active Socia Media Platforms

![Question2.JPG](attachment:Question2.JPG)

There are 2 social media platforms considered in this data, from the chat above we see that users were far more engaged on Tweeter than on LinkedIn.

<a id = 'q3'></a>
## When Are Peak Engagement Times?

![Question3.JPG](attachment:Question3.JPG)

Users of the social networks(Twitter and LinkedIn) in the given dataset were most active on Friday and Sunday and were least active on wednesday.


![Question3.2.JPG](attachment:Question3.2.JPG)
Users of the social networks in this dataset were most active in September followed by April and were least active in August.

<a id = 'q4'></a>
## What Is the Relationship Between Impressions and Engagement?

![Question4.JPG](attachment:Question4.JPG)

From the regression plot above, we notice that there is a linear relationship between the number of engagements and number of impressions. Furthermore, there is a positive correlation between the number of engagements and the number of impressions, indicating that when there is an increase in the number of impressions, there tends to be an increase in the number of engagements. However, it's important to note that correlation does not establish a causal relationship, and other factors may also contribute to changes in engagement.

<a id = 'q5'></a>
## Which Content Types Receive the Most Shares?

![Question5.JPG](attachment:Question5.JPG)

From the visualisation above, posts that contains photos recieved the most amount of shares, closely followed by posts that contains text while posts that contains poll had the least amount of shares.

<a id = 'q6'></a>
## How Effective Are Hashtags in Driving Engagement?

![Question6.JPG](attachment:Question6.JPG)

The mean number of engagements for posts without hashtags is higher than posts with hashtags, this implies that according to the data, hashtags are ineffective in driving engagements.

<a id = 'q7'></a>
## How Do Different Types of Clicks Correlate with Engagement?

We use Spearman Rank Correlation to determine the association(correlation) between the different types of clicks with the engagement feature, from the above output, we have that there is strong positive correlation between Post Clicks (All) and Engagement, since the p-value is less than 0.05 then the correlation is statistically significant. Therefore an increase in the number of Post Clicks (All) results in an increase in the number of engagements

There is also a strong positive correlation between Post Link Clicks, Other Post Clicks and Engagement, although less than that of Post Clicks(All) and Engagement, their p-values are also less than 0.05 which implies statistical significance. Therefore that increase in Post Link Clicks, Other Post Clicks results in an increase in number of Engagements

There is a weak positive correlation between between Post Media Clicks and Engagements and moderately strong positive correlation between Post Detail Expand Clicks, Profile Clicks and Engagements.

Finally, there is a weak negative correlation between Post Hashtag Clicks and Engagements 

We note that all the p-values are all less than 0.05 which implies that the correlations are all statistically significant.

<a id = 'q8'></a>
## Is There a Seasonal or Periodic Pattern in Engagement?

![Question7.3.JPG](attachment:Question7.3.JPG)

Examining the chart provided, we can observe clear patterns of seasonality within our data. Notably, there is a dip in user engagement levels from February to April followed by a significant surge in engagement from April to July. Subsequently, there is a continued increase in engagement from July to September followed by a decline from September to November. Finally, there is another upswing in engagement from November to December. We further decompose our data into trends, seasonalities and residuals.

![Question7.JPG](attachment:Question7.JPG)

![Question7.2.JPG](attachment:Question7.2.JPG)

<a id = 'q9'></a>
## What Is the Overall Engagement Rate and How Does It Vary Across Platforms?

![Question8.JPG](attachment:Question8.JPG)

<a id = 'q10'></a>
## Most Used Words in Social Media Posts

![Question9.JPG](attachment:Question9.JPG)

From the visualisation above, we can see the most commonly used words excluding **stopwords** in the posts. Frequently used words in posts includes but are not restricted to "visit", "call", "let", "today", e.t.c.