In [74]:
import pandas as pd

In [75]:
df=pd.read_csv('clean_social_posts.csv')

In [76]:
df.head()

Unnamed: 0,post_id,account_id,account_type,follower_count,media_type,content_category,traffic_source,has_call_to_action,post_datetime,post_date,...,shares,saves,reach,impressions,engagement_rate,followers_gained,caption_length,hashtags_count,performance_bucket_label,calc_engagement_rate
0,IG0000001,7,brand,3551,reel,technology,home feed,1,2024-11-30 06:00:00,2024-11-30,...,7,34,4327,6230,0.0385,899,100,7,medium,0.038523
1,IG0000002,20,creator,31095,image,fitness,hashtags,1,2025-08-15 15:00:00,2025-08-15,...,21,68,7451,8268,0.0663,805,122,5,viral,0.06628
2,IG0000003,15,brand,8167,reel,beauty,reels feed,0,2025-09-11 16:00:00,2025-09-11,...,1,22,1639,2616,0.0531,758,115,8,high,0.053135
3,IG0000004,11,creator,9044,carousel,music,external,0,2025-09-18 03:00:00,2025-09-18,...,7,0,2877,3171,0.0309,402,115,7,medium,0.030905
4,IG0000005,8,creator,15986,reel,technology,profile,0,2025-03-21 09:00:00,2025-03-21,...,5,21,5350,8503,0.0221,155,112,9,low,0.02211


In [77]:
df.nunique()

Unnamed: 0,0
post_id,29999
account_id,20
account_type,2
follower_count,20
media_type,3
content_category,10
traffic_source,6
has_call_to_action,2
post_datetime,8475
post_date,366


In [78]:
df.shape

(29999, 24)

In [79]:
df.isnull().sum()

Unnamed: 0,0
post_id,0
account_id,0
account_type,0
follower_count,0
media_type,0
content_category,0
traffic_source,0
has_call_to_action,0
post_datetime,0
post_date,0


In [80]:
df.duplicated().sum()

np.int64(0)

In [81]:
(df['impressions'] == 0).sum()
(df['reach'] == 0).sum()

np.int64(0)

In [82]:
(df['reach'] > df['impressions']).sum()

np.int64(0)

In [83]:
df[['likes','comments','shares','saves']].describe()

Unnamed: 0,likes,comments,shares,saves
count,29999.0,29999.0,29999.0,29999.0
mean,287.653588,8.521917,14.426614,42.517284
std,317.647682,10.116505,16.420899,47.808844
min,0.0,0.0,0.0,0.0
25%,104.0,3.0,5.0,15.0
50%,199.0,6.0,10.0,29.0
75%,363.0,11.0,19.0,54.0
max,10632.0,339.0,516.0,1542.0


In [84]:
df['calc_engagement_rate'] = (
    df['likes'] + df['comments'] + df['shares'] + df['saves']
) / df['impressions']

In [85]:
(df['calc_engagement_rate'] - df['engagement_rate']).abs().describe()

Unnamed: 0,0
count,29999.0
mean,2.5e-05
std,1.4e-05
min,0.0
25%,1.2e-05
50%,2.5e-05
75%,3.8e-05
max,5e-05


In [86]:
df[df['impressions'] == 0].shape

(0, 24)

“Impressions represent how many times content was displayed on users’ screens.
Since engagement rate is calculated using impressions as the denominator, I validated that impressions were never zero for published posts to avoid invalid ratios. I also ensured impressions were always greater than or equal to reach, which is a necessary condition in social analytics.”

In [87]:
df.columns = (
    df.columns
    .str.strip()
    .str.lower()
    .str.replace(" ", "_")
)

In [88]:
cat_cols = [
    'media_type',
    'content_category',
    'traffic_source',
    'account_type',
    'performance_bucket_label'
]

for col in cat_cols:
    df[col] = df[col].str.strip().str.lower()

In [89]:
df.to_csv("clean_social_posts.csv", index=False)

In [90]:
!pip install google-cloud-bigquery



In [91]:
from google.cloud import bigquery

In [92]:
from google.colab import auth
auth.authenticate_user()

In [93]:
client = bigquery.Client(project="instagram-485813")

In [94]:
query = """
SELECT COUNT(*) AS total_rows
FROM `instagram-485813.instagram.ins`
"""
client.query(query).to_dataframe()

Unnamed: 0,total_rows
0,29999


# Which media type drives the highest engagement rate?

In [95]:
query = """
SELECT
  media_type,
  COUNT(*) AS total_posts,
  ROUND(AVG(engagement_rate), 4) AS avg_engagement_rate
FROM `instagram-485813.instagram.ins`
GROUP BY media_type
ORDER BY avg_engagement_rate DESC;
"""

df_media_engagement = client.query(query).to_dataframe()
df_media_engagement

Unnamed: 0,media_type,total_posts,avg_engagement_rate
0,reel,7445,0.0423
1,image,11927,0.0423
2,carousel,10627,0.0418


Media Type Performance
Analysis shows that reel and image posts have nearly identical average engagement rates (0.0423), suggesting that static images can be as effective as video content when executed well. Carousel posts show a marginally lower engagement rate, indicating potential drop-off in multi-slide content. Given the large and balanced sample sizes, these trends appear stable and reliable.

# Q2: Does having a call-to-action (CTA) improve engagement and follower growth?

In [96]:
query = """
SELECT
  has_call_to_action,
  COUNT(*) AS total_posts,
  ROUND(AVG(engagement_rate), 4) AS avg_engagement_rate,
  ROUND(AVG(followers_gained), 2) AS avg_followers_gained
FROM `instagram-485813.instagram.ins`
GROUP BY has_call_to_action
ORDER BY avg_engagement_rate DESC;
"""

df_cta = client.query(query).to_dataframe()
df_cta

Unnamed: 0,has_call_to_action,total_posts,avg_engagement_rate,avg_followers_gained
0,0,19536,0.0422,501.84
1,1,10463,0.0419,502.74


Impact of Call-to-Action (CTA)
Posts with CTAs do not show higher engagement rates compared to non-CTA posts, indicating that explicit prompts do not necessarily drive likes or comments. However, CTA posts demonstrate a marginally higher average follower gain, suggesting that CTAs may play a small role in converting viewers into followers rather than increasing surface-level engagement.

# What is the optimal posting time for maximizing engagement?

In [97]:
query = """
SELECT
  post_hour,
  COUNT(*) AS total_posts,
  ROUND(AVG(engagement_rate), 4) AS avg_engagement_rate
FROM `instagram-485813.instagram.ins`
GROUP BY post_hour
ORDER BY avg_engagement_rate DESC;
"""

df_hourly = client.query(query).to_dataframe()
df_hourly

Unnamed: 0,post_hour,total_posts,avg_engagement_rate
0,3,1256,0.0434
1,8,1242,0.0433
2,2,1195,0.0433
3,20,1272,0.0427
4,17,1242,0.0427
5,14,1282,0.0426
6,4,1269,0.0424
7,5,1216,0.0424
8,9,1251,0.0422
9,16,1210,0.0422


Engagement rates remain relatively stable across posting hours, with only marginal differences between peak and low-performing times. This suggests that posting time alone is not a dominant driver of engagement. Instead, content characteristics likely play a more significant role in performance, and timing optimizations may offer only incremental gains.

# Does optimal posting time differ by media type?

In [98]:
query = """
SELECT
  media_type,
  post_hour,
  COUNT(*) AS total_posts,
  ROUND(AVG(engagement_rate), 4) AS avg_engagement_rate
FROM `instagram-485813.instagram.ins`
GROUP BY media_type, post_hour
HAVING COUNT(*) > 300
ORDER BY media_type, avg_engagement_rate DESC;
"""

df_media_hour = client.query(query).to_dataframe()
df_media_hour

Unnamed: 0,media_type,post_hour,total_posts,avg_engagement_rate
0,carousel,13,436,0.0436
1,carousel,20,449,0.0436
2,carousel,11,434,0.0432
3,carousel,16,414,0.0429
4,carousel,21,449,0.0428
...,...,...,...,...
61,reel,6,318,0.0407
62,reel,13,327,0.0407
63,reel,12,324,0.0404
64,reel,10,334,0.0402


In [99]:
query = """
WITH ranked_hours AS (
  SELECT
    media_type,
    post_hour,
    COUNT(*) AS total_posts,
    ROUND(AVG(engagement_rate), 4) AS avg_engagement_rate,
    RANK() OVER (
      PARTITION BY media_type
      ORDER BY AVG(engagement_rate) DESC
    ) AS rank_in_media
  FROM `instagram-485813.instagram.ins`
  GROUP BY media_type, post_hour
  HAVING COUNT(*) > 300
)

SELECT *
FROM ranked_hours
WHERE rank_in_media <= 3
ORDER BY media_type, rank_in_media;
"""

df_top_hours = client.query(query).to_dataframe()
df_top_hours

Unnamed: 0,media_type,post_hour,total_posts,avg_engagement_rate,rank_in_media
0,carousel,13,436,0.0436,1
1,carousel,20,449,0.0436,2
2,carousel,11,434,0.0432,3
3,image,8,516,0.0445,1
4,image,12,478,0.0444,2
5,image,3,468,0.0441,3
6,reel,17,330,0.0447,1
7,reel,9,307,0.0443,2
8,reel,4,305,0.0441,3


In [100]:
query = """
WITH ranked_hours AS (
  SELECT
    media_type,
    post_hour,
    COUNT(*) AS total_posts,
    ROUND(AVG(engagement_rate), 4) AS avg_engagement_rate,
    RANK() OVER (
      PARTITION BY media_type
      ORDER BY AVG(engagement_rate) ASC
    ) AS rank_low_engagement
  FROM `instagram-485813.instagram.ins`
  GROUP BY media_type, post_hour
  HAVING COUNT(*) > 300
)

SELECT *
FROM ranked_hours
WHERE rank_low_engagement <= 3
ORDER BY media_type, rank_low_engagement;
"""

df_low_hours = client.query(query).to_dataframe()
df_low_hours


Unnamed: 0,media_type,post_hour,total_posts,avg_engagement_rate,rank_low_engagement
0,carousel,22,443,0.0391,1
1,carousel,23,431,0.0403,2
2,carousel,9,435,0.0405,3
3,image,21,459,0.0405,1
4,image,15,498,0.0406,2
5,image,7,506,0.0406,3
6,reel,1,302,0.04,1
7,reel,10,334,0.0402,2
8,reel,12,324,0.0404,3


Different media types show slightly different high-performing time windows, but engagement differences across hours remain modest. This suggests that format-specific scheduling may provide incremental gains, while content quality remains the primary performance driver.

Certain late-night, early-morning, and mid-day hours consistently show lower engagement across media types. While the absolute differences are modest, these windows may be deprioritized when scheduling flexibility exists.

# Which content categories convert reach into followers most effectively?

In [101]:
query = """
SELECT
  content_category,
  COUNT(*) AS total_posts,
  ROUND(AVG(followers_gained / NULLIF(impressions, 0) * 1000), 2)
    AS followers_per_1k_impressions
FROM `instagram-485813.instagram.ins`
GROUP BY content_category
HAVING COUNT(*) > 300
ORDER BY followers_per_1k_impressions DESC;
"""

df_growth_efficiency = client.query(query).to_dataframe()
df_growth_efficiency

Unnamed: 0,content_category,total_posts,followers_per_1k_impressions
0,music,3003,100.69
1,lifestyle,3017,100.04
2,food,3010,99.01
3,beauty,2953,98.62
4,fitness,3004,98.49
5,fashion,3034,98.14
6,comedy,2950,98.0
7,technology,3025,96.91
8,travel,2968,96.81
9,photography,3035,95.3


**Content Category Conversion Efficiency**
When normalizing follower growth by impressions, content categories display meaningful differences in conversion efficiency. Music and lifestyle content consistently convert reach into followers more effectively, while photography and travel content show lower conversion rates. Although absolute differences are modest, they become significant at scale, highlighting the importance of content mix optimization for sustained account growth.

# Engagement vs Growth Trade-off by Content Category

In [102]:
query = """
WITH category_metrics AS (
  SELECT
    content_category,
    COUNT(*) AS total_posts,
    ROUND(AVG(engagement_rate), 4) AS avg_engagement_rate,
    ROUND(
      AVG(followers_gained / NULLIF(impressions, 0) * 1000),
      2
    ) AS followers_per_1k_impressions
  FROM `instagram-485813.instagram.ins`
  GROUP BY content_category
  HAVING COUNT(*) > 300
)

SELECT *
FROM category_metrics
ORDER BY avg_engagement_rate DESC;
"""

df_engagement_vs_growth = client.query(query).to_dataframe()
df_engagement_vs_growth


Unnamed: 0,content_category,total_posts,avg_engagement_rate,followers_per_1k_impressions
0,music,3003,0.0428,100.69
1,fitness,3004,0.0427,98.49
2,fashion,3034,0.0426,98.14
3,beauty,2953,0.0422,98.62
4,food,3010,0.0421,99.01
5,technology,3025,0.042,96.91
6,comedy,2950,0.0419,98.0
7,travel,2968,0.0417,96.81
8,lifestyle,3017,0.0416,100.04
9,photography,3035,0.0415,95.3


Engagement vs Growth Trade-off by Content Category
Analysis reveals that content categories differ significantly in their ability to convert engagement into follower growth. While categories such as fitness and fashion generate strong engagement, music and lifestyle content demonstrate superior follower conversion efficiency. This highlights the importance of aligning content strategy with specific objectives—using engagement-driven categories for retention and growth-efficient categories for acquisition-focused campaigns.

Are high-growth content categories underutilized?

In [103]:
query = """
WITH category_stats AS (
  SELECT
    content_category,
    COUNT(*) AS total_posts,
    ROUND(
      AVG(followers_gained / NULLIF(impressions, 0) * 1000),
      2
    ) AS followers_per_1k_impressions
  FROM `instagram-485813.instagram.ins`
  GROUP BY content_category
),
overall_posts AS (
  SELECT COUNT(*) AS total_posts_all
  FROM `instagram-485813.instagram.ins`
)

SELECT
  c.content_category,
  c.total_posts,
  ROUND(c.total_posts / o.total_posts_all * 100, 2) AS content_share_pct,
  c.followers_per_1k_impressions
FROM category_stats c
CROSS JOIN overall_posts o
ORDER BY followers_per_1k_impressions DESC;
"""

df_category_mix = client.query(query).to_dataframe()
df_category_mix


Unnamed: 0,content_category,total_posts,content_share_pct,followers_per_1k_impressions
0,music,3003,10.01,100.69
1,lifestyle,3017,10.06,100.04
2,food,3010,10.03,99.01
3,beauty,2953,9.84,98.62
4,fitness,3004,10.01,98.49
5,fashion,3034,10.11,98.14
6,comedy,2950,9.83,98.0
7,technology,3025,10.08,96.91
8,travel,2968,9.89,96.81
9,photography,3035,10.12,95.3


Despite significant differences in follower conversion efficiency across content categories, the current content mix allocates posting effort almost uniformly. This indicates a missed opportunity to reallocate content production toward higher-growth categories without increasing total posting volume.

# Does content performance vary by account?

In [104]:
query = """
WITH account_category_growth AS (
  SELECT
    account_id,
    content_category,
    COUNT(*) AS total_posts,
    ROUND(
      AVG(followers_gained / NULLIF(impressions, 0) * 1000),
      2
    ) AS followers_per_1k_impressions
  FROM `instagram-485813.instagram.ins`
  GROUP BY account_id, content_category
  HAVING COUNT(*) > 100
),
ranked_categories AS (
  SELECT
    *,
    RANK() OVER (
      PARTITION BY account_id
      ORDER BY followers_per_1k_impressions DESC
    ) AS rank_in_account
  FROM account_category_growth
)

SELECT *
FROM ranked_categories
WHERE rank_in_account <= 3
ORDER BY account_id, rank_in_account;
"""

df_account_strategy = client.query(query).to_dataframe()
df_account_strategy


Unnamed: 0,account_id,content_category,total_posts,followers_per_1k_impressions,rank_in_account
0,1,technology,170,117.02,1
1,1,comedy,130,110.59,2
2,1,lifestyle,160,103.67,3
3,2,music,150,119.52,1
4,2,technology,135,105.43,2
5,2,fitness,137,105.34,3
6,3,food,161,112.11,1
7,3,fashion,159,108.66,2
8,3,beauty,140,106.18,3
9,4,food,148,114.81,1


# Which content categories are reliably good vs occasionally viral but risky?

engagement = likes + comments + saves + shares

In [105]:
query = """
WITH post_engagement AS (
  SELECT
    content_category,
    (likes + comments + saves + shares) AS engagement
  FROM `instagram-485813.instagram.ins`
),

stats AS (
  SELECT
    content_category,
    COUNT(*) AS post_count,
    AVG(engagement) AS avg_engagement,
    STDDEV(engagement) AS std_engagement
  FROM post_engagement
  GROUP BY content_category
)

SELECT
  content_category,
  post_count,
  ROUND(avg_engagement, 2) AS avg_engagement,
  ROUND(std_engagement, 2) AS std_engagement,
  ROUND(std_engagement / NULLIF(avg_engagement, 0), 2) AS coeff_variation
FROM stats
ORDER BY coeff_variation ASC;
"""
df_category_consistency = client.query(query).to_dataframe()
df_category_consistency


Unnamed: 0,content_category,post_count,avg_engagement,std_engagement,coeff_variation
0,travel,2968,343.22,348.89,1.02
1,beauty,2953,354.5,368.67,1.04
2,fitness,3004,354.66,371.63,1.05
3,technology,3025,340.8,356.38,1.05
4,lifestyle,3017,350.5,367.57,1.05
5,comedy,2950,355.18,375.89,1.06
6,music,3003,358.46,379.61,1.06
7,photography,3035,356.43,413.68,1.16
8,fashion,3034,363.92,428.76,1.18
9,food,3010,353.43,455.04,1.29


The analysis shows that categories like Travel, Beauty, Fitness, Technology, and Lifestyle deliver the most consistent engagement, making them reliable for scalable growth. In contrast, Food, Fashion, and Photography are highly volatile, driven by occasional viral spikes rather than steady performance. This highlights that high average engagement can be misleading, and content strategies should prioritize consistency over sporadic virality for long-term growth.

# Which categories turn visibility into followers efficiently?

In [106]:
query = """
SELECT
  content_category,
  COUNT(*) AS post_count,
  ROUND(
    AVG(followers_gained / NULLIF(impressions, 0) * 1000),
    2
  ) AS followers_per_1k_impressions
FROM `instagram-485813.instagram.ins`
GROUP BY content_category
ORDER BY followers_per_1k_impressions DESC;
"""
df_category_conversion = client.query(query).to_dataframe()
df_category_conversion


Unnamed: 0,content_category,post_count,followers_per_1k_impressions
0,music,3003,100.69
1,lifestyle,3017,100.04
2,food,3010,99.01
3,beauty,2953,98.62
4,fitness,3004,98.49
5,fashion,3034,98.14
6,comedy,2950,98.0
7,technology,3025,96.91
8,travel,2968,96.81
9,photography,3035,95.3


Music and Lifestyle convert impressions into followers most efficiently, while Photography and Travel show the weakest conversion despite decent reach. Notably, some high-volatility categories like Food still convert well, indicating they are effective for growth spikes but unreliable for steady scaling. Overall, the strongest long-term strategy is to prioritize categories that combine reasonable consistency with high follower conversion, rather than optimizing for engagement or reach alone.

# Are we getting followers because people LIKE the content, or because they SAVE / SHARE it?

In [107]:
query = """
SELECT
  content_category,
  COUNT(*) AS post_count,

  ROUND(AVG(likes / NULLIF(impressions, 0) * 1000), 2) AS likes_per_1k_impressions,
  ROUND(AVG(comments / NULLIF(impressions, 0) * 1000), 2) AS comments_per_1k_impressions,
  ROUND(AVG(saves / NULLIF(impressions, 0) * 1000), 2) AS saves_per_1k_impressions,
  ROUND(AVG(shares / NULLIF(impressions, 0) * 1000), 2) AS shares_per_1k_impressions,

  ROUND(AVG(followers_gained / NULLIF(impressions, 0) * 1000), 2)
    AS followers_per_1k_impressions

FROM `instagram-485813.instagram.ins`
GROUP BY content_category
ORDER BY followers_per_1k_impressions DESC;
"""
df_engagement_quality = client.query(query).to_dataframe()
df_engagement_quality


Unnamed: 0,content_category,post_count,likes_per_1k_impressions,comments_per_1k_impressions,saves_per_1k_impressions,shares_per_1k_impressions,followers_per_1k_impressions
0,music,3003,34.91,1.05,5.11,1.74,100.69
1,lifestyle,3017,33.93,1.01,4.99,1.72,100.04
2,food,3010,34.25,1.02,5.07,1.73,99.01
3,beauty,2953,34.32,1.04,5.11,1.73,98.62
4,fitness,3004,34.86,1.01,5.1,1.75,98.49
5,fashion,3034,34.68,1.03,5.15,1.75,98.14
6,comedy,2950,34.06,1.0,5.08,1.71,98.0
7,technology,3025,34.27,0.99,5.06,1.7,96.91
8,travel,2968,33.9,1.0,5.05,1.72,96.81
9,photography,3035,33.78,0.99,4.99,1.69,95.3


Conclusion:
Follower growth is not driven by likes or comments, which remain nearly constant across categories. Instead, categories with higher save and share rates show better follower conversion, indicating that value-driven engagement is a stronger growth signal than surface-level interactions.

What content should the creator STOP doing because it consumes impressions but fails to grow followers?

In [108]:
query = """
WITH category_metrics AS (
  SELECT
    content_category,
    COUNT(*) AS post_count,
    AVG(likes + comments + saves + shares) AS avg_engagement,
    AVG(followers_gained / NULLIF(impressions, 0) * 1000)
      AS followers_per_1k_impressions
  FROM `instagram-485813.instagram.ins`
  GROUP BY content_category
),

overall_avg AS (
  SELECT
    AVG(followers_per_1k_impressions) AS avg_conversion
  FROM category_metrics
)

SELECT
  c.content_category,
  c.post_count,
  ROUND(c.avg_engagement, 2) AS avg_engagement,
  ROUND(c.followers_per_1k_impressions, 2)
    AS followers_per_1k_impressions
FROM category_metrics c
CROSS JOIN overall_avg o
WHERE c.followers_per_1k_impressions < o.avg_conversion
ORDER BY c.followers_per_1k_impressions ASC;
"""


In [109]:
df_stop_doing = client.query(query).to_dataframe()
df_stop_doing


Unnamed: 0,content_category,post_count,avg_engagement,followers_per_1k_impressions
0,photography,3035,356.43,95.3
1,travel,2968,343.22,96.81
2,technology,3025,340.8,96.91
3,comedy,2950,355.18,98.0
4,fashion,3034,363.92,98.14


**“STOP DOING” — Final Conclusion**

Despite generating healthy engagement, the following categories show below-average follower conversion and should not be prioritized as core growth drivers:

Photography – highest engagement volatility and lowest follower conversion

Travel – consistent engagement but weak audience conversion

Technology – stable but underperforms in growth efficiency

Comedy – good engagement, limited follower payoff

Fashion – strong engagement but fails to translate into proportional growth

Key takeaway:
High engagement does not guarantee audience growth. These categories consume impressions and effort but deliver lower growth ROI, making them better suited for occasional reach or branding content rather than sustained follower acquisition.

“Our analysis shows that several high-engagement categories actually underperform on follower growth, so optimizing for engagement alone can lead to false confidence.”

In [110]:
query = """
WITH category_metrics AS (
  SELECT
    content_category,
    COUNT(*) AS post_count,
    AVG(likes + comments + saves + shares) AS avg_engagement,
    AVG(followers_gained / NULLIF(impressions, 0) * 1000)
      AS followers_per_1k_impressions
  FROM `instagram-485813.instagram.ins`
  GROUP BY content_category
),

overall_avg AS (
  SELECT
    AVG(followers_per_1k_impressions) AS avg_conversion
  FROM category_metrics
)

SELECT
  c.content_category,
  c.post_count,
  ROUND(c.avg_engagement, 2) AS avg_engagement,
  ROUND(c.followers_per_1k_impressions, 2)
    AS followers_per_1k_impressions
FROM category_metrics c
CROSS JOIN overall_avg o
WHERE c.followers_per_1k_impressions > o.avg_conversion
ORDER BY c.followers_per_1k_impressions DESC;
"""


In [111]:
df_start_doing = client.query(query).to_dataframe()
df_start_doing


Unnamed: 0,content_category,post_count,avg_engagement,followers_per_1k_impressions
0,music,3003,358.46,100.69
1,lifestyle,3017,350.5,100.04
2,food,3010,353.43,99.01
3,beauty,2953,354.5,98.62
4,fitness,3004,354.66,98.49


Using the same benchmark (overall average follower conversion per 1k impressions), the following categories clearly outperform the platform average and should be prioritized:

Music – highest follower conversion and strong engagement

Lifestyle – consistent performance with high conversion efficiency

Food – converts impressions well despite higher volatility

Beauty – stable engagement with above-average growth

Fitness – reliable and growth-oriented

Why these are strong:
All five categories exceed the dataset’s average follower conversion rate, meaning they turn impressions into followers more efficiently than other categories.

This analysis shows that engagement alone is not a reliable indicator of growth. By using followers gained per 1k impressions as the primary metric and benchmarking categories against the overall average, we identified clear winners and losers in content strategy.

Categories such as Music, Lifestyle, Food, Beauty, and Fitness consistently convert impressions into followers and should be prioritized for scaling. In contrast, Photography, Travel, Technology, Comedy, and Fashion generate engagement but underperform on follower conversion, making them inefficient for sustained growth and better suited for occasional reach or branding.

Overall, the key insight is that value-driven engagement (saves and shares) drives growth, not likes. A data-backed content strategy should therefore focus on categories that reliably translate visibility into long-term audience growth rather than optimizing for surface-level engagement metrics.