# Tweet Analysis: Public Reaction to ChatGPT’s Launch

![logo.png](attachment:5cc6f0d1-147c-46c7-b2c0-302466c8cd31.png)

## Overview
This project analyzes real Twitter data in the days immediately following the **launch of ChatGPT**, focusing on user engagement and trending discussion topics. The goal is to extract insights that could help product teams understand early public perception, marketing impact, and viral engagement patterns.

## Objective

**Stakeholder**: OpenAI’s marketing and product development teams (hypothetical)  
**Goal**: Identify top hashtags, understand tweet volume trends, and highlight organic discussion points to evaluate public sentiment and engagement post-launch.


## Dataset

- A CSV file containing thousands of tweets mentioning “ChatGPT” shortly after its launch
- Link: https://www.kaggle.com/code/mpwolke/chatgpt-tweets?select=tweets.csv

## Data Cleaning & Processing

- Filled missing `hashtags` by extracting from tweet `text` using regex
- Cleaned tweet text by removing:
  - URLs
  - @mentions
  - Emojis and hashtags
  - Newlines and special characters
- Converted date columns (`user_created`, `created_at`) into datetime format

## Exploratory Data Analysis

![Tweets_over_time.png](attachment:ce9edd5f-4c3e-41d6-b90a-a976bc6101fc.png)

### Tweets over Time
We can see a big interest and promtp rise of tweets on the topic of ChatGPT on the first two days after the launch. But the numbers gragually go down within next 2 weeks. 

![Screen Shot 2025-05-09 at 16.11.43.png](attachment:9a64bd54-a2d9-476b-beb2-3cc882e8c345.png)

### Word frequency
We explored the word frequency in the tweets as it is shown on the chart above. And below there is a table of possible reasons for their usage. 

![High usage of ChatGPT for Q&A, writing tasks, and coding — shows functional usage-2.png](attachment:7396d8c0-741a-4714-bc27-24f47ef90b87.png)

### Sentiment labeling

To better understand public reactions to ChatGPT in the days following its release, I manually reviewed a sample of 300 tweets and categorized each into one of three sentiment classes:

Positive – expressing excitement, admiration, or praise
Negative – expressing fear, doubt, or criticism
Neutral – informational or ambiguous tone without a clear emotional stance
Manual labeling was chosen over automated sentiment analysis tools to ensure higher accuracy, especially given the subtlety, irony, and domain-specific language often found in tweets about ChatGPT. This approach allowed for more reliable insights and enhanced the quality of the overall analysis.

**Results**
Most tweets in the sample were neutral, as users primarily shared outputs or examples of their interactions with ChatGPT. Among the tweets that did convey sentiment, positive tweets outnumbered negative ones by more than two to one, indicating an overall favorable early reception of the tool.

![Sentiment_dist.png](attachment:084111bf-4fd3-4018-af37-d22e297cd9fa.png)

### Word Clouds
To explore the most commonly discussed topics and keywords in tweets about ChatGPT, I generated word clouds—a visual representation of text data where the size of each word reflects its frequency in the dataset.

![positive cloud.png](attachment:63c78efa-ce69-430c-b23e-47ab0d7b26cf.png)

![neutral cloud.png](attachment:898311bb-6778-4412-8728-92aa143297f9.png)

![negative cloud.png](attachment:9ace8a92-67c9-4d1c-96f1-0c2a1cd4bc4c.png)

**Insight:**
Interestingly, the word cloud for negative tweets contains many words that don’t appear inherently negative. 

## Future Improvements

- Add **sentiment analysis** (VADER/TextBlob) to categorize tone of tweets
- Build an **interactive dashboard** with Tableau or Streamlit
- Apply topic modeling (e.g., LDA) for deeper NLP analysis

### Summary of Key Findings
**Insights**
Public sentiment toward ChatGPT immediately after launch was largely positive, with users excited by its capabilities. However, there were notable concerns, reflected in keywords related to fear of job automation and AI safety.

This suggests that while the launch generated widespread enthusiasm, public discourse also reflected deeper concerns about the long-term impact of generative AI tools.

**Challenges and Limitations**
Due to the small sample size and manual sentiment tagging, this analysis captures only a snapshot of public opinion and may not represent the full range of perspectives.
