# SMM Paper 2: How reddit users react to the launch of Optimus Robot

#### Anoushka Shinde (anshinde@iu.edu)



## Data collection from Reddit using PRAW

For installing praw:

In [1]:
pip install praw

Note: you may need to restart the kernel to use updated packages.


In [2]:
import praw
import pandas as pd
from datetime import datetime, timedelta

# Initialize Reddit API client using PRAW
reddit = praw.Reddit(client_id='ACjvHudS9wYMwD7VkC9ikQ',
                     client_secret='VpVyb6kaW7MrnhTuwMIoabqLr0paxw',
                     user_agent="Smm_optimus by u/SeaRestaurant8690",
                     username='SeaRestaurant8690',
                     password='smmreddit23')
 

In [3]:
from datetime import datetime
from praw.models import MoreComments

# Define the subreddits and keywords
subreddits = ["Tesla", "technology", "robotics", "teslamotors"]
keywords = ["Optimus robot", "Tesla Optimus", "Tesla robot"]
date_ranges = [
    {"name": "Section_A", "start": datetime(2023, 9, 30), "end": datetime(2024, 10, 10)},
    {"name": "Section_B", "start": datetime(2024, 10, 11), "end": datetime(2024, 11, 6)}
]

In [4]:
# Create a list to store data
data = []

In [5]:
# Function to retrieve comments
def get_comments(submission, limit=35):
    comments_data = []
    submission.comments.replace_more(limit=0)  # Ensure no 'MoreComments' placeholder
    for comment in submission.comments.list()[:limit]:
        comment_data = {
            "Content": comment.body,
            "Timestamp": datetime.utcfromtimestamp(comment.created_utc),
            "Upvotes": comment.score,
            "Type": "comment"
        }
        comments_data.append(comment_data)
    return comments_data

# Fetch posts and comments
for subreddit in subreddits:
    for keyword in keywords:
        for section in date_ranges:
            # Search posts
            for submission in reddit.subreddit(subreddit).search(keyword, limit=10):
                submission_date = datetime.utcfromtimestamp(submission.created_utc)
                # Filter by date range
                if section["start"] <= submission_date <= section["end"]:
                    # Collect post data
                    post_data = {
                        "Content": submission.title + " " + submission.selftext,
                        "Timestamp": submission_date,
                        "Upvotes": submission.score,
                        "Type": "post",
                        "Section": section["name"]
                    }
                    data.append(post_data)
                    
                    # Collect comments data
                    comments_data = get_comments(submission)
                    for comment in comments_data:
                        comment["Section"] = section["name"]  # Add section info to each comment
                        data.append(comment)

In [6]:
# Convert data to DataFrame and save to CSV
df = pd.DataFrame(data)
df.to_csv('reddit_data.csv', index=False)

print("Data collection complete. Check 'reddit_data.csv' for results.")

Data collection complete. Check 'reddit_data.csv' for results.


#### CSV File Structure
The resulting CSV will contain columns for:

- Content: Content of the post or comment.
- Timestamp: Timestamp of the post or comment.
- Upvotes: Number of upvotes received.
- Type: Whether it's a "post" or "comment."
- Section: Section A is before the event, Section B is on the day of and after the event.

In [7]:
df.head()

Unnamed: 0,Content,Timestamp,Upvotes,Type,Section
0,Tesla has put 2 Optimus robots to work on its ...,2024-06-12 13:45:13,0,post,Section_A
1,I feel sorry for the two interns that have to ...,2024-06-12 14:25:31,40,comment,Section_A
2,>It is unclear which factory the robots are op...,2024-06-12 21:23:03,15,comment,Section_A
3,"Just out of curiosity, is there a point to wal...",2024-06-12 14:12:51,8,comment,Section_A
4,Don't forget the greed-bot in the CEO-office.,2024-06-12 14:27:21,6,comment,Section_A


In [8]:
num_rows = len(df)
print(f"Number of rows in the data: {num_rows}")

Number of rows in the data: 1762


In [9]:
# Count the number of rows in each section
section_counts = df['Section'].value_counts()
print("Number of rows in each section:")
print(section_counts)


Number of rows in each section:
Section_B    898
Section_A    864
Name: Section, dtype: int64


In [10]:
# Count the number of posts and comments
type_counts = df['Type'].value_counts()
print("Number of posts and comments:")
print(type_counts)

Number of posts and comments:
comment    1700
post         62
Name: Type, dtype: int64


In [11]:
# Calculate the number of posts and comments in each section
section_type_counts = df.groupby(['Section', 'Type']).size().unstack(fill_value=0)
print("Post and Comment Counts by Section:")
print(section_type_counts)

Post and Comment Counts by Section:
Type       comment  post
Section                 
Section_A      836    28
Section_B      864    34


In [12]:
max_upvotes = df['Upvotes'].max()
max_upvotes

30920

In [13]:
min_upvotes = df['Upvotes'].min()
min_upvotes

-165

In [14]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1762 entries, 0 to 1761
Data columns (total 5 columns):
 #   Column     Non-Null Count  Dtype         
---  ------     --------------  -----         
 0   Content    1762 non-null   object        
 1   Timestamp  1762 non-null   datetime64[ns]
 2   Upvotes    1762 non-null   int64         
 3   Type       1762 non-null   object        
 4   Section    1762 non-null   object        
dtypes: datetime64[ns](1), int64(1), object(3)
memory usage: 69.0+ KB


In [15]:
# Find the post or comment with the maximum upvotes
top_upvoted = df[df['Upvotes'] == df['Upvotes'].max()]

# Find the post or comment with the minimum upvotes (if needed)
least_upvoted = df[df['Upvotes'] == df['Upvotes'].min()]

# Display results
print("Top Upvoted Post or Comment:")
print(top_upvoted[['Section', 'Content', 'Type', 'Upvotes']])

print("\nLeast Upvoted Post or Comment:")
print(least_upvoted[['Section', 'Content', 'Type', 'Upvotes']])


Top Upvoted Post or Comment:
      Section                                            Content  Type  \
20  Section_B  The Optimus robots at Tesla’s Cybercab event w...  post   

    Upvotes  
20    30920  

Least Upvoted Post or Comment:
       Section                                            Content     Type  \
128  Section_A  BD are physical bots designed for mobility, li...  comment   

     Upvotes  
128     -165  
