In this project, we are predicting trends in technology adoption and interest based on social media (Twitter) data. Specifically, the model aims to forecast the following:

1. **Volume of Discussions**: Predicting the number of tweets or social media posts related to specific technologies, gadgets, or software within a given time frame in the future (e.g., daily, weekly). This serves as an indicator of public interest and awareness levels.

2. **Sentiment Trends**: Forecasting the overall sentiment (positive, negative, neutral) associated with these technologies in the social media discourse. This could involve predicting the average sentiment score or the proportion of tweets falling into each sentiment category for upcoming days.

3. **Combination of Volume and Sentiment**: A more comprehensive approach might involve predicting both the volume of discussion and the sentiment concurrently. This dual prediction can provide a more nuanced understanding of how public interest and perception might evolve over time.

### Example Predictions
- **Before a Product Launch**: If there's an upcoming release of a new gadget, the model might predict an increase in the volume of discussion and potentially the sentiment trend leading up to and following the launch.
- **Emerging Technology Trends**: For emerging tech like augmented reality, blockchain, or new software platforms, the model could forecast how discussions (both in volume and sentiment) about these technologies will trend in the short-term future.

### Purpose of These Predictions
- **Market Insight**: These predictions can provide valuable insights for businesses, marketers, and technologists about consumer interest and sentiment trends, aiding in strategic planning and decision-making.
- **Product Strategy**: For tech companies, understanding how public interest and sentiment are likely to shift can inform product development, marketing strategies, and customer engagement plans.
- **Investment Decisions**: Investors in technology sectors might use these predictions to gauge potential market reactions to new technologies or products.

The predictions, therefore, are not just about the raw data but also about interpreting the data to extract meaningful trends and insights that can inform various strategic decisions in the technology domain.

In [1]:
# Essential imports
import pandas as pd
import numpy as np
import tweepy
import nltk
import sqlite3
import re
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from nltk.stem import WordNetLemmatizer
from textblob import TextBlob

## 1. Data Collection
Sources: Gather data from social media. We will be using Twitter API to search and get tweets with relevant keywords

Keywords: Identify relevant keywords for each technology (e.g., "artificial intelligence", "augmented reality", "blockchain").

In [2]:
# Twitter API keys

# Consumer Keys
# MSML apis
api_key = 'fQyQfxNjgLk8NDoVt339h8K0g'
api_secret_key = '5wHUc4mrkVn1R9pR7tVaNXkKuB6Le1qIpSqKA3nb9H70rEVqiz'

# Authentication Tokens
bearer_token = 'AAAAAAAAAAAAAAAAAAAAAJu0rQEAAAAAisjoiU156fEdDphPqie4eiNi0L0%3DByO7mVMXobvQB1IA1XCLUWWTLbFtjSu9TcBh06OpFJSRldwkHa'

access_token = '2931998159-ngeYrsqwmVvs1jYjpZcCFBzO2xm0j2wsqokBLK6'
access_token_secret = 'CGo43zg5cX2KDdyACKDVIUtrULMV1SCBjPVNogCW1UKKs'

# Authenticate
auth = tweepy.OAuthHandler(api_key, api_secret_key)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth, wait_on_rate_limit=True)

client = tweepy.Client(bearer_token=bearer_token)

In [3]:
# Getting the tweets from twitter
query = 'artificial intelligence'
tweets = client.search_recent_tweets(query=query, max_results=10)

In [4]:
# Initialize lists for DataFrame
tweet_texts = []
context_annotations = []

# Extract data from tweets
for tweet in tweets.data:
    # Add tweet text
    tweet_texts.append(tweet.text)

    # Check and add context annotations
    if 'context_annotations' in tweet and len(tweet.context_annotations) > 0:
        context_annotations.append(str(tweet.context_annotations))
    else:
        context_annotations.append(None)

# Create DataFrame
tweets_df = pd.DataFrame({
    'Tweet': tweet_texts,
    'Context Annotations': context_annotations
})

tweets_df.head()

Unnamed: 0,Tweet,Context Annotations
0,RT @sercanmatrixai: @traded https://t.co/L43H7...,
1,RT @rowancheung: @ibab_ml The EU has become th...,
2,RT @Iam_StephenMusk: JUST IN: Security guard r...,
3,@soon_verse artificial intelligence is a hot t...,
4,RT @sercanmatrixai: @CryptoYusaku https://t.co...,


In [5]:
def add_tweets_to_database(tweets_df):
    # Create a SQLite database connection
    conn = sqlite3.connect('tweets_database.db')

    # Write the DataFrame to a SQLite table
    tweets_df.to_sql('tweets', conn, if_exists='replace', index=False)

    # Optionally, read the table back from the database to verify
    tweets_df_from_sql = pd.read_sql('SELECT * FROM tweets', conn)

    # Display the DataFrame read from the database
    print(tweets_df_from_sql)

    # Close the database connection
    conn.close()
    
add_tweets_to_database(tweets_df)

                                               Tweet Context Annotations
0  RT @sercanmatrixai: @traded https://t.co/L43H7...                None
1  RT @rowancheung: @ibab_ml The EU has become th...                None
2  RT @Iam_StephenMusk: JUST IN: Security guard r...                None
3  @soon_verse artificial intelligence is a hot t...                None
4  RT @sercanmatrixai: @CryptoYusaku https://t.co...                None
5  RT @itz_shivvvuuu: artificial intelligence ?\n...                None
6  RT @penpengin2023: 今年6月の記事。長文だけれど、読んでおく価値あり。\n...                None
7       RT @JenningsJa49121: https://t.co/hIBMHGTg00                None
8  Discover the captivating world of Obama DeepFa...                None
9  RT @UpscforAll: What is Digital India BHASHINI...                None


In [6]:
# For fetching the data in the database later
def get_tweets_by_query():
    conn = sqlite3.connect('tweets_database.db')
    cur = conn.cursor()

    # Select tweets that match the query
    cur.execute("SELECT Tweet FROM tweets")
    all_tweets = cur.fetchall()
    
    print(all_tweets)

    conn.close()
    return all_tweets

get_tweets_by_query()

[('RT @sercanmatrixai: @traded https://t.co/L43H72ltdJ\n\nDefinitely $MAN\n\nIt is the only #AI project that blends artificial intelligence, block…',), ("RT @rowancheung: @ibab_ml The EU has become the first continent to set regulations on AI.\n\nThe new 'Artificial Intelligence Act' aims to en…",), ('RT @Iam_StephenMusk: JUST IN: Security guard robots represent a cutting-edge advancement in the field of security and safety. These highly…',), ('@soon_verse artificial intelligence is a hot technology as it is been combined with blockchain to build interesting products . what does your project feel about artificial intelligence\n\n0x732e94603E8d45abf73019ee0e3862D67dE3CA2F',), ('RT @sercanmatrixai: @CryptoYusaku https://t.co/L43H72ltdJ\n\nHi  Yusaku,\n\n@CryptoYusaku \n\n$MAN is the only AI project that blends artificial…',), ('RT @itz_shivvvuuu: artificial intelligence ?\n\n~ meanwhile floridians https://t.co/Sc4jBlLB1B',), ('RT @penpengin2023: 今年6月の記事。長文だけれど、読んでおく価値あり。\n\n『どんなに素晴らしいAIシ

[('RT @sercanmatrixai: @traded https://t.co/L43H72ltdJ\n\nDefinitely $MAN\n\nIt is the only #AI project that blends artificial intelligence, block…',),
 ("RT @rowancheung: @ibab_ml The EU has become the first continent to set regulations on AI.\n\nThe new 'Artificial Intelligence Act' aims to en…",),
 ('RT @Iam_StephenMusk: JUST IN: Security guard robots represent a cutting-edge advancement in the field of security and safety. These highly…',),
 ('@soon_verse artificial intelligence is a hot technology as it is been combined with blockchain to build interesting products . what does your project feel about artificial intelligence\n\n0x732e94603E8d45abf73019ee0e3862D67dE3CA2F',),
 ('RT @sercanmatrixai: @CryptoYusaku https://t.co/L43H72ltdJ\n\nHi  Yusaku,\n\n@CryptoYusaku \n\n$MAN is the only AI project that blends artificial…',),
 ('RT @itz_shivvvuuu: artificial intelligence ?\n\n~ meanwhile floridians https://t.co/Sc4jBlLB1B',),
 ('RT @penpengin2023: 今年6月の記事。長文だけれど、読んでおく価値あり。\n\n『どんなに素晴

## 2. Data Preprocessing
Cleaning: Remove irrelevant content, special characters, and URLs.
Normalization: Convert text to a standard format (e.g., lowercase, stemming).


In [7]:
# Define the tweet cleaning function
def clean_tweet(tweet):
    # Convert to lowercase
    tweet = tweet.lower()
    
    # Remove URLs
    tweet = re.sub(r'http\S+|www\S+|https\S+', '', tweet, flags=re.MULTILINE)
    
    # Remove @usernames and #hashtags
    tweet = re.sub(r'\@\w+|\#','', tweet)
    
    # Remove punctuation and special characters
    tweet = re.sub(r'[^\w\s]', '', tweet)
    
    # Tokenize the tweet
    tweet_tokens = word_tokenize(tweet)
    
    # Remove stopwords
    filtered_words = [word for word in tweet_tokens if word not in stopwords.words('english')]
    
    # Lemmatization
    lemmatizer = WordNetLemmatizer()
    lemmatized_words = [lemmatizer.lemmatize(word) for word in filtered_words]
    
    return " ".join(lemmatized_words)

# Assuming tweets_df is your DataFrame and 'Tweet' is the column with tweet texts
# Apply the cleaning function to each tweet
tweets_df['Cleaned_Tweet'] = tweets_df['Tweet'].apply(clean_tweet)

# Display the first few rows of the DataFrame
tweets_df.head()

Unnamed: 0,Tweet,Context Annotations,Cleaned_Tweet
0,RT @sercanmatrixai: @traded https://t.co/L43H7...,,rt definitely man ai project blend artificial ...
1,RT @rowancheung: @ibab_ml The EU has become th...,,rt eu become first continent set regulation ai...
2,RT @Iam_StephenMusk: JUST IN: Security guard r...,,rt security guard robot represent cuttingedge ...
3,@soon_verse artificial intelligence is a hot t...,,artificial intelligence hot technology combine...
4,RT @sercanmatrixai: @CryptoYusaku https://t.co...,,rt hi yusaku man ai project blend artificial


In [8]:
# Drop the 'ContextAnnotations' column
tweets_df = tweets_df.drop('Context Annotations', axis=1)
tweets_df = tweets_df.drop('Tweet', axis=1)

# Display the DataFrame to confirm the column is dropped
tweets_df.head()

Unnamed: 0,Cleaned_Tweet
0,rt definitely man ai project blend artificial ...
1,rt eu become first continent set regulation ai...
2,rt security guard robot represent cuttingedge ...
3,artificial intelligence hot technology combine...
4,rt hi yusaku man ai project blend artificial


In [9]:
# query words
query_words = ['Artificial Intelligence', 'ai']

# Create a boolean mask
mask = tweets_df['Cleaned_Tweet'].str.contains('|'.join(query_words), case=False, na=False)

# Filter the DataFrame
filtered_tweets_df = tweets_df[mask]

# Display the filtered DataFrame
filtered_tweets_df.head()

Unnamed: 0,Cleaned_Tweet
0,rt definitely man ai project blend artificial ...
1,rt eu become first continent set regulation ai...
3,artificial intelligence hot technology combine...
4,rt hi yusaku man ai project blend artificial
5,rt artificial intelligence meanwhile floridian


## 3. Sentiment Analysis
Sentiment Detection Tool: Use pre-built libraries like TextBlob.
Classification: Classify the sentiment of each piece of text as positive, negative, or neutral.

In [10]:
# Function to apply sentiment analysis
def analyze_sentiment(tweet):
    analysis = TextBlob(tweet)
    polarity = analysis.sentiment.polarity
    if polarity > 0:
        return 'positive', polarity
    elif polarity == 0:
        return 'neutral', polarity
    else:
        return 'negative', polarity

# Apply the function to each tweet
tweets_df['Sentiment'], tweets_df['Polarity'] = zip(*tweets_df['Cleaned_Tweet'].apply(analyze_sentiment))

# Display the first few rows of the DataFrame with sentiment data
tweets_df.head()

Unnamed: 0,Cleaned_Tweet,Sentiment,Polarity
0,rt definitely man ai project blend artificial ...,negative,-0.3
1,rt eu become first continent set regulation ai...,negative,-0.071212
2,rt security guard robot represent cuttingedge ...,positive,0.16
3,artificial intelligence hot technology combine...,negative,-0.1125
4,rt hi yusaku man ai project blend artificial,negative,-0.6


## 4. Time Series Analysis
Aggregation: Aggregate sentiment scores over time (daily, weekly).
Trends Analysis: Use time series analysis techniques to identify trends. Libraries like Pandas and statsmodels can be helpful.


## 5. Forecasting
Model Selection: Choose a forecasting model like ARIMA, SARIMA, or LSTM (for deep learning approaches).
Prediction: Use the model to predict future trends in sentiment and discussion volume.


## 6. Visualization
Tools: Use libraries like Matplotlib or Plotly to visualize trends and forecasts.
Dashboard: Consider building a dashboard using Dash or Streamlit for real-time analysis and visualization.


## 7. Continuous Improvement and Updating
Feedback Loop: Incorporate new data regularly to update the models.
Model Tuning: Continuously evaluate and tune the models for better accuracy.


## 8. Deployment
Web Application: Deploy as a web application using frameworks like Flask or Django.
APIs: Create APIs for accessing the analysis and forecasts.