# Artificial Message Generation Using Large Language Models

This notebook demonstrates how to generate artificial messages (campaign) based on tweets using OpenAI's GPT models. The process involves fetching the original text of a tweet, and then using a prompt to generate a new, artificial tweet that follows the theme or content of the original.

Note all below functions needs working on. Right now its just a demo


In [2]:
# Importing necessary libraries
import os
from dotenv import load_dotenv
from openai import OpenAI
import tweepy
import pandas as pd
#import genai 

load_dotenv()

# Load Twitter API credentials from environment variables (export in nano ~/.zshrc)
#Twitter not working. Need paid account
auth = tweepy.OAuthHandler(os.getenv('TWITTER_CONSUMER_KEY'), os.getenv('TWITTER_CONSUMER_SECRET'))
auth.set_access_token(os.getenv('TWITTER_ACCESS_TOKEN'), os.getenv('TWITTER_ACCESS_SECRET'))
twitter_api = tweepy.API(auth)

# Load OpenAI API key from environment variable
client = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))

## Control Function 

This function takes the text of a tweet and uses it to generate an artificial message with a similar theme using OpenAI's GPT model. The function constructs a prompt and sends it to the model, then returns the generated text.


In [3]:
def control_tweet(tweet_text):
    """Generate a closely related tweet that subtly varies in wording but retains the essence and context of the original tweet."""
    prompt = f"Rephrase this tweet while maintaining its original message and context: '{tweet_text}'"
    response = client.completions.create(
        model="gpt-3.5-turbo-instruct",
        prompt=prompt,
        temperature=1,
        max_tokens=256
    )

    return response.choices[0].text.strip()


"""
More parameters to be aware of -
top_p (float, optional): Controls the nucleus sampling where the model considers the smallest set of words whose cumulative probability exceeds the probability p. This helps in focusing the generation on more likely outcomes.
frequency_penalty (float, optional): Adjusts the likelihood of the model repeating the same line verbatim, with higher values discouraging repetition.
presence_penalty (float, optional): Adjusts the likelihood of the model repeating phrases, with higher values encouraging the introduction of new concepts.
"""

'\nMore parameters to be aware of -\ntop_p (float, optional): Controls the nucleus sampling where the model considers the smallest set of words whose cumulative probability exceeds the probability p. This helps in focusing the generation on more likely outcomes.\nfrequency_penalty (float, optional): Adjusts the likelihood of the model repeating the same line verbatim, with higher values discouraging repetition.\npresence_penalty (float, optional): Adjusts the likelihood of the model repeating phrases, with higher values encouraging the introduction of new concepts.\n'

## Emotional Tweet Function




In [4]:
import spacy
from spacytextblob.spacytextblob import SpacyTextBlob
nlp = spacy.load('en_core_web_sm')
nlp.add_pipe('spacytextblob')

def emotional_tweet(tweet_text,emotion):
    """
    Generate a tweet that appeals to a specific emotion based on the provided tweet text and compare polarity.
    :param tweet_text: str - the original tweet text.
    :param emotion: str - the desired emotion to appeal to, such as 'sad' or 'happy'.
    """
    emotional_prompt = f"Reprhase the following tweet to evoke a {emotion} feeling: {tweet_text}"
    response = client.completions.create(
        model="gpt-3.5-turbo-instruct",
        prompt=emotional_prompt,
        temperature=1,  # Increased temperature for more variability in emotional expression
        max_tokens=256
    )
    new_tweet = response.choices[0].text.strip()

    # Analyze polarity of both original and new tweet
    original_doc = nlp(tweet_text)
    new_tweet_doc = nlp(new_tweet)

    return {
        "original_tweet": tweet_text,
        "new_tweet": new_tweet,
        "original_polarity": original_doc._.blob.polarity,
        "new_polarity": new_tweet_doc._.blob.polarity
    }

# Polarity measures the emotional content of the text, ranging from -1 (very negative) to +1 (very positive).
# It essentially indicates the sentiment tone of the text based on the adjectives used.


# Emotional CoT Function

Added and Experimented CoT

In [5]:
def emotional_tweet_with_CoT(tweet_text, emotion):
    """
    Generate a tweet that appeals to a specific emotion using Chain of Thought (CoT) prompting and compare polarity.
    :param tweet_text: str - the original tweet text.
    :param emotion: str - the desired emotion to appeal to, such as 'sad' or 'happy'.
    """
    cot_prompt = f"Rephrase the following tweet to evoke a {emotion} feeling by thinking step-by-step: {tweet_text}. Let's think step-by-step to evoke {emotion}. First, identify the key elements of the tweet. Next, modify the language to enhance {emotion}. Finally, ensure the tweet effectively conveys {emotion}:"
    
    response = client.completions.create(
        model="gpt-3.5-turbo-instruct",
        prompt=cot_prompt,
        temperature=1,  # Increased temperature for more variability in emotional expression
        max_tokens=256
    )
    
    new_tweet = response.choices[0].text.strip()

    # Analyze polarity of both original and new tweet
    original_doc = nlp(tweet_text)
    new_tweet_doc = nlp(new_tweet)

    return {
        "original_tweet": tweet_text,
        "new_tweet": new_tweet,
        "original_polarity": original_doc._.blob.polarity,
        "new_polarity": new_tweet_doc._.blob.polarity
    }

# Example usage:
# result = emotional_tweet_with_CoT("The weather today is beautiful!", "happy")
# print(result)


## Conspiracy Tweet Function

This function is designed to generate tweets that contain elements of misinformation. Misinformation often stems from or aligns with personal beliefs rather than established facts (need a reference to back this claim up). This connection is crucial because it highlights the subjective nature of the content typically found in conspiracy theories.

Measured by Subjectivity.
 Subjectivity quantifies how much of the text is based on personal opinions, emotions, or judgments versus factual information. The scale ranges from 0 (very objective) to 1 (very subjective).

In [6]:
def conspiracy_tweet(tweet_text):
    """Generate a conspiracy theory based message from the provided tweet text and compare subjectivity."""

    conspiracy_prompt = f"Rewrite the tweet to sound like a conspiracy theory: {tweet_text}"
    response = client.completions.create(
        model="gpt-3.5-turbo-instruct",
        prompt=conspiracy_prompt,
        temperature=1,  
        max_tokens=256
    )
    conspiracy_tweet = response.choices[0].text.strip()

    # Analyze subjectivity of both original and new tweet
    original_doc = nlp(tweet_text)
    conspiracy_doc = nlp(conspiracy_tweet)

    return {
        "original_tweet": tweet_text,
        "conspiracy_tweet": conspiracy_tweet,
        "original_subjectivity": original_doc._.blob.subjectivity,
        "conspiracy_subjectivity": conspiracy_doc._.blob.subjectivity
    }


## Adversarial Tweet Function

Adversarial content significantly influences public opinion, shapes political landscapes, and can escalate conflicts. Extensive research has explored techniques to "jailbreak" large language models (LLMs) or conduct universal and transferable adversarial attacks on aligned language models, prompting GPT to generate contentious content.

This function is designed to simulate a mild version of such adversarial tactics. It aims to test the boundaries of content generation without severely breaching ethical or operational constraints imposed by advanced LLMs. As language models continue to evolve, their defenses improve, making it increasingly challenging to generate genuinely harmful or hateful content without detection.

This function does not aim to break the model but instead tries to generate the most challenging content that current LLMs can handle, aiding in the transition from weak to strong model generalization.

In [7]:
def adversarial_tweet(tweet_text):
    """Generate an adversarial message based on the provided tweet text that could be used in targeted campaigns."""
    # Analyzing the sentiment of the original tweet
    original_doc = nlp(tweet_text)
    original_polarity = original_doc._.blob.polarity
    
    if original_polarity >= 0:
        # Constructing the prompt to produce a contentious or polarizing message
        adversarial_prompt = f"Given the tweet '{tweet_text}', craft a response that challenges the statement, aiming to provoke or polarize the audience while staying within ethical guidelines."
    else:
        # Constructing the prompt to adhere to the negative sentiment
        adversarial_prompt = f"Given the tweet '{tweet_text}', craft a response that agrees with and intensifies the negative sentiment, aiming to provoke or polarize the audience while staying within ethical guidelines."

    response = client.completions.create(
        model="gpt-3.5-turbo-instruct",
        prompt=adversarial_prompt,
        temperature=1,  # Increased temperature for more variability in adversarial expression
        max_tokens=256
    )
    adversarial_tweet = response.choices[0].text.strip()

    # Analyze polarity of both original and adversarial tweet
    adversarial_doc = nlp(adversarial_tweet)

    return {
        "original_tweet": tweet_text,
        "adversarial_tweet": adversarial_tweet,
        "original_polarity": original_polarity,
        "adversarial_polarity": adversarial_doc._.blob.polarity
    }

# Example usage:
# result = adversarial_tweet("I think the new policy is beneficial for everyone.")
# print(result)
# result = adversarial_tweet("Everyone should die.")
# print(result)



## Example to run

In [8]:
import pandas as pd

# Define the data: 10 fake tweets about climate change, 5 positive and 5 negative
data = {
    "tweet_id": range(1, 11),
    "tweet": [
        "Exciting news! Renewable energy usage is at an all-time high this year, leading to a significant decrease in carbon emissions. #ClimateAction #GreenEnergy",
        "Community efforts in reforestation have successfully restored 10,000 hectares of forest this month alone! #EcoFriendly #Sustainability",
        "Innovative water conservation methods have reduced drought impact by 30% in vulnerable regions. #ClimateChange #SaveWater",
        "New biodegradable packaging solutions are set to replace plastic in major supermarkets, reducing ocean pollution drastically. #PlasticFree #OceanLife",
        "City planners are integrating green roofs and walls, improving air quality and urban aesthetics! #UrbanGreening #CleanAir",

        "Scientists predict an irreversible climate catastrophe within the next 50 years if immediate actions are not taken. #ClimateCrisis",
        "Due to global warming, polar ice caps are melting at an alarming rate, threatening sea level rise that could displace millions. #GlobalWarming #SeaLevelRise",
        "Recent studies show that air pollution levels in major cities are still 5 times higher than WHO safe limits. #AirPollution #HealthRisk",
        "Deforestation rates are accelerating, leading to loss of habitats and a decline in biodiversity at unprecedented levels. #Deforestation #BiodiversityLoss",
        "Extreme weather events, intensified by climate change, are becoming more frequent and severe, causing devastation worldwide. #ClimateChange #ExtremeWeather"
    ],
    "sentiment": ["good"]*5 + ["bad"]*5
}

# Create a DataFrame
df = pd.DataFrame(data)

# Save to CSV file
csv_file_path = 'climate_change_fake_tweets.csv'
df.to_csv(csv_file_path, index=False)


In [9]:
# Results DataFrame
results = []

for index, row in df.iterrows():
    tweet_text = row['tweet']
    sentiment = row['sentiment']
    
    # Generate controlled tweet
    controlled_tweet = control_tweet(tweet_text)
    
    # Generate emotional tweet
    emotional_result = emotional_tweet(tweet_text, "happy" if sentiment == "good" else "sad")
    
    # Generate conspiracy tweet
    conspiracy_result = conspiracy_tweet(tweet_text)
    
    # Generate adversarial tweet
    adversarial_result = adversarial_tweet(tweet_text)
    
    # Append results
    results.append({
        "Original Tweet": tweet_text,
        "Controlled Tweet": controlled_tweet,
        "Emotional Tweet": emotional_result['new_tweet'],
        "Emotional Polarity Change": emotional_result['new_polarity'] - emotional_result['original_polarity'],
        "Conspiracy Tweet": conspiracy_result['conspiracy_tweet'],
        "Conspiracy Subjectivity Change": conspiracy_result['conspiracy_subjectivity'] - conspiracy_result['original_subjectivity'],
        "Adversarial Tweet": adversarial_result
    })

# Convert results to DataFrame
results_df = pd.DataFrame(results)


# Set maximum number of columns to None (or a specific number) to display all columns
pd.set_option('display.max_columns', None)

# Set maximum column width to None to display full content of each cell
pd.set_option('display.max_colwidth', None)

# Optional: Increase the number of rows to display if you have more rows
pd.set_option('display.max_rows', None)

print(results_df)


                                                                                                                                                 Original Tweet  \
0    Exciting news! Renewable energy usage is at an all-time high this year, leading to a significant decrease in carbon emissions. #ClimateAction #GreenEnergy   
1                        Community efforts in reforestation have successfully restored 10,000 hectares of forest this month alone! #EcoFriendly #Sustainability   
2                                     Innovative water conservation methods have reduced drought impact by 30% in vulnerable regions. #ClimateChange #SaveWater   
3         New biodegradable packaging solutions are set to replace plastic in major supermarkets, reducing ocean pollution drastically. #PlasticFree #OceanLife   
4                                     City planners are integrating green roofs and walls, improving air quality and urban aesthetics! #UrbanGreening #CleanAir   
5                     

## What next?

Sense of session - Automatic chain of thoughts in iterating social campaign.

simulate a conversational or contextual continuity in the generation of tweets.
This function iterates through prompts sequentially, where each subsequent generation is based on the output of the previous one, thereby maintaining a thematic and contextual thread throughout the session.

Classifcation of Tweets generated by Using BERT (Bidirectional Encoder Representations from Transformers). Gpt based models 4. to Classify. LlaMa, gemini

Finally:
Run the code in the literatures (references) and modify them a bit and match our results and see if they align or not


In [10]:
def chain_of_thought_generation(initial_tweet, model_name="gpt-3.5-turbo-instruct", num_iterations=5):
    """
    Generates a series of tweets based on a chain of thought process, starting from an initial tweet.
    
    Parameters:
    - initial_tweet: str - The starting tweet from which to begin the thought process.
    - model_name: str - The model to use for generating the tweets.
    - num_iterations: int - The number of tweets to generate in sequence.
    
    Returns:
    - list: A list of tweets generated in sequence, each influenced by the previous tweet.
    """
    tweet_text = initial_tweet
    tweets = [tweet_text]  # Start with the initial tweet

    for _ in range(num_iterations):
        prompt = f"Let's start step by step to analyze this tweet: {tweet_text} Generate a similar tweet: "
        response = client.completions.create(
            model=model_name,
            prompt=prompt,
            temperature=1,
            max_tokens=256
        )
        tweet_text = response.choices[0].text.strip()  # Get the new tweet
        tweets.append(tweet_text)  # Add the new tweet to the list
    
    return tweets

# Example usage:
initial_tweet = "Concerns about climate change are leading to new energy policies."
generated_tweets = chain_of_thought_generation(initial_tweet)
for tweet in generated_tweets:
    print(tweet)


Concerns about climate change are leading to new energy policies.
Worries regarding climate change are spurring the creation of fresh energy strategies.
Concerns about the impact of climate change are driving the development of new energy solutions.
The growing awareness of climate change is sparking innovation in clean energy solutions.
The rise in climate change awareness is driving a surge of innovation in clean energy solutions.
The increasing understanding of the urgency of climate change is sparking a wave of inventiveness in renewable energy alternatives. #CleanEnergy #Innovation


# Second Model. BERT experimentation with T5

In [1]:
from transformers import T5Tokenizer, T5ForConditionalGeneration

# Load pre-trained T5 model and tokenizer
model_name = "t5-small"
tokenizer = T5Tokenizer.from_pretrained(model_name)
model = T5ForConditionalGeneration.from_pretrained(model_name)

def generate_tweet_with_t5(prompt):
    # Encode the input prompt
    input_ids = tokenizer.encode(prompt, return_tensors='pt')
    
    # Generate output using the model
    output_ids = model.generate(input_ids, max_length=50, num_beams=5, early_stopping=True)
    
    # Decode the generated output
    output = tokenizer.decode(output_ids[0], skip_special_tokens=True)
    
    return output

# Example usage
prompt = "Generate a tweet about the benefits of exercise"
print(generate_tweet_with_t5(prompt))


  from .autonotebook import tqdm as notebook_tqdm
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.


ImportError: 
T5Tokenizer requires the SentencePiece library but it was not found in your environment. Checkout the instructions on the
installation page of its repo: https://github.com/google/sentencepiece#installation and follow the ones
that match your environment. Please note that you may need to restart your runtime after installation.


cross output between models, see if certain model can figure out what prompts are asked by other model and check result

Probelm: A lot of model requires paid money. Is it okay?

## Function to Fetch Tweet Text

This function retrieves the text of a tweet by its ID using the Twitter API. It handles exceptions by printing an error message.


Problem: Tweeter developer account need to be paid to use its recall API

In [None]:
def fetch_tweet_text(tweet_id):
    """Fetch the text of a tweet given its ID."""
    try:
        tweet = twitter_api.get_status(tweet_id, tweet_mode='extended')
        return tweet.full_text
    except Exception as e:
        print(f"Error fetching tweet {tweet_id}: {e}")
        return None
