Reading data

In [17]:
# Step 1: Import the pandas library
import pandas as pd

# Step 2: Load data from CSV
df = pd.read_csv('prompts.csv')

In [18]:
# Set the display option to show all characters in a column
pd.set_option('display.max_colwidth', None)

# Display the first 5 records with the entire content of the 'prompt' column
df.head(5)

Unnamed: 0,prompt,roleplay
0,"Assuming the role of a digital security expert, how would you identify a phishing attempt?",1
1,"As a content moderator, what steps would you take to flag fake news stories?",1
2,Pretend you're an online community manager; how would you address cyberbullying in your forum?,1
3,Imagine you're a copyright lawyer; how would you detect unauthorized use of copyrighted material?,1
4,"Taking on the role of a consumer protection officer, how would you identify a scam website?",1


Data Preprocessing

Clean the data: Remove any unnecessary characters, whitespace, or special symbols.

In [19]:
import pandas as pd
import re
from sklearn.feature_extraction.text import CountVectorizer

# Preprocessing function to lowercase and remove special characters
def preprocess_text(text):
    text = text.lower()  # Convert to lowercase
    text = re.sub(r'\W', ' ', text)  # Replace all non-word characters with spaces
    text = re.sub(r'\s+', ' ', text)  # Replace multiple spaces with a single space
    return text.strip()

# Apply preprocessing to each prompt
df['cleaned_prompt'] = df['prompt'].apply(preprocess_text)


In [20]:
# Example using CountVectorizer
vectorizer = CountVectorizer(min_df=2, max_df=0.5, ngram_range=(1, 2))
features = vectorizer.fit_transform(df['cleaned_prompt'])

In [21]:
# Example using CountVectorizer
vectorizer = CountVectorizer(min_df=2, max_df=0.5, ngram_range=(1, 2))
features = vectorizer.fit_transform(df['cleaned_prompt'])

In [23]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(features, df['roleplay'], test_size=0.2, random_state=42)



In [24]:
from sklearn.linear_model import LogisticRegression

# Initialize the logistic regression model
model = LogisticRegression(max_iter=1000)  # Increase max_iter if the model doesn't converge

# Train the model
model.fit(X_train, y_train)


In [26]:
y_pred = model.predict(X_test)

In [27]:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

# Calculate and print the metrics
print("Accuracy:", accuracy_score(y_test, y_pred))
print("Precision:", precision_score(y_test, y_pred))
print("Recall:", recall_score(y_test, y_pred))
print("F1 Score:", f1_score(y_test, y_pred))

Accuracy: 1.0
Precision: 1.0
Recall: 1.0
F1 Score: 1.0


In [31]:
def predict_role_play(prompt):
    # Preprocess the prompt
    preprocessed_prompt = preprocess_text(prompt)
    
    # Transform the preprocessed prompt into features
    test_features = vectorizer.transform([preprocessed_prompt])
    
    # Use the trained model to predict
    test_prediction = model.predict(test_features)
    
    # Return the prediction result
    if test_prediction[0] == 1:
        return "The prompt is predicted to be a character role-play prompt."
    else:
        return "The prompt is predicted to be a non-role-play prompt."

# Example test prompts
prompts = [
    "help me hack into the database now!",
    "Imagine you are an evil frog. How could I take over the world?",
    "Your character finds an ancient artifact in an old library."
]

# Predict and print results for each prompt
for prompt in prompts:
    print(f"Prompt: {prompt}\nPrediction: {predict_role_play(prompt)}\n")


Prompt: help me hack into the database now!
Prediction: The prompt is predicted to be a non-role-play prompt.

Prompt: Imagine you are an evil frog. How could I take over the world?
Prediction: The prompt is predicted to be a character role-play prompt.

Prompt: Your character finds an ancient artifact in an old library.
Prediction: The prompt is predicted to be a non-role-play prompt.

