# AI Word Highlighter - Jupyter Notebook Example

This notebook demonstrates how to use the SimpleAIWordHighlighter to identify common AI-generated words and phrases in text.

In [None]:
# Import the SimpleAIWordHighlighter class
from simple_ai_word_highlighter import SimpleAIWordHighlighter

# Import additional libraries for visualization
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from IPython.display import Markdown, display

## Initialize the Highlighter

First, let's create an instance of the SimpleAIWordHighlighter class.

In [None]:
highlighter = SimpleAIWordHighlighter()

## Sample Text for Analysis

Let's create two text samples for comparison - one that was AI-generated and one that was human-written.

In [None]:
# Sample AI-generated text
ai_text = """
In this article, we will delve into the fascinating realm of artificial intelligence and its implications for modern content creation. 
It's important to note that AI technology has evolved significantly over the past decade, transforming the landscape of digital marketing and content strategy.

When it comes to SEO, a plethora of techniques have been developed to enhance website visibility. 
Indeed, the field has witnessed a paradigm shift, with machine learning algorithms playing a crucial role in determining search rankings.

Furthermore, content creators must navigate the ethical considerations that arise from utilizing AI-generated text. 
On the other hand, the benefits of leveraging AI tools cannot be overstated, as they provide unprecedented efficiency and scalability.

In conclusion, as we have seen, the journey of integrating AI into content strategy is both challenging and rewarding. 
To summarize, businesses that harness the power of this technology responsibly will likely gain a competitive edge in the digital landscape.
"""

# Sample human-written text on the same topic
human_text = """
AI is changing how we create online content. Over the last ten years, new tools have made it easier to write blog posts, ads, and social media updates. These changes affect how websites rank in Google searches.

SEO experts now use many different methods to improve website rankings. Machine learning has changed how search engines decide which sites to show first. This means writers need to update their strategies.

There are some ethical questions about using AI to write content. Is it okay to publish an article written mostly by a computer? But AI tools do make work faster and help small teams do more.

Companies that learn to use AI tools well will probably do better than their competitors. However, human creativity and oversight still matter. The best approach combines AI efficiency with human judgment.
"""

## Analyze the AI-Generated Text

Now let's analyze the AI-generated text to identify common AI patterns.

In [None]:
# Highlight and analyze the AI-generated text
ai_highlighted_text, _ = highlighter.highlight_text(ai_text)
ai_results = highlighter.analyze_text(ai_text)

# Display the highlighted text (words and phrases used by AI will be in bold)
display(Markdown("### Highlighted AI-Generated Text"))
display(Markdown(ai_highlighted_text))

In [None]:
# Display analysis results
display(Markdown("### AI Text Analysis Results"))
print(f"Total Words: {ai_results['total_words']}")
print(f"Unique Words: {ai_results['unique_words']}")
print(f"AI Markers Found: {ai_results['ai_markers']}")
print(f"AI Word Percentage: {ai_results['ai_word_percentage']:.2f}%")

In [None]:
# Visualize the most common AI words and phrases found in the AI text
display(Markdown("### Top AI Words Found"))
if ai_results['word_counts']:
    word_counts = pd.DataFrame({
        'Word': list(ai_results['word_counts'].keys()),
        'Count': list(ai_results['word_counts'].values())
    }).sort_values('Count', ascending=False)
    
    plt.figure(figsize=(10, 6))
    sns.barplot(x='Count', y='Word', data=word_counts.head(10))
    plt.title('Top AI Words Detected')
    plt.tight_layout()
    plt.show()
else:
    print("No AI words detected.")

In [None]:
display(Markdown("### Top AI Phrases Found"))
if ai_results['phrase_counts']:
    phrase_counts = pd.DataFrame({
        'Phrase': list(ai_results['phrase_counts'].keys()),
        'Count': list(ai_results['phrase_counts'].values())
    }).sort_values('Count', ascending=False)
    
    plt.figure(figsize=(10, 6))
    sns.barplot(x='Count', y='Phrase', data=phrase_counts.head(10))
    plt.title('Top AI Phrases Detected')
    plt.tight_layout()
    plt.show()
else:
    print("No AI phrases detected.")

## Analyze the Human-Written Text

For comparison, let's analyze the human-written text.

In [None]:
# Highlight and analyze the human-written text
human_highlighted_text, _ = highlighter.highlight_text(human_text)
human_results = highlighter.analyze_text(human_text)

# Display the highlighted text
display(Markdown("### Highlighted Human-Written Text"))
display(Markdown(human_highlighted_text))

In [None]:
# Display analysis results
display(Markdown("### Human Text Analysis Results"))
print(f"Total Words: {human_results['total_words']}")
print(f"Unique Words: {human_results['unique_words']}")
print(f"AI Markers Found: {human_results['ai_markers']}")
print(f"AI Word Percentage: {human_results['ai_word_percentage']:.2f}%")

## Compare the Results

Now let's compare the AI and human text analysis results side by side.

In [None]:
# Create comparison dataframe
comparison_data = {
    'Metric': ['Total Words', 'Unique Words', 'AI Markers', 'AI Word Percentage'],
    'AI Text': [
        ai_results['total_words'],
        ai_results['unique_words'],
        ai_results['ai_markers'],
        f"{ai_results['ai_word_percentage']:.2f}%"
    ],
    'Human Text': [
        human_results['total_words'],
        human_results['unique_words'],
        human_results['ai_markers'],
        f"{human_results['ai_word_percentage']:.2f}%"
    ]
}

comparison_df = pd.DataFrame(comparison_data)
display(comparison_df)

In [None]:
# Visualize AI markers comparison
metrics = ['AI Markers', 'AI Word Percentage']
ai_values = [ai_results['ai_markers'], ai_results['ai_word_percentage']]
human_values = [human_results['ai_markers'], human_results['ai_word_percentage']]

plt.figure(figsize=(12, 6))

x = range(len(metrics))
width = 0.35

plt.bar([i - width/2 for i in x], ai_values, width, label='AI Text')
plt.bar([i + width/2 for i in x], human_values, width, label='Human Text')

plt.xlabel('Metrics')
plt.ylabel('Values')
plt.title('AI vs Human Text Comparison')
plt.xticks(x, metrics)
plt.legend()

plt.tight_layout()
plt.show()

## Add Custom Words and Phrases

You can extend the highlighter by adding your own custom AI words and phrases.

In [None]:
# Add custom words
highlighter.add_word("unprecedented", frequency=8)
highlighter.add_word("effortlessly", frequency=7)
highlighter.add_word("seamlessly", frequency=7)

# Add custom phrases
highlighter.add_phrase("it is essential to", frequency=9)
highlighter.add_phrase("a game-changer", frequency=8)
highlighter.add_phrase("at the end of the day", frequency=7)

print("Custom words and phrases added to the highlighter.")

## Export and Import Word Lists

You can export your word and phrase lists to CSV files and import them later or share them with others.

In [None]:
# Export words and phrases to CSV
highlighter.export_words_to_csv("ai_words.csv")
highlighter.export_phrases_to_csv("ai_phrases.csv")

print("Words and phrases exported to CSV files.")

In [None]:
# Create a new highlighter and import the words and phrases
new_highlighter = SimpleAIWordHighlighter()
success, message = new_highlighter.import_words_from_csv("ai_words.csv")
print(message)

success, message = new_highlighter.import_phrases_from_csv("ai_phrases.csv")
print(message)

## Practical Use Case: Improving an AI-Generated Text

Now let's see how you can use this highlighter to improve an AI-generated text.

In [None]:
# Create a new text to improve
text_to_improve = """
In this article, we will delve into the fascinating realm of cryptocurrency and its implications for modern finance. 
It's important to note that blockchain technology has evolved significantly over the past decade, transforming the landscape of digital transactions.

When it comes to Bitcoin, a plethora of factors influence its market value. 
Indeed, the cryptocurrency space has witnessed a paradigm shift, with decentralized finance playing a crucial role in reshaping traditional banking models.
"""

# Analyze and highlight the text
highlighted, found_items = highlighter.highlight_text(text_to_improve)
results = highlighter.analyze_text(text_to_improve)

# Display the highlighted text
display(Markdown("### Text to Improve (AI Markers Highlighted)"))
display(Markdown(highlighted))

In [None]:
# Create a dictionary of words and phrases to replace
replacements = {
    "delve into": "explore",
    "fascinating realm": "interesting world",
    "it's important to note": "notably",
    "when it comes to": "regarding",
    "plethora": "variety",
    "indeed": "certainly",
    "witnessed": "seen",
    "paradigm shift": "major change",
    "crucial role": "key part"
}

# Function to replace words and phrases in text
def replace_ai_markers(text, replacements):
    improved_text = text
    for old, new in replacements.items():
        improved_text = improved_text.replace(old, new)
    return improved_text

# Improve the text
improved_text = replace_ai_markers(text_to_improve, replacements)

# Analyze the improved text
improved_highlighted, _ = highlighter.highlight_text(improved_text)
improved_results = highlighter.analyze_text(improved_text)

# Display the improved text
display(Markdown("### Improved Text (AI Markers Highlighted)"))
display(Markdown(improved_highlighted))

In [None]:
# Compare the original and improved texts
comparison_data = {
    'Metric': ['Total Words', 'AI Markers', 'AI Word Percentage'],
    'Original Text': [
        results['total_words'],
        results['ai_markers'],
        f"{results['ai_word_percentage']:.2f}%"
    ],
    'Improved Text': [
        improved_results['total_words'],
        improved_results['ai_markers'],
        f"{improved_results['ai_word_percentage']:.2f}%"
    ]
}

comparison_df = pd.DataFrame(comparison_data)
display(comparison_df)

## Conclusion

This notebook has demonstrated how to use the SimpleAIWordHighlighter to:

1. Identify common AI-generated words and phrases in text
2. Compare AI-generated and human-written content
3. Add custom AI markers to the database
4. Export and import word and phrase lists
5. Improve AI-generated text by replacing common AI markers

By using this tool, you can make your AI-generated content appear more human-written and potentially improve its performance in contexts where AI detection might be a concern, such as SEO optimization or academic submissions.