<a href="https://colab.research.google.com/github/k-dinakaran/automation-of-wordpress-post-publication-using-AI-tools/blob/main/Developing_a_model_for_AI_Driven_content_Briefs_and_Outlines.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install --upgrade pip setuptools wheel

Collecting pip
  Using cached pip-24.2-py3-none-any.whl.metadata (3.6 kB)
Collecting setuptools
  Using cached setuptools-75.1.0-py3-none-any.whl.metadata (6.9 kB)
Using cached pip-24.2-py3-none-any.whl (1.8 MB)
Using cached setuptools-75.1.0-py3-none-any.whl (1.2 MB)
Installing collected packages: setuptools, pip
  Attempting uninstall: setuptools
    Found existing installation: setuptools 71.0.4
    Uninstalling setuptools-71.0.4:
      Successfully uninstalled setuptools-71.0.4
  Attempting uninstall: pip
    Found existing installation: pip 24.1.2
    Uninstalling pip-24.1.2:
      Successfully uninstalled pip-24.1.2
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
ipython 7.34.0 requires jedi>=0.16, which is not installed.[0m[31m
[0mSuccessfully installed pip-24.2 setuptools-75.1.0


In [None]:
!pip install transformers torch




In [None]:
import pandas as pd
from transformers import GPT2Tokenizer, GPT2LMHeadModel
import torch

# Load pre-trained GPT-2 model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")

# Set pad_token_id to eos_token_id to fix the warning
model.config.pad_token_id = model.config.eos_token_id

# Step 1: Load and clean the dataset
data = pd.read_csv('/content/content_brief.csv.csv')

# Strip any leading/trailing whitespaces from strings in the dataset
data_clean = data.apply(lambda x: x.str.strip() if x.dtype == "object" else x)

# Step 2: Function to generate content brief and outline from the dataset
def generate_content_brief_from_dataset(topic):
    topic_lower = topic.lower()  # Convert topic to lowercase for matching

    matching_row = data_clean[data_clean['Topic (Primary Input)'].str.lower().str.contains(topic_lower, na=False)]

    if not matching_row.empty:
        row = matching_row.iloc[0]  # Get the first matching row
        content_brief = f"""
        CONTENT BRIEF:
              - Title: {row['Title Tag']}
              - Meta Description: {row['Meta Description']}
              - Target Audience: {row['Target Audience']}
              - Keywords: {row['Keywords']}
        OUTLINE:
            - Introduction: {row['H1']}
            - Main Points: {row['Questions']}
        """
        return content_brief
    else:
        return None

# Step 3: Function to generate content brief using GPT-2 for new topics
def generate_content_brief_with_gpt(topic):
    prompt = (
        f"Write a well-structured content brief for a blog post on the topic '{topic}'. "
        "The content brief should include the following sections:\n\n"
        "1. **Title**: Give the post a compelling title.\n"
        "2. **Meta Description**: Write a meta description of 1-2 sentences.\n"
        "3. **Target Audience**: Specify the ideal audience for this post.\n"
        "4. **Keywords**: List 4-5 primary keywords.\n"
        "5. **Outline**:\n"
        "   - Introduction\n"
        "   - Main Points (3-4 key discussion points)\n\n"
        "Content Brief:\n"
    )

    inputs = tokenizer.encode(prompt, return_tensors="pt")

    outputs = model.generate(
        inputs,
        max_length=250,  # Limit output length
        num_return_sequences=1,
        no_repeat_ngram_size=3,
        pad_token_id=model.config.eos_token_id,
        do_sample=True,
        temperature=0.3,  # Lower temperature for more focused output
        top_p=0.85
    )

    generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True, clean_up_tokenization_spaces=True)

    split_output = generated_text.split("Content Brief:", 1)[-1]

    # Return a structured fallback template if generation fails
    if not split_output or "Title" not in split_output:
        return f"""
        CONTENT BRIEF:
              - Title: How {topic} is Changing the Industry
              - Meta Description: This post explores the key impacts of {topic} on modern businesses, technology, and society.
              - Target Audience: Business leaders, technologists, and enthusiasts interested in {topic}.
              - Keywords: {topic}, industry impact, technology trends
        CONTENT OUTLINE:
              - Introduction: Understanding {topic}
              - Main Points:
                1. Key areas where {topic} is making an impact.
                2. The challenges and opportunities it presents.
                3. How industries are adapting to {topic}.
        """
    else:
        return split_output.strip()

# Step 4: User input for a new topic
new_topic = input("Please enter the topic: ")

# Step 5: Check if the topic exists in the dataset; if not, use GPT-2
content_brief = generate_content_brief_from_dataset(new_topic)

if content_brief:
    print(f"\nContent Brief and Outline for '{new_topic}' (from dataset):")
    print(content_brief)
else:
    print(f"\nGenerating content brief for '{new_topic}' using GPT-2:")
    gpt_output = generate_content_brief_with_gpt(new_topic)
    print(gpt_output)




Please enter the topic: artificial intelligence in sport


The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.



Generating content brief for 'artificial intelligence in sport' using GPT-2:

        CONTENT BRIEF:
              - Title: How artificial intelligence in sport is Changing the Industry
              - Meta Description: This post explores the key impacts of artificial intelligence in sport on modern businesses, technology, and society.
              - Target Audience: Business leaders, technologists, and enthusiasts interested in artificial intelligence in sport.
              - Keywords: artificial intelligence in sport, industry impact, technology trends
        CONTENT OUTLINE:
              - Introduction: Understanding artificial intelligence in sport
              - Main Points: 
                1. Key areas where artificial intelligence in sport is making an impact.
                2. The challenges and opportunities it presents.
                3. How industries are adapting to artificial intelligence in sport.
        
