# Gundry MD SEO Analytcs Project

* *Author*: Mher Arabian<br>
* *Date*: 2024-02-12

## SEO and Digital Marketing Platforms
***

### What are they?

Tools like Ahrefs, SEMrush, and DataForSEO, are comprehensive SEO (Search Engine Optimization) and digital marketing platforms. They provide a range of tools and analytics designed to help businesses improve their online visibility, track their search engine ranking performance, research competitors, and optimize their marketing strategies. Here's an overview of the core functionalities offered by these types of platforms:

1. **Keyword Research**:
    
    These tools offer in-depth keyword research capabilities, allowing users to find relevant keywords for their content and SEO strategies. This includes data on keyword search volume, keyword difficulty, related keywords, and the intent behind searches. Understanding which keywords to target can help in creating content that drives more organic traffic.


2. **Competitor Analysis**

    They enable businesses to analyze their competitors' online strategies, from the keywords they rank for to their backlink profiles and paid advertising strategies. This insight can help businesses identify opportunities and gaps in their own strategies and make informed decisions to gain a competitive edge.

3. **SEO Audits**

    SEO audit features allow users to evaluate their websites for common SEO issues that might be affecting their search engine rankings. This can include checks for broken links, improper use of tags, page speed issues, mobile usability, and more. Audits provide actionable insights to improve website health and performance.


4. **Rank Tracking**

    Rank tracking features enable businesses to monitor where their pages rank on search engine results pages (SERPs) for targeted keywords over time. This is crucial for understanding the effectiveness of SEO efforts and adjusting strategies as needed.

5. **Backlink Analysis**

    Backlink tools analyze the quantity and quality of external sites linking back to a user's website. Since backlinks are a significant factor in search engine ranking algorithms, understanding and improving your backlink profile is essential for SEO.

6. **Content Marketing Tools**

    Some platforms offer tools to help with content creation and optimization, suggesting ways to improve content for better engagement and rankings. This can include content templates, tone analysis, and recommendations for optimizing for specific keywords.


7. **Social Media and Paid Advertising Analysis**

    Beyond organic search, these tools often extend into analyzing social media presence and performance, as well as paid advertising campaigns across platforms like Google Ads and social media ads. They provide insights into ad performance, audience targeting, and ROI.

### Popular Tools

1. **SEMrush**: SEMrush is a widely used tool for SEO and digital marketing. It offers an API (for Business plan subscribers) that allows you to access their extensive dataset, including keyword research, site audits, competitor analysis, and backlink tracking.

2. **Ahrefs**: Ahrefs is another popular SEO tool that provides an API for accessing their data, including site audits, backlink analysis, and keyword research. The API is available to subscribers, and pricing depends on the volume of data you need.

3. **Moz**: Moz offers an API for accessing their SEO data, including link metrics, keyword research, and domain information. They offer both free and paid access, with the paid API providing more data and higher limits.

4. **Majestic**: Majestic is known for its backlink analysis tools, and their API gives access to their comprehensive backlink data. It's a good choice if your focus is on link building and backlink analysis.

5. **Serpstat**: Serpstat provides an API for accessing their SEO platform, including keyword research, competitor analysis, and backlink analysis. They offer various subscription plans based on usage levels.

6. **SpyFu**: SpyFu offers insights into competitors' keyword strategies and PPC campaigns. Their API allows you to access this data programmatically, which can be useful for integrating competitive intelligence into your tools.

7. **Google Search Console**: While not a direct alternative in the same vein as the others, Google Search Console's API provides valuable data directly from Google about your site's performance in search results, including search traffic, impressions, clicks, and position. It's free to use for verified site owners.

8. **DataForSEO**: Good alternative that offers a wide range of SEO data through its API, catering to developers and companies who need to integrate SEO data into their applications or services. DataForSEO is particularly known for its comprehensive and flexible APIs tailored to various SEO tasks. Here's a brief overview of what DataForSEO offers:

### Why DataForSEO?


**Features and Services:**

* **Keywords Data API**: Provides extensive data on keywords, including search volume, CPC, competition, and SERP features. It's useful for keyword research and SEO strategy planning.

* **SERP API**: Allows you to retrieve search engine results pages (SERP) data for a specific query, which is crucial for analyzing search trends, competitor rankings, and understanding Google's SERP layouts.

* **OnPage API**: Offers detailed analysis of webpage content and structure for SEO optimization, including checks for meta tags, headings, and other on-page elements.

* **Backlink API**: Provides data on backlinks for any domain, which is vital for backlink analysis and understanding a website's link profile.

* **Traffic Analytics API**: Gives insights into web traffic, including sources, volumes, and audience behavior. This can help in understanding the market and competitors.

* **Rank Tracker API**: Enables you to track the rankings of specific keywords over time across different search engines and locations.

* **New Backlink API**: Backlinks API is DataForSEO’s newest service that provides up-to-date backlink analysis data on billions of domains.

**Advantages:**

* **Comprehensive Data**: DataForSEO offers a wide array of SEO data points, making it a versatile tool for different SEO needs.

* **Flexible Pricing**: They provide a pay-as-you-go pricing model, which can be more cost-effective for users with varying usage requirements. This model allows for greater flexibility and control over costs.

* **High Scalability**: The API is designed to handle requests at scale, making it suitable for businesses of all sizes, from startups to large enterprises.

* **Regular Updates**: DataForSEO frequently updates its APIs to ensure accuracy and relevance of data, adapting to the ever-changing SEO landscape.

## Definitions
***

### Campaign

In the context of SEO and digital marketing, a "campaign" refers to a coordinated series of efforts aimed at improving the visibility and ranking of a website or specific web pages in search engine results. A campaign typically focuses on a set of targeted keywords and employs various SEO strategies, such as content creation, on-page optimization, and backlink building, to improve rankings for those keywords.


### SERP



SERP stands for Search Engine Results Page. This is the page that users see after they search for a term on a search engine like Google. A SERP contains a list of results that the search engine has determined are relevant to the user's query. These results can include website pages, local business listings, images, videos, featured snippets, and more. In SEO analysis, attention is often focused on the ranking order of these results, as higher-ranked results are more likely to be clicked on by users.

A "positive SERP" refers to a result that is favorable for the website in question—meaning that the website or one of its pages appears within the top 10 results for a given keyword. Being in the top 10 is considered highly beneficial because these results get the majority of clicks from users.


### Search Volume

Search volume is a metric that indicates the number of times a specific keyword or phrase is searched for in a search engine over a given period of time. Typically, search volume is represented as an average monthly search count. This metric is crucial in Search Engine Optimization (SEO) and Search Engine Marketing (SEM) as it helps marketers understand the popularity and demand for certain keywords.



### Keywords

For example, the keywords related to Dr. Gundry and his products, such as "bio complete 3," represent the specific terms and phrases that people are using in search engines when they're looking for information on his dietary supplements, health advice, or other related products. These keywords reflect the areas of interest and concern of potential customers or followers of Dr. Gundry's health protocols. For instance, someone searching for "bio complete 3" is likely looking for information on this particular supplement, its benefits, reviews, or where to purchase it.

### Competition

A measure of how many advertisers are bidding on this keyword for their ads. "High" competition means many advertisers are interested in this keyword, which could make it harder or more expensive to rank for or bid on. "Low" means fewer advertisers, and "Medium" is in-between.

### Backlink

A backlink in the context of SEO (Search Engine Optimization) is a link from one website to another. It's considered an important factor for improving a website's search engine rankings, as search engines view backlinks as a form of endorsement or vote of confidence from one site to another. The quality, relevance, and number of backlinks can significantly influence a website's visibility and ranking on search engine results pages (SERPs).





### Microsites

For example:
```python
microsites = ['https://gundrymd.com/health-goal/weight/', 'https://gundrymd.com/health-goal/energy/']
```

Microsites allow Gundry MD to:

* **Target Specific Keywords**: Each microsite can be optimized for specific keywords related to its theme, such as "weight management supplements" for the weight goal page or "lectin-free food" for the lectin-free food shop. This helps in ranking for these targeted keywords.
* **Serve Niche Audiences**: Microsites cater to specific interests or needs of different segments of the audience, providing tailored content that directly addresses their concerns or goals.
* **Enhance User Experience**: By creating separate microsites for different health goals and product categories, users can more easily find the information or products they are looking for, improving navigation and engagement.
* **Boost SEO**: Microsites can generate additional entry points for organic search traffic, linking back to the main site and improving the overall domain's authority and search engine ranking.

### Affiliate Marketing

Affiliate marketing is a marketing arrangement by which an online retailer pays commission to an external website for traffic or sales generated from its referrals. This can often involve creating content, such as articles, blog posts, videos, or other forms of media that include affiliate links. When someone clicks on these links and makes a purchase, the content creator earns a commission from the sale.

## Imports
***

In [165]:
import json
import base64
import os
from pprint import pprint
from datetime import date
from dotenv import load_dotenv
import requests
import pandas as pd
import numpy as np 
from openai import OpenAI # openai==1.12.0
from client import RestClient # python client for Ope



In [166]:
# make sure using correct python virtual environment
import sys
sys.executable

'/Users/marabian/Projects/gundry-md-seo-project/.venv/bin/python3'

## Useful Links

* [DataForSEO API Explorer](https://app.dataforseo.com/api-explorer)

* [DataForSEO V3 API Official Documentation](https://docs.dataforseo.com/v3/?bash)

## DataForSEO API Setup
***

To follow good security practices, place your *DataForSEO* credentials in a `.env` file, make sure it's in the same directory as this file.

Example `.env` file:

```
DATAFORSEO_USERNAME=<Your Username>
DATAFORSEO_PASSWORD=<Your Password>
```

Let's make sure the *DataForSEO* API is working. Should get status code 20000.

In [217]:
# Load environment variables from .env file
load_dotenv()

# Fetching credentials from environment variables
dataforseo_username = os.environ.get('DATAFORSEO_USERNAME')
dataforseo_password = os.environ.get('DATAFORSEO_PASSWORD')

# Ensuring credentials were successfully fetched
if not dataforseo_username or not dataforseo_password:
    raise ValueError("DataForSEO credentials are not set in environment variables.")

client = RestClient(dataforseo_username, dataforseo_password)

# Making a connection request to check available SERP API endpoints
response = client.get("/v3/serp/endpoints")

# Selectively printing key details from the response
print(f"Status Code: {response.get('status_code')}")
print(f"Status Message: {response.get('status_message')}")
print(f"Cost: {response.get('cost')} USD")
tasks = response.get('tasks', [])
if tasks:
    print(f"Tasks Count: {len(tasks)}")
    first_task_result = tasks[0].get('result', [])
    if first_task_result:
        print("Sample Endpoint:", first_task_result[0][0])  # Printing the first endpoint as a sample
else:
    print("No tasks found in the response.")

Status Code: 20000
Status Message: Ok.
Cost: 0 USD
Tasks Count: 1
Sample Endpoint: v3/serp/baidu/languages


## GPT API Setup
***

Now let's make sure our *OpenAI* credentials are working. If you don't have credentials, follow these steps to obtain them:

1. Go to https://platform.openai.com/signup and create an account.
2. Go to https://platform.openai.com/
3. On the left sidebar, click on API keys and create a new secret key.


There is a free tier, but you may need to add your payment information to your OpenAI account to use their API. You can do this by going to *Settings > Billing* on the left sidebar.

Once you have your API key, make sure to place it in your `.env` file like this:

```
OPENAI_API_KEY=<Your Key>
```

In [226]:
# This loads the environment variables from .env
load_dotenv() 

# Now, you can access OPENAI_API_KEY
openai_api_key = os.getenv('OPENAI_API_KEY')

openai_client = OpenAI(
  api_key=openai_api_key
)


### Helper Functions

Let's define some helper functions which will allow us to interact with the GPT API.


In [227]:
# Sends conversation history to the OpenAI API and gets a model-generated response.
def get_completion_from_messages(messages, model="gpt-3.5-turbo", temperature=1):
    response = openai_client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=temperature, # this is the degree of randomness of the model's output
    )

    #print(str(response.choices[0].message))
    return response.choices[0].message.content


# Convert a JSON string that represents a list of dictionaries into an actual Python list of dictionaries
def read_string_to_list(input_string):
    if input_string is None:
        return None

    try:
        input_string = input_string.replace("'", "\"")  # Replace single quotes with double quotes for valid JSON
        data = json.loads(input_string)
        return data
    except json.JSONDecodeError:
        print("Error: Invalid JSON string")
        return None   

## Campaign Keyword Analysis
***

### Get Top 100 Keywords

Let's use *DataForSeo's* **keywords_for_site** API get the top 100 keywords by **search volume** (Google) related to *Dr. Gundry's* [website](https://gundrymd.com/).

Then we will load these into a python dictionary, where the key is the keyword itself, and the value is the search volume.

In [180]:
url = "https://api.dataforseo.com/v3/keywords_data/google_ads/keywords_for_site/live"
# Assuming 'url' is already defined as the endpoint URL for the "Keywords for Site" task
# Prepare the request payload targeting "https://gundrymd.com/"
payload = json.dumps([
    {
        "location_code": 2840,  # Location code for the United States
        "language_code": "en",  # English language
        "target": "https://gundrymd.com/"
    }
])

# Initialize an empty dictionary to store the keywords and their search volumes
keywords_dict = {}

# Make the POST request to DataForSEO using the prepared payload
response = requests.post(url, auth=(dataforseo_username, dataforseo_password), data=payload, headers={"Content-Type": "application/json"})
data = response.json()


# Check if the request was successful (status_code == 20000)
if data['status_code'] == 20000:
    # Loop through each task in the response
    for task in data['tasks']:
        # Then loop through each result in the task
        for result in task.get('result', []):
            # Assuming you want to collect all keywords, not just those with Search Volume > 5000
            # This updates the dictionary with the keyword as the key and the search volume as the value
            keywords_dict[result['keyword']] = result['search_volume']
            
            # If you need to limit to the top 100 keywords based on search volume, consider sorting and slicing the dictionary outside this loop

# After collecting all keywords, sort the dictionary by search volume (descending) and get the top 100
# Note: Dictionaries are ordered (by insertion order) in Python 3.7+
sorted_keywords = dict(sorted(keywords_dict.items(), key=lambda item: item[1], reverse=True)[:100])

# Now 'sorted_keywords' contains the top 100 keywords and their search volumes
# print(sorted_keywords)


Let's print the **top 10 keywords associated with Gundy MD**.

In [182]:
# Print the first 10 items
count = 0
for key, value in sorted_keywords.items():
    print(f"{key}: {value}")
    count += 1
    if count == 20:
        break

vitamins shop: 823000
pro biotics: 301000
thevitamin shoppe: 201000
vitamin c: 165000
vitamin shoppe near me: 90500
best women's probiotic: 60500
best womens probiotics: 60500
top rated women's probiotic: 60500
gut health: 49500
doctor gundry: 40500
probiotic foods: 40500
best probiotics: 40500
best prebiotics: 40500
supplement store near me: 40500
probiotic diet: 40500
best tablets for weight loss: 40500
gundry doctor: 40500
foods with probiotics in them: 40500
best weight loss capsules: 40500
steven gundry dr: 40500


Now let's print just the keywords and store it in a Python list.

In [183]:
# Sort the dictionary based on values in ascending order and store only the keys
ordered_keywords = [key for key, _ in sorted(sorted_keywords.items(), key=lambda item: item[1])]

In [184]:
print(f"Total number of keywords: {len(ordered_keywords)}")

Total number of keywords: 100


### Generative AI Use-case 1: Group Keywords

Grouping keywords is a common and effective practice in SEO. It helps in organizing your SEO strategy around thematic clusters or topics, improving content relevance, and enhancing user experience on your website. This approach aligns with search engines' focus on topic relevance and intent matching, potentially boosting your site's rankings for related searches. Grouping keywords can also streamline content creation and optimization efforts, making it easier to target multiple related keywords within a single piece of content or across a series of interconnected pages

Let's use *OpenAI's* **gpt-3.5-turbo** to group these 100 keywords. We will ask the model to identify natural clusters or groupings based on thematic similarity, relevance, or any other criteria it deems appropriate.

In [185]:
# Convert the list of keywords to a string format for the prompt
keywords_string = ', '.join(ordered_keywords)

# Craft the prompt for the model to group keywords
prompt = {
    "role": "user",
    "content": (
        f"Please analyze the following list of keywords and group them into thematic clusters: "
        f"{keywords_string}. "
        f"Provide a title for each group and list the keywords belonging to it."
        f"The output format should be a JSON string that represents a list of dictionaries."
        f"Make sure to only return the JSON string in the output, nothing else."
    )
}

# Use the helper function to send the prompt to the model and get the response
# temp 0 makes it so that the randomness is reduced, more deterministic
model_response = get_completion_from_messages([prompt], temperature=0)

Let's print the **group names**, and the number of groups found by the model.

In [186]:
# Convert JSON string to Python dictionary
keyword_groups_dict = json.loads(model_response)

In [188]:
# Print all the group names
for key in keyword_groups_dict.keys():
    print(key)

Group 1: Dr. Gundry Products
Group 2: Weight Loss Supplements
Group 3: Probiotics and Gut Health
Group 4: Vitamin and Supplement Store
Group 5: Other Health-related Keywords


Let's examine some of these groups.

In [189]:
# Remove "Group Number:" part from each key
keyword_groups_dict = {key.split(': ', 1)[1]: value for key, value in keyword_groups_dict.items()}

Let's look at some of these groups.

In [190]:
# Group: Dr. Gundry Products
keyword_groups_dict['Dr. Gundry Products']

['dr gundry products',
 'dr steven gundry products',
 'dr gundry diet',
 'steven gundry md products',
 'steve gundry com',
 'total restore',
 'the plant paradox',
 "dr gundry's",
 'dr gundry md',
 'doctor gundry md',
 'gundry md california',
 'gundrymd',
 'dr gundry md',
 'gundry doctor',
 'steven gundry dr',
 'stevengundrymd',
 'dr steven gundry md',
 'dr steven r gundry',
 'dr steven r gundry md']

In [191]:
keyword_groups_dict['Weight Loss Supplements']

['supplements for weight loss',
 'fat burning supplements',
 'best over the counter weight loss supplements',
 'vitamins that aid in weight loss',
 'fat reduction supplements',
 'protein food supplements',
 'best supplements for weight loss',
 'best weight loss supplements for women',
 'best weight loss products for women',
 'best foods for gut health',
 'best dietary supplements for weight loss',
 'good supplements for weight loss',
 'diet supplements for weight loss',
 'dietary pills',
 'weight control supplements',
 'loss weight supplement',
 'best tablets for weight loss',
 'best weight loss capsules']

## SERP Analysis
***

### Get SERPs for Keywords

Let's focus on the **Group 1: Dr. Gundry Products**, which was previously identified by GPT (Generative AI Use-case 1). 

These keywords are directly related to the brand and its product offerings, making them critical for maintaining a positive online presence. Users searching these terms are likely looking for information about Dr. Gundry's products, potentially with the intent to purchase. Ensuring that the search engine results pages (SERPs) for these keywords reflect **positive sentiment** and **relevant information** is essential for brand reputation management. 

Let's **retrieve the SERP data for a single keyword** (also supports list of keywords) using **DataForSEO's SERP API**.

In [220]:
# Define the endpoint URL
url = "https://api.dataforseo.com/v3/serp/google/organic/live/regular"


# Testing with just 1 keyword so the code runs fast, but can pass entire list of keywords
keywords = [keyword_groups_dict['Dr. Gundry Products'][0]]

print(f"Keywords: {keywords}")

# Initialize a list to store SERP data
serp_data = []

# Iterate over keywords to fetch SERP data
for keyword in keywords:
    payload = json.dumps([{
        "language_code": "en",
        "location_code": 2840,  # Location code for the United States
        "keyword": keyword
    }])
    headers = {"Authorization": f"Basic {dataforseo_username}:{dataforseo_password}", "Content-Type": "application/json"}
    response = requests.post(url, auth=(dataforseo_username, dataforseo_password), headers=headers, data=payload)
    result = response.json()

    # Process the response
    print(result['status_code'])
    if result['status_code'] == 20000:
        task_data = result['tasks'][0]['result'][0]['items']
        for item in task_data:
            if item['type'] in ['organic', 'paid']:
                serp_data.append({
                    "keyword": keyword,
                    "position": item.get('rank_absolute', None),
                    "type": item['type'],
                    "url": item.get('url', None),
                    "title": item.get('title', None)
                })




Keywords: ['dr gundry products']
20000


Let's take a look at some of the rows in our SERP data.

In [221]:
# At this point, serp_data contains the collected SERP information
# for further analysis or storage, let's convert this list of dictionaries
# into a pandas DataFrame

serp_df = pd.DataFrame(serp_data)
print(serp_df.shape)
serp_df.head()

(99, 5)


Unnamed: 0,keyword,position,type,url,title
0,dr gundry products,2,organic,https://gundrymd.com/,Dr. Gundry Supplements and Wellness Resources
1,dr gundry products,4,organic,https://gundrymd.com/shop/supplements/,Shop All Gundry MD Supplements
2,dr gundry products,5,organic,https://www.amazon.com/dr-steven-gundry-produc...,Dr Steven Gundry Products
3,dr gundry products,6,organic,https://drgundry.com/groceries/,Groceries - Dr Gundry
4,dr gundry products,7,organic,https://www.amazon.com/Gundry-Restore-Lining-S...,Amazon.com: Gundry MD® Total Restore® Gut Heal...


Let's look at the URL for one of these.

In [222]:
# Access the first row
first_row = serp_df.iloc[0]

# get URL column
url = first_row["url"]

print(url)

https://gundrymd.com/


### Interpreting Returned SERP Data



* **Keyword**: This is the search term or phrase you've analyzed. You see multiple rows for the same keyword because the SERP analysis fetches multiple results (URLs) that rank for this keyword. Each row represents a different webpage that ranks in the search results for the given keyword.

* **Position**: This indicates the ranking position of the URL for the specified keyword in Google's search results. Position 1 is the top-ranking result, position 2 is the second, and so forth. It's crucial for SEO as websites aim to rank as high as possible, ideally on the first page of search results, to increase visibility.

* **Type**: This field specifies the type of search result. In your dataframe, all entries are "organic," meaning they are natural search results as opposed to "paid" results (like advertisements). Organic search results are based on the relevance of the webpage to the search query and other SEO factors, rather than being directly paid for placement.

* **URL**: This is the web address of the page that ranks in the search results for the keyword. Each row has a different URL because each position in the search results corresponds to a different webpage. This column is essential for understanding which pages are your competition for a keyword and analyzing the content and SEO strategies that may contribute to their high ranking.

* **Title**: This is the title of the webpage as it appears in the search results. It's often the \<title\> tag of the HTML document. The title is crucial for SEO and user engagement, as it's one of the first things a user sees in the search results and can influence whether they click on a link. A well-crafted title should be relevant to the keyword, engaging, and accurately reflect the content of the page.

### Generative AI Use-case 2: Sentiment Analysis and Relevance

This code iterates over each SERP entry obtained from the previous step, sends the title and URL for sentiment and relevance analysis through a GPT-based function, and updates the DataFrame with the results.

Essentially we want to make sure the SERPs for these keywords (from Group 1: Dr. Gundry Products) express positive sentiment, and are directly relevant to the brand/campaign.

Let's define some helper functions below to accomplish this task.

In [223]:
def analyze_sentiment(title):
    prompt = f"Given the title '{title}', determine if the sentiment is positive, negative, or neutral."
    response = get_completion_from_messages([{"role": "user", "content": prompt}])
    # Assuming the model's response is straightforward, like "The sentiment is positive."
    # Extracting sentiment from the response
    try:
        sentiment = response.split('sentiment is ')[1].split('.')[0].strip()
        return sentiment
    except IndexError:
        return "neutral"  # Default fallback

In [233]:
def check_relevance(title, url, keyword):
    prompt = (
        f"Given the title '{title}', URL '{url}', and the keyword '{keyword}', "
        "is the content likely relevant to Gundry MD? Make sure to consider external websites like Amazon."
        "Answer with yes or no."
    )
    response = get_completion_from_messages([{"role": "user", "content": prompt}])
    # Assuming the model directly responds with "yes" or "no"
    return response.strip().lower() == "yes"


Let's test the functions above to make sure they work.

In [234]:
# Example usage
title_example = "10 Best Supplements for Heart Health - Gundry MD"
url_example = "https://gundrymd.com/supplements/heart-health"
keyword_example = "heart health supplements"
sentiment = analyze_sentiment(title_example)
relevance = check_relevance(title_example, url_example, keyword_example)
print(f"Sentiment: {sentiment}, Relevance to Gundry MD?: {relevance}")

Sentiment: neutral, Relevance to Gundry MD?: True


Let's run the functions above on the first 5 entries of our SERP. We will add the columns "Sentiment" and "Relevance" with their corresponding values returned by GPT. Then we will print the final dataframe.

In [235]:
# Create a copy of the first 5 rows from serp_df
serp_df_temp = serp_df.loc[:4].copy()

# Applying the sentiment analysis function
serp_df_temp['Sentiment'] = serp_df_temp['title'].apply(analyze_sentiment)

# Applying the relevance check function
# Assuming 'keyword' is used for relevance check, adjust as necessary
serp_df_temp['Relevance'] = serp_df_temp.apply(lambda row: check_relevance(row['title'], row['url'], row['keyword']), axis=1)

In [236]:
serp_df_temp

Unnamed: 0,keyword,position,type,url,title,Sentiment,Relevance
0,dr gundry products,2,organic,https://gundrymd.com/,Dr. Gundry Supplements and Wellness Resources,neutral,True
1,dr gundry products,4,organic,https://gundrymd.com/shop/supplements/,Shop All Gundry MD Supplements,neutral,True
2,dr gundry products,5,organic,https://www.amazon.com/dr-steven-gundry-produc...,Dr Steven Gundry Products,neutral,True
3,dr gundry products,6,organic,https://drgundry.com/groceries/,Groceries - Dr Gundry,neutral,True
4,dr gundry products,7,organic,https://www.amazon.com/Gundry-Restore-Lining-S...,Amazon.com: Gundry MD® Total Restore® Gut Heal...,neutral,True


So for the keyword "dr gundry product", GPT found these SERP entries to have neutral sentiment and relevant to Gundry MD.

**Ideas for Improvement**:

1. Instead of just using URL and title of the SERP entries, scrape the web, and use the HTML content itself for sentiment analysis and relevance analysis.
2. Construct more sophisticated prompts for further analaysis of SERP data.
3. Have guard rails in place to detect incorrect/hallucinated output from GPT model.
4. Use GPT4, makes much less mistakes! A lot more powerful, but costs more too.

## Competitor Analysis 
***

### Getting SERPs for Keywords

Let's focus on another keyword group related to Gundry MD. Remember our groups were:

In [238]:
# Print all the group names
for key in keyword_groups_dict.keys():
    print(key)

Dr. Gundry Products
Weight Loss Supplements
Probiotics and Gut Health
Vitamin and Supplement Store
Other Health-related Keywords


Let's look at the "Weight Loss Supplements" group.

In [242]:
keyword_groups_dict['Weight Loss Supplements']

['supplements for weight loss',
 'fat burning supplements',
 'best over the counter weight loss supplements',
 'vitamins that aid in weight loss',
 'fat reduction supplements',
 'protein food supplements',
 'best supplements for weight loss',
 'best weight loss supplements for women',
 'best weight loss products for women',
 'best foods for gut health',
 'best dietary supplements for weight loss',
 'good supplements for weight loss',
 'diet supplements for weight loss',
 'dietary pills',
 'weight control supplements',
 'loss weight supplement',
 'best tablets for weight loss',
 'best weight loss capsules']

Let's pick a keyword from this group "supplements for weight loss" and get the SERP results:

In [252]:
keyword_of_interest = keyword_groups_dict['Weight Loss Supplements'][0]
print(keyword_of_interest)

supplements for weight loss


In [253]:
# Define the endpoint URL
url = "https://api.dataforseo.com/v3/serp/google/organic/live/regular"


# Testing with just 1 keyword so the code runs fast, but can pass entire list of keywords
keywords = [keyword_groups_dict['Weight Loss Supplements'][0]]

print(f"Keywords: {keywords}")

# Initialize a list to store SERP data
serp_data = []

# Iterate over keywords to fetch SERP data
for keyword in keywords:
    payload = json.dumps([{
        "language_code": "en",
        "location_code": 2840,  # Location code for the United States
        "keyword": keyword
    }])
    headers = {"Authorization": f"Basic {dataforseo_username}:{dataforseo_password}", "Content-Type": "application/json"}
    response = requests.post(url, auth=(dataforseo_username, dataforseo_password), headers=headers, data=payload)
    result = response.json()

    # Process the response
    print(result['status_code'])
    if result['status_code'] == 20000:
        task_data = result['tasks'][0]['result'][0]['items']
        for item in task_data:
            if item['type'] in ['organic', 'paid']:
                serp_data.append({
                    "keyword": keyword,
                    "position": item.get('rank_absolute', None),
                    "type": item['type'],
                    "url": item.get('url', None),
                    "title": item.get('title', None)
                })

Keywords: ['supplements for weight loss']
20000


Let's look at the SERP data for the keyword "supplements for weight loss".

In [254]:
# At this point, serp_data contains the collected SERP information
# for further analysis or storage, let's convert this list of dictionaries
# into a pandas DataFrame

serp_df = pd.DataFrame(serp_data)
print(serp_df.shape)
serp_df.head()

(99, 5)


Unnamed: 0,keyword,position,type,url,title
0,supplements for weight loss,1,organic,https://www.webmd.com/vitamins-and-supplements...,11 Supplements and Herbs for Weight Loss Expla...
1,supplements for weight loss,3,organic,https://www.forbes.com/health/weight-loss/top-...,A Guide To The Top Weight Loss Supplements In ...
2,supplements for weight loss,4,organic,https://www.mayoclinic.org/healthy-lifestyle/w...,Dietary supplements for weight loss
3,supplements for weight loss,5,organic,https://ods.od.nih.gov/factsheets/WeightLoss-H...,Dietary Supplements for Weight Loss
4,supplements for weight loss,6,organic,https://www.gnc.com/weight-management/calorie-...,Shop Fat Burner Supplements & Thermogenics


Now let's get the top 20 results. We will analyze these.

In [261]:
serp_subset = serp_df.head(20)
serp_subset

Unnamed: 0,keyword,position,type,url,title
0,supplements for weight loss,1,organic,https://www.webmd.com/vitamins-and-supplements...,11 Supplements and Herbs for Weight Loss Expla...
1,supplements for weight loss,3,organic,https://www.forbes.com/health/weight-loss/top-...,A Guide To The Top Weight Loss Supplements In ...
2,supplements for weight loss,4,organic,https://www.mayoclinic.org/healthy-lifestyle/w...,Dietary supplements for weight loss
3,supplements for weight loss,5,organic,https://ods.od.nih.gov/factsheets/WeightLoss-H...,Dietary Supplements for Weight Loss
4,supplements for weight loss,6,organic,https://www.gnc.com/weight-management/calorie-...,Shop Fat Burner Supplements & Thermogenics
5,supplements for weight loss,7,organic,https://www.healthline.com/nutrition/weight-lo...,Weight Loss Medications: Do They Work?
6,supplements for weight loss,9,organic,https://healthmatch.io/weight-management/suppl...,7 Supplements For Weight Loss (That Actually W...
7,supplements for weight loss,10,organic,https://ods.od.nih.gov/factsheets/WeightLoss-C...,Dietary Supplements for Weight Loss - Consumer
8,supplements for weight loss,11,organic,https://www.amazon.com/Best-Sellers-Fat-Burner...,Best Fat Burner Supplements
9,supplements for weight loss,12,organic,https://www.mayoclinic.org/healthy-lifestyle/w...,Prescription weight-loss drugs: Can they help ...


### Generative AI Use-case 3: Analyzing Themes for SEO Competitors 

Understanding the content themes of competitors ranking for similar keywords is important for strategizing content creation and improving search visibility. Generative AI, particularly models like GPT, can provide useful insights by analyzing the themes present in the titles and URLs of top-ranking competitor pages.

Let's use the "url" and "title" fields returned by the SERP API to analyze themes. This can be much improved if we actually scrape the website in the SERP entry and give the content to the model.

In [297]:
def analyze_theme(title, url):
    prompt = (
        f"Given the title '{title}' from the URL '{url}', "
        "analyze and summarize the main theme or topic of the content. Provide a concise theme or topic summary."
        "Don't prepend the theme string itself with anything, restrict your answers to short terms for each theme."
    )
    response = get_completion_from_messages([{"role": "user", "content": prompt}])
    # Assuming the model directly responds with "yes" or "no"
    return response


In [285]:
# Create a copy of the first 5 rows from serp_df for demonstration
serp_df_temp = serp_df.iloc[:5].copy()

# Applying the theme analyzing function
serp_df_temp['Theme'] = serp_df_temp.apply(lambda row: analyze_theme(row['title'], row['url']), axis=1)





In [287]:
serp_df_temp

Unnamed: 0,keyword,position,type,url,title,Theme
0,supplements for weight loss,1,organic,https://www.webmd.com/vitamins-and-supplements...,11 Supplements and Herbs for Weight Loss Expla...,Supplements and herbs for weight loss
1,supplements for weight loss,3,organic,https://www.forbes.com/health/weight-loss/top-...,A Guide To The Top Weight Loss Supplements In ...,Weight Loss Supplements
2,supplements for weight loss,4,organic,https://www.mayoclinic.org/healthy-lifestyle/w...,Dietary supplements for weight loss,Dietary supplements for weight loss
3,supplements for weight loss,5,organic,https://ods.od.nih.gov/factsheets/WeightLoss-H...,Dietary Supplements for Weight Loss,Dietary Supplements for Weight Loss
4,supplements for weight loss,6,organic,https://www.gnc.com/weight-management/calorie-...,Shop Fat Burner Supplements & Thermogenics,Weight management


**Ideas for Improvement**:

* Scrape the web for each of these URLs to get more information about them, then feed to GPT for better theme analysis/categorization.
* **SEO Competitor vs Industry Competitor**: Can use the model to extract the industry competitors then perform theme analysis/categorizzation.
* Specify a list of themes, and have the model categorize these SERP results based on this list.
* Restrict the analysis to industry competitors of Gundry MD.

### Generative AI Use-case 4: Extract Industry Competitors from SERPs


Let's go through our SERP results, and extract the ones that are direct industry competitors to Gundry MD. We will add a new column "Industry Competitor" which specifies whether or not the website in each corresponding SERP entry (row) is a direct industry competitor to Gundry MD.

In [294]:
def extract_competitor(title, url):
    prompt = (
        f"Given the title '{title}' from the URL '{url}', "
        "Check if this represents a company, and if it is, check if it is a direct competitor to Gundry MD."
        "Don't prepend the answer string itself with anything, restrict your answer to Yes or No only."
        "Yes, if it is a direct industry competitor, No if not a competitor."
    )
    response = get_completion_from_messages([{"role": "user", "content": prompt}])
    # Assuming the model directly responds with "yes" or "no"
    return response


In [298]:
# Create a copy of the first 5 rows from serp_df for demonstration
serp_df_temp = serp_df.iloc[:5].copy()

# Applying the theme analyzing function
serp_df_temp['Industry Competitor'] = serp_df_temp.apply(lambda row: extract_competitor(row['title'], row['url']), axis=1)

In [299]:
serp_df_temp

Unnamed: 0,keyword,position,type,url,title,Industry Competitor
0,supplements for weight loss,1,organic,https://www.webmd.com/vitamins-and-supplements...,11 Supplements and Herbs for Weight Loss Expla...,No
1,supplements for weight loss,3,organic,https://www.forbes.com/health/weight-loss/top-...,A Guide To The Top Weight Loss Supplements In ...,No
2,supplements for weight loss,4,organic,https://www.mayoclinic.org/healthy-lifestyle/w...,Dietary supplements for weight loss,No
3,supplements for weight loss,5,organic,https://ods.od.nih.gov/factsheets/WeightLoss-H...,Dietary Supplements for Weight Loss,No
4,supplements for weight loss,6,organic,https://www.gnc.com/weight-management/calorie-...,Shop Fat Burner Supplements & Thermogenics,No


**Ideas for Improvement**:

* Use GPT4
* Scrape the website and use the HTML content for better results.

## Microsite Analysis

### Generative AI Use-case 5: Analyzing Microsites for Competitor Analysis




In [None]:
microsites = [
    "https://gundrymd.com/health-goal/weight/",
    "https://gundrymd.com/health-goal/energy/",
    "https://gundrymd.com/health-goal/digestion-and-gut-health/",
    "https://gundrymd.com/health-goal/immune-support/",
    "https://gundrymd.com/health-goal/heart-health/",
    "https://gundrymd.com/health-goal/mens-health/",
    "https://gundrymd.com/health-goal/womens-health/",
    "https://gundrymd.com/health-goal/muscles-joints/",
    "https://gundrymd.com/health-goal/brain-vision/",
    "https://gundrymd.com/health-goal/polyphenol-support/",
    "https://gundrymd.com/health-goal/lectin-free-diet-food/",
    "https://gundrymd.com/health-goal/hair-skin-nails/",
    "https://gundrymd.com/shop/supplements/",
    "https://gundrymd.com/shop/lectin-free-food/",
    "https://gundrymd.com/shop/skincare/",
    "https://gundrymd.com/product-category/accessories/",
    "https://gundrymd.com/p/magazine/",
    "https://gundrymd.com/ingredients/antioxidants/",
    "https://gundrymd.com/ingredients/coq10/",
    "https://gundrymd.com/ingredients/glucosamine/",
    "https://gundrymd.com/ingredients/greens-algae/",
    "https://gundrymd.com/ingredients/mct/"
]


Didn't get to complete this one! But the idea is the same as above, get the keywords associated with these microsites, then get SERP results for these keywords and perform analysis on them (sentiment analysis, categorization, abtract themes, etc).

## Affiliate Marketing
***


### Generative AI Use-case 6: Content Creation and Optimization


* Use GPT to generate SEO-optimized content for blogs, FAQs, and landing pages that target high-volume keywords.
* Ensure the content matches user intent and includes calls-to-action (can use GPT to validate with a special prompt).
* Can feed the model past articles/content, and tell it to mimic its writing style/prose when generating new content.

## Other Ideas
***



### Generative AI Use-case 7: Analyze Youtube Videos

Videos are often embedded into websites. Generative AI can use the visual content and script of the video (can be obtained by transcribing voice to text) to catagorize, perform sentiment analysis, and identify attributes.

### Generative AI Use-case 8: Analyze Website Carousel

Carousels are often found on websites. Generative AI can use the visual content and text of the carousel to catagorize, perform sentiment analysis, and identify attributes.

### Generative AI Use-case 9: Assist in A/B Testing

A/B testing, also known as split testing, is a method where two versions of a web page or content piece (A and B) are compared to determine which one performs better in terms of a predefined metric, such as click-through rate, conversion rate, or engagement levels.

Generative AI can:

1. **Personalization**: By analyzing data on user preferences and behaviors, generative AI can create personalized content variations for A/B testing. This means developing content that resonates more closely with different segments of Gundry MD’s audience, potentially leading to higher engagement and conversion rates
2. **Predictive Analysis**: Generative AI can predict the outcome of A/B tests by analyzing past performance data and identifying patterns. This predictive capability can help Golden Hippo make informed decisions about which variations to test and prioritize, speeding up the optimization process and reducing the resources spent on less promising options.
3. **Automated Testing and Optimization**: Generative AI can automate the process of setting up, running, and analyzing A/B tests. This includes selecting the most relevant KPIs (Key Performance Indicators) for each test, distributing traffic among different test variations, and providing insights on performance.

### Generative AI Use-case 9: Analyze Images on Websites

Incorporating Generative AI for analyzing images on websites can significantly enhance SEO strategies by automating the optimization of images for search engines. By categorizing images and identifying their content and sentiment, AI can generate descriptive, keyword-rich alt texts and file names, making the images more discoverable to search engines. This process not only helps in accurately indexing the images but also improves the overall user experience by ensuring relevant images appear in search results, thereby increasing website visibility and traffic.

## Final Thoughts + HCI

I can't think of more use-case right now for generative AI + SEO, but that's because I am not an expert in SEO. By brainstorming with SEO experts, I can figure out the needs of the employees and customers at Golden Hippo, and develop custom generative AI solutions for each of them.

I took a course titled "Human-Computer Interaction" (HCI) at Georgia Tech, where I learned about the **full design-lifecycle**, which essentially can be boiled down into these steps:

1. **User Research and Needfinding**

    a. Methods
   
    b. Data Inventory/User Types
   
2. **Design Alternatives**

    a. Brainstorming (individual and group)
   
    b. Personas
   
    c. User Profiles
   
    d. Timelines
   
    e. Scenarios/Storyboards
   
    f. User Modeling
   
3. **Prototyping**

    a. Low fidelity
   
    b. High fidelity
4. **User Evaluation**

    a. Qualitative evaluation

    b. Empirical evaluation
   
    c. Predictive evaluation

Before introducing AI technologies into Golden Hippo’s workflow, it is important to understand the specific needs, challenges, and opportunities within the company through user research. Only then can we begin brainstorming design alternatives,
prototype, and evaluate our prototypes. By leveraging the design lifecycle above, we can ensure that the AI solutions we build are closely aligned with Golden Hippo’s operational goals and user needs, leading to more effective and efficient marketing strategies, improved customer experiences, and enhanced business outcomes through targeted innovation and optimization.

