# Opportunity Watchlist

This notebook generates a practical SEO optimization queue from `seo_page_daily`.

### Quick start (beginner-friendly)
1. Run the **Setup (run once)** cell.
2. Run the remaining cells from top to bottom.
3. Review the watchlist table and chart for pages to prioritize.

### Links
- GitHub repo: [github.com/aidanm-lla/lla-data](https://github.com/aidanm-lla/lla-data)
- Open this notebook in Colab: [Opportunity Watchlist](https://colab.research.google.com/github/aidanm-lla/lla-data/blob/main/notebooks/seo/04_opportunity_watchlist.ipynb)

### Other notebooks
- [Search Contribution Overview](https://colab.research.google.com/github/aidanm-lla/lla-data/blob/main/notebooks/seo/01_search_contribution_overview.ipynb)
- [Top Pages Search Performance](https://colab.research.google.com/github/aidanm-lla/lla-data/blob/main/notebooks/seo/02_top_pages_search_performance.ipynb)
- [Query Drivers by Page](https://colab.research.google.com/github/aidanm-lla/lla-data/blob/main/notebooks/seo/03_query_drivers_by_page.ipynb)
- [Top Pages (Last 7 Days)](https://colab.research.google.com/github/aidanm-lla/lla-data/blob/main/notebooks/top_pages_last_7_days.ipynb)
- [Traffic Source Quality](https://colab.research.google.com/github/aidanm-lla/lla-data/blob/main/notebooks/traffic_sources.ipynb)
- [Time Patterns for Crisis-Related Pages](https://colab.research.google.com/github/aidanm-lla/lla-data/blob/main/notebooks/time_patterns.ipynb)
- [Crisis Support Funnel](https://colab.research.google.com/github/aidanm-lla/lla-data/blob/main/notebooks/crisis_funnel.ipynb)
- [Analysis Template](https://colab.research.google.com/github/aidanm-lla/lla-data/blob/main/notebooks/templates/analysis_template.ipynb)

Default logic focuses on pages with:
- high impressions
- low CTR
- mid rankings (positions where small improvements can produce clicks)

In [None]:
#@title Setup (run once)
import sys
import os

if "google.colab" in sys.modules:
    from google.colab import auth
    auth.authenticate_user()
    if not os.path.exists("lla-data"):
        !git clone -q https://github.com/aidoanto/lla-data.git
    repo = os.path.abspath("lla-data")
    if repo not in sys.path:
        sys.path.insert(0, repo)
    !pip install -q db-dtypes google-cloud-bigquery kaleido plotly
else:
    for p in ("..", "../.."):
        ap = os.path.abspath(p)
        if ap not in sys.path:
            sys.path.insert(0, ap)

import plotly.express as px

import lifeline_theme
from lla_data import config
from lla_data.bq import build_date_params, default_query_window, get_client, run_query

lifeline_theme.inject_fonts()

client = get_client()
window = default_query_window(config.DEFAULT_DAYS_BACK)

In [None]:
query = f"""
SELECT
  page_path,
  SUM(gsc_clicks) AS clicks,
  SUM(gsc_impressions) AS impressions,
  SAFE_DIVIDE(SUM(gsc_clicks), NULLIF(SUM(gsc_impressions), 0)) AS ctr,
  SAFE_DIVIDE(SUM(gsc_avg_position * gsc_impressions), NULLIF(SUM(gsc_impressions), 0)) AS avg_position,
  SUM(organic_sessions) AS organic_sessions
FROM `{config.PROJECT_ID}.{config.SEARCHCONSOLE_DATASET}.seo_page_daily`
WHERE report_date BETWEEN DATE(@start_date) AND DATE(@end_date)
GROUP BY page_path
HAVING impressions >= 100
ORDER BY impressions DESC
"""

df_watch = run_query(client, query, params=build_date_params(window))

df_watchlist = df_watch[
    (df_watch["ctr"] < 0.05)
    & (df_watch["avg_position"] >= 6)
    & (df_watch["avg_position"] <= 20)
].sort_values(["impressions", "ctr"], ascending=[False, True])

df_watchlist.head(30)

In [None]:
fig = px.scatter(
    df_watchlist,
    x="avg_position",
    y="ctr",
    size="impressions",
    hover_name="page_path",
    template="lifeline",
    title="Opportunity Watchlist: Low CTR + Mid Position + High Impressions",
)
fig.update_yaxes(tickformat=".0%")
lifeline_theme.add_lifeline_logo(fig)
fig.show()