# Blog Active contributors: Saksham, Ravi The blog is an AI-generated, SEO-optimised content system for 360 Ghar. Perplexity powers article generation, SEO fields (meta title, meta description, focus keyword, canonical URL, OG image, reading time, word count) are auto-computed from the post body, and a daily APScheduler job discovers fresh Gurugram real estate news, writes full HTML articles, and publishes them with categories and tags. ## Directory layout ``` app/api/api_v1/endpoints/ └── blog.py # blog post, category, tag CRUD + public read endpoints app/services/ ├── blog.py # blog post CRUD, SEO helpers, scheduled publish ├── blog_auto_publish.py # DailyPerplexityBlogPublisher: discover + generate + publish ├── blog_auto_publish_scheduler.py # registers daily cron on shared scheduler └── blog_service/ └── generator.py # _perplexity_generate: Perplexity API call for topic → HTML app/models/ └── blogs.py # BlogPost, BlogCategory, BlogTag, BlogPostCategory, BlogPostTag app/schemas/ └── blog.py # BlogPostCreate, BlogPostUpdate, BlogSource, BlogSEOMetadata ``` ## Key abstractions | Abstraction | File | Role | |---|---|---| | `create_blog_post` | `app/services/blog.py` | Persists post, auto-computes SEO fields, syncs `active` with `status` | | `_auto_meta_title` | `app/services/blog.py` | Truncates title to 57 chars with ellipsis at word boundary | | `_auto_meta_description` | `app/services/blog.py` | Builds 157-char description from excerpt or stripped body | | `_compute_word_count` / `_compute_reading_time` | `app/services/blog.py` | Word count (HTML stripped) and 200 wpm reading time | | `_serialize_sources` / `_serialize_seo_metadata` | `app/services/blog.py` | Normalises source/SEO objects to JSONB-storable dicts | | `DailyPerplexityBlogPublisher` | `app/services/blog_auto_publish.py` | Discovers same-day news, generates HTML, dedupes, publishes | | `_perplexity_generate` | `app/services/blog_service/generator.py` | Calls Perplexity chat completions API with SEO copywriter system prompt | | `publish_scheduled_posts` | `app/services/blog.py` | Publishes posts whose `scheduled_at` has passed | | `start_auto_blog_publish_scheduler` | `app/services/blog_auto_publish_scheduler.py` | Registers daily cron job on shared scheduler | | `BLOG_CATEGORIES` | `app/services/blog_service/generator.py` | 13 content category prompts for the AI copywriter | ## How it works Manual blog CRUD lives in `app/services/blog.py`. On create or update, SEO fields are auto-computed when not explicitly provided: `_auto_meta_title` truncates the title to 57 chars, `_auto_meta_description` builds a 157-char summary from the excerpt or stripped body, `_compute_word_count` strips HTML tags and counts words, and `_compute_reading_time` divides by 200 wpm (minimum 1). Sources are serialised to a JSONB list of `{url, ...}` dicts and `seo_metadata` to a JSONB dict. The legacy `active` boolean is kept in sync with `status == published`. The `BlogPostStatus` enum (`draft`, `published`, `archived`) drives visibility. ```mermaid graph TD Sched[AsyncIOScheduler AUTO_BLOG_CRON] -->|daily Asia/Kolkata| Job[_job_wrapper] Job --> Sched2[publish_scheduled_posts] Sched2 --> Pub1[posts with scheduled_at <= now] Job --> Pub2[DailyPerplexityBlogPublisher.publish_daily_posts] Pub2 --> Disc[discover same-day Gurugram real estate news] Disc -->|Perplexity web-grounded| Topics[structured topic list] Topics --> Dedup[exclude recent published slugs] Dedup --> Gen[_perplexity_generate per topic] Gen -->|system: SEO copywriter| API[Perplexity chat/completions] API -->|JSON title + content_html| Post[BlogPostCreate] Post --> Persist[create_blog_post] Persist --> SEO[auto meta_title, meta_description, reading_time, word_count] Persist --> Cats[DEFAULT_CATEGORIES, DEFAULT_TAGS attached] Persist --> DB[(BlogPost, BlogPostCategory, BlogPostTag)] Client -->|GET /blog| EP[app/api/.../blog.py] EP --> List[keyset paginated published posts] ``` Auto-publish is the more interesting flow. `start_auto_blog_publish_scheduler` registers a single cron job (configurable via `AUTO_BLOG_CRON`, default daily, timezone `AUTO_BLOG_TIMEZONE` defaulting to Asia/Kolkata) on the shared `AsyncIOScheduler` if `AUTO_BLOG_ENABLED` is true. The job wrapper first calls `publish_scheduled_posts` to flip any posts with `scheduled_at <= now` to `published`, then runs `DailyPerplexityBlogPublisher.publish_daily_posts`. The publisher uses two Perplexity calls. The discovery call uses `DISCOVERY_SYSTEM_PROMPT` (a "automated news desk" persona that returns only same-day real estate and Gurugram stories, excluding recent overlaps and social posts) to surface topics. The generation call uses `GENERATION_SYSTEM_PROMPT` (a "factual real-estate blog writer" persona that produces H2/H3-structured HTML with key takeaways, Gurgaon-specific context, and 360 Ghar value props woven in) to write each article. `BLOCKED_SOURCE_DOMAINS` filters out social media sources; `STOP_WORDS` and `RECENT_POST_LOOKBACK_DAYS` (7) drive deduplication. Each generated post is created via `create_blog_post` with `DEFAULT_CATEGORIES` (`Real Estate`, `Gurugram`, `News`) and `DEFAULT_TAGS`, which attaches `BlogCategory` and `BlogTag` rows through the `BlogPostCategory` and `BlogPostTag` join tables. In serverless mode (`SERVERLESS_ENABLED=True`) the scheduler is skipped and the job must be moved to Railway cron. ## Integration points - **Scheduler**: the cron job registers on the shared `AsyncIOScheduler` from `app/infrastructure/scheduler.py` (see [infrastructure](systems--infrastructure.md)). - **HTTP client**: `_perplexity_generate` uses `get_blog_client()` (120s default timeout) from `app/core/http.py`. - **Cache**: blog list responses are decorated with `@cached` from the [cache subsystem](systems--cache-subsystem.md). - **DB resilience**: list queries go through `execute_with_transient_retry` for transient DB error retry. - **Serverless**: in serverless mode the scheduler is skipped; auto-publish must run as a Railway cron job. ## Entry points for modification SEO field computation lives in the `_auto_*` helpers in `app/services/blog.py` — adjust there to change truncation lengths or reading speed. Auto-publish discovery and generation prompts live in `app/services/blog_auto_publish.py` (`DISCOVERY_SYSTEM_PROMPT`, `GENERATION_SYSTEM_PROMPT`); category/tag defaults live in the same file. New blog categories or tags must be created through `BlogCategory` / `BlogTag` rows and attached via the join tables. The Perplexity API integration in `blog_service/generator.py` requires `PERPLEXITY_API_KEY` in settings. ## Key source files | File | Purpose | |---|---| | `app/api/api_v1/endpoints/blog.py` | Blog REST endpoints (21.5 KB) | | `app/services/blog.py` | Blog service + SEO helpers (930 lines) | | `app/services/blog_auto_publish.py` | DailyPerplexityBlogPublisher (638 lines) | | `app/services/blog_auto_publish_scheduler.py` | Cron registration | | `app/services/blog_service/generator.py` | Perplexity generation (346 lines) | | `app/models/blogs.py` | BlogPost, BlogCategory, BlogTag, join tables | | `app/schemas/blog.py` | BlogPostCreate, BlogSource, BlogSEOMetadata | | `app/models/enums.py` | `BlogPostStatus` |