feat(reader): add Reader Simulation Agent module#135
feat(reader): add Reader Simulation Agent module#135shenminglinyi merged 2 commits intoshenminglinyi:masterfrom
Conversation
📝 WalkthroughWalkthroughAdds a reader-simulation subsystem: Pydantic schemas and DTOs, an async service that queries chapters and an LLM, DB migration and repository for persistence, new FastAPI routes to run/list simulations and churn alerts, and unit tests for schemas/DTOs. Changes
Sequence DiagramsequenceDiagram
participant Client
participant API as "API /reader"
participant Service as "ReaderSimulationService"
participant Repo as "ReaderSimulationRepository"
participant DB as "Database"
participant LLM as "LLM Client"
Client->>API: POST /reader/novels/{id}/chapters/{n}/simulate
API->>Service: simulate(novel_id, chapter_number)
Service->>Repo: fetch chapter & context
Repo->>DB: SELECT chapter / summaries
DB-->>Repo: chapter data
Repo-->>Service: context
Service->>Service: build prompt (3 personas)
Service->>LLM: generate(prompt)
LLM-->>Service: raw JSON text
Service->>Service: sanitize & parse JSON
Service->>Service: validate schema -> DTOs
Service-->>API: ChapterReaderReportDTO
API->>Repo: save(report) (best-effort)
Repo->>DB: INSERT reader_simulations
DB-->>Repo: record_id
API-->>Client: 200 { report / meta }
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (4)
infrastructure/persistence/database/reader_simulation_repository.py (2)
36-73: Considerdatetime.now(timezone.utc)over deprecatedutcnow().
datetime.utcnow()is deprecated in Python 3.12+. While it still functions, consider usingdatetime.now(timezone.utc)for forward compatibility.♻️ Suggested change
+from datetime import datetime, timezone ... - datetime.utcnow().isoformat(), + datetime.now(timezone.utc).isoformat(),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@infrastructure/persistence/database/reader_simulation_repository.py` around lines 36 - 73, The save method in reader_simulation_repository uses the deprecated datetime.utcnow(); replace that call with datetime.now(timezone.utc) to produce an aware UTC timestamp before inserting (update the import to include timezone if needed) so in function save you should change datetime.utcnow().isoformat() to datetime.now(timezone.utc).isoformat() and ensure the module imports timezone from datetime.
109-120: Consider returning only the latest record per chapter for churn alerts.Currently this returns all historical records above the threshold, which could include multiple records for the same chapter. If the intent is to show "which chapters currently have high churn risk," consider filtering to latest-per-chapter similar to
list_by_novel().♻️ Suggested query to return latest-per-chapter only
def get_high_churn_chapters( self, novel_id: str, threshold: float = 60.0, ) -> List[Dict]: """获取劝退风险高的章节(用于告警)。""" db = self._get_db() cursor = db.execute( - """SELECT * FROM reader_simulations - WHERE novel_id = ? AND avg_churn_risk >= ? - ORDER BY avg_churn_risk DESC""", - (novel_id, threshold), + """SELECT rs.* FROM reader_simulations rs + INNER JOIN ( + SELECT novel_id, chapter_number, MAX(created_at) as max_created + FROM reader_simulations + WHERE novel_id = ? + GROUP BY novel_id, chapter_number + ) latest ON rs.novel_id = latest.novel_id + AND rs.chapter_number = latest.chapter_number + AND rs.created_at = latest.max_created + WHERE rs.avg_churn_risk >= ? + ORDER BY rs.avg_churn_risk DESC""", + (novel_id, threshold), ) return [dict(row) for row in cursor.fetchall()]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@infrastructure/persistence/database/reader_simulation_repository.py` around lines 109 - 120, The method get_high_churn_chapters currently returns all historical rows above the threshold; change it to return only the latest record per chapter (like list_by_novel does) by modifying the SQL to filter to the most recent row per chapter_id (e.g., using a subquery or join that selects MAX(created_at) per novel_id+chapter_id) while still applying novel_id and avg_churn_risk >= threshold and ordering by avg_churn_risk DESC; update the method get_high_churn_chapters accordingly so it returns one dict per chapter (latest) instead of multiple historical records.application/reader/services/reader_simulation_service.py (1)
97-105: Log the exception instead of silently passing.Static analysis correctly flags the bare
except: pass. While fetchingprev_summaryis optional, logging at DEBUG level aids troubleshooting.♻️ Suggested change
if self._knowledge_repo and prev_chapter: try: knowledge = self._knowledge_repo.get_by_novel_id(novel_id) if knowledge: ch_sum = knowledge.get_chapter(chapter_number - 1) if ch_sum: prev_summary = ch_sum.summary or "" - except Exception: - pass + except Exception as e: + logger.debug("Failed to fetch prev chapter summary: %s", e)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@application/reader/services/reader_simulation_service.py` around lines 97 - 105, The bare except is swallowing errors when fetching prev_summary; replace it with catching Exception as e and log the exception at DEBUG so failures are visible. Update the block around self._knowledge_repo.get_by_novel_id(novel_id) / knowledge.get_chapter(chapter_number - 1) to use "except Exception as e:" and call the existing logger (e.g., self._logger.debug(...) or module logger via logging.getLogger(__name__).debug(...)) including context like novel_id/chapter_number and the exception, while leaving prev_summary behavior unchanged if an error occurs.interfaces/api/v1/reader/__init__.py (1)
31-35: Narrow the knowledge-repo fallback and log the downgrade path.The broad catch at Lines 34-35 masks unexpected runtime failures and silently drops knowledge context.
💡 Suggested refactor
from interfaces.api.dependencies import get_chapter_repository, get_llm_service try: from interfaces.api.dependencies import get_knowledge_repository knowledge_repo = get_knowledge_repository() - except Exception: + except ImportError: knowledge_repo = None + except Exception as e: + logger.warning("知识库依赖初始化失败,降级为无知识库模式: %s", e) + knowledge_repo = None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@interfaces/api/v1/reader/__init__.py` around lines 31 - 35, The current broad except hides real errors when importing or calling get_knowledge_repository; narrow the fallback by catching only expected exceptions (e.g., ImportError, ModuleNotFoundError and the specific error(s) you expect from get_knowledge_repository such as AttributeError or a custom RepoNotAvailableError) and set knowledge_repo = None only in those cases, while re-raising any other Exception; also emit a log entry when falling back (include the exception message and context) so the downgrade path is visible—update the import/call around get_knowledge_repository and the knowledge_repo assignment accordingly and use the module logger (or logging.getLogger(__name__)) to record the fallback with exception details.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@infrastructure/persistence/database/reader_simulation_repository.py`:
- Around line 25-34: The ensure_table method currently calls
db.executescript(sql) but DatabaseConnection does not expose executescript;
modify ensure_table to call executescript on the underlying sqlite3.Connection
by using self._get_db().get_connection().executescript(sql) (locate the
ensure_table function and replace the db.executescript call with
get_connection().executescript). Keep the existing migration_path read and
encoding logic, and ensure you still call get_connection().executescript only
when migration_path.exists().
In `@interfaces/api/v1/reader/__init__.py`:
- Around line 94-97: The handlers log the full exception and then interpolate
its text into the HTTP 500 response (e.g., logger.error(..., e) and raise
HTTPException(... detail=f"...{e}")), which leaks internals; instead, keep
detailed error information in server logs by using logger.exception(...) or
logger.error(..., exc_info=True) and raise HTTPException with a stable
client-facing message (e.g., detail="读者模拟分析失败") while chaining the original
exception (raise HTTPException(status_code=500, detail="读者模拟分析失败") from e).
Apply this change to all generic except Exception blocks that use the variable e
in this file (the blocks around the logger.error / HTTPException uses).
- Around line 183-191: The code assumes each item in feedbacks is a dict and
calls fb.get(...), which can raise when malformed JSON yields non-dict entries;
before iterating, validate that feedbacks is a list and inside the loop check
each fb with isinstance(fb, dict) (or skip/convert non-dicts to {}) and ensure
fb.get("pain_points", []) returns a list before extending all_pain_points;
update the block that assigns feedbacks (from feedbacks_raw) and the loop over
feedbacks to perform these type guards using the variables feedbacks_raw,
feedbacks, fb, and all_pain_points.
- Around line 75-97: The current code swallows persistence errors in the inner
try/except (repo = _get_repo(); repo.save(...)) by only logging via
logger.warning, causing the API (return {"success": True, "data":
report.to_dict()}) to report success even when save failed; change the inner
except to propagate the failure (either re-raise the exception or raise an
HTTPException) so callers see the error: locate the repo.save call and its
surrounding try/except in interfaces/api/v1/reader/__init__.py and replace the
logger.warning branch with code that raises the exception (or raises
HTTPException with a 500 and message) after logging, ensuring persistent save
failures are not treated as successful responses.
- Around line 47-53: The current _get_repo() sets the module-global _repo before
calling ReaderSimulationRepository.ensure_table(), so if ensure_table() raises
once subsequent calls will return the already-set _repo and never retry; to fix,
instantiate the repository into a local variable (e.g., repo =
ReaderSimulationRepository()), call repo.ensure_table() inside the try/except,
and only assign the module-global _repo = repo after ensure_table() succeeds (or
alternatively, on exception reset _repo to None) so future calls will retry
initialization; reference the symbols _get_repo, _repo, and
ReaderSimulationRepository.ensure_table when locating the change.
---
Nitpick comments:
In `@application/reader/services/reader_simulation_service.py`:
- Around line 97-105: The bare except is swallowing errors when fetching
prev_summary; replace it with catching Exception as e and log the exception at
DEBUG so failures are visible. Update the block around
self._knowledge_repo.get_by_novel_id(novel_id) /
knowledge.get_chapter(chapter_number - 1) to use "except Exception as e:" and
call the existing logger (e.g., self._logger.debug(...) or module logger via
logging.getLogger(__name__).debug(...)) including context like
novel_id/chapter_number and the exception, while leaving prev_summary behavior
unchanged if an error occurs.
In `@infrastructure/persistence/database/reader_simulation_repository.py`:
- Around line 36-73: The save method in reader_simulation_repository uses the
deprecated datetime.utcnow(); replace that call with datetime.now(timezone.utc)
to produce an aware UTC timestamp before inserting (update the import to include
timezone if needed) so in function save you should change
datetime.utcnow().isoformat() to datetime.now(timezone.utc).isoformat() and
ensure the module imports timezone from datetime.
- Around line 109-120: The method get_high_churn_chapters currently returns all
historical rows above the threshold; change it to return only the latest record
per chapter (like list_by_novel does) by modifying the SQL to filter to the most
recent row per chapter_id (e.g., using a subquery or join that selects
MAX(created_at) per novel_id+chapter_id) while still applying novel_id and
avg_churn_risk >= threshold and ordering by avg_churn_risk DESC; update the
method get_high_churn_chapters accordingly so it returns one dict per chapter
(latest) instead of multiple historical records.
In `@interfaces/api/v1/reader/__init__.py`:
- Around line 31-35: The current broad except hides real errors when importing
or calling get_knowledge_repository; narrow the fallback by catching only
expected exceptions (e.g., ImportError, ModuleNotFoundError and the specific
error(s) you expect from get_knowledge_repository such as AttributeError or a
custom RepoNotAvailableError) and set knowledge_repo = None only in those cases,
while re-raising any other Exception; also emit a log entry when falling back
(include the exception message and context) so the downgrade path is
visible—update the import/call around get_knowledge_repository and the
knowledge_repo assignment accordingly and use the module logger (or
logging.getLogger(__name__)) to record the fallback with exception details.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 7801d62d-09ec-41d5-8c9e-9967113767ce
📒 Files selected for processing (11)
application/reader/__init__.pyapplication/reader/dtos/__init__.pyapplication/reader/dtos/reader_feedback_dto.pyapplication/reader/schema.pyapplication/reader/services/__init__.pyapplication/reader/services/reader_simulation_service.pyinfrastructure/persistence/database/migrations/add_reader_simulations.sqlinfrastructure/persistence/database/reader_simulation_repository.pyinterfaces/api/v1/reader/__init__.pyinterfaces/main.pytests/unit/application/services/test_reader_simulation_service.py
| if _repo is None: | ||
| _repo = ReaderSimulationRepository() | ||
| try: | ||
| _repo.ensure_table() | ||
| except Exception as e: | ||
| logger.warning("读者模拟表初始化失败(首次使用时会自动重试): %s", e) | ||
| return _repo |
There was a problem hiding this comment.
_get_repo() won’t actually retry table initialization after first failure.
At Line 48, _repo is set before ensure_table(). If Line 50 fails once, future calls skip initialization entirely, contradicting the retry message at Line 52.
💡 Suggested fix
def _get_repo() -> ReaderSimulationRepository:
global _repo
if _repo is None:
- _repo = ReaderSimulationRepository()
+ candidate = ReaderSimulationRepository()
try:
- _repo.ensure_table()
+ candidate.ensure_table()
except Exception as e:
logger.warning("读者模拟表初始化失败(首次使用时会自动重试): %s", e)
- return _repo
+ raise
+ _repo = candidate
+ return _repo📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if _repo is None: | |
| _repo = ReaderSimulationRepository() | |
| try: | |
| _repo.ensure_table() | |
| except Exception as e: | |
| logger.warning("读者模拟表初始化失败(首次使用时会自动重试): %s", e) | |
| return _repo | |
| if _repo is None: | |
| candidate = ReaderSimulationRepository() | |
| try: | |
| candidate.ensure_table() | |
| except Exception as e: | |
| logger.warning("读者模拟表初始化失败(首次使用时会自动重试): %s", e) | |
| raise | |
| _repo = candidate | |
| return _repo |
🧰 Tools
🪛 Ruff (0.15.10)
[warning] 51-51: Do not catch blind exception: Exception
(BLE001)
[warning] 52-52: String contains ambiguous ( (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?
(RUF001)
[warning] 52-52: String contains ambiguous ) (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?
(RUF001)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@interfaces/api/v1/reader/__init__.py` around lines 47 - 53, The current
_get_repo() sets the module-global _repo before calling
ReaderSimulationRepository.ensure_table(), so if ensure_table() raises once
subsequent calls will return the already-set _repo and never retry; to fix,
instantiate the repository into a local variable (e.g., repo =
ReaderSimulationRepository()), call repo.ensure_table() inside the try/except,
and only assign the module-global _repo = repo after ensure_table() succeeds (or
alternatively, on exception reset _repo to None) so future calls will retry
initialization; reference the symbols _get_repo, _repo, and
ReaderSimulationRepository.ensure_table when locating the change.
| try: | ||
| repo = _get_repo() | ||
| repo.save( | ||
| novel_id=novel_id, | ||
| chapter_number=chapter_number, | ||
| overall_readability=report.overall_readability, | ||
| chapter_hook_strength=report.chapter_hook_strength, | ||
| pacing_verdict=report.pacing_verdict, | ||
| avg_scores=report._compute_avg_scores(), | ||
| feedbacks_json=json.dumps( | ||
| [f.to_dict() for f in report.feedbacks], | ||
| ensure_ascii=False, | ||
| ), | ||
| ) | ||
| except Exception as e: | ||
| logger.warning("读者模拟结果持久化失败: %s", e) | ||
|
|
||
| return {"success": True, "data": report.to_dict()} | ||
|
|
||
| except Exception as e: | ||
| logger.error("读者模拟失败 novel=%s ch=%d: %s", novel_id, chapter_number, e) | ||
| raise HTTPException(status_code=500, detail=f"读者模拟分析失败: {e}") | ||
|
|
There was a problem hiding this comment.
POST simulate reports success even when DB persistence fails.
At Lines 89-92, save failures are logged but ignored, so the API returns success with non-persisted data.
💡 Suggested fix
# 持久化
try:
repo = _get_repo()
repo.save(
novel_id=novel_id,
chapter_number=chapter_number,
overall_readability=report.overall_readability,
chapter_hook_strength=report.chapter_hook_strength,
pacing_verdict=report.pacing_verdict,
avg_scores=report._compute_avg_scores(),
feedbacks_json=json.dumps(
[f.to_dict() for f in report.feedbacks],
ensure_ascii=False,
),
)
except Exception as e:
- logger.warning("读者模拟结果持久化失败: %s", e)
+ logger.error("读者模拟结果持久化失败 novel=%s ch=%d: %s", novel_id, chapter_number, e)
+ raise HTTPException(status_code=500, detail="读者模拟完成,但保存失败,请稍后重试") from e
return {"success": True, "data": report.to_dict()}
+ except HTTPException:
+ raise
except Exception as e:
logger.error("读者模拟失败 novel=%s ch=%d: %s", novel_id, chapter_number, e)
raise HTTPException(status_code=500, detail=f"读者模拟分析失败: {e}")🧰 Tools
🪛 Ruff (0.15.10)
[warning] 89-89: Do not catch blind exception: Exception
(BLE001)
[warning] 94-94: Do not catch blind exception: Exception
(BLE001)
[warning] 96-96: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@interfaces/api/v1/reader/__init__.py` around lines 75 - 97, The current code
swallows persistence errors in the inner try/except (repo = _get_repo();
repo.save(...)) by only logging via logger.warning, causing the API (return
{"success": True, "data": report.to_dict()}) to report success even when save
failed; change the inner except to propagate the failure (either re-raise the
exception or raise an HTTPException) so callers see the error: locate the
repo.save call and its surrounding try/except in
interfaces/api/v1/reader/__init__.py and replace the logger.warning branch with
code that raises the exception (or raises HTTPException with a 500 and message)
after logging, ensuring persistent save failures are not treated as successful
responses.
| except Exception as e: | ||
| logger.error("读者模拟失败 novel=%s ch=%d: %s", novel_id, chapter_number, e) | ||
| raise HTTPException(status_code=500, detail=f"读者模拟分析失败: {e}") | ||
|
|
There was a problem hiding this comment.
Avoid exposing internal exception text in HTTP 500 responses.
Returning str(e) / interpolated {e} at Lines 96/138/168/208 can leak backend internals. Use a stable client message and chain the original exception.
💡 Suggested fix pattern
- except Exception as e:
- logger.error("获取读者模拟记录失败: %s", e)
- raise HTTPException(status_code=500, detail=str(e))
+ except Exception as e:
+ logger.exception("获取读者模拟记录失败")
+ raise HTTPException(status_code=500, detail="服务器内部错误,请稍后重试") from eApply the same pattern to the other generic except Exception blocks in this file.
Also applies to: 136-138, 166-168, 206-208
🧰 Tools
🪛 Ruff (0.15.10)
[warning] 94-94: Do not catch blind exception: Exception
(BLE001)
[warning] 96-96: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@interfaces/api/v1/reader/__init__.py` around lines 94 - 97, The handlers log
the full exception and then interpolate its text into the HTTP 500 response
(e.g., logger.error(..., e) and raise HTTPException(... detail=f"...{e}")),
which leaks internals; instead, keep detailed error information in server logs
by using logger.exception(...) or logger.error(..., exc_info=True) and raise
HTTPException with a stable client-facing message (e.g., detail="读者模拟分析失败")
while chaining the original exception (raise HTTPException(status_code=500,
detail="读者模拟分析失败") from e). Apply this change to all generic except Exception
blocks that use the variable e in this file (the blocks around the logger.error
/ HTTPException uses).
| feedbacks_raw = r.get("feedbacks_json", "[]") | ||
| try: | ||
| feedbacks = json.loads(feedbacks_raw) | ||
| except (json.JSONDecodeError, TypeError): | ||
| feedbacks = [] | ||
| # 提取痛点汇总 | ||
| all_pain_points = [] | ||
| for fb in feedbacks: | ||
| all_pain_points.extend(fb.get("pain_points", [])) |
There was a problem hiding this comment.
Guard feedbacks_json shape before iterating pain points.
At Line 191, fb.get(...) assumes every item is a dict. Malformed-but-valid JSON can still trigger a 500 here.
💡 Suggested fix
for r in records:
feedbacks_raw = r.get("feedbacks_json", "[]")
try:
- feedbacks = json.loads(feedbacks_raw)
+ parsed = json.loads(feedbacks_raw)
except (json.JSONDecodeError, TypeError):
- feedbacks = []
+ parsed = []
+ feedbacks = [fb for fb in parsed if isinstance(fb, dict)] if isinstance(parsed, list) else []
# 提取痛点汇总
all_pain_points = []
for fb in feedbacks:
- all_pain_points.extend(fb.get("pain_points", []))
+ pain_points = fb.get("pain_points", [])
+ if isinstance(pain_points, list):
+ all_pain_points.extend(pain_points)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@interfaces/api/v1/reader/__init__.py` around lines 183 - 191, The code
assumes each item in feedbacks is a dict and calls fb.get(...), which can raise
when malformed JSON yields non-dict entries; before iterating, validate that
feedbacks is a list and inside the loop check each fb with isinstance(fb, dict)
(or skip/convert non-dicts to {}) and ensure fb.get("pain_points", []) returns a
list before extending all_pain_points; update the block that assigns feedbacks
(from feedbacks_raw) and the loop over feedbacks to perform these type guards
using the variables feedbacks_raw, feedbacks, fb, and all_pain_points.
|
感谢 @JamesGoslings 的贡献! Reader Simulation Agent 模块设计优秀,完全符合项目 DDD 四层架构:
追读力引擎是项目的核心功能方向,此 PR 价值很高。 但由于存在合并冲突,暂时无法合并。请作者:
我们会持续关注此 PR,期待合并! 再次感谢! |
Simulate three reader personas (hardcore fan, casual reader, nitpicker)
evaluating each chapter across four dimensions:
- Suspense retention (悬疑保持度)
- Thrill score (爽感评分)
- Churn risk (劝退风险)
- Emotional resonance (情感共鸣度)
New files:
- application/reader/ - Core service, schema, DTOs
- infrastructure/persistence/database/reader_simulation_repository.py
- infrastructure/persistence/database/migrations/add_reader_simulations.sql
- interfaces/api/v1/reader/ - REST API (4 endpoints)
- tests/unit/application/services/test_reader_simulation_service.py
API endpoints:
- POST /api/v1/reader/novels/{id}/chapters/{n}/simulate
- GET /api/v1/reader/novels/{id}/chapters/{n}/simulation
- GET /api/v1/reader/novels/{id}/simulations
- GET /api/v1/reader/novels/{id}/churn-alerts
4b20d3f to
539da91
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (1)
infrastructure/persistence/database/reader_simulation_repository.py (1)
25-34:⚠️ Potential issue | 🔴 CriticalCall
executescript()on the SQLite connection, not the wrapper.At Line 34,
_get_db()returns theDatabaseConnectionwrapper frominfrastructure/persistence/database/connection.py, and that type does not exposeexecutescript(). This will raiseAttributeErrorthe first time table initialization runs. Usedb.get_connection().executescript(sql)and commit on that connection instead.Proposed fix
def ensure_table(self) -> None: """确保表存在(首次调用时自动建表)。""" from pathlib import Path migration_path = ( Path(__file__).parent / "migrations" / "add_reader_simulations.sql" ) db = self._get_db() if migration_path.exists(): sql = migration_path.read_text(encoding="utf-8") - db.executescript(sql) + conn = db.get_connection() + conn.executescript(sql) + conn.commit()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@infrastructure/persistence/database/reader_simulation_repository.py` around lines 25 - 34, ensure_table is calling executescript on the DatabaseConnection wrapper returned by _get_db(), which lacks executescript; instead call executescript on the underlying sqlite3 connection via db.get_connection().executescript(sql) and then commit that connection (e.g., db.get_connection().commit()) to persist the migration; update ensure_table to read the SQL, call executescript on db.get_connection(), and commit on that same connection.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@application/reader/schema.py`:
- Around line 34-47: The shared validator clamp_score (decorated on
"suspense_retention", "thrill_score", "churn_risk", "emotional_resonance")
always returns 50.0 for None/invalid values, but churn_risk should use its
declared default 30.0; update the validation so churn_risk gets 30.0 on
None/invalid inputs—either by splitting validators (create a separate
clamp_churn_risk validator that returns 30.0 and apply it only to "churn_risk")
or by making clamp_score inspect the field name (via the validator context or a
separate decorated method) and return 30.0 when validating "churn_risk" while
keeping 50.0 for the others; adjust decorator targets accordingly so other
fields keep the 50.0 fallback.
In `@application/reader/services/reader_simulation_service.py`:
- Around line 279-305: The current loop copies all payload.feedbacks (including
duplicates and unknown personas) into feedbacks and then appends missing
defaults, which can produce >3 entries and skew
ChapterReaderReportDTO._compute_avg_scores(); instead, iterate only the
canonical persona keys ("hardcore","casual","nitpicker") and for each key find
the first matching fb in payload.feedbacks (ignore others), build a single
ReaderFeedbackDTO using that fb's fields (persona_label via PERSONA_LABELS,
scores via ReaderDimensionScoresDTO) or a default
ReaderDimensionScoresDTO/one_line_verdict when none found, thereby ensuring
exactly one feedback per canonical persona and preventing duplicates/unknown
personas from affecting averages.
In `@infrastructure/persistence/database/reader_simulation_repository.py`:
- Around line 109-118: get_high_churn_chapters currently returns every
historical row above the threshold, causing duplicate/stale alerts; change it to
only consider the latest simulation per chapter_number (same behavior as
list_by_novel) by filtering/joining on the max simulation_time (or max id if
used as monotonic) per (novel_id, chapter_number) before applying avg_churn_risk
>= threshold and ordering; update the SQL/query in get_high_churn_chapters to
select rows whose (novel_id, chapter_number) match the per-chapter latest
timestamp, then apply the threshold and ORDER BY avg_churn_risk DESC so only the
newest run per chapter can trigger an alert.
In `@interfaces/api/v1/reader/__init__.py`:
- Around line 67-92: The handler currently persists and returns success even
when ReaderSimulationService.simulate() returns a fallback empty DTO (e.g.,
_empty_report() for "chapter not found" or "empty content"), which creates fake
history; change the flow after calling _get_service().simulate(...) to detect
the fallback (by using an explicit flag or method on ChapterReaderReportDTO such
as is_empty(), or by comparing to _empty_report()) and in that case do NOT call
repo.save(...) and instead return an appropriate 4xx response (or raise an HTTP
error) indicating the simulation failed; only perform repo.save(...) and return
{"success": True, "data": report.to_dict()} when the report is a real analysis
result.
---
Duplicate comments:
In `@infrastructure/persistence/database/reader_simulation_repository.py`:
- Around line 25-34: ensure_table is calling executescript on the
DatabaseConnection wrapper returned by _get_db(), which lacks executescript;
instead call executescript on the underlying sqlite3 connection via
db.get_connection().executescript(sql) and then commit that connection (e.g.,
db.get_connection().commit()) to persist the migration; update ensure_table to
read the SQL, call executescript on db.get_connection(), and commit on that same
connection.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 5e8c5c01-46be-441c-8b3a-f76de4fa30b4
📒 Files selected for processing (11)
application/reader/__init__.pyapplication/reader/dtos/__init__.pyapplication/reader/dtos/reader_feedback_dto.pyapplication/reader/schema.pyapplication/reader/services/__init__.pyapplication/reader/services/reader_simulation_service.pyinfrastructure/persistence/database/migrations/add_reader_simulations.sqlinfrastructure/persistence/database/reader_simulation_repository.pyinterfaces/api/v1/reader/__init__.pyinterfaces/main.pytests/unit/application/services/test_reader_simulation_service.py
✅ Files skipped from review due to trivial changes (1)
- infrastructure/persistence/database/migrations/add_reader_simulations.sql
🚧 Files skipped from review as they are similar to previous changes (1)
- interfaces/main.py
| @field_validator( | ||
| "suspense_retention", "thrill_score", "churn_risk", "emotional_resonance", | ||
| mode="before", | ||
| ) | ||
| @classmethod | ||
| def clamp_score(cls, value: object) -> float: | ||
| """将评分归一到 0-100 范围。""" | ||
| if value is None: | ||
| return 50.0 | ||
| try: | ||
| v = float(value) | ||
| except (TypeError, ValueError): | ||
| return 50.0 | ||
| return max(0.0, min(100.0, v)) |
There was a problem hiding this comment.
Keep the churn_risk fallback aligned with its declared default.
This shared validator returns 50.0 for None/invalid input on every dimension, but churn_risk is declared with a default of 30.0. A malformed LLM value now gets interpreted as elevated churn, which can create false positives in the alert endpoint and persisted averages. Give churn_risk its own 30.0 fallback or split the validators by field.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@application/reader/schema.py` around lines 34 - 47, The shared validator
clamp_score (decorated on "suspense_retention", "thrill_score", "churn_risk",
"emotional_resonance") always returns 50.0 for None/invalid values, but
churn_risk should use its declared default 30.0; update the validation so
churn_risk gets 30.0 on None/invalid inputs—either by splitting validators
(create a separate clamp_churn_risk validator that returns 30.0 and apply it
only to "churn_risk") or by making clamp_score inspect the field name (via the
validator context or a separate decorated method) and return 30.0 when
validating "churn_risk" while keeping 50.0 for the others; adjust decorator
targets accordingly so other fields keep the 50.0 fallback.
| feedbacks = [] | ||
| for fb in payload.feedbacks: | ||
| feedbacks.append(ReaderFeedbackDTO( | ||
| persona=fb.persona, | ||
| persona_label=PERSONA_LABELS.get(fb.persona, fb.persona), | ||
| scores=ReaderDimensionScoresDTO( | ||
| suspense_retention=fb.scores.suspense_retention, | ||
| thrill_score=fb.scores.thrill_score, | ||
| churn_risk=fb.scores.churn_risk, | ||
| emotional_resonance=fb.scores.emotional_resonance, | ||
| ), | ||
| one_line_verdict=fb.one_line_verdict, | ||
| highlights=list(fb.highlights), | ||
| pain_points=list(fb.pain_points), | ||
| suggestions=list(fb.suggestions), | ||
| )) | ||
|
|
||
| # 确保三个人设都有(缺失时填默认) | ||
| existing_personas = {f.persona for f in feedbacks} | ||
| for persona_key in ("hardcore", "casual", "nitpicker"): | ||
| if persona_key not in existing_personas: | ||
| feedbacks.append(ReaderFeedbackDTO( | ||
| persona=persona_key, | ||
| persona_label=PERSONA_LABELS.get(persona_key, persona_key), | ||
| scores=ReaderDimensionScoresDTO(), | ||
| one_line_verdict="(该读者视角的反馈未能生成)", | ||
| )) |
There was a problem hiding this comment.
Normalize to exactly one feedback per canonical persona.
This loop preserves duplicate and unknown persona values from the LLM output, then only fills missing personas afterward. The result can contain 4+ feedback blocks, and ChapterReaderReportDTO._compute_avg_scores() will average over those extras, skewing both the API response and the persisted scores. Build the result from the three allowed keys (hardcore, casual, nitpicker) and ignore or overwrite anything outside that set.
🧰 Tools
🪛 Ruff (0.15.11)
[warning] 296-296: Comment contains ambiguous ( (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?
(RUF003)
[warning] 296-296: Comment contains ambiguous ) (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?
(RUF003)
[warning] 304-304: String contains ambiguous ( (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?
(RUF001)
[warning] 304-304: String contains ambiguous ) (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?
(RUF001)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@application/reader/services/reader_simulation_service.py` around lines 279 -
305, The current loop copies all payload.feedbacks (including duplicates and
unknown personas) into feedbacks and then appends missing defaults, which can
produce >3 entries and skew ChapterReaderReportDTO._compute_avg_scores();
instead, iterate only the canonical persona keys
("hardcore","casual","nitpicker") and for each key find the first matching fb in
payload.feedbacks (ignore others), build a single ReaderFeedbackDTO using that
fb's fields (persona_label via PERSONA_LABELS, scores via
ReaderDimensionScoresDTO) or a default ReaderDimensionScoresDTO/one_line_verdict
when none found, thereby ensuring exactly one feedback per canonical persona and
preventing duplicates/unknown personas from affecting averages.
| def get_high_churn_chapters( | ||
| self, novel_id: str, threshold: float = 60.0, | ||
| ) -> List[Dict]: | ||
| """获取劝退风险高的章节(用于告警)。""" | ||
| db = self._get_db() | ||
| cursor = db.execute( | ||
| """SELECT * FROM reader_simulations | ||
| WHERE novel_id = ? AND avg_churn_risk >= ? | ||
| ORDER BY avg_churn_risk DESC""", | ||
| (novel_id, threshold), |
There was a problem hiding this comment.
Only alert on the latest simulation per chapter.
This query returns every historical row whose avg_churn_risk crosses the threshold. After a chapter is re-simulated, the API can surface duplicate chapters or stale alerts even if the newest run is healthy. Mirror list_by_novel() here and filter on the latest row per chapter_number before applying the threshold.
Proposed fix
def get_high_churn_chapters(
self, novel_id: str, threshold: float = 60.0,
) -> List[Dict]:
"""获取劝退风险高的章节(用于告警)。"""
db = self._get_db()
cursor = db.execute(
- """SELECT * FROM reader_simulations
- WHERE novel_id = ? AND avg_churn_risk >= ?
- ORDER BY avg_churn_risk DESC""",
- (novel_id, threshold),
+ """SELECT rs.* FROM reader_simulations rs
+ INNER JOIN (
+ SELECT novel_id, chapter_number, MAX(created_at) AS max_created
+ FROM reader_simulations
+ WHERE novel_id = ?
+ GROUP BY novel_id, chapter_number
+ ) latest ON rs.novel_id = latest.novel_id
+ AND rs.chapter_number = latest.chapter_number
+ AND rs.created_at = latest.max_created
+ WHERE rs.avg_churn_risk >= ?
+ ORDER BY rs.avg_churn_risk DESC, rs.chapter_number ASC""",
+ (novel_id, threshold),
)
return [dict(row) for row in cursor.fetchall()]🧰 Tools
🪛 Ruff (0.15.11)
[warning] 112-112: Docstring contains ambiguous ( (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?
(RUF002)
[warning] 112-112: Docstring contains ambiguous ) (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?
(RUF002)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@infrastructure/persistence/database/reader_simulation_repository.py` around
lines 109 - 118, get_high_churn_chapters currently returns every historical row
above the threshold, causing duplicate/stale alerts; change it to only consider
the latest simulation per chapter_number (same behavior as list_by_novel) by
filtering/joining on the max simulation_time (or max id if used as monotonic)
per (novel_id, chapter_number) before applying avg_churn_risk >= threshold and
ordering; update the SQL/query in get_high_churn_chapters to select rows whose
(novel_id, chapter_number) match the per-chapter latest timestamp, then apply
the threshold and ORDER BY avg_churn_risk DESC so only the newest run per
chapter can trigger an alert.
Fixes raised in PR shenminglinyi#135 review: - 🔴 Critical: DatabaseConnection has no executescript() attribute. Get raw sqlite3 connection via get_connection() and call executescript on it. Wrap in try/except since CREATE TABLE IF NOT EXISTS in migration makes repeated calls safe; remaining failures are logged not silenced. - 🟠 Major: silent LLM failure masquerading as success. Previously any LLM/parse failure returned HTTP 200 with default-value 'feedback'. Now: * Service catches LLM exceptions and returns fallback report tagged with is_fallback=True + error_message. * Empty-feedbacks parse result (LLM refusal / empty response) also marked as fallback with response preview for debugging. * Router rejects persistence of fallback reports (no more dirty data). * Router distinguishes client errors (400) from upstream LLM errors (502) instead of masking both as 200. - 🟠 Major: persistence failures silently swallowed. Now exposed via meta.persisted/persist_error in response so clients can retry. New DTO fields: ChapterReaderReportDTO.is_fallback, error_message. Tests: 3 new cases covering fallback semantics. Total 19/19 passing. Manually verified against running server: - Non-existent chapter -> HTTP 400 (was silently 200) - LLM failure -> HTTP 502 (was silently 200 with fake scores) - executescript warning gone from logs
|
@shenminglinyi 你好,冲突已解决,CodeRabbit 的 🔴 Critical 和 🟠 Major 问题也都修了(见 442500a):
本地已验证:启动 warning 消失。麻烦复审,期待合入 🙏 |
There was a problem hiding this comment.
♻️ Duplicate comments (5)
interfaces/api/v1/reader/__init__.py (3)
227-236:⚠️ Potential issue | 🟡 MinorGuard
feedbacks_jsonshape before iterating pain points.Line 236 assumes every item in
feedbacksis a dict. Malformed JSON could cause a 500 if an item isn't a dict or ifpain_pointsisn't a list.Proposed fix
# 提取痛点汇总 all_pain_points = [] for fb in feedbacks: - all_pain_points.extend(fb.get("pain_points", [])) + if isinstance(fb, dict): + pain_points = fb.get("pain_points", []) + if isinstance(pain_points, list): + all_pain_points.extend(pain_points)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@interfaces/api/v1/reader/__init__.py` around lines 227 - 236, Validate and guard the shape of parsed feedbacks before iterating: ensure that after json.loads(feedbacks_raw) the variable feedbacks is a list (fallback to [] if not), and when iterating use type checks on each fb (e.g., ensure isinstance(fb, dict)) and only extend all_pain_points with fb.get("pain_points", []) when pain_points is a list (or coerce/skip non-list values); update the loop that builds all_pain_points (variables: records, feedbacks_raw, feedbacks, all_pain_points, fb) to perform these checks to avoid TypeError/AttributeError on malformed JSON items.
49-57:⚠️ Potential issue | 🟠 Major
_get_repo()won't retry table initialization after first failure.At line 52,
_repois assigned beforeensure_table(). If line 54 fails, subsequent calls skip initialization entirely because_repois already set.Proposed fix
def _get_repo() -> ReaderSimulationRepository: global _repo if _repo is None: - _repo = ReaderSimulationRepository() + candidate = ReaderSimulationRepository() try: - _repo.ensure_table() + candidate.ensure_table() except Exception as e: logger.warning("读者模拟表初始化失败(首次使用时会自动重试): %s", e) + raise + _repo = candidate return _repo🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@interfaces/api/v1/reader/__init__.py` around lines 49 - 57, _get_repo currently assigns the global _repo before calling ReaderSimulationRepository.ensure_table, so if ensure_table() raises the partially-initialized _repo remains set and future calls won't retry; fix by instantiating a local variable (e.g., repo = ReaderSimulationRepository()), call repo.ensure_table() inside the try, and only assign the global _repo = repo after ensure_table() succeeds; keep the existing except branch to log via logger.warning and do not overwrite _repo on failure so subsequent calls will retry initialization.
181-183:⚠️ Potential issue | 🟡 MinorAvoid exposing internal exception text in HTTP 500 responses.
Lines 183, 213, and 253 return
str(e)or interpolated{e}in the HTTP response detail, which can leak backend internals to clients.Proposed fix pattern
- except Exception as e: - logger.error("获取读者模拟记录失败: %s", e) - raise HTTPException(status_code=500, detail=str(e)) + except Exception as e: + logger.exception("获取读者模拟记录失败") + raise HTTPException(status_code=500, detail="服务器内部错误,请稍后重试") from eApply the same pattern to all generic
except Exceptionhandlers in this file.Also applies to: 211-213, 251-253
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@interfaces/api/v1/reader/__init__.py` around lines 181 - 183, Multiple generic except Exception handlers in this module currently expose internal exception text via HTTPException(detail=str(e)); update each handler (the ones that log and raise HTTPException around the reader endpoints) to log the full exception (e.g., logger.exception(...) or logger.error(..., exc_info=True)) but return a generic client-safe message in the HTTP response (for example raise HTTPException(status_code=500, detail="Internal server error while processing request") or similar), and apply this change consistently to the three handlers that currently use str(e) so logs keep full details while responses do not leak internals.application/reader/services/reader_simulation_service.py (1)
307-333:⚠️ Potential issue | 🟠 MajorNormalize to exactly one feedback per canonical persona.
This loop preserves duplicate and unknown
personavalues from LLM output, then fills missing personas afterward. If the LLM returns multiple feedbacks with the same persona (e.g., two "hardcore" entries), the result can contain 4+ feedback blocks, skewing_compute_avg_scores().Consider iterating only the canonical keys and taking the first match:
Proposed fix
- feedbacks = [] - for fb in payload.feedbacks: - feedbacks.append(ReaderFeedbackDTO( - persona=fb.persona, - persona_label=PERSONA_LABELS.get(fb.persona, fb.persona), - scores=ReaderDimensionScoresDTO( - suspense_retention=fb.scores.suspense_retention, - thrill_score=fb.scores.thrill_score, - churn_risk=fb.scores.churn_risk, - emotional_resonance=fb.scores.emotional_resonance, - ), - one_line_verdict=fb.one_line_verdict, - highlights=list(fb.highlights), - pain_points=list(fb.pain_points), - suggestions=list(fb.suggestions), - )) - - # 确保三个人设都有(缺失时填默认) - existing_personas = {f.persona for f in feedbacks} - for persona_key in ("hardcore", "casual", "nitpicker"): - if persona_key not in existing_personas: - feedbacks.append(ReaderFeedbackDTO( - persona=persona_key, - persona_label=PERSONA_LABELS.get(persona_key, persona_key), - scores=ReaderDimensionScoresDTO(), - one_line_verdict="(该读者视角的反馈未能生成)", - )) + # Build exactly one feedback per canonical persona + persona_fb_map = {fb.persona: fb for fb in reversed(payload.feedbacks)} + feedbacks = [] + for persona_key in ("hardcore", "casual", "nitpicker"): + fb = persona_fb_map.get(persona_key) + if fb: + feedbacks.append(ReaderFeedbackDTO( + persona=persona_key, + persona_label=PERSONA_LABELS.get(persona_key, persona_key), + scores=ReaderDimensionScoresDTO( + suspense_retention=fb.scores.suspense_retention, + thrill_score=fb.scores.thrill_score, + churn_risk=fb.scores.churn_risk, + emotional_resonance=fb.scores.emotional_resonance, + ), + one_line_verdict=fb.one_line_verdict, + highlights=list(fb.highlights), + pain_points=list(fb.pain_points), + suggestions=list(fb.suggestions), + )) + else: + feedbacks.append(ReaderFeedbackDTO( + persona=persona_key, + persona_label=PERSONA_LABELS.get(persona_key, persona_key), + scores=ReaderDimensionScoresDTO(), + one_line_verdict="(该读者视角的反馈未能生成)", + ))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@application/reader/services/reader_simulation_service.py` around lines 307 - 333, The current builder collects all payload.feedbacks (allowing duplicates/unknown personas) then pads missing personas, which can create >3 entries and break _compute_avg_scores(); change the logic in the ReaderFeedbackDTO construction to iterate canonical persona keys ("hardcore","casual","nitpicker") and for each persona_key select the first matching entry from payload.feedbacks (match on fb.persona == persona_key), mapping it into ReaderFeedbackDTO (use PERSONA_LABELS and ReaderDimensionScoresDTO as before); if no match found, append a default ReaderFeedbackDTO with empty ReaderDimensionScoresDTO and the placeholder one_line_verdict — this guarantees exactly one feedback per canonical persona and ignores duplicates/unknowns.infrastructure/persistence/database/reader_simulation_repository.py (1)
133-144:⚠️ Potential issue | 🟠 MajorOnly alert on the latest simulation per chapter.
This query returns every historical row whose
avg_churn_riskcrosses the threshold, which can surface duplicate/stale alerts after re-simulation. Mirror thelist_by_novel()approach by filtering on the latest row perchapter_numberbefore applying the threshold.Proposed fix
def get_high_churn_chapters( self, novel_id: str, threshold: float = 60.0, ) -> List[Dict]: """获取劝退风险高的章节(用于告警)。""" db = self._get_db() cursor = db.execute( - """SELECT * FROM reader_simulations - WHERE novel_id = ? AND avg_churn_risk >= ? - ORDER BY avg_churn_risk DESC""", - (novel_id, threshold), + """SELECT rs.* FROM reader_simulations rs + INNER JOIN ( + SELECT novel_id, chapter_number, MAX(created_at) AS max_created + FROM reader_simulations + WHERE novel_id = ? + GROUP BY novel_id, chapter_number + ) latest ON rs.novel_id = latest.novel_id + AND rs.chapter_number = latest.chapter_number + AND rs.created_at = latest.max_created + WHERE rs.avg_churn_risk >= ? + ORDER BY rs.avg_churn_risk DESC""", + (novel_id, threshold), ) return [dict(row) for row in cursor.fetchall()]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@infrastructure/persistence/database/reader_simulation_repository.py` around lines 133 - 144, The current get_high_churn_chapters method returns all historical rows exceeding the threshold; change it to only consider the latest simulation per chapter by filtering to the most recent row per chapter_number (mirroring list_by_novel). Modify the SQL in get_high_churn_chapters to join or subquery on reader_simulations grouped by chapter_number (e.g., using MAX(simulation_time) or MAX(id)) to select only the latest row for each chapter_number, then apply the avg_churn_risk >= ? filter and ORDER BY avg_churn_risk DESC so stale duplicates are excluded.
🧹 Nitpick comments (3)
infrastructure/persistence/database/reader_simulation_repository.py (1)
93-93:datetime.utcnow()is deprecated in Python 3.12+.Consider using
datetime.now(timezone.utc)for forward compatibility.Proposed fix
+from datetime import datetime, timezone -from datetime import datetime ... - datetime.utcnow().isoformat(), + datetime.now(timezone.utc).isoformat(),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@infrastructure/persistence/database/reader_simulation_repository.py` at line 93, Replace the deprecated datetime.utcnow() call with a timezone-aware timestamp: change the occurrence of datetime.utcnow().isoformat() in reader_simulation_repository (where the timestamp is generated) to use datetime.now(timezone.utc).isoformat() and update imports to include timezone (e.g., import or add from datetime import timezone). Ensure any tests or callers expecting UTC strings still work with the ISO8601 output.application/reader/services/reader_simulation_service.py (2)
342-342:datetime.utcnow()is deprecated in Python 3.12+.Same as in the repository file, consider using
datetime.now(timezone.utc)for forward compatibility.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@application/reader/services/reader_simulation_service.py` at line 342, Replace the deprecated naive UTC timestamp creation used for the analyzed_at field (datetime.utcnow()) with an aware UTC datetime: use datetime.now(timezone.utc) and update imports to include timezone (e.g., from datetime import datetime, timezone) so analyzed_at is timezone-aware; ensure any downstream code expecting naive datetimes is adjusted or cast accordingly in reader_simulation_service.py (look for the analyzed_at assignment).
104-105: Silent exception swallowing hides knowledge repository issues.The
try-except-passpattern completely swallows errors fromget_knowledge_repository(). Consider at least logging at DEBUG level so issues are discoverable during troubleshooting.Proposed fix
if self._knowledge_repo and prev_chapter: try: knowledge = self._knowledge_repo.get_by_novel_id(novel_id) if knowledge: ch_sum = knowledge.get_chapter(chapter_number - 1) if ch_sum: prev_summary = ch_sum.summary or "" except Exception: - pass + logger.debug("获取知识仓储前章摘要失败,已忽略", exc_info=True)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@application/reader/services/reader_simulation_service.py` around lines 104 - 105, The bare "except: pass" swallowing exceptions from get_knowledge_repository() should be replaced with catching the exception (e.g., "except Exception as e:") and logging it at DEBUG level so issues are discoverable; update the exception block in the ReaderSimulationService (where get_knowledge_repository() is called) to call an available logger (e.g., self.logger.debug or process_logger.debug) and include the exception details (use exc_info=True or format the exception) instead of silently passing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@application/reader/services/reader_simulation_service.py`:
- Around line 307-333: The current builder collects all payload.feedbacks
(allowing duplicates/unknown personas) then pads missing personas, which can
create >3 entries and break _compute_avg_scores(); change the logic in the
ReaderFeedbackDTO construction to iterate canonical persona keys
("hardcore","casual","nitpicker") and for each persona_key select the first
matching entry from payload.feedbacks (match on fb.persona == persona_key),
mapping it into ReaderFeedbackDTO (use PERSONA_LABELS and
ReaderDimensionScoresDTO as before); if no match found, append a default
ReaderFeedbackDTO with empty ReaderDimensionScoresDTO and the placeholder
one_line_verdict — this guarantees exactly one feedback per canonical persona
and ignores duplicates/unknowns.
In `@infrastructure/persistence/database/reader_simulation_repository.py`:
- Around line 133-144: The current get_high_churn_chapters method returns all
historical rows exceeding the threshold; change it to only consider the latest
simulation per chapter by filtering to the most recent row per chapter_number
(mirroring list_by_novel). Modify the SQL in get_high_churn_chapters to join or
subquery on reader_simulations grouped by chapter_number (e.g., using
MAX(simulation_time) or MAX(id)) to select only the latest row for each
chapter_number, then apply the avg_churn_risk >= ? filter and ORDER BY
avg_churn_risk DESC so stale duplicates are excluded.
In `@interfaces/api/v1/reader/__init__.py`:
- Around line 227-236: Validate and guard the shape of parsed feedbacks before
iterating: ensure that after json.loads(feedbacks_raw) the variable feedbacks is
a list (fallback to [] if not), and when iterating use type checks on each fb
(e.g., ensure isinstance(fb, dict)) and only extend all_pain_points with
fb.get("pain_points", []) when pain_points is a list (or coerce/skip non-list
values); update the loop that builds all_pain_points (variables: records,
feedbacks_raw, feedbacks, all_pain_points, fb) to perform these checks to avoid
TypeError/AttributeError on malformed JSON items.
- Around line 49-57: _get_repo currently assigns the global _repo before calling
ReaderSimulationRepository.ensure_table, so if ensure_table() raises the
partially-initialized _repo remains set and future calls won't retry; fix by
instantiating a local variable (e.g., repo = ReaderSimulationRepository()), call
repo.ensure_table() inside the try, and only assign the global _repo = repo
after ensure_table() succeeds; keep the existing except branch to log via
logger.warning and do not overwrite _repo on failure so subsequent calls will
retry initialization.
- Around line 181-183: Multiple generic except Exception handlers in this module
currently expose internal exception text via HTTPException(detail=str(e));
update each handler (the ones that log and raise HTTPException around the reader
endpoints) to log the full exception (e.g., logger.exception(...) or
logger.error(..., exc_info=True)) but return a generic client-safe message in
the HTTP response (for example raise HTTPException(status_code=500,
detail="Internal server error while processing request") or similar), and apply
this change consistently to the three handlers that currently use str(e) so logs
keep full details while responses do not leak internals.
---
Nitpick comments:
In `@application/reader/services/reader_simulation_service.py`:
- Line 342: Replace the deprecated naive UTC timestamp creation used for the
analyzed_at field (datetime.utcnow()) with an aware UTC datetime: use
datetime.now(timezone.utc) and update imports to include timezone (e.g., from
datetime import datetime, timezone) so analyzed_at is timezone-aware; ensure any
downstream code expecting naive datetimes is adjusted or cast accordingly in
reader_simulation_service.py (look for the analyzed_at assignment).
- Around line 104-105: The bare "except: pass" swallowing exceptions from
get_knowledge_repository() should be replaced with catching the exception (e.g.,
"except Exception as e:") and logging it at DEBUG level so issues are
discoverable; update the exception block in the ReaderSimulationService (where
get_knowledge_repository() is called) to call an available logger (e.g.,
self.logger.debug or process_logger.debug) and include the exception details
(use exc_info=True or format the exception) instead of silently passing.
In `@infrastructure/persistence/database/reader_simulation_repository.py`:
- Line 93: Replace the deprecated datetime.utcnow() call with a timezone-aware
timestamp: change the occurrence of datetime.utcnow().isoformat() in
reader_simulation_repository (where the timestamp is generated) to use
datetime.now(timezone.utc).isoformat() and update imports to include timezone
(e.g., import or add from datetime import timezone). Ensure any tests or callers
expecting UTC strings still work with the ISO8601 output.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 95ba0fb2-3ae0-4cb2-97c8-755b4b79a4c5
📒 Files selected for processing (5)
application/reader/dtos/reader_feedback_dto.pyapplication/reader/services/reader_simulation_service.pyinfrastructure/persistence/database/reader_simulation_repository.pyinterfaces/api/v1/reader/__init__.pytests/unit/application/services/test_reader_simulation_service.py
Related to #75
实现了追读力引擎的第一阶段:读者模拟 Agent,
包含三类读者视角评估、四维度评分、钩子强度分析、劝退风险告警。
硬约束/软约束分级和债务机制将在后续 PR 中实现。
Simulate three reader personas (hardcore fan, casual reader, nitpicker)
evaluating each chapter across four dimensions:
New files:
API endpoints:
变更类型
feat新功能变更说明
新增读者模拟 Agent 模块,通过 LLM 模拟三类读者(硬核粉/休闲读者/挑刺党)对章节进行多维度评估,为作者提供读者视角的质量反馈。该功能是 #75 追读力引擎的基础能力,后续可在此之上叠加硬约束/软约束分级和债务机制。
架构影响
application/infrastructure/interfacesreader_simulations表,migration:add_reader_simulations.sql,首次调用 API 时自动建表)/api/v1/reader/前缀下的 4 个端点)测试