Conversation
98f26ba to
d1e4a65
Compare
There was a problem hiding this comment.
Pull request overview
This PR migrates session tracking from the X-Session-ID header to an x-session-id cookie, centralizing request parsing and adding utilities to resolve/create users & sessions based on that cookie across API endpoints and services.
Changes:
- Introduce
extract_session_cookie()/extract_origin_from_request()helpers and apply them across routers + middleware. - Add
resolve_user_and_session()utility and a new/user_and_sessionendpoint that sets the session cookie. - Update data-collection plumbing, tests, and environment/configuration to support cookie-based session IDs and an
ENVsetting.
Reviewed changes
Copilot reviewed 21 out of 24 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| src/app/user/utils/utils.py | Adds async helper to resolve/create user+session from a session UUID. |
| src/app/user/api/router.py | Adds cookie-setting endpoint and refactors bookmark routes to use cookie-derived identity. |
| src/app/tutor/api/router.py | Switches session ID sourcing from header to cookie. |
| src/app/search/api/router.py | Switches session ID sourcing from header to cookie. |
| src/app/api/api_v1/endpoints/chat.py | Switches session ID sourcing from header to cookie for data-collection. |
| src/app/api/api_v1/endpoints/metric.py | Switches session ID sourcing from header to cookie for syllabus download tracking. |
| src/app/middleware/monitor_requests.py | Switches request monitoring session ID from header to cookie. |
| src/app/shared/utils/requests.py | New shared request helpers for cookie + origin extraction. |
| src/app/shared/domain/constants.py | Defines cookie name and TTL constant for session cookie. |
| src/app/services/sql_db/queries_user.py | Adjusts session lookup signature and adds new exception/logging behavior. |
| src/app/services/data_collection.py | Changes session_id typing to UUID and removes string-to-UUID conversion. |
| src/app/api/api_v1/api.py | Updates user router import wiring. |
| src/app/core/config.py | Adds required ENV setting for environment detection. |
| src/app/tests/test_utils.py | Adds unit tests for resolve_user_and_session. |
| src/app/tests/api/api_v1/test_user.py | Updates user API tests to use cookies and new bookmark routes. |
| src/app/tests/api/api_v1/test_search.py | Updates search API tests to send session via Cookie header. |
| pytest.ini | Sets ENV=test for test runs. |
| k8s/welearn-api/values.dev.yaml | Provides ENV config for dev deployment. |
| k8s/welearn-api/values.staging.yaml | Provides ENV config for staging deployment. |
| k8s/welearn-api/values.prod.yaml | Provides ENV config for production deployment. |
| .env.example | Documents ENV for local development. |
Comments suppressed due to low confidence (2)
src/app/user/api/router.py:135
- The route path contains a literal ":document_id" segment, but
document_idis not declared as a path parameter (FastAPI uses{document_id}), so it will be treated as a query param and the URL will literally include:document_id. Either change the route to use a proper path param (/bookmarks/{document_id}) or remove:document_idfrom the path and keepdocument_idas a query parameter.
src/app/user/api/router.py:162 - Same issue as above: the route path includes a literal
:document_idsegment. This is easy to confuse with a real path parameter and will likely break clients expecting/bookmarks/<uuid>. Consider switching to/bookmarks/{document_id}(path param) or/bookmarkswithdocument_idonly as a query parameter.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| with session_maker() as s: | ||
| session = s.execute(select(Session).where(Session.id == session_id)).first() | ||
| if session: | ||
| logger.info( | ||
| "Valid session. user_id=%s session_id=%s", | ||
| session[0].inferred_user_id, | ||
| session_id, | ||
| ) | ||
| return session[0].inferred_user_id |
There was a problem hiding this comment.
get_user_from_session_id currently treats any session row as valid and does not check Session.end_at > now (unlike get_or_create_session_sync). This can make expired sessions resolve to a user indefinitely, especially problematic now that cookies can persist long-term. Consider filtering by end_at in the query (and/or deleting/invalidating expired sessions).
| user_id = await run_in_threadpool(get_user_from_session_id, session_uuid) | ||
|
|
||
| if not user_id: | ||
| logger.info("No user found. Creating new user and session.") | ||
| user_id = await run_in_threadpool(get_or_create_user_sync, None, referer) | ||
| session_uuid = await run_in_threadpool( | ||
| get_or_create_session_sync, user_id, None, host, referer | ||
| ) |
There was a problem hiding this comment.
resolve_user_and_session creates a new user+session whenever get_user_from_session_id returns falsy. When used from endpoints like bookmarks, an invalid/unknown cookie will silently create a brand-new user and then read/write bookmarks on that new user, which is likely unintended. Consider splitting this into two flows (e.g., resolve_existing_user_from_session that 401s on missing/invalid session vs. an explicit "create user/session" endpoint).
| else: | ||
| HTTPException( | ||
| status_code=404, detail=f"Session with id {session_id} not found" | ||
| ) |
There was a problem hiding this comment.
get_user_from_session_id instantiates an HTTPException in the else branch but never raises it, so the exception is effectively ignored and the function falls through returning None. Either raise the exception or remove it and just return None (and let callers decide how to handle missing sessions).
| else: | |
| HTTPException( | |
| status_code=404, detail=f"Session with id {session_id} not found" | |
| ) |
d1e4a65 to
a3bd736
Compare
0daa9cf to
b04928e
Compare
Description
Why?
How?
Screenshots (if appropriate):
Types of changes
Checklist: