Fix AioStreams metadata disambiguation using watch history and year by Copilot · Pull Request #47 · willtho89/librarySync

Copilot · 2026-02-02T05:49:17Z

AioStreams provides only filenames without media IDs, causing ambiguous metadata matches when multiple shows share titles (e.g., "Fallout" TV show vs "Fallout" anime).

Changes

Series continuity check - Query user's watch history to disambiguate TV show candidates:

When watching S02E08, check if user previously watched S02E07 of a candidate show
Match candidates by IMDb/TMDB/TVDB IDs from watch history
Validate episode continuity (next episode, later season, or rewatch within ±3 episodes)
Skip for first-time watches, fall back to standard matching

Year extraction - Verify existing PTT year parsing flows correctly:

Year from filename prioritized in search queries: ["Fallout 2024", "Fallout"]
Candidate selection prefers year matches

Impact

# Before: Ambiguous title match
# "Fallout S02E08" → matches first candidate (often wrong)

# After: Context-aware selection
# User watched Fallout S02E07 (IMDb: tt12345678)
# "Fallout S02E08" → prefers candidate with tt12345678

Implementation

Made _select_candidate_for_entry() async, added db/user_id params
Added _check_series_continuity() with SQLAlchemy query for watch history
Added null checks for max_season/max_episode in continuity logic
Updated tests for async signature, added continuity test case

Original prompt

This section details on the original issue you should resolve

<issue_title>[Bug] Wrong Metadata with AioStreams</issue_title>
<issue_description>AIOStreams only provides the filename but no ID. Occasionally it happens that I watch something (e.g. Fallout) but it picks up the wrong metadata (an Anime called fallout).

We need to make this more robust. If in the file name a year is present, use this for search.
Also for series do a "sanity check": if I previously watched Fallout (IMDb Id 102) S02E07 and now S02E08, it's more likely it's the same show and not a different "Fallout"</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes [Bug] Wrong Metadata with AioStreams #46

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

…metadata matching Co-authored-by: willtho89 <4933503+willtho89@users.noreply.github.com>

Co-authored-by: willtho89 <4933503+willtho89@users.noreply.github.com>

Copilot

Pull request overview

Improves AioStreams metadata selection for ambiguous TV titles by using a user’s watch history (episode continuity + IDs) and ensuring year-in-filename influences search and candidate choice.

Changes:

Made _select_candidate_for_entry() async and added db/user_id so selection can consult watch history.
Added _check_series_continuity() to prefer candidates that match previously watched episodes (ID-based).
Updated tests to call the new async selector and added a continuity test case.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File	Description
backend/src/librarysync/jobs/aiostreams_import.py	Adds async candidate selection and DB-backed continuity logic to disambiguate same-title TV shows.
backend/tests/test_aiostreams_import.py	Updates tests for async signature and adds coverage for continuity-based selection.

Copilot · 2026-02-07T14:57:51Z

backend/src/librarysync/jobs/aiostreams_import.py

+    # First, find all TV shows the user has watched
+    # Use a subquery to get the max season and episode per show
+    # Note: This may return max_season from one episode and max_episode from another,
+    # but it's acceptable for our continuity check as we're looking for a general pattern
+    result = await db.execute(
+        select(
+            MediaItem,
+            func.max(EpisodeItem.season_number).label("max_season"),
+            func.max(EpisodeItem.episode_number).label("max_episode"),
+        )
+        .join(EpisodeItem, EpisodeItem.show_media_item_id == MediaItem.id)
+        .join(WatchedItem, and_(
+            WatchedItem.episode_item_id == EpisodeItem.id,
+            WatchedItem.user_id == user_id
+        ))
+        .where(MediaItem.media_type == "tv")
+        .group_by(MediaItem.id)


Computing max_season and max_episode independently can produce an impossible (season, episode) pair (e.g., max season from one row and max episode from another). This can incorrectly reject true continuations (e.g., user watched S03E01 but max_episode comes from S01E22, causing the nearby/continuation checks to fail). Consider selecting the latest watched episode per show using a single ordering over (season_number DESC, episode_number DESC) (e.g., a window function / DISTINCT ON in Postgres), or a two-step aggregate (max season, then max episode constrained to that season).

Suggested change

# First, find all TV shows the user has watched

# Use a subquery to get the max season and episode per show

# Note: This may return max_season from one episode and max_episode from another,

# but it's acceptable for our continuity check as we're looking for a general pattern

result = await db.execute(

select(

MediaItem,

func.max(EpisodeItem.season_number).label("max_season"),

func.max(EpisodeItem.episode_number).label("max_episode"),

)

.join(EpisodeItem, EpisodeItem.show_media_item_id == MediaItem.id)

.join(WatchedItem, and_(

WatchedItem.episode_item_id == EpisodeItem.id,

WatchedItem.user_id == user_id

))

.where(MediaItem.media_type == "tv")

.group_by(MediaItem.id)

# First, find the latest watched episode per show for this user, in a way that

# guarantees (max_season, max_episode) comes from a real episode row.

#

# Step 1: find the max season per show that the user has watched.

max_season_per_show_subq = (

select(

EpisodeItem.show_media_item_id.label("show_id"),

func.max(EpisodeItem.season_number).label("max_season"),

)

.join(

WatchedItem,

and_(

WatchedItem.episode_item_id == EpisodeItem.id,

WatchedItem.user_id == user_id,

),

)

.group_by(EpisodeItem.show_media_item_id)

.subquery()

)

# Step 2: within that max season, find the max episode per show.

max_episode_in_max_season_subq = (

select(

EpisodeItem.show_media_item_id.label("show_id"),

func.max(EpisodeItem.episode_number).label("max_episode"),

)

.join(

max_season_per_show_subq,

and_(

max_season_per_show_subq.c.show_id == EpisodeItem.show_media_item_id,

max_season_per_show_subq.c.max_season == EpisodeItem.season_number,

),

)

.group_by(EpisodeItem.show_media_item_id)

.subquery()

)

# Step 3: join the aggregates back to MediaItem to get per-show latest season/episode.

result = await db.execute(

select(

MediaItem,

max_season_per_show_subq.c.max_season.label("max_season"),

max_episode_in_max_season_subq.c.max_episode.label("max_episode"),

)

.join(

max_season_per_show_subq,

max_season_per_show_subq.c.show_id == MediaItem.id,

)

.join(

max_episode_in_max_season_subq,

max_episode_in_max_season_subq.c.show_id == MediaItem.id,

)

.where(MediaItem.media_type == "tv")

Copilot · 2026-02-07T14:57:51Z

backend/src/librarysync/jobs/aiostreams_import.py

+    result = await db.execute(
+        select(
+            MediaItem,
+            func.max(EpisodeItem.season_number).label("max_season"),
+            func.max(EpisodeItem.episode_number).label("max_episode"),
+        )
+        .join(EpisodeItem, EpisodeItem.show_media_item_id == MediaItem.id)
+        .join(WatchedItem, and_(
+            WatchedItem.episode_item_id == EpisodeItem.id,
+            WatchedItem.user_id == user_id
+        ))
+        .where(MediaItem.media_type == "tv")
+        .group_by(MediaItem.id)
+    )


Despite computing title_key, the query fetches all TV shows a user has watched and only later filters by title in Python. For users with large histories this can be unnecessarily expensive. Consider pushing at least a coarse title filter into SQL (e.g., ilike / lower(title) equality on entry.title) so you only aggregate candidate-relevant shows, then keep the stricter _normalize_title_key check in Python if needed.

Copilot · 2026-02-07T14:57:52Z

backend/tests/test_aiostreams_import.py

+        selected = asyncio.run(
+            aiostreams_import._select_candidate_for_entry(db, "test_user", entry, candidates)
+        )


Using asyncio.run() inside unittest.TestCase can break if tests are executed under an environment that already has a running event loop. Prefer unittest.IsolatedAsyncioTestCase with async def test_... methods (and await ...) for these async tests.

Copilot · 2026-02-07T14:57:52Z

backend/src/librarysync/jobs/aiostreams_import.py

+    # 1. Are TV shows
+    # 2. Have a matching normalized title
+    # 3. Have episode watches by this user
+    from sqlalchemy import and_, func


Minor cleanup: importing and_/func inside the function makes dependencies less discoverable; prefer module-level imports for consistency. Also, the if entry.season_number is not None: check is redundant because the function already returns early when season_number is None.

Copilot · 2026-02-07T14:57:52Z

backend/src/librarysync/jobs/aiostreams_import.py

+        if entry.season_number is not None:
+            is_continuation = (


Minor cleanup: importing and_/func inside the function makes dependencies less discoverable; prefer module-level imports for consistency. Also, the if entry.season_number is not None: check is redundant because the function already returns early when season_number is None.

Initial plan

f108948

Copilot AI assigned Copilot and willtho89 Feb 2, 2026

Copilot started work on behalf of willtho89 February 2, 2026 05:49 View session

Copilot AI and others added 2 commits February 2, 2026 05:54

Add series continuity check and improve year handling for AioStreams …

afe4f3a

…metadata matching Co-authored-by: willtho89 <4933503+willtho89@users.noreply.github.com>

Fix code review issues: add null checks and improve query comment

9296ed1

Co-authored-by: willtho89 <4933503+willtho89@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Fix wrong metadata handling in AIOStreams~~ Fix AioStreams metadata disambiguation using watch history and year Feb 2, 2026

Copilot AI requested a review from willtho89 February 2, 2026 05:59

Copilot finished work on behalf of willtho89 February 2, 2026 05:59

willtho89 requested review from Copilot and removed request for willtho89 February 7, 2026 14:56

Copilot AI reviewed Feb 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix AioStreams metadata disambiguation using watch history and year#47

Fix AioStreams metadata disambiguation using watch history and year#47
Copilot wants to merge 3 commits intomainfrom
copilot/fix-wrong-metadata-issue

Copilot AI commented Feb 2, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 7, 2026

Uh oh!

Copilot AI Feb 7, 2026

Uh oh!

Copilot AI Feb 7, 2026

Uh oh!

Copilot AI Feb 7, 2026

Uh oh!

Copilot AI Feb 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Impact

Implementation

Comments on the Issue (you are @copilot in this section)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Feb 2, 2026 •

edited

Loading