Skip to content

feat: Add competition seasons endpoint#2

Merged
eskobar95 merged 20 commits intomainfrom
feature/competition-seasons-endpoint
Dec 1, 2025
Merged

feat: Add competition seasons endpoint#2
eskobar95 merged 20 commits intomainfrom
feature/competition-seasons-endpoint

Conversation

@eskobar95
Copy link
Copy Markdown
Owner

@eskobar95 eskobar95 commented Dec 1, 2025

📋 Summary

This PR adds a new endpoint to retrieve all available seasons for a specific competition, and enhances the existing clubs endpoint with full support for national team competitions.

🎯 Changes

New Endpoint

  • GET /competitions/{competition_id}/seasons - Returns all available seasons for a competition

Enhanced Endpoint

  • GET /competitions/{competition_id}/clubs - Now fully supports national team competitions
    • Automatically detects national team competitions (World Cup, EURO, Copa America, etc.)
    • Uses participants page (/teilnehmer/) for national teams instead of record winners
    • Correctly handles season_id for national teams (year-1 in URL)
    • Limits results to expected tournament size to exclude non-qualified teams

New Files

  • app/schemas/competitions/seasons.py - Pydantic schemas for season data

    • CompetitionSeason - Individual season schema with season_id, season_name, start_year, end_year
    • CompetitionSeasons - Response schema with competition info and list of seasons
  • app/services/competitions/seasons.py - Service implementation

    • TransfermarktCompetitionSeasons - Scrapes season data from competition pages
    • Handles both cross-year (e.g., "25/26" → 2025-2026) and single-year (e.g., "2025" → 2025-2025) formats
    • Correctly parses historical seasons (e.g., "99/00" → 1999-2000)
    • Extracts seasons from dropdown/table structure on competition pages

Modified Files

  • app/api/endpoints/competitions.py - Added new seasons endpoint, updated clubs endpoint
  • app/schemas/competitions/clubs.py - Made season_id optional
  • app/schemas/competitions/__init__.py - Exported new schemas
  • app/services/competitions/clubs.py - Added national team competition support
    • Detection logic for 6 known national team competitions
    • URL construction for participants pages
    • XPath expressions for participants table
    • Tournament size limits to exclude non-qualified teams
  • app/utils/xpath.py - Added XPath expressions for season dropdown and participants table
  • Dockerfile - Fixed PYTHONPATH configuration

🔍 Features

Competition Seasons Endpoint

  • ✅ Supports cross-year competitions (e.g., Superligaen: "25/26", "24/25")
  • ✅ Supports single-year competitions (e.g., Allsvenskan: "2025", "2024")
  • ✅ Correctly handles historical seasons crossing century boundaries (e.g., "99/00" → 1999-2000)
  • ✅ Returns structured data with season_id, season_name, start_year, and end_year
  • ✅ Seasons sorted by start_year descending (newest first)
  • ✅ Robust XPath expressions with multiple fallback patterns

National Team Competitions Support

  • ✅ Automatic detection of national team competitions (FIWC, EURO, COPA, AFAC, GOCU, AFCN)
  • ✅ Uses participants page (/teilnehmer/pokalwettbewerb/) for accurate participant lists
  • ✅ Correct season_id handling (tournament year vs URL season_id)
  • ✅ Tournament size validation (32 for World Cup, 24 for EURO, etc.)
  • ✅ Excludes non-qualified teams from results
  • ✅ Backward compatible with existing league endpoints

📝 Example Responses

Seasons Endpoint

{
  "id": "DK1",
  "name": "Superliga",
  "seasons": [
    {
      "seasonId": "2025",
      "seasonName": "25/26",
      "startYear": 2025,
      "endYear": 2026
    }
  ],
  "updatedAt": "2025-12-01T17:34:55.269244"
}

Clubs Endpoint (National Team)

{
  "id": "FIWC",
  "name": "World Cup",
  "seasonId": "2006",
  "clubs": [
    {
      "id": "3439",
      "name": "Brazil"
    },
    {
      "id": "3262",
      "name": "Germany"
    }
  ],
  "updatedAt": "2025-12-01T18:37:02.313924"
}

✅ Testing

Seasons Endpoint

  • ✅ Superligaen (DK1) - Cross-year format, 36 seasons
  • ✅ Allsvenskan (SE1) - Single-year format, 30 seasons
  • ✅ World Cup (FIWC) - Single-year format, 23 seasons
  • ✅ Historical seasons (e.g., "99/00" correctly parsed as 1999-2000)

National Team Competitions

  • ✅ World Cup (FIWC, 2006) - 32 participants
  • ✅ EURO (2024) - 24 participants
  • ✅ Copa America (2019) - 12 participants
  • ✅ Asian Cup (2023) - 24 participants
  • ✅ Gold Cup (2023) - 16 participants
  • ✅ Africa Cup (2024) - 24 participants
  • ✅ Normal leagues (DK1, 2024) - Still works correctly

🔗 Related

This endpoint complements the existing /competitions/{competition_id}/clubs endpoint by providing season information that can be used to query clubs for specific seasons. The enhanced clubs endpoint now provides accurate participant lists for national team competitions, not just record winners.

Summary by CodeRabbit

  • New Features

    • National teams support across club and competition APIs; new club competitions and competition seasons endpoints.
  • Improvements

    • More robust parsing for varied page structures and aligned/normalized player/club lists.
    • Smarter season detection/parsing and configurable tournament-size handling.
    • Schemas made more tolerant/optional to accommodate national teams.
  • Documentation

    • Added API memory guide and changelog updates.
  • Chores

    • CI runner update, ignore-rule additions, and container path adjustment.
  • Tests

    • Added tests for tournament size configuration and validation.

✏️ Tip: You can customize this high-level summary in your review settings.

- Add full support for national teams across all club endpoints
- Add new /clubs/{club_id}/competitions endpoint to retrieve club competitions
- Add isNationalTeam field to Club Profile response schema
- Make Club Profile fields optional to accommodate national teams
- Enhance Club Players endpoint to handle national team HTML structure
- Update XPath expressions to support both club and national team structures
- Add intelligent detection logic for national teams
- Maintain backward compatibility with existing club endpoints

This update enables the API to work seamlessly with both regular clubs
and national teams, providing a unified interface for all club-related
data retrieval.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Dec 1, 2025

Warning

Rate limit exceeded

@eskoubar95 has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 4 minutes and 18 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 085db81 and cd83f20.

📒 Files selected for processing (5)
  • app/services/clubs/players.py (8 hunks)
  • app/services/competitions/clubs.py (4 hunks)
  • app/services/competitions/seasons.py (1 hunks)
  • app/utils/xpath.py (2 hunks)
  • tests/competitions/test_competitions_clubs.py (2 hunks)

Walkthrough

Adds national-team support across schemas, services, and endpoints; introduces club competitions and competition seasons services/endpoints; expands XPath selectors and parsing to handle alternate page structures; updates CI, Dockerfile, docs, settings, and tests; adjusts several schemas to allow optional fields and adds tournament-size configuration.

Changes

Cohort / File(s) Change Summary
CI & Repo Ignore
​.github/workflows/check.yml, ​.gitignore
Update workflow runner to ubuntu-latest and pin python-version: "3.9"; add OpenMemory IDE rules and .DS_Store to .gitignore.
Changelog & Docs
CHANGELOG.md, openmemory.md, README.md
Document Unreleased national-team support, new endpoints/services/schemas; add OpenMemory guide; add TOURNAMENT_SIZE_* env vars and notes.
API Endpoints
app/api/endpoints/clubs.py, app/api/endpoints/competitions.py
Add GET /clubs/{club_id}/competitions and GET /competitions/{competition_id}/seasons; adjust response_model_exclude_none=True; instantiate new services and use heuristic national-team detection for players.
Club Schemas
Club schema exports
app/schemas/clubs/__init__.py, app/schemas/clubs/profile.py, app/schemas/clubs/competitions.py
Expose ClubCompetitions; add ClubCompetition and ClubCompetitions models; make several ClubProfile fields optional and add is_national_team; change ClubSquad.national_team_players to Optional[int].
Competition Schemas
Competition schema exports
app/schemas/competitions/__init__.py, app/schemas/competitions/seasons.py, app/schemas/competitions/clubs.py, app/schemas/competitions/search.py
Re-export CompetitionSeasons; add CompetitionSeason/CompetitionSeasons; make CompetitionClubs.season_id optional; make CompetitionSearchResult fields country, clubs, players optional.
Club Services
app/services/clubs/profile.py, app/services/clubs/players.py, app/services/clubs/competitions.py
Profile: detect national teams and set isNationalTeam; Players: national-team-specific parsing and list padding/alignment; add TransfermarktClubCompetitions service to fetch/parse club competitions.
Competition Services
app/services/competitions/search.py, app/services/competitions/seasons.py, app/services/competitions/clubs.py
Search: pad lists for alignment; add TransfermarktCompetitionSeasons with multi-strategy season parsing and year logic; extend competition clubs to support national-team competitions (URL/name/season handling, validation, tournament-size heuristics).
XPath Utilities
app/utils/xpath.py
Expand union/fallback XPath selectors and add selectors for competitions, participants, and season controls to support national-team and alternate page structures.
Build / Runtime
Dockerfile, app/settings.py
Set PYTHONPATH to fixed /app; add TOURNAMENT_SIZE_* settings, validator, and get_tournament_size helper.
Tests
tests/competitions/test_competitions_clubs.py
Add tests for tournament size configuration, validation, and warning behavior.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant API as "API Endpoint"
    participant Service as "Service Layer"
    participant Parser as "HTML Parser/XPath"
    participant Response

    rect rgb(200,220,255)
    note over Client,Response: Club Competitions Flow
    Client->>API: GET /clubs/{club_id}/competitions?season_id=...
    API->>Service: instantiate TransfermarktClubCompetitions(club_id, season_id)
    Service->>Service: __post_init__: build URL, fetch page
    Service->>Parser: locate RECORD_TABLE via XPath
    Parser-->>Service: competition rows/links
    Service->>Service: __parse_club_competitions(): extract id,name,url
    Service-->>API: dict {id, seasonId, competitions[]}
    API-->>Response: serialized ClubCompetitions JSON
    Response-->>Client: 200 OK
    end
Loading
sequenceDiagram
    participant Client
    participant API as "API Endpoint"
    participant Service as "Service Layer"
    participant Parser as "HTML Parser/XPath"
    participant Response

    rect rgb(220,255,220)
    note over Client,Response: Competition Seasons Flow
    Client->>API: GET /competitions/{competition_id}/seasons
    API->>Service: instantiate TransfermarktCompetitionSeasons(competition_id)
    Service->>Service: __post_init__: build URL, fetch page
    Service->>Parser: extract season options via multiple XPath strategies
    Parser-->>Service: season strings/options
    Service->>Service: __parse_seasons(): detect type & parse to start/end years
    Service-->>API: dict {id, name, seasons[]}
    API-->>Response: serialized CompetitionSeasons JSON
    Response-->>Client: 200 OK
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Files to focus on:
    • app/services/clubs/players.py — national-team parsing, padding/alignment edge cases
    • app/services/competitions/seasons.py — season string parsing, century/edge-year logic, deduplication
    • app/services/competitions/clubs.py — national-team URL/name/season handling, tournament-size heuristics, validation
    • app/services/clubs/competitions.py — URL construction, table detection, error handling
    • app/utils/xpath.py — XPath unions and fallback ordering

Poem

🐰
I hopped through tables, links, and years,
Found nations tucked among club cheers.
I padded rows and chased each name,
Kept seasons tidy, URLs tame.
A little hop — the data clears.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 79.41% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'feat: Add competition seasons endpoint' accurately describes the primary change—adding a new GET /competitions/{competition_id}/seasons endpoint. It is concise, clear, and directly reflects the main feature introduced.

Comment @coderabbitai help to get the list of available commands and usage tips.

- Add GET /competitions/{competition_id}/seasons endpoint
- Implement TransfermarktCompetitionSeasons service to scrape season data
- Add CompetitionSeason and CompetitionSeasons Pydantic schemas
- Support both cross-year (e.g., 25/26) and single-year (e.g., 2025) seasons
- Handle historical seasons correctly (e.g., 99/00 -> 1999-2000)
- Extract seasons from competition page dropdown/table structure
- Return season_id, season_name, start_year, and end_year for each season
- Sort seasons by start_year descending (newest first)

Closes #[issue-number]
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (9)
.github/workflows/check.yml (1)

12-13: Consider updating GitHub Actions to the latest stable versions.

Actionlint flagged that actions/checkout@v3 and actions/setup-python@v4 are outdated. While still functional, updating to v4 and v5 respectively will ensure compatibility with current GitHub Actions infrastructure and receive bug fixes and security updates.

Apply this diff to modernize the actions:

-      - uses: actions/checkout@v3
-      - uses: actions/setup-python@v4
+      - uses: actions/checkout@v4
+      - uses: actions/setup-python@v5
.gitignore (1)

165-167: Consider grouping with other IDE-specific rules.

The OpenMemory/Cursor rules could be grouped with the PyCharm .idea/ section (around line 161) for better organization, since both are IDE-specific ignores.

openmemory.md (1)

1-90: Consider adding competition seasons documentation.

This project memory guide documents Club Profile and Club Competitions services but is missing documentation for the new Competition Seasons service (app/services/competitions/seasons.py) mentioned in the PR objectives. Consider adding a section similar to the Club Competitions Service section.

app/services/competitions/search.py (1)

41-41: Consider padding name for consistency.

The name list is not padded like the other fields. If the NAMES XPath returns fewer elements than URLS, the zip will silently truncate results. For consistency with other fields, consider padding name as well.

app/services/clubs/players.py (1)

115-120: Consider padding players_names and players_positions for robustness.

The base_length alignment is well-implemented for most fields. However, players_names (line 57) and players_positions (line 58, used in the non-national-team path) are not padded. If these XPath queries return fewer elements than players_ids, the zip will silently truncate results.

+        # Ensure players_names and players_positions are aligned
+        if len(players_names) != base_length:
+            players_names = (players_names + [None] * base_length)[:base_length]
+        if not is_national_team and len(players_positions) != base_length:
+            players_positions = (players_positions + [None] * base_length)[:base_length]
+
         players_nationalities = [nationality.xpath(Clubs.Players.NATIONALITIES) for nationality in page_nationalities]
app/services/competitions/seasons.py (2)

19-27: Verify URL domain consistency with other services.

This service uses https://www.transfermarkt.com/... while TransfermarktClubCompetitions uses https://www.transfermarkt.us/.... If the rest of the codebase standardizes on one domain (often .us here), consider aligning this URL to avoid inconsistent behavior, redirects, or geo‑specific differences.


140-275: Season extraction and response structure look robust overall.

The multi‑stage fallback in __parse_seasons (table cell → dropdown → generic list/text search) plus deduping and filtering logic yields a clean seasons list, and get_competition_seasons wires it into the response in a way consistent with other services (id, name, seasons, populated via XPath and helpers).

Aside from the year‑normalization details noted elsewhere, this structure and flow look good.

Also applies to: 277-289

app/services/clubs/competitions.py (2)

23-31: Revisit default season_id heuristic for “current season”.

self.season_id defaults to datetime.now().year. On many competitions Transfermarkt uses the start year of a season (e.g., 2024 for season 2024/25). With this heuristic:

  • In Jan–Jun 2025, the “current” active season is 2024/25, but datetime.now().year would give 2025, which might not exist yet or might represent 2025/26.

If you want “current active season”, consider something like:

  • If month >= 7 → year, else year - 1.

Or make the default explicit in the API/docs (e.g., “defaults to calendar year, not necessarily current active season”) to avoid surprises.


32-38: Minor style: prefer top‑level imports for datetime and HTTPException.

Importing datetime and HTTPException inside __post_init__ is safe but non‑idiomatic for this codebase; moving them to module scope will simplify the function and avoid repeated imports if this method is ever called frequently.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3388e4f and fbeced8.

📒 Files selected for processing (17)
  • .github/workflows/check.yml (1 hunks)
  • .gitignore (1 hunks)
  • CHANGELOG.md (1 hunks)
  • app/api/endpoints/clubs.py (2 hunks)
  • app/api/endpoints/competitions.py (2 hunks)
  • app/schemas/clubs/__init__.py (1 hunks)
  • app/schemas/clubs/competitions.py (1 hunks)
  • app/schemas/clubs/profile.py (2 hunks)
  • app/schemas/competitions/__init__.py (1 hunks)
  • app/schemas/competitions/seasons.py (1 hunks)
  • app/services/clubs/competitions.py (1 hunks)
  • app/services/clubs/players.py (1 hunks)
  • app/services/clubs/profile.py (1 hunks)
  • app/services/competitions/search.py (2 hunks)
  • app/services/competitions/seasons.py (1 hunks)
  • app/utils/xpath.py (3 hunks)
  • openmemory.md (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (9)
app/schemas/competitions/__init__.py (1)
app/schemas/competitions/seasons.py (1)
  • CompetitionSeasons (13-16)
app/schemas/clubs/__init__.py (1)
app/schemas/clubs/competitions.py (1)
  • ClubCompetitions (12-15)
app/services/clubs/players.py (3)
app/utils/xpath.py (6)
  • Clubs (105-191)
  • Clubs (223-236)
  • Players (1-102)
  • Players (148-184)
  • Past (179-184)
  • Present (172-177)
app/services/base.py (1)
  • get_list_by_xpath (132-150)
app/utils/utils.py (1)
  • safe_regex (65-84)
app/schemas/competitions/seasons.py (1)
app/schemas/base.py (2)
  • AuditMixin (10-11)
  • TransfermarktBaseModel (14-90)
app/schemas/clubs/competitions.py (1)
app/schemas/base.py (2)
  • AuditMixin (10-11)
  • TransfermarktBaseModel (14-90)
app/services/clubs/competitions.py (3)
app/services/base.py (2)
  • TransfermarktBase (16-223)
  • request_url_page (104-117)
app/utils/utils.py (1)
  • extract_from_url (19-46)
app/utils/xpath.py (4)
  • Clubs (105-191)
  • Clubs (223-236)
  • Competitions (186-191)
  • Competitions (194-236)
app/api/endpoints/competitions.py (2)
app/services/competitions/seasons.py (2)
  • TransfermarktCompetitionSeasons (10-289)
  • get_competition_seasons (277-289)
app/schemas/competitions/seasons.py (1)
  • CompetitionSeasons (13-16)
app/services/competitions/seasons.py (4)
app/services/base.py (4)
  • TransfermarktBase (16-223)
  • request_url_page (104-117)
  • raise_exception_if_not_found (119-130)
  • get_text_by_xpath (152-206)
app/utils/utils.py (1)
  • trim (49-62)
app/utils/xpath.py (5)
  • Competitions (186-191)
  • Competitions (194-236)
  • Profile (18-57)
  • Profile (106-138)
  • Profile (195-210)
app/api/endpoints/competitions.py (1)
  • get_competition_seasons (28-31)
app/services/competitions/search.py (3)
app/services/base.py (1)
  • get_list_by_xpath (132-150)
app/utils/xpath.py (5)
  • Competitions (186-191)
  • Competitions (194-236)
  • Search (59-70)
  • Search (140-146)
  • Search (212-221)
app/utils/utils.py (1)
  • extract_from_url (19-46)
🪛 actionlint (1.7.9)
.github/workflows/check.yml

12-12: the runner of "actions/checkout@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)


13-13: the runner of "actions/setup-python@v4" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

🪛 GitHub Actions: Code Check
app/schemas/competitions/seasons.py

[error] 1-17: Ruff reported multiple issues: F401 imported but unused (Optional), W391 extra newline at end of file, and other formatting problems on this file. Fix imports, trailing newline, and spacing.

app/services/competitions/seasons.py

[error] 1-290: Ruff reported linting errors in this file (multiple issues: unused imports F401, trailing/blank line warnings W391/W293, line length E501, and formatting/compression issues). Please run ruff --fix and address all listed items.

app/utils/xpath.py

[error] 199-210: Line length (E501) and related formatting issues detected in xpath definitions. Please shorten long lines or split expressions and re-run linting.

🔇 Additional comments (15)
.github/workflows/check.yml (2)

10-10: Solid updates for runner and Python environment consistency.

Upgrading to ubuntu-latest and explicitly pinning Python 3.9 ensures the workflow runs on stable, current infrastructure and maintains consistent test environments. This aligns well with the new Pydantic schemas and type hints introduced in this PR.

Also applies to: 14-15


15-15: Verify that Python 3.9 is the intended minimum version.

Python 3.9 reached end-of-life in October 2025. Consider whether the project should target a newer minimum version (e.g., 3.10 or 3.11) for an extended support window and access to newer language features.

Please confirm:

  1. Is Python 3.9 the intentional minimum version for this project?
  2. If not, which Python version(s) should be supported?
app/schemas/clubs/__init__.py (1)

1-4: LGTM!

The re-export follows the established pattern and correctly exposes ClubCompetitions at the package level.

app/schemas/clubs/profile.py (2)

11-11: LGTM!

Making national_team_players optional correctly handles national teams that don't have this field.


40-46: LGTM!

Making stadium_name, stadium_seats, and current_transfer_record optional, and adding the is_national_team flag appropriately supports national teams while maintaining backward compatibility with regular clubs.

app/schemas/clubs/competitions.py (1)

1-15: LGTM!

The schema structure is clean, follows existing patterns with TransfermarktBaseModel and AuditMixin, and appropriately models club competitions data.

app/utils/xpath.py (3)

154-164: LGTM!

The XPath unions correctly handle both regular club structures and national team structures, enabling broader data extraction coverage.


186-191: LGTM!

The new Clubs.Competitions class provides well-structured XPath constants for extracting competition data from the Record table.


224-236: LGTM!

The comments clearly explain the purpose of the union XPaths, and the patterns correctly match both regular clubs and national teams.

app/services/clubs/profile.py (1)

73-97: Approve with verification note on edge cases.

The national team detection logic is reasonable, combining structural indicators (no legal form/stadium) with competition-based checks. The None-safety for league_name at line 86 is correct.

Consider whether legitimate clubs with incomplete profile data (missing legal form or stadium) participating in international tournaments might be incorrectly flagged. The dual condition (structural + competition) mitigates this, but verify with edge cases like smaller national associations or youth national teams.

app/api/endpoints/clubs.py (1)

35-39: LGTM!

The new endpoint follows the established patterns: consistent naming, optional season_id parameter, and proper response model configuration matching other endpoints.

app/api/endpoints/competitions.py (1)

27-31: LGTM!

The new seasons endpoint is well-structured and correctly uses response_model_exclude_none=True to handle optional fields cleanly. The implementation follows the established service pattern.

app/schemas/competitions/__init__.py (1)

3-3: LGTM!

The re-export follows the established pattern and properly exposes the new CompetitionSeasons schema.

app/services/clubs/players.py (1)

59-114: Solid national team parsing implementation.

The row-by-row extraction with URL matching effectively handles the different HTML structure for national teams. The seen_urls set correctly prevents duplicates.

One consideration: the detection heuristic at line 61 (PAGE_INFOS empty + players exist) might trigger for edge cases where a regular club's page has an unusual structure. This is a pragmatic approach, but consider if additional signals could strengthen the detection.

app/services/clubs/competitions.py (1)

39-72: Club competitions parsing and response wiring look solid.

  • Using Clubs.Competitions.RECORD_TABLE as a guard ensures you only parse when the expected structure is present.
  • Extracting href, deriving comp_id via extract_from_url, and building {id, name, url} objects per row is straightforward and resilient (skips rows without a competition link or missing id/name).
  • get_club_competitions returns {id, seasonId, competitions} using the shared response dict from TransfermarktBase, which aligns with existing patterns.

No functional issues spotted here.

Also applies to: 74-86

Comment thread app/schemas/competitions/seasons.py Outdated
Comment on lines +1 to +17
from typing import Optional

from app.schemas.base import AuditMixin, TransfermarktBaseModel


class CompetitionSeason(TransfermarktBaseModel):
season_id: str
season_name: str
start_year: int
end_year: int


class CompetitionSeasons(TransfermarktBaseModel, AuditMixin):
id: str
name: str
seasons: list[CompetitionSeason]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix linting issues flagged by pipeline.

The pipeline reports unused import and trailing newline issues. Apply these fixes:

-from typing import Optional
-
 from app.schemas.base import AuditMixin, TransfermarktBaseModel


 class CompetitionSeason(TransfermarktBaseModel):
     season_id: str
     season_name: str
     start_year: int
     end_year: int


 class CompetitionSeasons(TransfermarktBaseModel, AuditMixin):
     id: str
     name: str
     seasons: list[CompetitionSeason]
-
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from typing import Optional
from app.schemas.base import AuditMixin, TransfermarktBaseModel
class CompetitionSeason(TransfermarktBaseModel):
season_id: str
season_name: str
start_year: int
end_year: int
class CompetitionSeasons(TransfermarktBaseModel, AuditMixin):
id: str
name: str
seasons: list[CompetitionSeason]
from app.schemas.base import AuditMixin, TransfermarktBaseModel
class CompetitionSeason(TransfermarktBaseModel):
season_id: str
season_name: str
start_year: int
end_year: int
class CompetitionSeasons(TransfermarktBaseModel, AuditMixin):
id: str
name: str
seasons: list[CompetitionSeason]
🧰 Tools
🪛 GitHub Actions: Code Check

[error] 1-17: Ruff reported multiple issues: F401 imported but unused (Optional), W391 extra newline at end of file, and other formatting problems on this file. Fix imports, trailing newline, and spacing.

🤖 Prompt for AI Agents
In app/schemas/competitions/seasons.py around lines 1 to 17, remove the unused
import "Optional" from the typing imports and ensure the file ends with a single
trailing newline (no extra blank lines) so the linter stops flagging an unused
import and trailing-newline issues; save and re-run the pipeline to confirm the
fixes.

Comment on lines +54 to +65
def pad_list(lst, length):
"""
Pad a list to a specified length with None values.

Args:
lst: The list to pad.
length: The target length for the list.

Returns:
The original list if it's already the target length, otherwise a padded list.
"""
return lst if len(lst) == length else lst + [None] * (length - len(lst))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Handle lists longer than base_length.

If any extracted list is longer than base_length, the padding logic returns the original list unchanged (since [None] * negative is empty). This would cause a length mismatch in the zip. Consider truncating:

-            return lst if len(lst) == length else lst + [None] * (length - len(lst))
+            if len(lst) >= length:
+                return lst[:length]
+            return lst + [None] * (length - len(lst))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def pad_list(lst, length):
"""
Pad a list to a specified length with None values.
Args:
lst: The list to pad.
length: The target length for the list.
Returns:
The original list if it's already the target length, otherwise a padded list.
"""
return lst if len(lst) == length else lst + [None] * (length - len(lst))
def pad_list(lst, length):
"""
Pad a list to a specified length with None values.
Args:
lst: The list to pad.
length: The target length for the list.
Returns:
The original list if it's already the target length, otherwise a padded list.
"""
if len(lst) >= length:
return lst[:length]
return lst + [None] * (length - len(lst))
🤖 Prompt for AI Agents
In app/services/competitions/search.py around lines 54 to 65, pad_list currently
returns the original list when it's longer than the target length which breaks
downstream zipping; change it so that if len(lst) > length it returns
lst[:length], if len(lst) < length it returns lst + [None] * (length -
len(lst)), and if len(lst) == length return lst; ensure you handle non-positive
length defensively (e.g., treat length <= 0 as an empty list) to avoid negative
repeats.

Comment thread app/services/competitions/seasons.py Outdated
Comment thread app/services/competitions/seasons.py
Comment thread app/utils/xpath.py Outdated
Comment thread CHANGELOG.md
Comment on lines +5 to +35
## [Unreleased]

### Added
- **National Teams Support**: Full support for national teams across all club endpoints
- Added `isNationalTeam` boolean field to Club Profile response schema
- Club Profile endpoint now automatically detects and indicates if a club is a national team
- Club Players endpoint now supports fetching players from national teams
- Club Competitions endpoint supports national teams

- **New Endpoint**: `/clubs/{club_id}/competitions`
- Retrieves all competitions a club participates in for a given season
- Supports both regular clubs and national teams
- Returns competition ID, name, and URL for each competition
- Defaults to current season if `season_id` is not provided

### Changed
- **Club Profile Schema**: Made several fields optional to accommodate national teams
- `stadium_name`, `stadium_seats`, `current_transfer_record` are now optional
- `squad.national_team_players` is now optional
- `league` fields can now be `None` for national teams

- **Club Players Endpoint**: Enhanced to handle different HTML structures
- Automatically detects national teams and uses appropriate parsing logic
- Handles different table structures between clubs and national teams
- Ensures all player data lists are properly aligned for correct parsing

### Technical Details
- Updated XPath expressions to support both club and national team HTML structures
- Added intelligent detection logic for national teams based on HTML structure
- Improved error handling and data validation for edge cases
- All changes maintain backward compatibility with existing club endpoints
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Missing documentation for competition seasons endpoint.

The PR objectives describe a new /competitions/{competition_id}/seasons endpoint, but the changelog only documents /clubs/{club_id}/competitions. Consider adding an entry for the competition seasons feature:

 - **New Endpoint**: `/clubs/{club_id}/competitions`
   - Retrieves all competitions a club participates in for a given season
   - Supports both regular clubs and national teams
   - Returns competition ID, name, and URL for each competition
   - Defaults to current season if `season_id` is not provided

+- **New Endpoint**: `/competitions/{competition_id}/seasons`
+  - Retrieves all available seasons for a competition
+  - Returns season ID, name, start year, and end year
+  - Seasons sorted by start year descending
+  - Handles cross-year (25/26), single-year (2025), and century-crossing (99/00) formats
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
## [Unreleased]
### Added
- **National Teams Support**: Full support for national teams across all club endpoints
- Added `isNationalTeam` boolean field to Club Profile response schema
- Club Profile endpoint now automatically detects and indicates if a club is a national team
- Club Players endpoint now supports fetching players from national teams
- Club Competitions endpoint supports national teams
- **New Endpoint**: `/clubs/{club_id}/competitions`
- Retrieves all competitions a club participates in for a given season
- Supports both regular clubs and national teams
- Returns competition ID, name, and URL for each competition
- Defaults to current season if `season_id` is not provided
### Changed
- **Club Profile Schema**: Made several fields optional to accommodate national teams
- `stadium_name`, `stadium_seats`, `current_transfer_record` are now optional
- `squad.national_team_players` is now optional
- `league` fields can now be `None` for national teams
- **Club Players Endpoint**: Enhanced to handle different HTML structures
- Automatically detects national teams and uses appropriate parsing logic
- Handles different table structures between clubs and national teams
- Ensures all player data lists are properly aligned for correct parsing
### Technical Details
- Updated XPath expressions to support both club and national team HTML structures
- Added intelligent detection logic for national teams based on HTML structure
- Improved error handling and data validation for edge cases
- All changes maintain backward compatibility with existing club endpoints
## [Unreleased]
### Added
- **National Teams Support**: Full support for national teams across all club endpoints
- Added `isNationalTeam` boolean field to Club Profile response schema
- Club Profile endpoint now automatically detects and indicates if a club is a national team
- Club Players endpoint now supports fetching players from national teams
- Club Competitions endpoint supports national teams
- **New Endpoint**: `/clubs/{club_id}/competitions`
- Retrieves all competitions a club participates in for a given season
- Supports both regular clubs and national teams
- Returns competition ID, name, and URL for each competition
- Defaults to current season if `season_id` is not provided
- **New Endpoint**: `/competitions/{competition_id}/seasons`
- Retrieves all available seasons for a competition
- Returns season ID, name, start year, and end year
- Seasons sorted by start year descending
- Handles cross-year (25/26), single-year (2025), and century-crossing (99/00) formats
### Changed
- **Club Profile Schema**: Made several fields optional to accommodate national teams
- `stadium_name`, `stadium_seats`, `current_transfer_record` are now optional
- `squad.national_team_players` is now optional
- `league` fields can now be `None` for national teams
- **Club Players Endpoint**: Enhanced to handle different HTML structures
- Automatically detects national teams and uses appropriate parsing logic
- Handles different table structures between clubs and national teams
- Ensures all player data lists are properly aligned for correct parsing
### Technical Details
- Updated XPath expressions to support both club and national team HTML structures
- Added intelligent detection logic for national teams based on HTML structure
- Improved error handling and data validation for edge cases
- All changes maintain backward compatibility with existing club endpoints

@eskobar95 eskobar95 force-pushed the feature/competition-seasons-endpoint branch from fbeced8 to 93929b3 Compare December 1, 2025 17:44
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
app/services/competitions/seasons.py (1)

61-70: 2-digit year parsing heuristic may misparse historical seasons.

The threshold of 90 means seasons like "85/86" would be parsed as 2085–2086 instead of 1985–1986. Consider using a dynamic pivot based on current year (e.g., years > current_year % 100 + 10 map to 1900s).

🧹 Nitpick comments (1)
app/services/clubs/players.py (1)

115-168: Padding logic provides robustness, but be aware of silent truncation.

The normalization pattern (list + [default] * base_length)[:base_length] handles short lists well but will silently truncate lists longer than base_length. This is likely acceptable as a defensive measure against malformed HTML, but consider logging when truncation occurs for debugging purposes.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fbeced8 and 93929b3.

📒 Files selected for processing (7)
  • app/api/endpoints/competitions.py (2 hunks)
  • app/schemas/competitions/__init__.py (1 hunks)
  • app/schemas/competitions/search.py (1 hunks)
  • app/schemas/competitions/seasons.py (1 hunks)
  • app/services/clubs/players.py (1 hunks)
  • app/services/competitions/seasons.py (1 hunks)
  • app/utils/xpath.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • app/schemas/competitions/init.py
  • app/schemas/competitions/seasons.py
🧰 Additional context used
🧬 Code graph analysis (3)
app/api/endpoints/competitions.py (2)
app/services/competitions/seasons.py (1)
  • get_competition_seasons (282-294)
app/schemas/competitions/seasons.py (1)
  • CompetitionSeasons (11-14)
app/services/clubs/players.py (3)
app/utils/xpath.py (6)
  • Clubs (105-191)
  • Clubs (227-240)
  • Players (1-102)
  • Players (148-184)
  • Past (179-184)
  • Present (172-177)
app/services/base.py (1)
  • get_list_by_xpath (132-150)
app/utils/utils.py (1)
  • safe_regex (65-84)
app/services/competitions/seasons.py (4)
app/services/base.py (4)
  • TransfermarktBase (16-223)
  • request_url_page (104-117)
  • raise_exception_if_not_found (119-130)
  • get_text_by_xpath (152-206)
app/utils/utils.py (1)
  • trim (49-62)
app/utils/xpath.py (5)
  • Competitions (186-191)
  • Competitions (194-240)
  • Profile (18-57)
  • Profile (106-138)
  • Profile (195-214)
app/api/endpoints/competitions.py (1)
  • get_competition_seasons (28-31)
🔇 Additional comments (7)
app/schemas/competitions/search.py (1)

9-11: Relaxing country/clubs/players to optional fits partial-data/national-team cases

Making these fields optional with a None default is consistent with search results that may not have those counts or a country (e.g., national teams or incomplete data). Just double‑check key consumers of CompetitionSearchResult (sorting, aggregations, formatting) to ensure they don’t assume clubs/players are always non‑null integers or country always a non‑empty string, and handle None explicitly where needed.

app/api/endpoints/competitions.py (1)

27-31: LGTM!

The new endpoint follows the established pattern used by get_competition_clubs, with appropriate response model and service integration.

app/services/competitions/seasons.py (1)

282-294: LGTM!

The public API follows the established pattern used by other Transfermarkt services, with clear documentation and consistent response structure.

app/utils/xpath.py (4)

154-164: LGTM – Union patterns correctly support both club and national team structures.

The XPath unions allow graceful fallback between posrela structure (regular clubs) and alternative patterns (national teams). The formatting is clean and within line limits.


186-192: LGTM!

The new Clubs.Competitions nested class provides well-structured XPath constants for competition record extraction.


198-214: Season dropdown selectors look good.

The multi-line formatting addresses previous line-length concerns, and the fallback patterns provide robustness for varying HTML structures.


227-240: LGTM – National team support added to club URLs and names.

The inline comments clearly document the purpose of each union branch.

Comment on lines +59 to +106

# Detect if this is a national team (no posrela structure)
is_national_team = len(self.page.xpath(Clubs.Players.PAGE_INFOS)) == 0 and len(players_names) > 0

# Handle national teams: extract data from rows directly
if is_national_team:
# Get all player rows (may contain duplicates for home/away)
all_player_rows = self.page.xpath(
"//div[@id='yw1']//tbody//tr[.//td[@class='hauptlink']//a[contains(@href, '/profil/spieler')]]",
)

# Extract data from rows, matching with player URLs to avoid duplicates
players_positions = []
players_dobs_raw = []
seen_urls = set()

for url in self.get_list_by_xpath(Clubs.Players.URLS):
# Skip if we've already processed this URL
if url in seen_urls:
continue
seen_urls.add(url)

# Find first row containing this player URL
for row in all_player_rows:
row_urls = row.xpath(".//td[@class='hauptlink']//a[contains(@href, '/profil/spieler')]/@href")
if url in row_urls:
tds = row.xpath(".//td")
# Position is in TD[4] for national teams
if len(tds) > 4:
pos_text = "".join(tds[4].xpath(".//text()")).strip()
players_positions.append(pos_text if pos_text else None)
else:
players_positions.append(None)

# DOB/Age is in TD[5] for national teams
if len(tds) > 5:
dob_text = "".join(tds[5].xpath(".//text()")).strip()
players_dobs_raw.append(dob_text if dob_text else None)
else:
players_dobs_raw.append(None)
break
else:
# URL not found in any row
players_positions.append(None)
players_dobs_raw.append(None)

players_dobs = [safe_regex(dob_age, REGEX_DOB, "dob") for dob_age in players_dobs_raw]
players_ages = [safe_regex(dob_age, REGEX_DOB, "age") for dob_age in players_dobs_raw]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential data misalignment between players_ids and national team parsed lists.

When is_national_team is true, the loop (lines 75-103) skips duplicate URLs via seen_urls, producing players_positions and players_dobs_raw with only unique entries. However, players_ids (line 56) is built from all URLs before this check, potentially including duplicates.

This causes players_ids to have more elements than players_positions/players_dobs, leading to incorrect player-to-attribute mapping even after padding normalization.

Consider deduplicating players_ids for national teams:

+        # For national teams, deduplicate players_ids to match positions/dobs
+        if is_national_team:
+            unique_ids = []
+            seen_ids = set()
+            for pid in players_ids:
+                if pid not in seen_ids:
+                    seen_ids.add(pid)
+                    unique_ids.append(pid)
+            players_ids = unique_ids
+
         # Ensure all lists have the same length as players_ids
         base_length = len(players_ids)

Alternatively, rebuild players_ids using the same seen_urls set from the national team parsing loop.

Committable suggestion skipped: line range outside the PR's diff.

Comment thread app/services/competitions/seasons.py Outdated
- Detect national team competitions (FIWC, EURO, COPA, AFAC, GOCU, AFCN)
- Use /teilnehmer/pokalwettbewerb/ URL for national team competitions
- Handle season_id correctly (year-1 for national teams in URL)
- Add XPath expressions for participants table
- Limit participants to expected tournament size to exclude non-qualified teams
- Make season_id optional in CompetitionClubs schema
- Update Dockerfile PYTHONPATH configuration
@eskobar95 eskobar95 force-pushed the feature/competition-seasons-endpoint branch from 27b61c7 to c3f2410 Compare December 1, 2025 20:45
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
app/services/competitions/seasons.py (1)

108-142: Consider extracting shared year-normalization logic to reduce duplication.

The 2-digit year to full year conversion logic (lines 128-135) duplicates the logic in __parse_season_to_years (lines 62-69). Extracting a helper method would improve maintainability and prevent future divergence.

def __normalize_two_digit_year(self, two_digit: int) -> int:
    """Convert a 2-digit year to a full year using the 90-threshold heuristic."""
    if two_digit >= 90:
        return 1900 + two_digit
    return 2000 + two_digit

Then use it in both methods:

-                    if start_num >= 90:
-                        return str(1900 + start_num)
-                    else:
-                        return str(2000 + start_num)
+                    return str(self.__normalize_two_digit_year(start_num))
app/services/competitions/clubs.py (1)

162-190: Minor: Unreachable fallback code.

At lines 176-177, the fallback if url_season_id: inside the except ValueError block is unreachable because int(url_season_id) at line 172 would only fail if url_season_id is not a valid integer, but line 169 already checked url_season_id.isdigit(). The try/except is still reasonable for defensive coding, but the inner fallback condition is redundant.

             if url_season_id and url_season_id.isdigit():
                 try:
                     # Convert back to tournament year (add 1)
                     tournament_year = str(int(url_season_id) + 1)
                     self.response["seasonId"] = tournament_year
                 except ValueError:
-                    # Fallback to original season_id if conversion fails
-                    if url_season_id:
-                        self.response["seasonId"] = url_season_id
+                    # This should not happen since isdigit() was checked
+                    pass
             elif self.season_id:
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 93929b3 and 25cbc7b.

📒 Files selected for processing (7)
  • .gitignore (1 hunks)
  • Dockerfile (1 hunks)
  • app/api/endpoints/competitions.py (2 hunks)
  • app/schemas/competitions/clubs.py (2 hunks)
  • app/services/competitions/clubs.py (4 hunks)
  • app/services/competitions/seasons.py (1 hunks)
  • app/utils/xpath.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • .gitignore
  • app/api/endpoints/competitions.py
🧰 Additional context used
🧬 Code graph analysis (2)
app/services/competitions/clubs.py (3)
app/services/base.py (3)
  • TransfermarktBase (16-223)
  • get_list_by_xpath (132-150)
  • get_text_by_xpath (152-206)
app/utils/utils.py (2)
  • extract_from_url (19-46)
  • trim (49-62)
app/utils/xpath.py (7)
  • Competitions (186-191)
  • Competitions (194-265)
  • Clubs (105-191)
  • Clubs (232-265)
  • Profile (18-57)
  • Profile (106-138)
  • Profile (195-219)
app/services/competitions/seasons.py (4)
app/services/base.py (4)
  • TransfermarktBase (16-223)
  • request_url_page (104-117)
  • raise_exception_if_not_found (119-130)
  • get_text_by_xpath (152-206)
app/utils/utils.py (1)
  • trim (49-62)
app/utils/xpath.py (5)
  • Competitions (186-191)
  • Competitions (194-265)
  • Profile (18-57)
  • Profile (106-138)
  • Profile (195-219)
app/api/endpoints/competitions.py (1)
  • get_competition_seasons (28-31)
🔇 Additional comments (12)
Dockerfile (1)

4-4: Clean PYTHONPATH configuration for containerized environment.

The change to set PYTHONPATH=/app explicitly is appropriate for a containerized application. This is cleaner than concatenating with $PYTHONPATH, which would be empty in the base Python image anyway. The explicit absolute path ensures consistent module discovery for the new services, schemas, and endpoints added in this PR.

Verify that the import paths work correctly by ensuring that app/main.py and all new modules can properly resolve imports like from app.services.competitions.seasons import ... with this PYTHONPATH setup.

app/schemas/competitions/clubs.py (1)

1-15: LGTM!

Making season_id optional with Optional[str] = None properly accommodates national team competitions where the season identifier may not be available or applicable.

app/utils/xpath.py (4)

154-164: LGTM!

The union-based XPath expressions appropriately handle multiple HTML structures for player data extraction, providing fallback paths for different page layouts.


186-192: LGTM!

The new Clubs.Competitions nested class provides well-structured XPath selectors for extracting competition records and links.


197-219: LGTM!

The season-related XPath selectors provide comprehensive fallback patterns to handle various HTML structures on competition pages. The multi-line string concatenation is properly formatted.


232-265: LGTM!

The updated XPath selectors with union patterns properly handle both regular clubs and national teams. The inline comments explaining the rationale are helpful for maintainability.

Minor consideration: The PARTICIPANTS_URLS and PARTICIPANTS_NAMES selectors depend on the heading text "Clubs starting into tournament", which could break if Transfermarkt changes the heading. This is acceptable for scraping but worth noting for future maintenance.

app/services/competitions/seasons.py (3)

144-223: LGTM with a note on the robust fallback strategy.

The multiple XPath fallback patterns provide good resilience against HTML structure variations. The cascading approach (table cell → dropdown → alternative patterns → generic text extraction) is well-organized.


242-283: LGTM!

The season filtering logic properly handles deduplication, filters out non-season text entries, and validates season formats. Sorting by startYear descending provides a good user experience with the most recent seasons first.


285-297: LGTM!

The public API method is clean and returns a well-structured response with competition metadata and the parsed seasons list.

app/services/competitions/clubs.py (3)

8-16: LGTM!

The NATIONAL_TEAM_COMPETITIONS mapping provides a clean way to identify and handle national team competitions. The URL slugs align with Transfermarkt's URL structure.


126-149: LGTM!

The name cleaning logic handles various page title formats appropriately. The cascading string replacements and the fallback in the next() call provide good robustness.


70-85: LGTM!

The __post_init__ method cleanly branches between national team and regular league URL construction, maintaining good separation of concerns.

Comment thread app/services/competitions/clubs.py Outdated
Comment on lines +53 to +95
if is_cross_year:
# Cross-year format: "25/26" -> start_year: 2025, end_year: 2026
# Handle both "25/26" and "2024/25" formats, and old seasons like "99/00"
parts = season_name.split("/")
if len(parts) == 2:
start_part = parts[0].strip()
end_part = parts[1].strip()

# Parse start year
if len(start_part) == 2:
start_num = int(start_part)
# If start is >= 90, it's likely 1900s (e.g., "99/00" = 1999/2000)
# Otherwise assume 2000s (e.g., "25/26" = 2025/2026)
if start_num >= 90:
start_year = 1900 + start_num
else:
start_year = 2000 + start_num
else:
start_year = int(start_part)

# Parse end year
if len(end_part) == 2:
end_num = int(end_part)
start_num = int(start_part) if len(start_part) == 2 else (start_year % 100)
# If end is less than start (e.g., "99/00"), it means we crossed century
# So if start is 1999, end should be 2000
if end_num < start_num:
# Crossed century boundary - end is in next century
if start_year >= 2000:
end_year = 2000 + end_num
else:
# Start is in 1900s, end is in 2000s
end_year = 2000 + end_num
else:
# Same century as start
if start_year >= 2000:
end_year = 2000 + end_num
else:
end_year = 1900 + end_num
else:
end_year = int(end_part)

return (start_year, end_year)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing exception handling for malformed cross-year season strings.

The int() calls at lines 63, 71, 75, 76, and 93 can raise ValueError if the season string contains non-numeric parts (e.g., "25/abc"). While the single-year path (lines 98-103) has a try/except, the cross-year path does not.

Apply this diff to wrap the cross-year parsing in a try-except:

         if is_cross_year:
             # Cross-year format: "25/26" -> start_year: 2025, end_year: 2026
             # Handle both "25/26" and "2024/25" formats, and old seasons like "99/00"
             parts = season_name.split("/")
             if len(parts) == 2:
-                start_part = parts[0].strip()
-                end_part = parts[1].strip()
-
-                # Parse start year
-                if len(start_part) == 2:
-                    start_num = int(start_part)
+                try:
+                    start_part = parts[0].strip()
+                    end_part = parts[1].strip()
+
+                    # Parse start year
+                    if len(start_part) == 2:
+                        start_num = int(start_part)
+                        # ... rest of parsing logic ...
+                    # ... complete cross-year parsing ...
+                    return (start_year, end_year)
+                except ValueError:
+                    return (None, None)

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In app/services/competitions/seasons.py around lines 53 to 95, the cross-year
parsing block performs multiple int() conversions that can raise ValueError for
malformed inputs; wrap the entire cross-year parsing logic in a try/except
ValueError block, catch ValueError, handle it the same way as the single-year
path (e.g., log or swallow the error and return None / appropriate failure
value), and ensure no uncaught ValueError propagates out of this function.

- Add length validation for ids and names before zip() to prevent silent data loss
- Raise descriptive ValueError with logging if ids and names mismatch
- Simplify seasonId assignment logic for national teams
- Remove unnecessary try/except block (isdigit() prevents ValueError)
- Clean up unreachable fallback code
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
app/services/competitions/clubs.py (1)

106-119: Hardcoded tournament sizes remain outdated.

This issue was flagged in a previous review and remains unaddressed. The values for FIWC (32) and COPA (12) don't reflect current/upcoming tournament formats:

  • World Cup expands to 48 teams in 2026
  • Copa America has been 16 teams in recent editions (2016, 2024)

Consider making this configuration-aware or deriving expected size dynamically, since tournament formats change over time:

             expected_sizes = {
-                "FIWC": 32,  # World Cup has 32 participants
+                "FIWC": 48,  # World Cup: 48 teams from 2026
                 "EURO": 24,  # EURO has 24 participants
-                "COPA": 12,  # Copa America typically has 12 participants
+                "COPA": 16,  # Copa America: 16 teams (recent format)
                 "AFAC": 24,  # Asian Cup has 24 participants
                 "GOCU": 16,  # Gold Cup has 16 participants
                 "AFCN": 24,  # Africa Cup has 24 participants
             }
🧹 Nitpick comments (2)
app/services/competitions/clubs.py (2)

60-66: Redundant try/except block.

The isdigit() check on line 61 already guarantees the string can be parsed as an integer, so int(url_season_id) cannot raise ValueError. The try/except is dead code.

         url_season_id = self.season_id
         if url_season_id and url_season_id.isdigit():
-            try:
-                year = int(url_season_id)
-                url_season_id = str(year - 1)
-            except ValueError:
-                pass
+            url_season_id = str(int(url_season_id) - 1)

148-161: Fragile string manipulation for name cleaning.

The sequential string replacements and splits depend on specific text patterns from Transfermarkt pages, which could break if the site changes its formatting. The magic number 3 on line 158 is unclear.

Consider documenting the expected input formats and using a more explicit approach:

-            # Take the part that doesn't contain "Participants" or is the longest
-            name = next((p for p in parts if "Participants" not in p and len(p) > 3), parts[0])
+            # Filter out short fragments and "Participants" text
+            MIN_NAME_LENGTH = 4  # Exclude fragments like "20", "WC", etc.
+            name = next((p for p in parts if "Participants" not in p and len(p) >= MIN_NAME_LENGTH), parts[0])

Also, the trim(name) on line 150 is redundant since get_text_by_xpath already trims the result.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 25cbc7b and c412735.

📒 Files selected for processing (1)
  • app/services/competitions/clubs.py (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
app/services/competitions/clubs.py (3)
app/services/base.py (3)
  • TransfermarktBase (16-223)
  • get_list_by_xpath (132-150)
  • get_text_by_xpath (152-206)
app/utils/utils.py (2)
  • extract_from_url (19-46)
  • trim (49-62)
app/utils/xpath.py (2)
  • Competitions (186-191)
  • Competitions (194-265)
🔇 Additional comments (3)
app/services/competitions/clubs.py (3)

11-19: LGTM!

The national team competition mapping is clear and well-structured. The URL slugs align with Transfermarkt's URL patterns.


127-136: Good defensive validation.

The length mismatch check with detailed error logging is a solid safeguard against silent data corruption from XPath parsing issues.


175-200: Season ID handling is well-structured with appropriate fallbacks.

The year conversion logic correctly inverts the URL construction (subtracts 1 for URL, adds 1 for response). The fallback chain provides resilience.

One consideration: line 186-188 uses url_season_id as-is when it exists but isn't a digit. Verify this is intentional and won't propagate unexpected values.

- Add tournament size configuration to Settings class with environment variable support
- Replace hardcoded dict with settings.get_tournament_size() method
- Add warning logging when tournament size is not configured (instead of silent truncation)
- Proceed without truncation when size is unavailable (no silent data loss)
- Add validation for tournament sizes (must be positive integers)
- Add comprehensive unit tests for both configured and fallback paths
- Update README.md with new environment variables documentation

This prevents silent truncation when tournament sizes change (e.g., World Cup expanding to 48)
and allows easy configuration via environment variables.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
app/services/competitions/clubs.py (1)

100-127: CRITICAL: Outdated tournament size defaults will truncate valid participant data.

The past review comment correctly identified that tournament sizes are outdated. While the PR moved these to configuration (which is good), the default values remain incorrect:

  • World Cup (FIWC): Default is 32, but expanding to 48 teams starting in 2026. This will truncate 16 valid participants for 2026+ tournaments.
  • Copa America (COPA): Default is 12, but recent editions (2016, 2024) have 16 teams. This will truncate 4 valid participants.

Update the defaults in app/settings.py (lines 16-18) and README.md (lines 63, 65):

-    TOURNAMENT_SIZE_FIWC: Optional[int] = Field(default=32, description="World Cup expected participants")
+    TOURNAMENT_SIZE_FIWC: Optional[int] = Field(default=48, description="World Cup expected participants")
-    TOURNAMENT_SIZE_COPA: Optional[int] = Field(default=12, description="Copa America expected participants")
+    TOURNAMENT_SIZE_COPA: Optional[int] = Field(default=16, description="Copa America expected participants")

Consider making truncation more robust.

The current truncation simply takes the first N items (urls[:expected_size]), assuming they appear in the correct order. If the HTML structure changes or the XPath captures items differently, this could truncate actual participants while keeping non-participants. Consider:

  1. Only truncating when a clear "not qualified" marker is present in the HTML
  2. Deriving the expected size from the page content rather than hardcoding
  3. Adding validation that the truncated items match expected patterns (e.g., team URLs contain expected path segments)
README.md (1)

63-69: Update default values to match current tournament formats.

The documented defaults are outdated and should be updated along with the code changes in app/settings.py:

-| `TOURNAMENT_SIZE_FIWC`   | Expected number of participants for World Cup              | `32`         |
+| `TOURNAMENT_SIZE_FIWC`   | Expected number of participants for World Cup              | `48`         |
-| `TOURNAMENT_SIZE_COPA`   | Expected number of participants for Copa America           | `12`         |
+| `TOURNAMENT_SIZE_COPA`   | Expected number of participants for Copa America           | `16`         |

This aligns with the critical issue raised in the app/services/competitions/clubs.py review.

tests/competitions/test_competitions_clubs.py (1)

94-104: Update test assertions when fixing tournament size defaults.

Lines 98 and 100 assert the current (outdated) default values. When you update the defaults in app/settings.py to FIWC=48 and COPA=16, remember to update these test assertions:

-    assert settings.get_tournament_size("FIWC") == 32
+    assert settings.get_tournament_size("FIWC") == 48
-    assert settings.get_tournament_size("COPA") == 12
+    assert settings.get_tournament_size("COPA") == 16
app/settings.py (1)

16-21: CRITICAL: Update default tournament sizes to match current formats.

The default values are outdated and will cause data truncation:

-    TOURNAMENT_SIZE_FIWC: Optional[int] = Field(default=32, description="World Cup expected participants")
+    TOURNAMENT_SIZE_FIWC: Optional[int] = Field(default=48, description="World Cup expected participants")
-    TOURNAMENT_SIZE_COPA: Optional[int] = Field(default=12, description="Copa America expected participants")
+    TOURNAMENT_SIZE_COPA: Optional[int] = Field(default=16, description="Copa America expected participants")

Context:

  • World Cup expanded to 48 teams starting in 2026
  • Copa America recent editions (2016, 2024) have 16 teams

This is the root cause of the truncation issue flagged across multiple files.

🧹 Nitpick comments (1)
app/services/competitions/clubs.py (1)

178-204: Season ID extraction logic is correct but could be simplified.

The logic correctly handles the year conversion for national teams (subtracts 1 for URL, adds 1 back for response). The fallback chain is reasonable.

Consider simplifying the conditional nesting for better readability:

if self.is_national_team:
    # Extract URL season_id and convert to tournament year (add 1)
    url_season_id = extract_from_url(
        self.get_text_by_xpath(Competitions.Profile.URL),
        "season_id",
    )
    if url_season_id and url_season_id.isdigit():
        self.response["seasonId"] = str(int(url_season_id) + 1)
    elif url_season_id:
        self.response["seasonId"] = url_season_id
    elif self.season_id:
        self.response["seasonId"] = self.season_id
else:
    # For regular leagues, extract season_id normally
    season_id = extract_from_url(
        self.get_text_by_xpath(Competitions.Profile.URL),
        "season_id",
    )
    if season_id:
        self.response["seasonId"] = season_id
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c412735 and c6488f8.

📒 Files selected for processing (4)
  • README.md (1 hunks)
  • app/services/competitions/clubs.py (4 hunks)
  • app/settings.py (2 hunks)
  • tests/competitions/test_competitions_clubs.py (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
app/services/competitions/clubs.py (3)
app/utils/utils.py (2)
  • extract_from_url (19-46)
  • trim (49-62)
app/utils/xpath.py (7)
  • Competitions (186-191)
  • Competitions (194-265)
  • Clubs (105-191)
  • Clubs (232-265)
  • Profile (18-57)
  • Profile (106-138)
  • Profile (195-219)
app/settings.py (1)
  • get_tournament_size (38-56)
tests/competitions/test_competitions_clubs.py (1)
app/settings.py (2)
  • Settings (7-56)
  • get_tournament_size (38-56)
🔇 Additional comments (6)
app/services/competitions/clubs.py (2)

1-20: LGTM!

The imports and national team competition mapping are well-structured and clearly documented.


131-138: Excellent defensive validation!

The length mismatch check prevents silent data loss and provides clear diagnostics. This is a great safeguard against XPath parsing issues.

tests/competitions/test_competitions_clubs.py (2)

42-73: Well-structured tests with good coverage.

The tests for tournament size configuration, warning logging, and environment variable overrides are comprehensive and properly isolated using mocks.


107-113: Good validation testing.

The tests properly verify that invalid tournament sizes (negative and zero) are rejected with appropriate error messages.

app/settings.py (2)

23-36: Excellent validation logic.

The validator ensures tournament sizes are positive when provided, with clear error messages. The use of @field_validator with multiple fields is efficient.


38-56: Clean and well-documented accessor method.

The get_tournament_size method provides a clear interface for retrieving tournament sizes with proper type hints and documentation. The mapping approach is straightforward and maintainable.

Comment on lines +48 to +72
def __get_national_team_url(self) -> str:
"""
Construct the URL for national team competition participants page.

Returns:
str: The URL for the participants page.
"""
slug = NATIONAL_TEAM_COMPETITIONS.get(self.competition_id)
if not slug:
raise ValueError(f"Unknown national team competition: {self.competition_id}")

# For national teams, season_id in URL is the year BEFORE the tournament
# E.g., 2006 World Cup uses saison_id=2005
url_season_id = self.season_id
if url_season_id and url_season_id.isdigit():
try:
year = int(url_season_id)
url_season_id = str(year - 1)
except ValueError:
pass

return (
f"https://www.transfermarkt.com/{slug}/teilnehmer/"
f"pokalwettbewerb/{self.competition_id}/saison_id/{url_season_id}"
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, locate the file and examine the context around it
find . -type f -name "clubs.py" | head -5

Repository: eskoubar95/transfermarkt-api

Length of output: 173


🏁 Script executed:

#!/bin/bash
# Find the NATIONAL_TEAM_COMPETITIONS definition
rg -n "NATIONAL_TEAM_COMPETITIONS" --type py -B 2 -A 10

Repository: eskoubar95/transfermarkt-api

Length of output: 2563


🏁 Script executed:

#!/bin/bash
# Check for tests related to this method
fd -e "test*.py" -o -e "*_test.py" | xargs rg -l "__get_national_team_url\|national_team" 2>/dev/null | head -5

Repository: eskoubar95/transfermarkt-api

Length of output: 188


🏁 Script executed:

#!/bin/bash
# Examine the full context of the method and surrounding code
rg -n "def __get_national_team_url" --type py -B 5 -A 30

Repository: eskoubar95/transfermarkt-api

Length of output: 2781


🏁 Script executed:

#!/bin/bash
# Search for tests related to competitions/clubs
find . -type f -name "*.py" -path "*/test*" | xargs rg -l "clubs\|competition" 2>/dev/null | head -10

Repository: eskoubar95/transfermarkt-api

Length of output: 54


🏁 Script executed:

#!/bin/bash
# Search for any test files in the repo
fd -e "test*.py" -o -e "*_test.py"

Repository: eskoubar95/transfermarkt-api

Length of output: 188


🏁 Script executed:

#!/bin/bash
# Look for any comments or documentation about URL construction patterns
rg -n "saison_id\|season_id.*URL\|year.*tournament" --type py -B 2 -A 2

Repository: eskoubar95/transfermarkt-api

Length of output: 54


🏁 Script executed:

#!/bin/bash
# Check if there are any other methods that construct URLs to understand the pattern
rg -n "def __get.*url\|transfermarkt.com" --type py -B 1 -A 8

Repository: eskoubar95/transfermarkt-api

Length of output: 54


🏁 Script executed:

#!/bin/bash
# Try to understand if there's a specific pattern by checking the entire file structure
wc -l app/services/competitions/clubs.py
cat -n app/services/competitions/clubs.py | head -100

Repository: eskoubar95/transfermarkt-api

Length of output: 4417


Remove the redundant try/except block.

Lines 63-67 are unreachable. The isdigit() check on line 62 guarantees the string contains only digits, so int() cannot raise ValueError.

         url_season_id = self.season_id
         if url_season_id and url_season_id.isdigit():
-            try:
-                year = int(url_season_id)
-                url_season_id = str(year - 1)
-            except ValueError:
-                pass
+            year = int(url_season_id)
+            url_season_id = str(year - 1)

Regarding the year-1 URL logic: The pattern is consistently applied to all six national team competitions (FIWC, EURO, COPA, AFAC, GOCU, AFCN) as documented. Verification that this pattern applies correctly across different tournament years requires testing against the actual TransferMarkt site, which cannot be done in this environment.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def __get_national_team_url(self) -> str:
"""
Construct the URL for national team competition participants page.
Returns:
str: The URL for the participants page.
"""
slug = NATIONAL_TEAM_COMPETITIONS.get(self.competition_id)
if not slug:
raise ValueError(f"Unknown national team competition: {self.competition_id}")
# For national teams, season_id in URL is the year BEFORE the tournament
# E.g., 2006 World Cup uses saison_id=2005
url_season_id = self.season_id
if url_season_id and url_season_id.isdigit():
try:
year = int(url_season_id)
url_season_id = str(year - 1)
except ValueError:
pass
return (
f"https://www.transfermarkt.com/{slug}/teilnehmer/"
f"pokalwettbewerb/{self.competition_id}/saison_id/{url_season_id}"
)
def __get_national_team_url(self) -> str:
"""
Construct the URL for national team competition participants page.
Returns:
str: The URL for the participants page.
"""
slug = NATIONAL_TEAM_COMPETITIONS.get(self.competition_id)
if not slug:
raise ValueError(f"Unknown national team competition: {self.competition_id}")
# For national teams, season_id in URL is the year BEFORE the tournament
# E.g., 2006 World Cup uses saison_id=2005
url_season_id = self.season_id
if url_season_id and url_season_id.isdigit():
year = int(url_season_id)
url_season_id = str(year - 1)
return (
f"https://www.transfermarkt.com/{slug}/teilnehmer/"
f"pokalwettbewerb/{self.competition_id}/saison_id/{url_season_id}"
)
🤖 Prompt for AI Agents
In app/services/competitions/clubs.py around lines 48 to 72, the try/except that
wraps int(url_season_id) is redundant because url_season_id.isdigit() already
guarantees numeric input; remove the try/except and directly compute the
previous year by converting to int and subtracting one (e.g., url_season_id =
str(int(url_season_id) - 1)) while preserving the existing isdigit() guard and
comments.

- Remove extra HTTP request to fetch club profile just to read isNationalTeam
- Set is_national_team=None to let TransfermarktClubPlayers use DOM heuristics
- Remove broad except Exception that silently swallowed all errors
- Improve performance by eliminating redundant network call
- Players class already has robust DOM-based detection for national teams
- Move datetime and HTTPException imports from method level to module level
- Improves code readability and marginally improves performance
- Follows Python best practices for import organization
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c6488f8 and 424b974.

📒 Files selected for processing (1)
  • app/api/endpoints/clubs.py (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
app/api/endpoints/clubs.py (2)
app/services/clubs/competitions.py (2)
  • TransfermarktClubCompetitions (9-86)
  • get_club_competitions (74-86)
app/schemas/clubs/competitions.py (1)
  • ClubCompetitions (12-15)
🔇 Additional comments (2)
app/api/endpoints/clubs.py (2)

6-6: LGTM!

The import follows the established pattern and is necessary for the new endpoint.


37-41: LGTM!

The new endpoint follows the established pattern in the file perfectly. The implementation is clean, type hints are correct, and the service integration matches the existing endpoints.

Comment on lines +30 to +32
# Let TransfermarktClubPlayers use its DOM heuristics to detect national teams
# This avoids an extra HTTP request and relies on the players page structure
tfmkt = TransfermarktClubPlayers(club_id=club_id, season_id=season_id, is_national_team=None)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Find the TransfermarktClubPlayers class definition
find . -type f -name "*.py" | xargs grep -l "class TransfermarktClubPlayers"

Repository: eskoubar95/transfermarkt-api

Length of output: 103


🏁 Script executed:

# Search for the class and its initialization logic
rg -n "class TransfermarktClubPlayers" -A 30

Repository: eskoubar95/transfermarkt-api

Length of output: 2494


🏁 Script executed:

# Look for is_national_team parameter handling
rg -n "is_national_team" -B 3 -A 3

Repository: eskoubar95/transfermarkt-api

Length of output: 7065


🏁 Script executed:

# Check if TransfermarktClubPlayers is a dataclass and see full class definition
head -50 app/services/clubs/players.py

Repository: eskoubar95/transfermarkt-api

Length of output: 2170


🏁 Script executed:

# Check the parent class TransfermarktBase
rg -n "class TransfermarktBase" -A 20

Repository: eskoubar95/transfermarkt-api

Length of output: 2338


🏁 Script executed:

# Check if is_national_team is defined anywhere in TransfermarktClubPlayers
rg -n "is_national_team" app/services/clubs/players.py

Repository: eskoubar95/transfermarkt-api

Length of output: 212


🏁 Script executed:

# Search for tests that call get_club_players or TransfermarktClubPlayers
rg -n "get_club_players\|TransfermarktClubPlayers" tests/ --type py

Repository: eskoubar95/transfermarkt-api

Length of output: 54


🏁 Script executed:

# Check git log to see recent changes to clubs/players.py
git log --oneline -20 app/services/clubs/players.py

Repository: eskoubar95/transfermarkt-api

Length of output: 148


🏁 Script executed:

# Check if there are any **kwargs handling in the dataclass or parent
git show HEAD:app/services/clubs/players.py | head -80

Repository: eskoubar95/transfermarkt-api

Length of output: 3621


Remove the invalid is_national_team=None parameter.

The TransfermarktClubPlayers class does not accept an is_national_team parameter. Passing is_national_team=None will cause a TypeError at runtime. The DOM heuristics for detecting national teams work independently within the __parse_club_players() method (line 61 checks len(self.page.xpath(Clubs.Players.PAGE_INFOS)) == 0), so the parameter is unnecessary. Change line 32 to:

tfmkt = TransfermarktClubPlayers(club_id=club_id, season_id=season_id)
🤖 Prompt for AI Agents
In app/api/endpoints/clubs.py around lines 30 to 32, the call to
TransfermarktClubPlayers incorrectly passes an unsupported parameter
is_national_team=None which will raise a TypeError at runtime; remove that
argument so the constructor is invoked as
TransfermarktClubPlayers(club_id=club_id, season_id=season_id) and rely on the
class's internal DOM heuristics for national team detection.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
app/services/clubs/competitions.py (3)

26-31: Calendar-year default may not match “current season” semantics

self.season_id = str(datetime.now().year) treats the current season as the current calendar year, which can diverge from competition seasons that span years (e.g., 24/25 vs 2025). If API consumers expect “current season” in league terms rather than pure year, consider delegating to a shared “current season” resolver (e.g., from the seasons service or config) instead of datetime.now().year.


38-71: Parsing logic is solid; consider refining type hints and XPath reuse

The overall extraction from RECORD_TABLE via COMPETITION_LINKS is straightforward and defensively coded (skips rows without links, requires both id and name before appending).

A couple of small polish points:

  • The return annotation can be made more precise, e.g. list[dict[str, str]], which helps static checking and downstream consumers.
  • Since app.utils.xpath.Clubs.Competitions already exposes COMPETITION_NAME / COMPETITION_URL, consider using those instead of inlining ".//text()" and "href" here for better centralization if the DOM ever changes.
  • If you ever observe duplicate competitions in this table, you might want to de‑duplicate by comp_id before returning, but that can be deferred until there’s a concrete need.

11-36: Tighten dataclass typing to use Optional for nullable fields

The type hints club_id: str = None and season_id: str = None conflict with the declared types and will trip static analyzers. Use Optional instead: club_id: Optional[str] = None and season_id: Optional[str] = None, or make club_id required without a default since it's essential to form the URL. This aligns with the existing pattern used elsewhere in the codebase (e.g., competitions/seasons.py uses competition_id: Optional[str] = None).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 424b974 and 6b07ff9.

📒 Files selected for processing (1)
  • app/services/clubs/competitions.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
app/services/clubs/competitions.py (4)
app/services/base.py (2)
  • TransfermarktBase (16-223)
  • request_url_page (104-117)
app/utils/utils.py (1)
  • extract_from_url (19-46)
app/utils/xpath.py (2)
  • Competitions (186-191)
  • Competitions (194-265)
app/api/endpoints/clubs.py (1)
  • get_club_competitions (38-41)
🔇 Additional comments (1)
app/services/clubs/competitions.py (1)

73-85: Public getter wiring looks clean and consistent

Populating self.response with id, seasonId, and competitions and returning it directly is clear and matches the intended schema; no issues from a correctness perspective as long as TransfermarktBase initializes response as a dict.

- Move datetime and HTTPException imports from method level to module level
- Improves code readability and marginally improves performance
- Follows Python best practices for import organization
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
app/services/clubs/players.py (3)

86-113: Consider pre-indexing rows by URL for O(n) lookup.

The current nested loop has O(n×m) complexity. For pages with many players, consider building a URL-to-row map first:

-            # Iterate over the same URL list used to create players_ids to maintain order and count
-            for url in players_urls:
-                # Find first row containing this player URL
-                found = False
-                for row in all_player_rows:
-                    row_urls = row.xpath(".//td[@class='hauptlink']//a[contains(@href, '/profil/spieler')]/@href")
-                    if url in row_urls:
+            # Build URL-to-row map for O(1) lookups
+            url_to_row = {}
+            for row in all_player_rows:
+                row_urls = row.xpath(".//td[@class='hauptlink']//a[contains(@href, '/profil/spieler')]/@href")
+                for row_url in row_urls:
+                    if row_url not in url_to_row:
+                        url_to_row[row_url] = row
+
+            # Iterate over the same URL list used to create players_ids to maintain order and count
+            for url in players_urls:
+                row = url_to_row.get(url)
+                if row is not None:
                     tds = row.xpath(".//td")
                     # ... rest of extraction logic

94-106: Consider extracting TD indices as named constants.

The hardcoded indices (4, 5) for position and DOB columns are fragile. For consistency with other XPath selectors in app/utils/xpath.py, consider defining these as constants:

# In Clubs.Players or similar
NATIONAL_TEAM_POSITION_TD_INDEX = 4
NATIONAL_TEAM_DOB_TD_INDEX = 5

125-178: Silent truncation may cause data loss.

The padding pattern (list + [default] * base_length)[:base_length] silently truncates lists longer than base_length. If a list has more elements due to parsing quirks, those extra entries are discarded without warning.

Consider logging a warning when truncation occurs to aid debugging:

+import logging
+
+logger = logging.getLogger(__name__)
+
+def _normalize_list(data: list, base_length: int, default, name: str) -> list:
+    if len(data) > base_length:
+        logger.warning(f"{name} has {len(data)} elements, truncating to {base_length}")
+    return (data + [default] * base_length)[:base_length]

This also reduces the repetitive normalization code throughout the method.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6b07ff9 and 085db81.

📒 Files selected for processing (2)
  • app/services/clubs/players.py (4 hunks)
  • app/services/competitions/search.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • app/services/competitions/search.py
🧰 Additional context used
🧬 Code graph analysis (1)
app/services/clubs/players.py (2)
app/services/base.py (1)
  • get_list_by_xpath (132-150)
app/utils/utils.py (2)
  • extract_from_url (19-46)
  • safe_regex (65-84)
🔇 Additional comments (4)
app/services/clubs/players.py (4)

18-25: LGTM!

The is_national_team optional flag is well-documented with clear behavior for when it's None (falling back to DOM-based detection).


62-64: Good fix for data alignment.

Storing players_urls as the canonical source and deriving players_ids from it ensures consistent ordering across all player attribute lists. This addresses the previously flagged alignment issue.


68-74: LGTM!

The detection logic is clean: uses explicit flag when provided, otherwise falls back to a sensible DOM heuristic. The local variable shadowing of self.is_national_team is intentional and improves readability within the method.


180-215: LGTM!

The zip-based return structure is clean. With proper normalization above, all lists will have matching lengths, ensuring correct player-to-attribute mapping.

- Keep imports at module level in clubs/competitions.py (from CodeRabbit review)
- Preserve is_national_team flag logic in clubs/players.py
- Keep name padding in competitions/search.py
- Add .DS_Store to .gitignore
@eskobar95 eskobar95 force-pushed the feature/competition-seasons-endpoint branch from be652e8 to 1383dc9 Compare December 1, 2025 21:57
- Remove whitespace from blank lines (W293)
- Add missing trailing commas (COM812)
- Split long XPath lines to comply with E501 line length limit
- Format XPath strings to comply with line length
- Format list comprehensions
- Format is_season condition
@eskobar95 eskobar95 merged commit 3e4c6b7 into main Dec 1, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant