Skip to content

feat(ui): add MEF search and preview views#205

Open
rerowep wants to merge 1 commit intorero:stagingfrom
rerowep:wep-ui
Open

feat(ui): add MEF search and preview views#205
rerowep wants to merge 1 commit intorero:stagingfrom
rerowep:wep-ui

Conversation

@rerowep
Copy link
Copy Markdown
Contributor

@rerowep rerowep commented Mar 24, 2026

  • Add dedicated list and preview pages for agents, concepts, places, and all_mef. Improve search UX with sort and size controls, richer facets, and entity-specific result templates with thumbnails.
  • Add all_mef API and alias handling support, including setup/CLI hooks to create the alias and route compatibility for all/mef item access.
  • Refine detail rendering and navigation: hide data.rero.ch identifiedBy values, simplify header/button labels, and preserve return navigation to the originating search page.

Summary by CodeRabbit

  • New Features

    • Unified "all" MEF search API and per-record redirect; list and preview UI pages for agents, concepts, places; /robots.txt added.
    • CLI command and app startup hook to create/manage the combined "all_mef" search alias.
  • Behavior Changes

    • Simplified deleted-entity filtering and adjusted search defaults; monitoring now reports per-category DB/ES counts with guarded diffs.
  • Tests

    • Many new/updated API/UI tests, webpack-manifest test stub, WSGI middleware test, and monitoring/unit test coverage.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 24, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds cross-index "all_mef" support: new AllMefSearch class, alias creation/management (init + CLI), API endpoints and UI views for aggregated MEF search/preview, deleted-handling filter/query changes, monitoring enhancements, and many tests/fixtures.

Changes

Cohort / File(s) Summary
Search class
rero_mef/all_mef.py
Adds AllMefSearch (subclass of invenio_search.api.RecordsSearch) with Meta configuring index="all_mef", doc_types=None, fields=("*",), facets={}, and default_filter=None.
Alias management (init & CLI)
rero_mef/ext.py, rero_mef/cli.py
Adds REROMEFAPP.ensure_all_mef_alias(app) invoked from init_app and utils all_mef_alias CLI; both resolve targets mef, concepts_mef, places_mef to concrete indices and create/update the all_mef alias via current_search_client, handling NotFoundError.
API views & routing
rero_mef/views.py
Adds /all/mef search and /all/mef/<pid_value> redirect endpoints, _ensure_all_mef_alias() and _endpoint_for_all_mef_pid(), composes AllMefSearch via and_search_factory, normalizes hits, maps _index→entity, builds links/pagination, and handles NotFoundError/InvalidQueryRESTError.
Theme UI & display logic
rero_mef/theme/views.py
Adds /robots.txt, list and preview routes for agents, concepts, places, and /all; schema-label loaders, display-row builders, identifier/link formatting helpers, ENTITY_CONFIG, and many presentation helpers for templates.
Filters & query changes
rero_mef/filter.py, rero_mef/query.py
Removes multi_exists_filter/deleted_entities_agg; adds and_terms_filter() and all_mef_entity_filter(); simplifies deleted-exclusion in _default_parser to a single exclude("exists", field="deleted").
Monitoring
rero_mef/monitoring/api.py
Adds Monitoring.info_category(mef_class, endpoints) producing per-doc_type DB/ES counts with guarded arithmetic and NotFoundError handling for missing ES indices; also adjusts check_mef() behavior.
Tests & fixtures
tests/api/..., tests/ui/..., tests/api/conftest.py, tests/ui/conftest.py
Adds comprehensive tests: all-mef REST (tests/api/test_all_mef_rest.py), entity serializers, theme UI views, extension alias logic, monitoring, tasks, identifier/formatting unit tests, WSGI API prefix middleware, and an autouse webpack manifest fixture.
REST test updates
tests/api/test_agents_mef_rest.py, tests/api/test_concepts_mef_rest.py, tests/api/test_places_mef_rest.py
Updates aggregation expectations: adds authorized_access_point (and country_associated where applicable) and removes prior deleted_entities assertions.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant FlaskApp as Flask App
    participant AliasMgr as Alias Manager
    participant ES as Elasticsearch
    participant RecordsREST as Records-REST

    Client->>FlaskApp: GET /all/mef?q=...&page=1
    FlaskApp->>AliasMgr: _ensure_all_mef_alias()
    AliasMgr->>ES: indices.get_alias / indices.exists (resolve targets)
    ES-->>AliasMgr: alias mappings / exists result
    AliasMgr->>ES: indices.put_alias / indices.update_aliases (create/update all_mef)
    AliasMgr-->>FlaskApp: alias ready
    FlaskApp->>FlaskApp: build search via and_search_factory + AllMefSearch
    FlaskApp->>ES: execute search (AllMefSearch)
    ES-->>FlaskApp: hits (include _index, total)
    FlaskApp->>FlaskApp: normalize hits, map _index→entity, build links/pagination
    Client->>FlaskApp: GET /all/mef/<pid>
    FlaskApp->>RecordsREST: _endpoint_for_all_mef_pid(pid)
    RecordsREST-->>FlaskApp: endpoint or null
    alt endpoint found
        FlaskApp-->>Client: 302 redirect to concrete item URL
    else not found
        FlaskApp-->>Client: 404 JSON
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~55 minutes

Possibly related PRs

Suggested reviewers

  • Garfield-fr
  • jma
  • PascalRepond

Poem

🐰 I hopped through aliases, neat and spry,

I stitched agents, concepts, places in one sky.
I found the indices, tied them in a loop,
made searches quick — then munched a carrot soup.
Hooray for one alias, and one tidy group!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(ui): add MEF search and preview views' accurately summarizes the main change: adding UI pages for MEF search and preview functionality across multiple entity types.
Docstring Coverage ✅ Passed Docstring coverage is 88.15% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
rero_mef/theme/views.py (1)

162-165: Referrer comparison may not match when ports differ.

The check referrer_parts.netloc == request.host compares the full netloc (host:port) against request.host. If the referrer URL lacks an explicit port while request.host includes one (or vice versa), the comparison fails even for same-origin requests. The fallback to list_url is safe, so this is not a bug—just potentially unexpected behavior where same-origin referrers might not be preserved.

♻️ Optional: Compare hostnames only
     if request.referrer:
         referrer_parts = urlparse(request.referrer)
-        if not referrer_parts.netloc or referrer_parts.netloc == request.host:
+        referrer_host = referrer_parts.hostname or ""
+        request_host = request.host.rsplit(":", 1)[0] if request.host else ""
+        if not referrer_parts.netloc or referrer_host == request_host:
             back_url = request.referrer
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rero_mef/theme/views.py` around lines 162 - 165, The referrer-origin check
using referrer_parts.netloc vs request.host can fail when ports differ; update
the logic around request.referrer/urlparse (where back_url is set) to compare
hostnames instead of netlocs: use referrer_parts.hostname and extract the
hostname from request.host (e.g., split on ':' or use a helper) and then check
if not referrer_parts.hostname or referrer_hostname == request_hostname before
assigning back_url (keeping the existing fallback to list_url intact).
rero_mef/views.py (1)

177-186: Per-request alias verification is inefficient.

_ensure_all_mef_alias() performs multiple Elasticsearch API calls (exists_alias, get_alias, exists, update_aliases) on every /all/mef request. Since the alias is also created during app initialization in ext.py, this redundant verification adds unnecessary latency.

Consider caching the alias existence check or removing the per-request verification entirely, relying on the startup initialization.

♻️ Option 1: Simple caching with module-level flag
+_all_mef_alias_verified = False
+
 def _ensure_all_mef_alias():
     """Ensure all_mef alias exists and points to available MEF indexes."""
+    global _all_mef_alias_verified
+    if _all_mef_alias_verified:
+        return True
+
     alias_name = "all_mef"
     # ... existing logic ...
+
+    result = current_search_client.indices.exists_alias(name=alias_name)
+    if result:
+        _all_mef_alias_verified = True
+    return result
♻️ Option 2: Trust startup initialization
 `@api_blueprint.route`("/all/mef", strict_slashes=False)
 `@api_blueprint.route`("/all/mef/", strict_slashes=False)
 def all_mef_search():
     # ...
-    if not _ensure_all_mef_alias():
-        return jsonify(
-            {
-                "status": 404,
-                "message": (
-                    "The all_mef alias is missing and no MEF indexes were found. "
-                    "Index/load mef, concepts_mef and places_mef first."
-                ),
-            }
-        ), 404
+    # Alias is ensured at app startup via ext.py
+    # NotFoundError below handles the case where it's missing
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rero_mef/views.py` around lines 177 - 186, The per-request call to
_ensure_all_mef_alias() is causing redundant ES calls; either remove this
runtime verification and rely on the alias creation performed at startup in
ext.py, or implement a simple module-level cache flag (e.g.,
ALL_MEF_ALIAS_READY) that _ensure_all_mef_alias() sets to True on successful
creation so subsequent /all/mef requests skip the ES checks; update the view to
check the flag (or drop the call) and ensure ext.py sets the flag after creating
the alias so the view no longer invokes
exists_alias/get_alias/exists/update_aliases on every request.
rero_mef/ext.py (1)

64-116: Consider consistency with other alias management code.

This method uses put_alias in a loop, while rero_mef/views.py (line 82) and rero_mef/cli.py (line 1179) use update_aliases with batch actions. The update_aliases approach is atomic—either all changes succeed or none do—whereas iterative put_alias calls can result in partial state if one fails mid-loop.

For app initialization, the current approach is acceptable since failures are logged and the app can still start. However, for consistency across the codebase, consider aligning on one approach.

♻️ Optional: Use update_aliases for atomicity
         for target in target_aliases:
             try:
                 # Resolve alias -> concrete indexes.
                 alias_mapping = search_client.indices.get_alias(name=target)
             except Exception:
                 alias_mapping = {}

             index_names = list(alias_mapping.keys())
             if not index_names:
                 # Fallback: target can itself be a concrete index name.
                 index_names = [target]

-            for index_name in index_names:
-                try:
-                    search_client.indices.put_alias(
-                        index=index_name,
-                        name=alias_name,
-                    )
-                except Exception as err:
-                    app.logger.warning(
-                        "Could not attach alias %s to %s: %s",
-                        alias_name,
-                        index_name,
-                        err,
-                    )
+            actions.extend(
+                {"add": {"index": idx, "alias": alias_name}}
+                for idx in index_names
+            )
+
+        if actions:
+            try:
+                search_client.indices.update_aliases(body={"actions": actions})
+            except Exception as err:
+                app.logger.warning("Could not update %s alias: %s", alias_name, err)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rero_mef/ext.py` around lines 64 - 116, The ensure_all_mef_alias method
currently calls search_client.indices.put_alias in a loop which can leave
partial state; instead, collect add-alias actions for each resolved concrete
index for targets in target_aliases and call
search_client.indices.update_aliases once (using the same alias_name) to apply
them atomically; update error handling to catch exceptions from update_aliases
and log a single warning (referencing ensure_all_mef_alias, target_aliases,
alias_name, and search_client.indices.update_aliases) so behavior matches the
batch approach used in rero_mef/views.py and rero_mef/cli.py.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@rero_mef/views.py`:
- Around line 71-78: The code currently swallows exceptions in two try/except
blocks (around the earlier block and the one checking
current_search_client.indices.exists) which hides errors; replace both "except
Exception: pass" with "except Exception as e:" and log the exception at debug
level including context (alias_name and target) and the exception info so
behavior remains the same but failures are visible; specifically update the
blocks that call current_search_client.indices.exists(...) and any preceding try
that currently passes to call your module logger (e.g., logger.debug or the
existing module logger) with a message like "alias resolution failed for
alias_name=%s target=%s: %s" and include e (or use logger.exception at debug
level) to preserve stack info.

---

Nitpick comments:
In `@rero_mef/ext.py`:
- Around line 64-116: The ensure_all_mef_alias method currently calls
search_client.indices.put_alias in a loop which can leave partial state;
instead, collect add-alias actions for each resolved concrete index for targets
in target_aliases and call search_client.indices.update_aliases once (using the
same alias_name) to apply them atomically; update error handling to catch
exceptions from update_aliases and log a single warning (referencing
ensure_all_mef_alias, target_aliases, alias_name, and
search_client.indices.update_aliases) so behavior matches the batch approach
used in rero_mef/views.py and rero_mef/cli.py.

In `@rero_mef/theme/views.py`:
- Around line 162-165: The referrer-origin check using referrer_parts.netloc vs
request.host can fail when ports differ; update the logic around
request.referrer/urlparse (where back_url is set) to compare hostnames instead
of netlocs: use referrer_parts.hostname and extract the hostname from
request.host (e.g., split on ':' or use a helper) and then check if not
referrer_parts.hostname or referrer_hostname == request_hostname before
assigning back_url (keeping the existing fallback to list_url intact).

In `@rero_mef/views.py`:
- Around line 177-186: The per-request call to _ensure_all_mef_alias() is
causing redundant ES calls; either remove this runtime verification and rely on
the alias creation performed at startup in ext.py, or implement a simple
module-level cache flag (e.g., ALL_MEF_ALIAS_READY) that _ensure_all_mef_alias()
sets to True on successful creation so subsequent /all/mef requests skip the ES
checks; update the view to check the flag (or drop the call) and ensure ext.py
sets the flag after creating the alias so the view no longer invokes
exists_alias/get_alias/exists/update_aliases on every request.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b84704d6-ed29-48dc-b560-b9c6105a8fa8

📥 Commits

Reviewing files that changed from the base of the PR and between 9b291a0 and 8b913ad.

⛔ Files ignored due to path filters (17)
  • pyproject.toml is excluded by none and included by none
  • rero_mef/config.py is excluded by !rero_mef/config.py and included by rero_mef/**/*.py
  • rero_mef/theme/assets/scss/rero_mef/mef.scss is excluded by none and included by none
  • rero_mef/theme/static/templates/invenio_search_ui/searchbar.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/agents_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/all_mef_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/concepts_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/count_inline.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/facets.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/places_results.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/footer.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/frontpage.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/header.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/mef_detail.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/mef_search.html is excluded by none and included by none
  • scripts/setup is excluded by none and included by none
  • uv.lock is excluded by !**/*.lock and included by none
📒 Files selected for processing (7)
  • rero_mef/all_mef.py
  • rero_mef/cli.py
  • rero_mef/ext.py
  • rero_mef/filter.py
  • rero_mef/monitoring/api.py
  • rero_mef/theme/views.py
  • rero_mef/views.py

@coveralls
Copy link
Copy Markdown
Collaborator

coveralls commented Mar 30, 2026

Coverage Status

coverage: 87.549% (+0.9%) from 86.621%
when pulling 6b83350 on rerowep:wep-ui
into 9b291a0 on rero:staging.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (6)
rero_mef/theme/views.py (2)

47-58: Redundant Disallow rules in robots.txt.

With Disallow: / at the root, all subsequent Disallow rules are redundant as the root rule already blocks everything. Either remove the specific paths or change the root rule to be more selective if partial crawling is intended.

♻️ Proposed fix (if blocking all paths is intended)
 def robots():
     """Serve robots.txt disallowing all UI paths."""
     content = (
         "User-agent: *\n"
         "Disallow: /\n"
-        "Disallow: /agents\n"
-        "Disallow: /concepts\n"
-        "Disallow: /places\n"
-        "Disallow: /all\n"
     )
     return Response(content, mimetype="text/plain")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rero_mef/theme/views.py` around lines 47 - 58, The robots() view builds a
robots.txt that contains "Disallow: /" plus additional specific Disallow lines
which are redundant; either remove the specific path lines so content only
contains "User-agent: *\nDisallow: /\n" (if you intend to block everything), or
replace the root "Disallow: /" with the selective paths (e.g., keep "Disallow:
/agents\nDisallow: /concepts\n..." and drop "Disallow: /") if you intend only to
block those endpoints; update the content string inside the robots() function
accordingly.

393-411: Consider logging schema load failures.

While the graceful fallback to an empty dict is appropriate, silently catching all exceptions could hide configuration issues (e.g., malformed JSON schemas, permission problems). Consider logging at debug level.

♻️ Proposed fix
+from flask import current_app
+
 def _load_schema_labels(entity_name):
     """Load property titles from local common JSON schema files."""
     schema_name = _SCHEMA_NAME_MAP.get(entity_name)
     if not schema_name:
         return {}

     schema_path = (
         Path(__file__).resolve().parents[1] / "jsonschemas" / "common" / schema_name
     )
     try:
         with schema_path.open("r", encoding="utf-8") as fp:
             schema = json.load(fp)
-    except Exception:
+    except Exception as err:
+        current_app.logger.debug("Could not load schema %s: %s", schema_path, err)
         return {}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rero_mef/theme/views.py` around lines 393 - 411, The _load_schema_labels
function currently swallows all exceptions when opening/parsing schema files;
update it to log the failure at debug (or appropriate) level so configuration
issues are visible: wrap the schema file open/json.load in the existing except
block as "except Exception as e:" and call the module logger (e.g.,
logging.getLogger(__name__) or the module's logger name) to log the schema_path
and exception information (including traceback or str(e)) before returning {}.
Ensure you reference the same symbols: _load_schema_labels, schema_path, and the
exception variable (e) so the log includes which schema file failed and why.
rero_mef/views.py (3)

62-105: Consider consolidating alias management logic.

There are now three implementations of alias management:

  1. rero_mef/ext.py:ensure_all_mef_alias (at app init)
  2. rero_mef/views.py:_ensure_all_mef_alias (at request time)
  3. rero_mef/cli.py:all_mef_alias (CLI command)

While each serves a different context, the core logic is duplicated. Consider extracting the shared alias-creation logic into a utility function that all three can call.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rero_mef/views.py` around lines 62 - 105, The _ensure_all_mef_alias function
in rero_mef/views.py duplicates alias-management logic found in
rero_mef/ext.py:ensure_all_mef_alias and rero_mef/cli.py:all_mef_alias; extract
the shared logic (building alias actions, checking existing aliases/indexes,
calling indices.update_aliases and returning existence) into a single utility
function (e.g., a new helper like build_or_apply_all_mef_alias or
ensure_all_mef_alias_core) and have _ensure_all_mef_alias,
ext.ensure_all_mef_alias and cli.all_mef_alias call that helper to avoid
duplication; keep the same behavior (logging and exception handling) but
centralize the core steps: determine targets, construct actions, call
update_aliases when needed, and return alias existence.

204-213: Unhandled exceptions from and_search_factory.

Per the context snippet from rero_mef/query.py, and_search_factory can raise exceptions beyond InvalidQueryRESTError, including AttributeError or IndexError if the search object lacks expected attributes. While AllMefSearch should always have _index populated, consider wrapping this in a broader try-except to return a 500 with a helpful message rather than an unhandled server error.

♻️ Proposed fix
     search = AllMefSearch()
     try:
         search, _ = and_search_factory(None, search)
     except InvalidQueryRESTError:
         return jsonify(
             {
                 "status": 400,
                 "message": "The syntax of the search query is invalid.",
             }
         ), 400
+    except Exception as err:
+        current_app.logger.exception("Unexpected error in search factory")
+        return jsonify(
+            {
+                "status": 500,
+                "message": "An unexpected error occurred while processing the search.",
+            }
+        ), 500
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rero_mef/views.py` around lines 204 - 213, and_search_factory may raise
exceptions other than InvalidQueryRESTError (e.g., AttributeError/IndexError);
update the try/except around the call to and_search_factory (where
AllMefSearch() is created) to keep the existing except InvalidQueryRESTError
handling but also add a broader except Exception as e clause that returns a 500
JSON response (e.g., {"status":500, "message":"Internal server error while
building search", "detail": str(e)}) so the server does not crash on unexpected
errors and callers receive a helpful message; reference and_search_factory,
AllMefSearch, and InvalidQueryRESTError when locating the change.

236-247: Index name detection is fragile.

Using substring matching ("concepts_mef" in index_name) to determine entity type will break if index naming conventions change (e.g., versioned indices like concepts_mef-v2). Consider a more robust approach, such as checking against a mapping or using the alias information.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rero_mef/views.py` around lines 236 - 247, The current substring checks on
index_name (e.g., "concepts_mef" in index_name) are fragile; in the function
that sets entity, self_link and preview_link (the block referencing index_name,
entity, self_link, preview_link and url_for calls to
invenio_records_rest.comef_item / plmef_item / mef_item) replace the ad-hoc
contains logic with a deterministic mapping lookup: normalize the index base by
splitting index_name on "-" (index_base = index_name.split("-", 1)[0]) or by
resolving the index alias, then use a dict like {"concepts_mef": "concepts",
"places_mef": "places", "agents_mef": "agents"} (or alias->entity mapping) to
pick entity and then construct self_link/preview_link via the appropriate
url_for route mapping (map entity to route names comef_item/plmef_item/mef_item)
so versioned indices (e.g., concepts_mef-v2) resolve correctly.
rero_mef/ext.py (1)

92-102: Add debug logging for silent exception handling.

When get_alias fails at line 96, the exception is swallowed silently without any logging, making it difficult to diagnose issues. The subsequent put_alias block (lines 110-116) correctly logs failures. For consistency and debuggability, consider adding similar logging here.

♻️ Proposed fix
             try:
                     # Resolve alias -> concrete indexes.
                     alias_mapping = search_client.indices.get_alias(name=target)
-                except Exception:
+                except Exception as err:
+                    app.logger.debug(
+                        "Could not resolve alias %s: %s",
+                        target,
+                        err,
+                    )
                     alias_mapping = {}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rero_mef/ext.py` around lines 92 - 102, The except block that swallows
exceptions from search_client.indices.get_alias(name=target) should log the
error for debuggability; update the except in the loop over target_aliases
(where alias_mapping is set) to call the same logger used in the put_alias block
and include the target name and exception details (stack/trace) so failures are
visible; reference search_client.indices.get_alias, target (from target_aliases)
and alias_mapping when adding the logging call.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@rero_mef/ext.py`:
- Around line 92-102: The except block that swallows exceptions from
search_client.indices.get_alias(name=target) should log the error for
debuggability; update the except in the loop over target_aliases (where
alias_mapping is set) to call the same logger used in the put_alias block and
include the target name and exception details (stack/trace) so failures are
visible; reference search_client.indices.get_alias, target (from target_aliases)
and alias_mapping when adding the logging call.

In `@rero_mef/theme/views.py`:
- Around line 47-58: The robots() view builds a robots.txt that contains
"Disallow: /" plus additional specific Disallow lines which are redundant;
either remove the specific path lines so content only contains "User-agent:
*\nDisallow: /\n" (if you intend to block everything), or replace the root
"Disallow: /" with the selective paths (e.g., keep "Disallow: /agents\nDisallow:
/concepts\n..." and drop "Disallow: /") if you intend only to block those
endpoints; update the content string inside the robots() function accordingly.
- Around line 393-411: The _load_schema_labels function currently swallows all
exceptions when opening/parsing schema files; update it to log the failure at
debug (or appropriate) level so configuration issues are visible: wrap the
schema file open/json.load in the existing except block as "except Exception as
e:" and call the module logger (e.g., logging.getLogger(__name__) or the
module's logger name) to log the schema_path and exception information
(including traceback or str(e)) before returning {}. Ensure you reference the
same symbols: _load_schema_labels, schema_path, and the exception variable (e)
so the log includes which schema file failed and why.

In `@rero_mef/views.py`:
- Around line 62-105: The _ensure_all_mef_alias function in rero_mef/views.py
duplicates alias-management logic found in rero_mef/ext.py:ensure_all_mef_alias
and rero_mef/cli.py:all_mef_alias; extract the shared logic (building alias
actions, checking existing aliases/indexes, calling indices.update_aliases and
returning existence) into a single utility function (e.g., a new helper like
build_or_apply_all_mef_alias or ensure_all_mef_alias_core) and have
_ensure_all_mef_alias, ext.ensure_all_mef_alias and cli.all_mef_alias call that
helper to avoid duplication; keep the same behavior (logging and exception
handling) but centralize the core steps: determine targets, construct actions,
call update_aliases when needed, and return alias existence.
- Around line 204-213: and_search_factory may raise exceptions other than
InvalidQueryRESTError (e.g., AttributeError/IndexError); update the try/except
around the call to and_search_factory (where AllMefSearch() is created) to keep
the existing except InvalidQueryRESTError handling but also add a broader except
Exception as e clause that returns a 500 JSON response (e.g., {"status":500,
"message":"Internal server error while building search", "detail": str(e)}) so
the server does not crash on unexpected errors and callers receive a helpful
message; reference and_search_factory, AllMefSearch, and InvalidQueryRESTError
when locating the change.
- Around line 236-247: The current substring checks on index_name (e.g.,
"concepts_mef" in index_name) are fragile; in the function that sets entity,
self_link and preview_link (the block referencing index_name, entity, self_link,
preview_link and url_for calls to invenio_records_rest.comef_item / plmef_item /
mef_item) replace the ad-hoc contains logic with a deterministic mapping lookup:
normalize the index base by splitting index_name on "-" (index_base =
index_name.split("-", 1)[0]) or by resolving the index alias, then use a dict
like {"concepts_mef": "concepts", "places_mef": "places", "agents_mef":
"agents"} (or alias->entity mapping) to pick entity and then construct
self_link/preview_link via the appropriate url_for route mapping (map entity to
route names comef_item/plmef_item/mef_item) so versioned indices (e.g.,
concepts_mef-v2) resolve correctly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 49c7f533-6e5a-4056-b012-97ee130e42ea

📥 Commits

Reviewing files that changed from the base of the PR and between 8b913ad and 54ed022.

⛔ Files ignored due to path filters (18)
  • pyproject.toml is excluded by none and included by none
  • rero_mef/config.py is excluded by !rero_mef/config.py and included by rero_mef/**/*.py
  • rero_mef/theme/assets/scss/rero_mef/mef.scss is excluded by none and included by none
  • rero_mef/theme/static/templates/invenio_search_ui/searchbar.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/agents_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/all_mef_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/concepts_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/count_inline.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/facets.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/places_results.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/footer.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/frontpage.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/header.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/mef_detail.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/mef_search.html is excluded by none and included by none
  • scripts/setup is excluded by none and included by none
  • scripts/test is excluded by none and included by none
  • uv.lock is excluded by !**/*.lock and included by none
📒 Files selected for processing (10)
  • rero_mef/all_mef.py
  • rero_mef/cli.py
  • rero_mef/ext.py
  • rero_mef/filter.py
  • rero_mef/monitoring/api.py
  • rero_mef/theme/views.py
  • rero_mef/views.py
  • tests/api/test_agents_mef_rest.py
  • tests/api/test_concepts_mef_rest.py
  • tests/api/test_places_mef_rest.py
✅ Files skipped from review due to trivial changes (1)
  • tests/api/test_places_mef_rest.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • rero_mef/cli.py
  • rero_mef/filter.py

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (4)
rero_mef/views.py (1)

102-105: Silent exception at line 104 should be logged.

The exists_alias check at line 102-104 silently swallows exceptions without logging, unlike the other exception handlers in this function which log at debug level.

♻️ Proposed fix to add logging
     try:
         return current_search_client.indices.exists_alias(name=alias_name)
-    except Exception:
+    except Exception as err:
+        current_app.logger.debug("Could not check alias %s: %s", alias_name, err)
         return False
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rero_mef/views.py` around lines 102 - 105, The exists_alias call currently
swallows all exceptions; update the exception handler around
current_search_client.indices.exists_alias(name=alias_name) to log the caught
exception at debug level (matching the other handlers) before returning False —
include the exception object and a short contextual message in the debug log so
failures to check the alias are visible while still returning False on error.
tests/ui/conftest.py (1)

41-55: Unused tmp_path_factory parameter in fixture.

The tmp_path_factory parameter is declared but never used. The fixture writes to the instance path derived from INVENIO_INSTANCE_PATH, not a temporary directory.

♻️ Remove unused parameter
 `@pytest.fixture`(scope="session", autouse=True)
-def webpack_manifest(tmp_path_factory):
+def webpack_manifest():
     """Create a stub webpack manifest so UI templates render without assets."""
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/ui/conftest.py` around lines 41 - 55, The fixture webpack_manifest
declares an unused parameter tmp_path_factory; remove tmp_path_factory from the
function signature so the fixture becomes def webpack_manifest(): and update any
tests that might have relied on injecting tmp_path_factory (none expected) —
keep the rest of the body intact referencing _WEBPACK_MANIFEST_STUBS and
manifest_path.
rero_mef/ext.py (1)

92-102: Silent exception handling at line 96 should log for consistency.

The exception at line 96 silently falls back to an empty mapping without any logging, while line 110 properly logs a warning. For consistency and debuggability, consider logging when alias resolution fails.

♻️ Proposed fix to add logging
             try:
                 # Resolve alias -> concrete indexes.
                 alias_mapping = search_client.indices.get_alias(name=target)
-            except Exception:
+            except Exception as err:
+                app.logger.debug(
+                    "Could not resolve alias %s: %s",
+                    target,
+                    err,
+                )
                 alias_mapping = {}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rero_mef/ext.py` around lines 92 - 102, The except block around
search_client.indices.get_alias silently swallows errors; change it to capture
the exception (e.g., except Exception as e) and log a warning like the later
block does so failures resolving alias->indexes are visible; reference the same
logger used at line 110 and include the target alias and exception details in
the log (symbols: target_aliases, search_client.indices.get_alias,
alias_mapping, index_names, target).
rero_mef/theme/views.py (1)

310-338: User input flows into URL construction.

In _get_identifier_url, the value parameter is concatenated directly into URLs (e.g., line 318: f"https://www.isbn-search.org/search/?q={value_str}"). While this is for display purposes (generating clickable links), consider URL-encoding the value to handle special characters safely.

🛡️ Optional: URL-encode identifier values
+from urllib.parse import quote_plus
+
 def _get_identifier_url(identifier_type, value):
     """Generate external URL for identifier types."""
     if not value:
         return None

     value_str = str(value).strip()
+    encoded_value = quote_plus(value_str)

     if identifier_type == "isbn":
-        return f"https://www.isbn-search.org/search/?q={value_str}"
+        return f"https://www.isbn-search.org/search/?q={encoded_value}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rero_mef/theme/views.py` around lines 310 - 338, The function
_get_identifier_url currently interpolates raw value_str into external URLs;
update it to URL-encode identifier values before constructing links: import
urllib.parse and use urllib.parse.quote_plus(value_str) for query-parameter
targets (isbn, ean, ismn) and urllib.parse.quote(value_str, safe='/') for
path-like IDs such as doi (so slashes are preserved), while keeping the url/uri
branch behavior (returning value_str only when it already starts with http/https
and not from internal domain). Ensure you replace direct f-string insertions
with the encoded variable (still named value_str or encoded_value) so all
clickable links are safe.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@rero_mef/ext.py`:
- Around line 92-102: The except block around search_client.indices.get_alias
silently swallows errors; change it to capture the exception (e.g., except
Exception as e) and log a warning like the later block does so failures
resolving alias->indexes are visible; reference the same logger used at line 110
and include the target alias and exception details in the log (symbols:
target_aliases, search_client.indices.get_alias, alias_mapping, index_names,
target).

In `@rero_mef/theme/views.py`:
- Around line 310-338: The function _get_identifier_url currently interpolates
raw value_str into external URLs; update it to URL-encode identifier values
before constructing links: import urllib.parse and use
urllib.parse.quote_plus(value_str) for query-parameter targets (isbn, ean, ismn)
and urllib.parse.quote(value_str, safe='/') for path-like IDs such as doi (so
slashes are preserved), while keeping the url/uri branch behavior (returning
value_str only when it already starts with http/https and not from internal
domain). Ensure you replace direct f-string insertions with the encoded variable
(still named value_str or encoded_value) so all clickable links are safe.

In `@rero_mef/views.py`:
- Around line 102-105: The exists_alias call currently swallows all exceptions;
update the exception handler around
current_search_client.indices.exists_alias(name=alias_name) to log the caught
exception at debug level (matching the other handlers) before returning False —
include the exception object and a short contextual message in the debug log so
failures to check the alias are visible while still returning False on error.

In `@tests/ui/conftest.py`:
- Around line 41-55: The fixture webpack_manifest declares an unused parameter
tmp_path_factory; remove tmp_path_factory from the function signature so the
fixture becomes def webpack_manifest(): and update any tests that might have
relied on injecting tmp_path_factory (none expected) — keep the rest of the body
intact referencing _WEBPACK_MANIFEST_STUBS and manifest_path.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ca9ee1a8-1105-4363-b6a4-b1387967f9d7

📥 Commits

Reviewing files that changed from the base of the PR and between 54ed022 and 838c24b.

⛔ Files ignored due to path filters (18)
  • pyproject.toml is excluded by none and included by none
  • rero_mef/config.py is excluded by !rero_mef/config.py and included by rero_mef/**/*.py
  • rero_mef/theme/assets/scss/rero_mef/mef.scss is excluded by none and included by none
  • rero_mef/theme/static/templates/invenio_search_ui/searchbar.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/agents_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/all_mef_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/concepts_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/count_inline.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/facets.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/places_results.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/footer.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/frontpage.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/header.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/mef_detail.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/mef_search.html is excluded by none and included by none
  • scripts/setup is excluded by none and included by none
  • scripts/test is excluded by none and included by none
  • uv.lock is excluded by !**/*.lock and included by none
📒 Files selected for processing (12)
  • rero_mef/all_mef.py
  • rero_mef/cli.py
  • rero_mef/ext.py
  • rero_mef/filter.py
  • rero_mef/monitoring/api.py
  • rero_mef/theme/views.py
  • rero_mef/views.py
  • tests/api/test_agents_mef_rest.py
  • tests/api/test_concepts_mef_rest.py
  • tests/api/test_places_mef_rest.py
  • tests/ui/conftest.py
  • tests/ui/test_theme_views.py
✅ Files skipped from review due to trivial changes (1)
  • tests/api/test_agents_mef_rest.py
🚧 Files skipped from review as they are similar to previous changes (5)
  • tests/api/test_places_mef_rest.py
  • tests/api/test_concepts_mef_rest.py
  • rero_mef/filter.py
  • rero_mef/cli.py
  • rero_mef/monitoring/api.py

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (4)
tests/ui/test_theme_views.py (1)

264-269: Unused variable value should be prefixed with underscore.

The unpacked value is never used in this test. Use _ to indicate it's intentionally ignored.

🧹 Fix unused variable
 def test_format_value_with_url_url_external():
     """Url key with a plain HTTPS URL returns it as the external URL."""
     from rero_mef.theme.views import _format_value_with_url

-    value, url = _format_value_with_url("url", "https://example.com")
+    _, url = _format_value_with_url("url", "https://example.com")
     assert url == "https://example.com"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/ui/test_theme_views.py` around lines 264 - 269, The test unpacks
(value, url) from _format_value_with_url but never uses value; rename the unused
variable to _ (e.g., (_, url) = _format_value_with_url(...)) inside
test_format_value_with_url_url_external to signal intentional ignore and satisfy
linter; locate the call to _format_value_with_url in
test_format_value_with_url_url_external and change the left-hand tuple
accordingly.
tests/ui/conftest.py (1)

41-55: Unused fixture parameter tmp_path_factory.

The tmp_path_factory parameter is declared but never used in the fixture body. The fixture writes to a fixed instance path instead.

🧹 Remove unused parameter
 `@pytest.fixture`(scope="session", autouse=True)
-def webpack_manifest(tmp_path_factory):
+def webpack_manifest():
     """Create a stub webpack manifest so UI templates render without assets."""
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/ui/conftest.py` around lines 41 - 55, The fixture webpack_manifest
declares an unused parameter tmp_path_factory; remove tmp_path_factory from the
fixture signature (i.e., change def webpack_manifest(tmp_path_factory): to def
webpack_manifest():) or, if intended, use tmp_path_factory to create the
manifest directory instead of writing to the fixed INVENIO_INSTANCE_PATH—update
the function name webpack_manifest accordingly and adjust manifest path creation
and os.makedirs/manifest writing to use the tmp_path_factory-created path if
choosing the latter.
rero_mef/theme/views.py (1)

402-406: Consider logging schema load failures for debugging.

The broad except Exception silently swallows all errors. While returning {} is a safe fallback for the UI, logging the exception would help diagnose issues with missing or malformed schema files.

🔧 Suggested improvement
+    from flask import current_app
     try:
         with schema_path.open("r", encoding="utf-8") as fp:
             schema = json.load(fp)
-    except Exception:
+    except (FileNotFoundError, json.JSONDecodeError, PermissionError) as exc:
+        current_app.logger.debug(f"Could not load schema {schema_name}: {exc}")
         return {}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rero_mef/theme/views.py` around lines 402 - 406, The current try/except
around schema_path.open(...) and json.load(...) silently swallows all errors and
returns {}, so modify the except to capture the exception (e.g., `except
Exception as e:`) and log the failure before returning the fallback; use the
module/logger used elsewhere (or create one with logging.getLogger(__name__))
and call logger.exception or logger.error with context including schema_path and
the exception to aid debugging while still returning {} as the safe fallback for
the UI.
rero_mef/ext.py (1)

92-102: Inconsistent exception handling: alias resolution failure is not logged.

Line 96 silently catches and ignores exceptions from get_alias, while line 110 properly logs failures from put_alias. For consistency and debuggability, consider logging the alias resolution failure.

♻️ Proposed fix for consistent logging
             for target in target_aliases:
                 try:
                     # Resolve alias -> concrete indexes.
                     alias_mapping = search_client.indices.get_alias(name=target)
-                except Exception:
+                except Exception as err:
+                    app.logger.debug(
+                        "Could not resolve alias %s: %s",
+                        target,
+                        err,
+                    )
                     alias_mapping = {}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rero_mef/ext.py` around lines 92 - 102, The alias resolution loop swallows
exceptions from search_client.indices.get_alias (for target in target_aliases)
without logging—make the except block capture the exception (except Exception as
e) and log it using the same logger used when handling put_alias failures
(include the target name and exception details) so that alias_mapping handling
remains consistent and debuggable; update the except block around get_alias to
log the error and then set alias_mapping = {} as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@rero_mef/monitoring/api.py`:
- Around line 219-226: The code passes a possibly-empty index to get_es_count
and computes "db-es" without fully guarding types; change the logic in the block
that builds result for each doc_type so that you first check if
endpoints[doc_type].get("search_index") is truthy before calling get_es_count
(if empty set es_count = ""), and compute "db-es" only when both db_count and
es_count are ints (otherwise set to ""), while leaving mef_linked
(mef_search().filter("exists", field=entity_cls.name).count()) unchanged; update
the result assignment that references get_db_count, get_es_count, mef_search and
entity_cls.name accordingly.

In `@rero_mef/views.py`:
- Around line 204-213: and_search_factory currently requires a `self` REST view
parameter but never uses it (it references `request` and `current_app`
directly), so update the function signature and all call sites to remove the
unused `self` parameter: change and_search_factory to a module-level or
`@staticmethod` that accepts only the necessary arguments (e.g., the initial
search object), update callers like the view that constructs AllMefSearch and
invokes and_search_factory (currently calling and_search_factory(None, search))
to call the new signature, and keep error handling for InvalidQueryRESTError
unchanged; ensure imports and tests reflect the new signature.

In `@tests/api/conftest.py`:
- Around line 36-38: The current check using path.startswith(self.prefix) can
rewrite unrelated routes like "/apiary"; change the guard to only match exact
prefix or prefix followed by a slash — e.g., replace the condition with one that
ensures path == self.prefix or path.startswith(self.prefix + "/") (or use a
regex like ^{re.escape(self.prefix)}(?=/|$)) before setting
environ["SCRIPT_NAME"] and adjusting environ["PATH_INFO"] so only "/api" or
"/api/..." are rewritten.

In `@tests/ui/test_monitoring.py`:
- Around line 221-235: The test currently builds endpoints = {"aidref": {...},
"aggnd": {...}} and calls mon.info_category(AgentMefRecord, endpoints) but only
asserts results for "aidref"; add equivalent assertions for "aggnd" to ensure it
appears in result and its nested keys ("db", "es", "mef"/"db-es" as appropriate)
and that result["aggnd"]["index"] == "agents_gnd", mirroring the checks used for
"aidref" and "mef" to catch mapping/iteration regressions in info_category.

---

Nitpick comments:
In `@rero_mef/ext.py`:
- Around line 92-102: The alias resolution loop swallows exceptions from
search_client.indices.get_alias (for target in target_aliases) without
logging—make the except block capture the exception (except Exception as e) and
log it using the same logger used when handling put_alias failures (include the
target name and exception details) so that alias_mapping handling remains
consistent and debuggable; update the except block around get_alias to log the
error and then set alias_mapping = {} as before.

In `@rero_mef/theme/views.py`:
- Around line 402-406: The current try/except around schema_path.open(...) and
json.load(...) silently swallows all errors and returns {}, so modify the except
to capture the exception (e.g., `except Exception as e:`) and log the failure
before returning the fallback; use the module/logger used elsewhere (or create
one with logging.getLogger(__name__)) and call logger.exception or logger.error
with context including schema_path and the exception to aid debugging while
still returning {} as the safe fallback for the UI.

In `@tests/ui/conftest.py`:
- Around line 41-55: The fixture webpack_manifest declares an unused parameter
tmp_path_factory; remove tmp_path_factory from the fixture signature (i.e.,
change def webpack_manifest(tmp_path_factory): to def webpack_manifest():) or,
if intended, use tmp_path_factory to create the manifest directory instead of
writing to the fixed INVENIO_INSTANCE_PATH—update the function name
webpack_manifest accordingly and adjust manifest path creation and
os.makedirs/manifest writing to use the tmp_path_factory-created path if
choosing the latter.

In `@tests/ui/test_theme_views.py`:
- Around line 264-269: The test unpacks (value, url) from _format_value_with_url
but never uses value; rename the unused variable to _ (e.g., (_, url) =
_format_value_with_url(...)) inside test_format_value_with_url_url_external to
signal intentional ignore and satisfy linter; locate the call to
_format_value_with_url in test_format_value_with_url_url_external and change the
left-hand tuple accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 69a12f4a-f884-4091-b1fb-8c7b23c8ccde

📥 Commits

Reviewing files that changed from the base of the PR and between 838c24b and 095d218.

⛔ Files ignored due to path filters (18)
  • pyproject.toml is excluded by none and included by none
  • rero_mef/config.py is excluded by !rero_mef/config.py and included by rero_mef/**/*.py
  • rero_mef/theme/assets/scss/rero_mef/mef.scss is excluded by none and included by none
  • rero_mef/theme/static/templates/invenio_search_ui/searchbar.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/agents_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/all_mef_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/concepts_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/count_inline.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/facets.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/places_results.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/footer.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/frontpage.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/header.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/mef_detail.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/mef_search.html is excluded by none and included by none
  • scripts/setup is excluded by none and included by none
  • scripts/test is excluded by none and included by none
  • uv.lock is excluded by !**/*.lock and included by none
📒 Files selected for processing (15)
  • rero_mef/all_mef.py
  • rero_mef/cli.py
  • rero_mef/ext.py
  • rero_mef/filter.py
  • rero_mef/monitoring/api.py
  • rero_mef/query.py
  • rero_mef/theme/views.py
  • rero_mef/views.py
  • tests/api/conftest.py
  • tests/api/test_agents_mef_rest.py
  • tests/api/test_concepts_mef_rest.py
  • tests/api/test_places_mef_rest.py
  • tests/ui/conftest.py
  • tests/ui/test_monitoring.py
  • tests/ui/test_theme_views.py
💤 Files with no reviewable changes (1)
  • rero_mef/query.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • tests/api/test_concepts_mef_rest.py
  • rero_mef/filter.py
  • rero_mef/cli.py

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (7)
tests/ui/concepts/test_concepts_api.py (1)

381-393: Remove redundant imports inside test function.

The imports on line 383 are redundant since ConceptIdrefRecord and ConceptMefRecord are already imported at the module level (lines 22-27).

♻️ Proposed fix
 def test_concept_record_delete(app, concept_idref_data):
     """ConceptRecord.delete removes the ref from linked MEF records."""
-    from rero_mef.concepts import ConceptIdrefRecord, ConceptMefRecord
-
     idref_record, _ = ConceptIdrefRecord.create_or_update(
         data=deepcopy(concept_idref_data), dbcommit=True, reindex=True
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/ui/concepts/test_concepts_api.py` around lines 381 - 393, Remove the
redundant local import statements inside the test_concept_record_delete
function: delete the "from rero_mef.concepts import ConceptIdrefRecord,
ConceptMefRecord" line and rely on the module-level imports already present;
keep the rest of test_concept_record_delete unchanged so it continues to use
ConceptIdrefRecord and ConceptMefRecord as referenced in the test.
tests/ui/places/test_places_api.py (1)

260-273: Consider moving deepcopy import to module level.

The deepcopy import inside the function is valid but inconsistent with the module-level import pattern used in test_concepts_api.py (line 19). Moving it to the imports section would improve consistency across test files.

♻️ Proposed fix

Add to module imports (after line 18):

from copy import deepcopy

Then simplify the test:

 def test_place_record_delete(app, place_idref_data, place_mef_idref_data):
     """PlaceRecord.delete removes the ref from linked MEF records."""
-    from copy import deepcopy
-
     idref_record, _ = PlaceIdrefRecord.create_or_update(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/ui/places/test_places_api.py` around lines 260 - 273, Move the local
import of deepcopy out of the test body to the module imports to match the
project's import style: add "from copy import deepcopy" at the top of the test
module and remove the "from copy import deepcopy" line inside the
test_place_record_delete function so the test uses the module-level deepcopy
import when calling deepcopy(place_idref_data).
tests/ui/test_theme_views.py (1)

264-270: Prefix unused value variable with underscore.

The unpacked value on line 268 is not used in the assertion.

♻️ Proposed fix
 def test_format_value_with_url_url_external():
     """Url key with a plain HTTPS URL returns it as the external URL."""
     from rero_mef.theme.views import _format_value_with_url

-    value, url = _format_value_with_url("url", "https://example.com")
+    _value, url = _format_value_with_url("url", "https://example.com")
     assert url == "https://example.com"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/ui/test_theme_views.py` around lines 264 - 270, The test
test_format_value_with_url_url_external unpacks (value, url) but never uses
value; change the unpacking to prefix the unused variable with an underscore
(e.g., _value, url) in the call to _format_value_with_url so linters won’t flag
an unused variable; update the tuple assignment in the test to use _value and
keep the assertion on url unchanged.
tests/ui/conftest.py (1)

41-55: Unused tmp_path_factory parameter and potential test isolation issue.

  1. The tmp_path_factory parameter is declared but never used.
  2. The fixture writes to a persistent location (INVENIO_INSTANCE_PATH or .venv/var/instance) rather than a temporary directory. This could cause:
    • Test pollution between runs
    • Issues in parallel test execution
    • Leftover files in the development environment

Consider either using tmp_path_factory for proper isolation or removing the unused parameter if the persistent location is intentional (e.g., required by Invenio's asset resolution).

♻️ Option 1: Remove unused parameter if persistent location is intentional
 `@pytest.fixture`(scope="session", autouse=True)
-def webpack_manifest(tmp_path_factory):
+def webpack_manifest():
     """Create a stub webpack manifest so UI templates render without assets."""
♻️ Option 2: Use tmp_path_factory for test isolation
 `@pytest.fixture`(scope="session", autouse=True)
 def webpack_manifest(tmp_path_factory):
     """Create a stub webpack manifest so UI templates render without assets."""
-    dist_dir = os.path.join(
-        os.environ.get(
-            "INVENIO_INSTANCE_PATH", os.path.join(os.getcwd(), ".venv/var/instance")
-        ),
-        "static",
-        "dist",
-    )
+    dist_dir = tmp_path_factory.mktemp("static") / "dist"
     os.makedirs(dist_dir, exist_ok=True)
     manifest_path = os.path.join(dist_dir, "manifest.json")
-    if not os.path.exists(manifest_path):
-        with open(manifest_path, "w", encoding="utf-8") as f:
-            json.dump(_WEBPACK_MANIFEST_STUBS, f)
+    with open(manifest_path, "w", encoding="utf-8") as f:
+        json.dump(_WEBPACK_MANIFEST_STUBS, f)
+    # Set env var so Invenio finds the stub
+    os.environ["INVENIO_INSTANCE_PATH"] = str(dist_dir.parent.parent)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/ui/conftest.py` around lines 41 - 55, The fixture webpack_manifest
declares an unused tmp_path_factory and writes a manifest.json into a persistent
path computed from INVENIO_INSTANCE_PATH (or .venv/var/instance), which can
cause test pollution and parallel-run issues; either remove the tmp_path_factory
parameter if writing to the real instance path is intentional, or modify the
fixture to create a temporary dist directory via tmp_path_factory.mktemp and
write manifest.json there (using _WEBPACK_MANIFEST_STUBS and the same
manifest_path variable logic) and, if necessary, patch any code that resolves
static paths to point to that temporary dist for the test session so tests
remain isolated.
tests/ui/agents/test_agents_viaf_api.py (1)

346-353: Remove redundant import; consider stronger assertions.

  1. AgentMefRecord is already imported at line 29 - the import on line 348 is redundant.
  2. The test only checks return types. Consider adding value assertions to verify the fixtures' VIAF PIDs are correctly handled.
♻️ Proposed fix
 def test_get_all_missing_viaf_pids(app, agent_viaf_record, agent_mef_record):
     """get_all_missing_viaf_pids returns pids from VIAF not in MEF and vice versa."""
-    from rero_mef.agents.mef.api import AgentMefRecord
-
     missing, non_existing = AgentMefRecord.get_all_missing_viaf_pids()
     # agent_mef_record has a viaf_pid that matches agent_viaf_record so neither list has it
     assert isinstance(missing, list)
     assert isinstance(non_existing, dict)
+    # Verify the matched viaf_pid is excluded from both collections
+    viaf_pid = agent_viaf_record.get("pid")
+    if viaf_pid:
+        assert viaf_pid not in missing
+        assert viaf_pid not in non_existing.values()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/ui/agents/test_agents_viaf_api.py` around lines 346 - 353, Remove the
redundant local import of AgentMefRecord in test_get_all_missing_viaf_pids (the
class is already imported at top-level) and strengthen assertions: call
AgentMefRecord.get_all_missing_viaf_pids() and assert not only that missing is a
list and non_existing a dict but also that the VIAF PID from the
agent_viaf_record fixture appears (or does not appear) in the appropriate result
and that the pid from agent_mef_record is handled as expected; use the fixtures'
viaf_pid values and assert membership/absence in missing and keys/values of
non_existing to validate behavior.
rero_mef/ext.py (1)

64-116: Duplicated alias logic with rero_mef/views.py could diverge.

The alias resolution logic here differs from _ensure_all_mef_alias() in views.py (lines 62-105):

  • ext.py: Uses get_alias() with fallback to target name
  • views.py: Explicitly checks exists_alias() then exists() before adding

Both accomplish the same goal but could behave differently in edge cases. Consider extracting a shared utility function to ensure consistent behavior.

The broad Exception catches (lines 96, 110) are acceptable here since Elasticsearch can throw various exception types during startup when the cluster may be unavailable or indexes not yet created.

♻️ Consider extracting shared alias utility
# In a shared module like rero_mef/utils.py or rero_mef/search_utils.py

def ensure_all_mef_alias(search_client, logger, alias_name="all_mef"):
    """Ensure all_mef alias points to all MEF indexes.
    
    Returns True if alias was created/updated, False otherwise.
    """
    targets = ("mef", "concepts_mef", "places_mef")
    actions = []
    
    for target in targets:
        try:
            if search_client.indices.exists_alias(name=target):
                mapping = search_client.indices.get_alias(name=target)
                actions.extend(
                    {"add": {"index": idx, "alias": alias_name}}
                    for idx in mapping
                )
            elif search_client.indices.exists(index=target):
                actions.append({"add": {"index": target, "alias": alias_name}})
        except Exception as err:
            logger.debug("Alias resolution failed for %s: %s", target, err)
    
    if actions:
        try:
            search_client.indices.update_aliases(body={"actions": actions})
            return True
        except Exception as err:
            logger.warning("Could not update %s alias: %s", alias_name, err)
    return False
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rero_mef/ext.py` around lines 64 - 116, Duplicate alias-resolution logic
between ensure_all_mef_alias (in ext.py) and _ensure_all_mef_alias (in views.py)
can diverge; extract a single shared utility (e.g., ensure_all_mef_alias in a
new rero_mef/search_utils.py) that accepts a search_client and logger, uses
indices.exists_alias and indices.exists to resolve targets
("mef","concepts_mef","places_mef"), builds actions list and calls
indices.update_aliases, returns a boolean success, and replace the bodies of
ensure_all_mef_alias and _ensure_all_mef_alias to call this utility
(propagating/logging errors consistently).
rero_mef/theme/views.py (1)

253-255: Cache schema labels instead of reparsing the same file per section.

_build_display_rows() reloads the schema on every call. Preview rendering does that once for the main record and again for each source section, so the same JSON file can be opened several times in one request. A small cache on _load_schema_labels() would remove that repeated disk I/O.

Also applies to: 436-454

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rero_mef/theme/views.py` around lines 253 - 255, _build_display_rows
currently calls _load_schema_labels repeatedly causing repeated file reads; add
a small module-level cache inside _load_schema_labels (e.g., a dict keyed by
entity_name) to store and return parsed labels on subsequent calls instead of
reparsing the JSON file each time, and update _load_schema_labels to check the
cache first and populate it after the first load; apply the same caching
approach where _load_schema_labels is used elsewhere (notably the code around
lines 436-454) so preview rendering reuses the parsed schema.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@rero_mef/theme/views.py`:
- Around line 213-218: The referrer validation currently accepts
request.referrer when referrer_parts.netloc is empty, which allows non-HTTP
schemes like javascript:; update the logic around list_url/back_url (variables
entity_name, list_url, back_url) that parses request.referrer with urlparse into
referrer_parts to also require referrer_parts.scheme to be "http" or "https"
before assigning back_url = request.referrer (keeping the existing
same-host/netloc checks); ensure you reference request.referrer, urlparse,
referrer_parts.scheme and only accept the referrer if scheme is one of the
allowed values.

In `@rero_mef/views.py`:
- Around line 62-104: The current _ensure_all_mef_alias() function performs
alias mutations (building actions and calling
current_search_client.indices.update_aliases) on every request; remove the write
logic so the function only checks alias/index existence using
current_search_client.indices.exists_alias and
current_search_client.indices.exists and returns the boolean, and do not call
update_aliases here; also change the final exception handling around the
exists_alias check to log/propagate transport/auth errors distinctly rather than
collapsing them into a misleading "alias missing" response (use
current_search_client.indices.exists_alias and exists exceptions to decide
logging vs re-raising).
- Around line 264-279: The pagination link builder currently calls
request.args.to_dict(flat=True) which collapses repeated query keys; change the
call in the block that builds links (where args is defined and later passed to
url_for for "api_blueprint.all_mef_search") to request.args.to_dict(flat=False)
so list values are preserved and url_for will render repeated query params; keep
the existing modifications (args["size"] = str(size) and args.pop("page", None))
so size and page handling remain the same when constructing self/next/prev
links.

---

Nitpick comments:
In `@rero_mef/ext.py`:
- Around line 64-116: Duplicate alias-resolution logic between
ensure_all_mef_alias (in ext.py) and _ensure_all_mef_alias (in views.py) can
diverge; extract a single shared utility (e.g., ensure_all_mef_alias in a new
rero_mef/search_utils.py) that accepts a search_client and logger, uses
indices.exists_alias and indices.exists to resolve targets
("mef","concepts_mef","places_mef"), builds actions list and calls
indices.update_aliases, returns a boolean success, and replace the bodies of
ensure_all_mef_alias and _ensure_all_mef_alias to call this utility
(propagating/logging errors consistently).

In `@rero_mef/theme/views.py`:
- Around line 253-255: _build_display_rows currently calls _load_schema_labels
repeatedly causing repeated file reads; add a small module-level cache inside
_load_schema_labels (e.g., a dict keyed by entity_name) to store and return
parsed labels on subsequent calls instead of reparsing the JSON file each time,
and update _load_schema_labels to check the cache first and populate it after
the first load; apply the same caching approach where _load_schema_labels is
used elsewhere (notably the code around lines 436-454) so preview rendering
reuses the parsed schema.

In `@tests/ui/agents/test_agents_viaf_api.py`:
- Around line 346-353: Remove the redundant local import of AgentMefRecord in
test_get_all_missing_viaf_pids (the class is already imported at top-level) and
strengthen assertions: call AgentMefRecord.get_all_missing_viaf_pids() and
assert not only that missing is a list and non_existing a dict but also that the
VIAF PID from the agent_viaf_record fixture appears (or does not appear) in the
appropriate result and that the pid from agent_mef_record is handled as
expected; use the fixtures' viaf_pid values and assert membership/absence in
missing and keys/values of non_existing to validate behavior.

In `@tests/ui/concepts/test_concepts_api.py`:
- Around line 381-393: Remove the redundant local import statements inside the
test_concept_record_delete function: delete the "from rero_mef.concepts import
ConceptIdrefRecord, ConceptMefRecord" line and rely on the module-level imports
already present; keep the rest of test_concept_record_delete unchanged so it
continues to use ConceptIdrefRecord and ConceptMefRecord as referenced in the
test.

In `@tests/ui/conftest.py`:
- Around line 41-55: The fixture webpack_manifest declares an unused
tmp_path_factory and writes a manifest.json into a persistent path computed from
INVENIO_INSTANCE_PATH (or .venv/var/instance), which can cause test pollution
and parallel-run issues; either remove the tmp_path_factory parameter if writing
to the real instance path is intentional, or modify the fixture to create a
temporary dist directory via tmp_path_factory.mktemp and write manifest.json
there (using _WEBPACK_MANIFEST_STUBS and the same manifest_path variable logic)
and, if necessary, patch any code that resolves static paths to point to that
temporary dist for the test session so tests remain isolated.

In `@tests/ui/places/test_places_api.py`:
- Around line 260-273: Move the local import of deepcopy out of the test body to
the module imports to match the project's import style: add "from copy import
deepcopy" at the top of the test module and remove the "from copy import
deepcopy" line inside the test_place_record_delete function so the test uses the
module-level deepcopy import when calling deepcopy(place_idref_data).

In `@tests/ui/test_theme_views.py`:
- Around line 264-270: The test test_format_value_with_url_url_external unpacks
(value, url) but never uses value; change the unpacking to prefix the unused
variable with an underscore (e.g., _value, url) in the call to
_format_value_with_url so linters won’t flag an unused variable; update the
tuple assignment in the test to use _value and keep the assertion on url
unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f54f330f-05c6-41a2-a61b-a8737c1ec9c6

📥 Commits

Reviewing files that changed from the base of the PR and between 095d218 and 98ab907.

⛔ Files ignored due to path filters (19)
  • .github/workflows/continuous-integration-test.yml is excluded by none and included by none
  • pyproject.toml is excluded by none and included by none
  • rero_mef/config.py is excluded by !rero_mef/config.py and included by rero_mef/**/*.py
  • rero_mef/theme/assets/scss/rero_mef/mef.scss is excluded by none and included by none
  • rero_mef/theme/static/templates/invenio_search_ui/searchbar.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/agents_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/all_mef_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/concepts_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/count_inline.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/facets.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/places_results.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/footer.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/frontpage.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/header.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/mef_detail.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/mef_search.html is excluded by none and included by none
  • scripts/setup is excluded by none and included by none
  • scripts/test is excluded by none and included by none
  • uv.lock is excluded by !**/*.lock and included by none
📒 Files selected for processing (18)
  • rero_mef/all_mef.py
  • rero_mef/cli.py
  • rero_mef/ext.py
  • rero_mef/filter.py
  • rero_mef/monitoring/api.py
  • rero_mef/query.py
  • rero_mef/theme/views.py
  • rero_mef/views.py
  • tests/api/conftest.py
  • tests/api/test_agents_mef_rest.py
  • tests/api/test_concepts_mef_rest.py
  • tests/api/test_places_mef_rest.py
  • tests/ui/agents/test_agents_viaf_api.py
  • tests/ui/concepts/test_concepts_api.py
  • tests/ui/conftest.py
  • tests/ui/places/test_places_api.py
  • tests/ui/test_monitoring.py
  • tests/ui/test_theme_views.py
💤 Files with no reviewable changes (1)
  • rero_mef/query.py
✅ Files skipped from review due to trivial changes (1)
  • tests/ui/test_monitoring.py
🚧 Files skipped from review as they are similar to previous changes (5)
  • tests/api/test_concepts_mef_rest.py
  • tests/api/test_places_mef_rest.py
  • tests/api/conftest.py
  • rero_mef/cli.py
  • tests/api/test_agents_mef_rest.py

@rerowep rerowep force-pushed the wep-ui branch 7 times, most recently from 8c73a9a to 0c4a423 Compare March 31, 2026 14:20
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
rero_mef/ext.py (1)

72-72: Consider extracting shared constant for MEF target aliases.

The tuple ("mef", "concepts_mef", "places_mef") is duplicated in rero_mef/cli.py:1157. When new MEF entity types are added, both locations must be updated separately.

♻️ Suggested extraction (optional)

Create a shared constant in a config or utils module:

# e.g., in rero_mef/constants.py or config.py
MEF_INDEX_TARGETS = ("mef", "concepts_mef", "places_mef")

Then import and use it in both ext.py and cli.py.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rero_mef/ext.py` at line 72, Extract the duplicated tuple ("mef",
"concepts_mef", "places_mef") into a shared constant (e.g., MEF_INDEX_TARGETS)
in a common module (like rero_mef/constants.py or rero_mef/config.py), then
replace the inline tuple assigned to target_aliases in ext.py and the duplicate
literal in cli.py with an import of MEF_INDEX_TARGETS; ensure both places use
the same symbol so future additions only require updating the constant.
tests/ui/test_theme_views.py (1)

110-127: Add a negative-referrer regression test for the preview back-link.

Current coverage validates accepted same-host referrers but not rejected unsafe schemes. Add one case to ensure javascript:/data: referrers are ignored and fallback list URL is used.

✅ Suggested test case
+def test_agents_preview_ignores_unsafe_referrer(
+    client,
+    agent_mef_record,
+    agent_viaf_record,
+    agent_gnd_record,
+    agent_idref_record,
+    agent_rero_record,
+):
+    """Unsafe Referer schemes must not be reused as back URL."""
+    pid = agent_mef_record.get("pid")
+    res = client.get(
+        url_for("rero_mef.agents_preview", pid_value=pid),
+        headers={"Referer": "javascript:alert(1)"},
+    )
+    assert res.status_code == 200
+    body = res.get_data(as_text=True)
+    assert "javascript:alert(1)" not in body
+    assert url_for("rero_mef.agents_list") in body
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/ui/test_theme_views.py` around lines 110 - 127, Add a negative
regression test alongside test_agents_preview_with_local_referrer to assert that
unsafe referrers with schemes like "javascript:" or "data:" are rejected and the
preview back-link falls back to the list URL; specifically, create a new test
(e.g., test_agents_preview_with_unsafe_referrer) that calls
url_for("rero_mef.agents_preview", pid_value=...) with headers={"Referer":
"javascript:alert(1)"} (and/or "data:text/html,..." ), assert res.status_code ==
200 and assert the fallback list URL (the same target returned by
url_for("rero_mef.agents") or the known list route) appears in
res.get_data(as_text=True) instead of the unsafe referrer.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@rero_mef/monitoring/api.py`:
- Line 222: The unguarded call mef_search().filter("exists",
field=entity_cls.name).count() can raise NotFoundError if the MEF index/alias is
missing; replace this direct .count() with the same safe helper used elsewhere:
build the exists query via mef_search().filter("exists", field=entity_cls.name)
and pass that search/query into self.get_es_count(...) to obtain mef_linked so
NotFoundError is handled consistently (refer to mef_search(), entity_cls.name,
mef_linked, and self.get_es_count()).

In `@rero_mef/theme/views.py`:
- Around line 499-503: The schema loading block that uses schema_path.open(... )
and json.load(fp) currently swallows all exceptions; change it to catch only
expected errors (e.g., FileNotFoundError, PermissionError, OSError, and
json.JSONDecodeError) and handle them by returning {} (and optionally logging
the exception), while allowing any other unexpected exceptions to propagate;
update the except clause to "except (FileNotFoundError, PermissionError,
OSError, json.JSONDecodeError) as e" (or similar) and include the exception
variable for logging/debugging instead of a blind "except Exception".
- Around line 597-608: The identifier-rendering loop that builds HTML chips
currently injects id_item["url"], id_item["text"], and display directly into
Markup (see the loop over id_item in id_strs and the chips.append(Markup(...))
calls); fix this by importing escape from markupsafe, escaping id_item["text"]
and display with escape() before composing Markup, validating id_item.get("url")
to allow only safe schemes (http/https) and fall back to plain escaped text when
invalid, and build the final Markup using parameterized Markup templates (e.g.,
Markup('<a href="{}">{}</a>').format(escaped_url, escaped_text)) instead of
f-strings so no unescaped user input is marked safe. Ensure chips.append uses
the escaped/validated values and that no raw user value is passed into Markup.

In `@tests/ui/test_theme_views.py`:
- Around line 268-269: The test assigns an unused variable by unpacking the
tuple returned from _format_value_with_url into value and url; update the test
to avoid the unused binding by discarding the first element (e.g., replace the
left-hand unpack with a throwaway name like _ or only assign url from the
returned tuple) so only url is bound and the assertion remains url ==
"https://example.com".

---

Nitpick comments:
In `@rero_mef/ext.py`:
- Line 72: Extract the duplicated tuple ("mef", "concepts_mef", "places_mef")
into a shared constant (e.g., MEF_INDEX_TARGETS) in a common module (like
rero_mef/constants.py or rero_mef/config.py), then replace the inline tuple
assigned to target_aliases in ext.py and the duplicate literal in cli.py with an
import of MEF_INDEX_TARGETS; ensure both places use the same symbol so future
additions only require updating the constant.

In `@tests/ui/test_theme_views.py`:
- Around line 110-127: Add a negative regression test alongside
test_agents_preview_with_local_referrer to assert that unsafe referrers with
schemes like "javascript:" or "data:" are rejected and the preview back-link
falls back to the list URL; specifically, create a new test (e.g.,
test_agents_preview_with_unsafe_referrer) that calls
url_for("rero_mef.agents_preview", pid_value=...) with headers={"Referer":
"javascript:alert(1)"} (and/or "data:text/html,..." ), assert res.status_code ==
200 and assert the fallback list URL (the same target returned by
url_for("rero_mef.agents") or the known list route) appears in
res.get_data(as_text=True) instead of the unsafe referrer.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 75ae4171-2a35-42ff-8073-52f3ec374ede

📥 Commits

Reviewing files that changed from the base of the PR and between 095d218 and 0c4a423.

⛔ Files ignored due to path filters (19)
  • .github/workflows/continuous-integration-test.yml is excluded by none and included by none
  • pyproject.toml is excluded by none and included by none
  • rero_mef/config.py is excluded by !rero_mef/config.py and included by rero_mef/**/*.py
  • rero_mef/theme/assets/scss/rero_mef/mef.scss is excluded by none and included by none
  • rero_mef/theme/static/templates/invenio_search_ui/searchbar.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/agents_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/all_mef_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/concepts_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/count_inline.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/facets.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/places_results.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/footer.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/frontpage.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/header.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/mef_detail.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/mef_search.html is excluded by none and included by none
  • scripts/setup is excluded by none and included by none
  • scripts/test is excluded by none and included by none
  • uv.lock is excluded by !**/*.lock and included by none
📒 Files selected for processing (22)
  • rero_mef/all_mef.py
  • rero_mef/cli.py
  • rero_mef/ext.py
  • rero_mef/filter.py
  • rero_mef/monitoring/api.py
  • rero_mef/query.py
  • rero_mef/theme/views.py
  • rero_mef/views.py
  • tests/api/conftest.py
  • tests/api/test_agents_mef_rest.py
  • tests/api/test_all_mef_rest.py
  • tests/api/test_concepts_mef_rest.py
  • tests/api/test_entity_serializers.py
  • tests/api/test_places_mef_rest.py
  • tests/ui/agents/test_agents_tasks.py
  • tests/ui/agents/test_agents_viaf_api.py
  • tests/ui/concepts/test_concepts_api.py
  • tests/ui/conftest.py
  • tests/ui/places/test_places_api.py
  • tests/ui/test_ext.py
  • tests/ui/test_monitoring.py
  • tests/ui/test_theme_views.py
💤 Files with no reviewable changes (1)
  • rero_mef/query.py
✅ Files skipped from review due to trivial changes (3)
  • tests/ui/agents/test_agents_viaf_api.py
  • tests/ui/concepts/test_concepts_api.py
  • tests/ui/test_monitoring.py
🚧 Files skipped from review as they are similar to previous changes (6)
  • tests/api/test_places_mef_rest.py
  • tests/ui/conftest.py
  • rero_mef/filter.py
  • tests/api/conftest.py
  • rero_mef/cli.py
  • tests/ui/places/test_places_api.py

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
rero_mef/theme/views.py (1)

597-609: ⚠️ Potential issue | 🔴 Critical

display is still marked safe without escaping.

display is built from record data, then line 609 wraps the joined string in Markup. If authorized_access_point or source contains HTML, it will be rendered whenever identifiedBy is present. Escape display and join Markup fragments directly instead of converting everything back to str.

🔒 Proposed fix
-                lines = [display] if display else []
-                lines.extend(str(c) for c in chips)
-                return Markup("<br>".join(lines))
+                lines = [escape(display)] if display else []
+                lines.extend(chips)
+                return Markup("<br>").join(lines)
As per coding guidelines, "**/*.py: This project uses Ruff for linting and formatting. Focus on: - Security vulnerabilities and correctness issues - Performance problems and edge cases - Logic errors and architectural improvements**"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rero_mef/theme/views.py` around lines 597 - 609, The code returns
Markup("<br>".join(lines)) while leaving display unescaped, risking HTML
injection; update the assembly so display is escaped (use escape(display) or
Markup.escape(display)) and build lines as Markup fragments (e.g., keep chips as
Markup and append an escaped Markup for display) then join using
Markup("<br>").join(...) or concatenate Markup-safe pieces directly; locate the
loop over id_strs, the variables display and chips, and the final return
statement and ensure display is escaped before joining with the Markup fragments
so no raw record HTML is rendered.
🧹 Nitpick comments (1)
tests/api/test_all_mef_rest.py (1)

129-163: Add a repeated-query-param pagination regression test.

The view now preserves multi-value args, but these tests only cover scalar size / page. A case like ?entity=agents&entity=places&page=2 would lock in the flat=False behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/api/test_all_mef_rest.py` around lines 129 - 163, Update the pagination
tests to include a repeated query-parameter case so multi-value args are
preserved: in the tests around test_all_mef_search_pagination_next and
test_all_mef_search_pagination_prev (or add a new test function), call
client.get with a query string containing repeated params (e.g.
"?entity=agents&entity=places&page=2&size=5") while keeping the same mocks for
_ensure_all_mef_alias, AllMefSearch, and and_search_factory; then assert the
returned body["links"]["next"] and ["prev"] (as appropriate) include both
entity=agents and entity=places (i.e. both values are present in the generated
link) to catch the regression that flattened multi-value args.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@rero_mef/filter.py`:
- Around line 60-81: The entity facet aggregation logic is misclassifying
concept/place docs as agents; update the aggregation wired to the entity facet
(the code referenced as the "entity" aggregation in rero_mef/config.py) to use
the same index-pattern mapping as all_mef_entity_filter() and check for the
specific patterns "concepts_mef*" and "places_mef*" before the generic "mef*"
(or match exact prefixes rather than contains('mef')), so buckets align with the
filter; in short, make the aggregation use the same index detection order/logic
as all_mef_entity_filter() (agents -> "mef*", concepts -> "concepts_mef*",
places -> "places_mef*") or derive both from a single shared mapping to avoid
divergence.

---

Duplicate comments:
In `@rero_mef/theme/views.py`:
- Around line 597-609: The code returns Markup("<br>".join(lines)) while leaving
display unescaped, risking HTML injection; update the assembly so display is
escaped (use escape(display) or Markup.escape(display)) and build lines as
Markup fragments (e.g., keep chips as Markup and append an escaped Markup for
display) then join using Markup("<br>").join(...) or concatenate Markup-safe
pieces directly; locate the loop over id_strs, the variables display and chips,
and the final return statement and ensure display is escaped before joining with
the Markup fragments so no raw record HTML is rendered.

---

Nitpick comments:
In `@tests/api/test_all_mef_rest.py`:
- Around line 129-163: Update the pagination tests to include a repeated
query-parameter case so multi-value args are preserved: in the tests around
test_all_mef_search_pagination_next and test_all_mef_search_pagination_prev (or
add a new test function), call client.get with a query string containing
repeated params (e.g. "?entity=agents&entity=places&page=2&size=5") while
keeping the same mocks for _ensure_all_mef_alias, AllMefSearch, and
and_search_factory; then assert the returned body["links"]["next"] and ["prev"]
(as appropriate) include both entity=agents and entity=places (i.e. both values
are present in the generated link) to catch the regression that flattened
multi-value args.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6470828e-df70-4ecc-aa11-724f92bf01f2

📥 Commits

Reviewing files that changed from the base of the PR and between 0c4a423 and 188daed.

⛔ Files ignored due to path filters (19)
  • .github/workflows/continuous-integration-test.yml is excluded by none and included by none
  • pyproject.toml is excluded by none and included by none
  • rero_mef/config.py is excluded by !rero_mef/config.py and included by rero_mef/**/*.py
  • rero_mef/theme/assets/scss/rero_mef/mef.scss is excluded by none and included by none
  • rero_mef/theme/static/templates/invenio_search_ui/searchbar.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/agents_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/all_mef_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/concepts_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/count_inline.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/facets.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/places_results.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/footer.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/frontpage.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/header.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/mef_detail.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/mef_search.html is excluded by none and included by none
  • scripts/setup is excluded by none and included by none
  • scripts/test is excluded by none and included by none
  • uv.lock is excluded by !**/*.lock and included by none
📒 Files selected for processing (22)
  • rero_mef/all_mef.py
  • rero_mef/cli.py
  • rero_mef/ext.py
  • rero_mef/filter.py
  • rero_mef/monitoring/api.py
  • rero_mef/query.py
  • rero_mef/theme/views.py
  • rero_mef/views.py
  • tests/api/conftest.py
  • tests/api/test_agents_mef_rest.py
  • tests/api/test_all_mef_rest.py
  • tests/api/test_concepts_mef_rest.py
  • tests/api/test_entity_serializers.py
  • tests/api/test_places_mef_rest.py
  • tests/ui/agents/test_agents_tasks.py
  • tests/ui/agents/test_agents_viaf_api.py
  • tests/ui/concepts/test_concepts_api.py
  • tests/ui/conftest.py
  • tests/ui/places/test_places_api.py
  • tests/ui/test_ext.py
  • tests/ui/test_monitoring.py
  • tests/ui/test_theme_views.py
💤 Files with no reviewable changes (1)
  • rero_mef/query.py
✅ Files skipped from review due to trivial changes (6)
  • tests/api/test_concepts_mef_rest.py
  • tests/ui/concepts/test_concepts_api.py
  • tests/ui/test_ext.py
  • tests/ui/test_monitoring.py
  • rero_mef/cli.py
  • tests/ui/places/test_places_api.py
🚧 Files skipped from review as they are similar to previous changes (8)
  • tests/api/test_places_mef_rest.py
  • tests/ui/agents/test_agents_tasks.py
  • tests/ui/agents/test_agents_viaf_api.py
  • tests/api/test_entity_serializers.py
  • tests/api/test_agents_mef_rest.py
  • tests/ui/conftest.py
  • tests/api/conftest.py
  • tests/ui/test_theme_views.py

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
rero_mef/theme/views.py (1)

594-609: ⚠️ Potential issue | 🔴 Critical

Escape display before marking the joined chips as safe.

The new escaping only covers id_strs. display is still built from record data, appended unescaped at Line 607, and the whole string is wrapped in Markup at Line 609. A malicious authorized_access_point or source value will therefore render as trusted HTML.

🔒 Suggested fix
-                lines = [display] if display else []
-                lines.extend(str(c) for c in chips)
-                return Markup("<br>".join(lines))
+                lines = [escape(display)] if display else []
+                lines.extend(chips)
+                return Markup("<br>").join(lines)

As per coding guidelines, **/*.py: "Focus on: Security vulnerabilities and correctness issues; Performance problems and edge cases; Logic errors and architectural improvements."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rero_mef/theme/views.py` around lines 594 - 609, The variable display is
currently included raw into the safe Markup output causing XSS risk; before
building lines, escape display (e.g., replace use of display in lines with an
escaped version like escape(display) or Markup(escape(display))) so any
record-derived text (authorized_access_point/source) is sanitized, and then join
with the already-escaped/Markup chips as before; update the block that
constructs lines (the display/lines list) to use the escaped display value
instead of the raw display variable.
rero_mef/ext.py (1)

64-116: ⚠️ Potential issue | 🟠 Major

Reconcile all_mef membership instead of only appending aliases.

Line 106 only adds the currently resolved indexes to all_mef; nothing removes indices that no longer sit behind mef, concepts_mef, or places_mef. After a reindex/repoint, the alias can keep stale concrete indices and /all/mef will return duplicate or outdated hits. Build the desired target set first, then apply add/remove actions in one update_aliases() call.

As per coding guidelines, **/*.py: "Focus on: Security vulnerabilities and correctness issues; Performance problems and edge cases; Logic errors and architectural improvements."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rero_mef/ext.py` around lines 64 - 116, The function ensure_all_mef_alias
currently only calls put_alias to add resolved concrete indexes to alias_name
("all_mef"), which leaves stale indexes attached; instead compute the desired
set by resolving each target in target_aliases
("mef","concepts_mef","places_mef") into concrete index names, compare it with
the current concrete indexes for alias_name (use
search_client.indices.get_alias(name=alias_name) to get existing members), then
build a single bulk change list of remove actions for indexes present but not
desired and add actions for desired but not present, and call
search_client.indices.update_aliases(body={"actions": [...]}) once to reconcile
membership atomically; use the existing try/except patterns around get_alias and
update_aliases and log warnings on failures (reference ensure_all_mef_alias,
target_aliases, alias_name, search_client.indices.get_alias,
search_client.indices.put_alias -> replace with
search_client.indices.update_aliases).
🧹 Nitpick comments (3)
rero_mef/monitoring/api.py (1)

215-236: Consider using .items() for cleaner iteration.

The loop iterates over endpoints keys and then accesses endpoints[doc_type] separately. Using .items() is slightly more efficient and idiomatic.

♻️ Suggested refactor
-        for doc_type in endpoints:
+        for doc_type, endpoint in endpoints.items():
             entity_cls = get_entity_class(doc_type)
             if not entity_cls:
                 continue
-            index = endpoints[doc_type].get("search_index", "")
+            index = endpoint.get("search_index", "")
             db_count = self.get_db_count(doc_type)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rero_mef/monitoring/api.py` around lines 215 - 236, The loop over endpoints
iterates keys then indexes back into endpoints, which is non-idiomatic; change
the loop to iterate over endpoints.items() (e.g., for doc_type, cfg in
endpoints.items()) and then use cfg.get("search_index", "") instead of
endpoints[doc_type]. Also keep usages of get_entity_class(doc_type),
self.get_db_count(doc_type), self.get_es_count(index), mef_search() and
mef_index the same—just replace the inner access pattern to use the unpacked cfg
to make the code clearer and slightly more efficient.
tests/ui/test_ext.py (1)

35-43: Consider using a context manager for safer extension state restoration.

The manual save/restore pattern works but could be cleaner with a helper or unittest.mock.patch.dict.

♻️ Optional: Use patch.dict for safer extension manipulation
+from unittest.mock import patch
+
 def test_ensure_all_mef_alias_no_search_ext(app):
     """ensure_all_mef_alias skips when invenio-search is not loaded."""
     ext = REROMEFAPP.__new__(REROMEFAPP)
-    saved = app.extensions.pop("invenio-search", None)
-    try:
+    # Temporarily remove invenio-search extension
+    with patch.dict(app.extensions, {}, clear=False):
+        app.extensions.pop("invenio-search", None)
         ext.ensure_all_mef_alias(app)
-    finally:
-        if saved is not None:
-            app.extensions["invenio-search"] = saved
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/ui/test_ext.py` around lines 35 - 43, The test
test_ensure_all_mef_alias_no_search_ext manually pops and restores
app.extensions which is error-prone; replace that pattern with a context manager
such as unittest.mock.patch.dict or a custom helper to temporarily modify
app.extensions around the call to REROMEFAPP.ensure_all_mef_alias(app) so the
original extensions dict is always restored even on error — locate the test
function and the use of REROMEFAPP.__new__(REROMEFAPP), the app.extensions
pop/restore block, and wrap the call to ensure_all_mef_alias(app) with
patch.dict(app.extensions, {...}, clear=False) or an equivalent context manager.
tests/api/test_all_mef_rest.py (1)

302-316: Move pytest import to module level.

pytest is imported inside functions on lines 304 and 321. While functional, module-level imports are conventional and avoid redundant import overhead.

♻️ Move pytest to module-level imports
 """Tests for all-MEF search and item API views (rero_mef/views.py)."""

 import json
+import pytest
 from unittest import mock

 # ... rest of file ...

 def test_ensure_all_mef_alias_index_check_raises(app):
     """Re-raise unexpected errors from the index existence check."""
-    import pytest
-
     from rero_mef.views import _ensure_all_mef_alias

Also applies to: 319-331

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/api/test_all_mef_rest.py` around lines 302 - 316, Move the local pytest
imports to the module level: remove the inline "import pytest" statements inside
test_ensure_all_mef_alias_index_check_raises and the nearby test function in the
same file, and add a single "import pytest" at the top-level imports of
tests/api/test_all_mef_rest.py so both tests use the module-level pytest import
instead of importing inside the function bodies.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@rero_mef/views.py`:
- Around line 41-59: The helper _endpoint_for_all_mef_pid currently probes
AgentMefRecord, ConceptMefRecord, PlaceMefRecord in order and returns the first
match, causing ambiguous PIDs (e.g., pid "1") to be routed incorrectly; change
it to collect all matching endpoints (call
record_cls.get_record_by_pid(pid_value) for each of AgentMefRecord,
ConceptMefRecord, PlaceMefRecord), and if exactly one match exists return that
endpoint, otherwise return None (fail closed) so duplicates do not cause silent
misrouting; do not rely on ordering of the endpoint_map.

In `@tests/ui/test_theme_views.py`:
- Around line 383-394: The test calls _add_row_from_data with only four
arguments but the function signature requires five (rows, key, value, labels,
entity_name); update the test_add_row_from_data_identified_by_skips_rero_ch test
to pass a valid entity_name (e.g., a string like "item" or None if allowed) as
the fifth argument when invoking _add_row_from_data so the call matches the
_add_row_from_data signature.

---

Duplicate comments:
In `@rero_mef/ext.py`:
- Around line 64-116: The function ensure_all_mef_alias currently only calls
put_alias to add resolved concrete indexes to alias_name ("all_mef"), which
leaves stale indexes attached; instead compute the desired set by resolving each
target in target_aliases ("mef","concepts_mef","places_mef") into concrete index
names, compare it with the current concrete indexes for alias_name (use
search_client.indices.get_alias(name=alias_name) to get existing members), then
build a single bulk change list of remove actions for indexes present but not
desired and add actions for desired but not present, and call
search_client.indices.update_aliases(body={"actions": [...]}) once to reconcile
membership atomically; use the existing try/except patterns around get_alias and
update_aliases and log warnings on failures (reference ensure_all_mef_alias,
target_aliases, alias_name, search_client.indices.get_alias,
search_client.indices.put_alias -> replace with
search_client.indices.update_aliases).

In `@rero_mef/theme/views.py`:
- Around line 594-609: The variable display is currently included raw into the
safe Markup output causing XSS risk; before building lines, escape display
(e.g., replace use of display in lines with an escaped version like
escape(display) or Markup(escape(display))) so any record-derived text
(authorized_access_point/source) is sanitized, and then join with the
already-escaped/Markup chips as before; update the block that constructs lines
(the display/lines list) to use the escaped display value instead of the raw
display variable.

---

Nitpick comments:
In `@rero_mef/monitoring/api.py`:
- Around line 215-236: The loop over endpoints iterates keys then indexes back
into endpoints, which is non-idiomatic; change the loop to iterate over
endpoints.items() (e.g., for doc_type, cfg in endpoints.items()) and then use
cfg.get("search_index", "") instead of endpoints[doc_type]. Also keep usages of
get_entity_class(doc_type), self.get_db_count(doc_type),
self.get_es_count(index), mef_search() and mef_index the same—just replace the
inner access pattern to use the unpacked cfg to make the code clearer and
slightly more efficient.

In `@tests/api/test_all_mef_rest.py`:
- Around line 302-316: Move the local pytest imports to the module level: remove
the inline "import pytest" statements inside
test_ensure_all_mef_alias_index_check_raises and the nearby test function in the
same file, and add a single "import pytest" at the top-level imports of
tests/api/test_all_mef_rest.py so both tests use the module-level pytest import
instead of importing inside the function bodies.

In `@tests/ui/test_ext.py`:
- Around line 35-43: The test test_ensure_all_mef_alias_no_search_ext manually
pops and restores app.extensions which is error-prone; replace that pattern with
a context manager such as unittest.mock.patch.dict or a custom helper to
temporarily modify app.extensions around the call to
REROMEFAPP.ensure_all_mef_alias(app) so the original extensions dict is always
restored even on error — locate the test function and the use of
REROMEFAPP.__new__(REROMEFAPP), the app.extensions pop/restore block, and wrap
the call to ensure_all_mef_alias(app) with patch.dict(app.extensions, {...},
clear=False) or an equivalent context manager.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6e49d320-b685-457b-ae4c-cb4b8ed1dbdc

📥 Commits

Reviewing files that changed from the base of the PR and between 188daed and 12a43dc.

⛔ Files ignored due to path filters (19)
  • .github/workflows/continuous-integration-test.yml is excluded by none and included by none
  • pyproject.toml is excluded by none and included by none
  • rero_mef/config.py is excluded by !rero_mef/config.py and included by rero_mef/**/*.py
  • rero_mef/theme/assets/scss/rero_mef/mef.scss is excluded by none and included by none
  • rero_mef/theme/static/templates/invenio_search_ui/searchbar.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/agents_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/all_mef_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/concepts_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/count_inline.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/facets.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/places_results.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/footer.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/frontpage.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/header.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/mef_detail.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/mef_search.html is excluded by none and included by none
  • scripts/setup is excluded by none and included by none
  • scripts/test is excluded by none and included by none
  • uv.lock is excluded by !**/*.lock and included by none
📒 Files selected for processing (22)
  • rero_mef/all_mef.py
  • rero_mef/cli.py
  • rero_mef/ext.py
  • rero_mef/filter.py
  • rero_mef/monitoring/api.py
  • rero_mef/query.py
  • rero_mef/theme/views.py
  • rero_mef/views.py
  • tests/api/conftest.py
  • tests/api/test_agents_mef_rest.py
  • tests/api/test_all_mef_rest.py
  • tests/api/test_concepts_mef_rest.py
  • tests/api/test_entity_serializers.py
  • tests/api/test_places_mef_rest.py
  • tests/ui/agents/test_agents_tasks.py
  • tests/ui/agents/test_agents_viaf_api.py
  • tests/ui/concepts/test_concepts_api.py
  • tests/ui/conftest.py
  • tests/ui/places/test_places_api.py
  • tests/ui/test_ext.py
  • tests/ui/test_monitoring.py
  • tests/ui/test_theme_views.py
💤 Files with no reviewable changes (1)
  • rero_mef/query.py
🚧 Files skipped from review as they are similar to previous changes (9)
  • tests/ui/agents/test_agents_tasks.py
  • tests/api/test_concepts_mef_rest.py
  • tests/ui/conftest.py
  • tests/ui/test_monitoring.py
  • tests/ui/places/test_places_api.py
  • tests/api/test_agents_mef_rest.py
  • tests/ui/agents/test_agents_viaf_api.py
  • tests/api/test_entity_serializers.py
  • rero_mef/cli.py

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
rero_mef/theme/views.py (1)

607-609: ⚠️ Potential issue | 🔴 Critical

Unsafe final Markup composition still allows HTML injection via display.

display is inserted unescaped, then the whole joined string is wrapped as trusted HTML (Markup(...)). That reintroduces XSS risk even after escaping individual chips.

🔧 Suggested fix
-                lines = [display] if display else []
-                lines.extend(str(c) for c in chips)
-                return Markup("<br>".join(lines))
+                lines = [escape(display)] if display else []
+                lines.extend(chips)
+                return Markup("<br>").join(lines)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rero_mef/theme/views.py` around lines 607 - 609, The current return wraps a
joined string with Markup while inserting the raw variable display, allowing
HTML injection; update the logic in the function that builds lines (the
variables lines, display, chips, and the final Markup call) to ensure display is
properly escaped (e.g., use markupsafe.escape/Markup.escape on display) or
construct the final result by joining already-safe Markup parts (avoid wrapping
an unescaped string with Markup); in short, escape display before including it
in lines and only mark as safe the pieces you know are escaped/safe before
returning Markup("<br>".join(...)).
🧹 Nitpick comments (1)
tests/api/test_all_mef_rest.py (1)

131-165: Assert the pagination URLs, not just key presence.

rero_mef/views.py:263-274 builds these links with url_for(..., _external=True), so the new /api middleware can break the actual values without changing whether "next"/"prev" exist. Verifying the returned URLs here would catch wrong page numbers or a missing /api prefix.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/api/test_all_mef_rest.py` around lines 131 - 165, The tests
test_all_mef_search_pagination_next and test_all_mef_search_pagination_prev only
check for presence of "next"/"prev" keys but not the actual URL values, so add
assertions that verify the returned link strings include the correct /api prefix
and correct page query (e.g. next should point to page=2 for the first test, and
prev should point to page=1 and next to page=3 for the second), either by
comparing body["links"]["next"/"prev"] to an expected URL built with Flask's
url_for(..., _external=True, page=...) or by asserting the exact path/query
substring ("/api/all/mef/?size=5&page=..."); this ensures the link-building code
in rero_mef.views (the function that constructs links with url_for) is producing
correct external URLs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@rero_mef/cli.py`:
- Around line 1159-1183: The current logic only builds "add" actions and never
removes stale concrete indices from the alias; update the behavior in the block
using current_search_client.indices.get_alias(name=alias_name) to read existing
members of alias_name, compute the set difference against the resolved targets
(from targets where exists_alias or exists succeeded), and construct a single
actions list containing both {"remove": {"index": old_index, "alias":
alias_name}} for any old_index not in the desired set and {"add": {"index":
index_name, "alias": alias_name}} for desired indices, then call
current_search_client.indices.update_aliases(body={"actions": actions}) once so
alias_name is reconciled atomically; use the existing symbols
current_search_client.indices.exists_alias,
current_search_client.indices.exists, current_search_client.indices.get_alias
and update_aliases to locate and modify the code.

In `@rero_mef/ext.py`:
- Around line 93-110: The current code swallows all exceptions from
search_client.indices.get_alias and put_alias (variables: alias_mapping, target,
alias_name), turning connection/auth/backend errors into benign fallbacks;
change the error handling to catch only the Elasticsearch "not found" exception
(e.g. elasticsearch.exceptions.NotFoundError or the client's equivalent) for
get_alias and treat that case as alias_mapping = {}, but let any other Exception
propagate or be logged and re-raised so real failures surface; similarly, for
put_alias catch and handle only expected resource-not-found or alias-exists
errors, and otherwise re-raise the error instead of swallowing it.

In `@tests/ui/conftest.py`:
- Around line 53-55: The current logic only writes _WEBPACK_MANIFEST_STUBS when
manifest_path does not exist, leaving existing but incomplete manifest.json
unchanged; change the behavior in the manifest creation block to open
manifest_path if it exists, load its JSON (fall back to {} on parse errors),
merge any missing keys from _WEBPACK_MANIFEST_STUBS into that dict, and then
write the merged result back to manifest_path (using the same encoding and
json.dump) so existing entries are preserved while stub keys are added;
reference manifest_path and _WEBPACK_MANIFEST_STUBS to locate and update the
code.
- Around line 44-50: The fallback for INVENIO_INSTANCE_PATH in conftest.py can
leak state; instead create a session-local temp instance path via pytest's
tmp_path_factory before computing dist_dir: use tmp_path_factory.mktemp or
tmp_path_factory.getbasetemp to produce a directory, set
os.environ["INVENIO_INSTANCE_PATH"] (or os.environ.setdefault) to that temp
path, then build dist_dir from that env var as done now (references: dist_dir,
INVENIO_INSTANCE_PATH, tmp_path_factory). Ensure this setup runs at session
scope so tests share the isolated instance path and avoid using
os.getcwd()/.venv/var/instance.

---

Duplicate comments:
In `@rero_mef/theme/views.py`:
- Around line 607-609: The current return wraps a joined string with Markup
while inserting the raw variable display, allowing HTML injection; update the
logic in the function that builds lines (the variables lines, display, chips,
and the final Markup call) to ensure display is properly escaped (e.g., use
markupsafe.escape/Markup.escape on display) or construct the final result by
joining already-safe Markup parts (avoid wrapping an unescaped string with
Markup); in short, escape display before including it in lines and only mark as
safe the pieces you know are escaped/safe before returning
Markup("<br>".join(...)).

---

Nitpick comments:
In `@tests/api/test_all_mef_rest.py`:
- Around line 131-165: The tests test_all_mef_search_pagination_next and
test_all_mef_search_pagination_prev only check for presence of "next"/"prev"
keys but not the actual URL values, so add assertions that verify the returned
link strings include the correct /api prefix and correct page query (e.g. next
should point to page=2 for the first test, and prev should point to page=1 and
next to page=3 for the second), either by comparing body["links"]["next"/"prev"]
to an expected URL built with Flask's url_for(..., _external=True, page=...) or
by asserting the exact path/query substring ("/api/all/mef/?size=5&page=...");
this ensures the link-building code in rero_mef.views (the function that
constructs links with url_for) is producing correct external URLs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b50de19a-88d9-4592-abb0-4e34ee8e510b

📥 Commits

Reviewing files that changed from the base of the PR and between 12a43dc and 268af7a.

⛔ Files ignored due to path filters (19)
  • .github/workflows/continuous-integration-test.yml is excluded by none and included by none
  • pyproject.toml is excluded by none and included by none
  • rero_mef/config.py is excluded by !rero_mef/config.py and included by rero_mef/**/*.py
  • rero_mef/theme/assets/scss/rero_mef/mef.scss is excluded by none and included by none
  • rero_mef/theme/static/templates/invenio_search_ui/searchbar.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/agents_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/all_mef_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/concepts_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/count_inline.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/facets.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/places_results.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/footer.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/frontpage.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/header.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/mef_detail.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/mef_search.html is excluded by none and included by none
  • scripts/setup is excluded by none and included by none
  • scripts/test is excluded by none and included by none
  • uv.lock is excluded by !**/*.lock and included by none
📒 Files selected for processing (22)
  • rero_mef/all_mef.py
  • rero_mef/cli.py
  • rero_mef/ext.py
  • rero_mef/filter.py
  • rero_mef/monitoring/api.py
  • rero_mef/query.py
  • rero_mef/theme/views.py
  • rero_mef/views.py
  • tests/api/conftest.py
  • tests/api/test_agents_mef_rest.py
  • tests/api/test_all_mef_rest.py
  • tests/api/test_concepts_mef_rest.py
  • tests/api/test_entity_serializers.py
  • tests/api/test_places_mef_rest.py
  • tests/ui/agents/test_agents_tasks.py
  • tests/ui/agents/test_agents_viaf_api.py
  • tests/ui/concepts/test_concepts_api.py
  • tests/ui/conftest.py
  • tests/ui/places/test_places_api.py
  • tests/ui/test_ext.py
  • tests/ui/test_monitoring.py
  • tests/ui/test_theme_views.py
💤 Files with no reviewable changes (1)
  • rero_mef/query.py
✅ Files skipped from review due to trivial changes (6)
  • tests/ui/test_monitoring.py
  • tests/ui/concepts/test_concepts_api.py
  • tests/ui/test_ext.py
  • rero_mef/filter.py
  • tests/ui/test_theme_views.py
  • rero_mef/views.py
🚧 Files skipped from review as they are similar to previous changes (6)
  • tests/ui/agents/test_agents_tasks.py
  • tests/api/test_concepts_mef_rest.py
  • tests/ui/agents/test_agents_viaf_api.py
  • tests/api/test_places_mef_rest.py
  • tests/api/test_agents_mef_rest.py
  • tests/api/test_entity_serializers.py

@rerowep rerowep force-pushed the wep-ui branch 2 times, most recently from 0ac827f to 04b336c Compare March 31, 2026 18:17
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
tests/ui/conftest.py (1)

53-55: ⚠️ Potential issue | 🟡 Minor

Patch existing manifest.json instead of only creating it when missing.

Line 53 only initializes stubs on first creation. If manifest.json already exists but misses keys, tests can still fail to render templates.

Suggested fix
     manifest_path = os.path.join(dist_dir, "manifest.json")
-    if not os.path.exists(manifest_path):
-        with open(manifest_path, "w", encoding="utf-8") as f:
-            json.dump(_WEBPACK_MANIFEST_STUBS, f)
+    manifest = {}
+    if os.path.exists(manifest_path):
+        try:
+            with open(manifest_path, encoding="utf-8") as f:
+                manifest = json.load(f)
+        except json.JSONDecodeError:
+            manifest = {}
+    for key, value in _WEBPACK_MANIFEST_STUBS.items():
+        manifest.setdefault(key, value)
+    with open(manifest_path, "w", encoding="utf-8") as f:
+        json.dump(manifest, f)
As per coding guidelines, `**/*.py`: “Focus on: Security vulnerabilities and correctness issues; Performance problems and edge cases; Logic errors and architectural improvements.”
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/ui/conftest.py` around lines 53 - 55, Currently the code only writes
_WEBPACK_MANIFEST_STUBS to manifest_path when the file is missing; instead
always open manifest_path (create if missing), load its JSON if present, merge
any missing keys from _WEBPACK_MANIFEST_STUBS into the loaded dict, and write
the merged dict back using json.dump with encoding="utf-8"; update the logic
around manifest_path and _WEBPACK_MANIFEST_STUBS so the existing manifest is
patched (not overwritten) to include missing stub keys while preserving existing
values.
rero_mef/theme/views.py (1)

607-609: ⚠️ Potential issue | 🔴 Critical

Escape display before producing safe HTML output.

Line 607 includes unescaped record content in lines, and Line 609 marks the full joined string as safe HTML. That allows HTML/script injection through metadata values.

Suggested fix
-                lines = [display] if display else []
-                lines.extend(str(c) for c in chips)
-                return Markup("<br>".join(lines))
+                lines = [escape(display)] if display else []
+                lines.extend(chips)
+                return Markup("<br>").join(lines)
In MarkupSafe, does `Markup("<br>".join(lines))` mark all joined content as trusted, and is `Markup("<br>").join([escaped_or_markup_items])` the recommended safe pattern?

As per coding guidelines, **/*.py: “Focus on: Security vulnerabilities and correctness issues; Performance problems and edge cases; Logic errors and architectural improvements.”

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rero_mef/theme/views.py` around lines 607 - 609, The current code collects
raw values in lines (using display and chips) then marks the joined string as
safe via Markup("<br>".join(lines)), which allows HTML/script injection;
instead, import and use markupsafe.escape to escape display and each chip (e.g.,
call escape(display) and escape(str(c)) when building the list), then join the
escaped items with Markup("<br>").join(escaped_lines) so the separator is
trusted but content remains escaped; update the logic around the lines variable,
the display handling, the generator for chips, and the final Markup join
accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@rero_mef/all_mef.py`:
- Line 22: Replace the mutable class attribute facets = {} with an immutable
mapping to avoid cross-instance mutation: import types and set Meta.facets =
types.MappingProxyType({}) (or use an equivalent immutable mapping) in
rero_mef.all_mef.Meta (and apply the same pattern to the other Meta classes that
define facets) so that consumers can read facets but cannot mutate the shared
mapping.

In `@rero_mef/cli.py`:
- Around line 1177-1181: The code currently catches all exceptions around
current_search_client.indices.get_alias(name=alias_name), which hides real
errors; change the except Exception to catch the Elasticsearch "NotFoundError"
(elasticsearch.exceptions.NotFoundError) when the alias does not exist, import
that exception, set current_members = set() in that specific except block, and
let other exceptions propagate or be logged/raised so real backend failures
aren't swallowed (reference current_search_client.indices.get_alias, alias_name,
and current_members).

---

Duplicate comments:
In `@rero_mef/theme/views.py`:
- Around line 607-609: The current code collects raw values in lines (using
display and chips) then marks the joined string as safe via
Markup("<br>".join(lines)), which allows HTML/script injection; instead, import
and use markupsafe.escape to escape display and each chip (e.g., call
escape(display) and escape(str(c)) when building the list), then join the
escaped items with Markup("<br>").join(escaped_lines) so the separator is
trusted but content remains escaped; update the logic around the lines variable,
the display handling, the generator for chips, and the final Markup join
accordingly.

In `@tests/ui/conftest.py`:
- Around line 53-55: Currently the code only writes _WEBPACK_MANIFEST_STUBS to
manifest_path when the file is missing; instead always open manifest_path
(create if missing), load its JSON if present, merge any missing keys from
_WEBPACK_MANIFEST_STUBS into the loaded dict, and write the merged dict back
using json.dump with encoding="utf-8"; update the logic around manifest_path and
_WEBPACK_MANIFEST_STUBS so the existing manifest is patched (not overwritten) to
include missing stub keys while preserving existing values.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 285fbe11-e0e2-478d-8478-36b230f53f3b

📥 Commits

Reviewing files that changed from the base of the PR and between 268af7a and 0ac827f.

⛔ Files ignored due to path filters (19)
  • .github/workflows/continuous-integration-test.yml is excluded by none and included by none
  • pyproject.toml is excluded by none and included by none
  • rero_mef/config.py is excluded by !rero_mef/config.py and included by rero_mef/**/*.py
  • rero_mef/theme/assets/scss/rero_mef/mef.scss is excluded by none and included by none
  • rero_mef/theme/static/templates/invenio_search_ui/searchbar.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/agents_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/all_mef_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/concepts_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/count_inline.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/facets.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/places_results.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/footer.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/frontpage.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/header.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/mef_detail.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/mef_search.html is excluded by none and included by none
  • scripts/setup is excluded by none and included by none
  • scripts/test is excluded by none and included by none
  • uv.lock is excluded by !**/*.lock and included by none
📒 Files selected for processing (22)
  • rero_mef/all_mef.py
  • rero_mef/cli.py
  • rero_mef/ext.py
  • rero_mef/filter.py
  • rero_mef/monitoring/api.py
  • rero_mef/query.py
  • rero_mef/theme/views.py
  • rero_mef/views.py
  • tests/api/conftest.py
  • tests/api/test_agents_mef_rest.py
  • tests/api/test_all_mef_rest.py
  • tests/api/test_concepts_mef_rest.py
  • tests/api/test_entity_serializers.py
  • tests/api/test_places_mef_rest.py
  • tests/ui/agents/test_agents_tasks.py
  • tests/ui/agents/test_agents_viaf_api.py
  • tests/ui/concepts/test_concepts_api.py
  • tests/ui/conftest.py
  • tests/ui/places/test_places_api.py
  • tests/ui/test_ext.py
  • tests/ui/test_monitoring.py
  • tests/ui/test_theme_views.py
💤 Files with no reviewable changes (1)
  • rero_mef/query.py
✅ Files skipped from review due to trivial changes (6)
  • tests/ui/concepts/test_concepts_api.py
  • tests/api/test_concepts_mef_rest.py
  • tests/ui/agents/test_agents_viaf_api.py
  • tests/api/test_entity_serializers.py
  • tests/ui/test_monitoring.py
  • tests/api/test_all_mef_rest.py
🚧 Files skipped from review as they are similar to previous changes (8)
  • tests/ui/agents/test_agents_tasks.py
  • tests/api/test_agents_mef_rest.py
  • tests/api/test_places_mef_rest.py
  • tests/ui/places/test_places_api.py
  • rero_mef/filter.py
  • rero_mef/ext.py
  • rero_mef/views.py
  • rero_mef/monitoring/api.py

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
rero_mef/theme/views.py (1)

606-609: ⚠️ Potential issue | 🟠 Major

Escape display before joining with Markup.

The display variable (from Line 594) contains user-controlled values (access_point, source) that are not escaped. Marking the entire joined string as Markup at Line 609 allows XSS.

Proposed fix
-                lines = [display] if display else []
+                lines = [escape(display)] if display else []
                 lines.extend(str(c) for c in chips)
                 return Markup("<br>".join(lines))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rero_mef/theme/views.py` around lines 606 - 609, The code currently builds
lines = [display] if display else [] and then returns
Markup("<br>".join(lines)), which marks unescaped user-controlled display as
safe and can lead to XSS; update this to escape display before joining (e.g.,
use escape(display) when populating lines) so that the string passed into Markup
is already escaped; reference the variables/functions display, chips, lines and
the Markup call when making the change.
🧹 Nitpick comments (1)
tests/ui/test_theme_views.py (1)

110-127: Add a negative Referer test to prevent redirect/reflection regressions.

This currently validates only the local Referer case. Add one case with an external Referer (for example, https://evil.example) and assert fallback behavior (and/or that the external URL is not surfaced in the rendered response). That will protect the security boundary this test is trying to enforce.

🔒 Suggested test addition
+def test_agents_preview_with_external_referrer_is_ignored(
+    client,
+    agent_mef_record,
+    agent_viaf_record,
+    agent_gnd_record,
+    agent_idref_record,
+    agent_rero_record,
+):
+    """External Referer must not be used as back URL."""
+    pid = agent_mef_record.get("pid")
+    external_referrer = "https://evil.example/phishing"
+    res = client.get(
+        url_for("rero_mef.agents_preview", pid_value=pid),
+        headers={"Referer": external_referrer},
+    )
+    assert res.status_code == 200
+    assert external_referrer not in res.get_data(as_text=True)

As per coding guidelines, focus on security vulnerabilities and correctness issues.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/ui/test_theme_views.py` around lines 110 - 127, Add a negative Referer
test alongside test_agents_preview_with_local_referrer to ensure external
Referer headers are not reflected: create a new test (e.g.,
test_agents_preview_with_external_referrer) that calls
client.get(url_for("rero_mef.agents_preview", pid_value=pid),
headers={"Referer": "https://evil.example"}), assert res.status_code == 200, and
assert that the external URL is not present in res.get_data(as_text=True) and
that the response falls back to the safe/default back-URL behavior used by the
view (i.e., the same fallback value the view uses when no local Referer is
provided).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@rero_mef/theme/views.py`:
- Around line 606-609: The code currently builds lines = [display] if display
else [] and then returns Markup("<br>".join(lines)), which marks unescaped
user-controlled display as safe and can lead to XSS; update this to escape
display before joining (e.g., use escape(display) when populating lines) so that
the string passed into Markup is already escaped; reference the
variables/functions display, chips, lines and the Markup call when making the
change.

---

Nitpick comments:
In `@tests/ui/test_theme_views.py`:
- Around line 110-127: Add a negative Referer test alongside
test_agents_preview_with_local_referrer to ensure external Referer headers are
not reflected: create a new test (e.g.,
test_agents_preview_with_external_referrer) that calls
client.get(url_for("rero_mef.agents_preview", pid_value=pid),
headers={"Referer": "https://evil.example"}), assert res.status_code == 200, and
assert that the external URL is not present in res.get_data(as_text=True) and
that the response falls back to the safe/default back-URL behavior used by the
view (i.e., the same fallback value the view uses when no local Referer is
provided).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d769238f-3444-4e9f-91c3-bc86b318132e

📥 Commits

Reviewing files that changed from the base of the PR and between 0ac827f and 04b336c.

⛔ Files ignored due to path filters (19)
  • .github/workflows/continuous-integration-test.yml is excluded by none and included by none
  • pyproject.toml is excluded by none and included by none
  • rero_mef/config.py is excluded by !rero_mef/config.py and included by rero_mef/**/*.py
  • rero_mef/theme/assets/scss/rero_mef/mef.scss is excluded by none and included by none
  • rero_mef/theme/static/templates/invenio_search_ui/searchbar.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/agents_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/all_mef_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/concepts_results.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/count_inline.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/facets.html is excluded by none and included by none
  • rero_mef/theme/static/templates/rero_mef/places_results.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/footer.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/frontpage.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/header.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/mef_detail.html is excluded by none and included by none
  • rero_mef/theme/templates/rero_mef/mef_search.html is excluded by none and included by none
  • scripts/setup is excluded by none and included by none
  • scripts/test is excluded by none and included by none
  • uv.lock is excluded by !**/*.lock and included by none
📒 Files selected for processing (22)
  • rero_mef/all_mef.py
  • rero_mef/cli.py
  • rero_mef/ext.py
  • rero_mef/filter.py
  • rero_mef/monitoring/api.py
  • rero_mef/query.py
  • rero_mef/theme/views.py
  • rero_mef/views.py
  • tests/api/conftest.py
  • tests/api/test_agents_mef_rest.py
  • tests/api/test_all_mef_rest.py
  • tests/api/test_concepts_mef_rest.py
  • tests/api/test_entity_serializers.py
  • tests/api/test_places_mef_rest.py
  • tests/ui/agents/test_agents_tasks.py
  • tests/ui/agents/test_agents_viaf_api.py
  • tests/ui/concepts/test_concepts_api.py
  • tests/ui/conftest.py
  • tests/ui/places/test_places_api.py
  • tests/ui/test_ext.py
  • tests/ui/test_monitoring.py
  • tests/ui/test_theme_views.py
💤 Files with no reviewable changes (1)
  • rero_mef/query.py
✅ Files skipped from review due to trivial changes (4)
  • tests/ui/agents/test_agents_tasks.py
  • tests/api/test_places_mef_rest.py
  • tests/ui/agents/test_agents_viaf_api.py
  • tests/api/test_entity_serializers.py
🚧 Files skipped from review as they are similar to previous changes (7)
  • tests/ui/places/test_places_api.py
  • tests/ui/concepts/test_concepts_api.py
  • tests/ui/test_monitoring.py
  • tests/ui/conftest.py
  • rero_mef/monitoring/api.py
  • rero_mef/filter.py
  • tests/api/test_agents_mef_rest.py

* Add dedicated list and preview pages for agents, concepts, places,
  and all_mef. Improve search UX with sort and size controls, richer
  facets, and entity-specific result templates with thumbnails.
* Add all_mef API and alias handling support, including setup/CLI hooks
  to create the alias and route compatibility for all/mef item access.
* Refine detail rendering and navigation: hide data.rero.ch identifiedBy
  values, simplify header/button labels, and preserve return navigation
  to the originating search page.

Co-Authored-by: Peter Weber <peter.weber@rero.ch>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants