Skip to content

feat: allow folder-specific filters#9

Merged
robin-cls merged 3 commits into
mainfrom
query_layout_filters
May 13, 2026
Merged

feat: allow folder-specific filters#9
robin-cls merged 3 commits into
mainfrom
query_layout_filters

Conversation

@robin-cls
Copy link
Copy Markdown
Collaborator

list_files, query, filter_values did not support folder-specific filters until now. However, some datasets may have metadata in folders which is not present in the file name. This is the case for the L4-KaRIn dataset which has the version number in the folder name, but not in the file name.

This PR allows the intermediate nodes declared in the layouts - these intermediate nodes represent folders - to be used as filters in the list_files, map, query and filter_values methods. Instead of using the file name convention fields, we aggregate all fields from all declared conventions in the layouts.

Until now, layouts were assumed to share the file name convention. This means that all filtering fields could be applied to all layouts. This assumption is not true anymore. We need to introduce new behaviors when the given filters are specific to the intermediate nodes, but these nodes do not exist. The following cases were implemented:

  • A folder-specific filter is given but the layouts are not enabled -> the filter is ignored and we raise a warning
  • A folder-specific filter is given, the layouts are enabled and the filter is present in all the layouts -> the query should return successfully
  • A folder-specific filter is given, the layouts are enabled and the filter is not present in all layouts. Only the layouts that declare the field are selected. If the query is configured to scan the part of the file system, it should returned successfully. If the query is not sufficiently constrained, the layout decimation will lead to a LayoutMismatchError. In this case, the exception is annotated by listing the folder-specific filters that may have broken the query.

@robin-cls robin-cls merged commit c0d9366 into main May 13, 2026
7 checks passed
@robin-cls robin-cls deleted the query_layout_filters branch May 26, 2026 14:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant