Reopen - Option to include security_repo_status in list_models API for bulk queries

**Is your feature request related to a problem? Please describe.**

Our team has a critical requirement to programmatically assess the security status of a large number of Hugging Face models on a daily basis. The current list_models API in huggingface_hub does not directly return the security_repo_status for each model in the list ([ source](https://github.com/huggingface/huggingface_hub/issues/2649?content_ref=currently+the+default+value+none+is+returned+but+i+would+like+to+have+security_repo_status+included+and+returned)). This was previously discussed in issue #2649, which was closed as "not planned" ([ source](https://github.com/huggingface/huggingface_hub/issues/2649?content_ref=closed+this+as+not+plannedon+nov+4+2024)).

The suggested workaround is to iterate through the models obtained from list_models and then make an individual call to hf_api.model_info(model_id, securityStatus=True) for each model to retrieve its security_repo_status ([ source](https://github.com/huggingface/huggingface_hub/issues/2649?content_ref=for+now+you+can+work+around+this+by+iterating+through+the+list+of+models+and+call+hfapi+model_info+your_model_id+securitystatus+true+for+each+model)). While this works for a small number of models, it is not a feasible solution for our use case which involves scanning a very large volume of models daily. This approach leads to:

Significant performance degradation: Making thousands of individual API calls is extremely slow.
Increased risk of rate limiting: Bulk individual calls are likely to hit API rate limits, disrupting our daily scans.
Inefficient resource utilization.

**Describe the solution you'd like**

We would like to reopen the request for an option within the list_models API to directly include the security_repo_status for each model in the response. Ideally, this could be an optional parameter, for example include_security_status=True, to maintain backward compatibility and allow users to request this information only when needed.

This would be similar to how model_info can retrieve this information with the securityStatus=True parameter ([ source](https://github.com/huggingface/huggingface_hub/issues/2649?content_ref=however+the+individual+model+endpoint+info+api+models+model_id+includes+security+status+when+requested+with+securitystatus+true)).

**Describe alternatives you've considered**

The primary alternative, as mentioned, is to iterate through the list of models from list_models() and then call model_info() for each one.

```
from huggingface_hub import HfApi
hf_api = HfApi()
models = hf_api.list_models(filter="some_filter") # Potentially thousands of models
security_statuses = {}
for model in models:
    try:
        # This part is slow and prone to rate limits for large numbers of models
        model_details = hf_api.model_info(model.id, securityStatus=True)
        security_statuses[model.id] = model_details.security_repo_status
    except Exception as e:
        # Handle errors, retries, rate limits etc.
        security_statuses[model.id] = f"Error fetching status: {e}"
# Process security_statuses
```

This approach is not scalable for our daily operational needs due to the performance and rate-limiting issues described above.

**Additional context**

We understand from the previous discussion in #2649 that including security_repo_status directly in the /api/models response was deemed infeasible because the field is retrieved on-demand and could slow down the API or trigger rate limiting ([ source](https://github.com/huggingface/huggingface_hub/issues/2649)).

However, for users and organizations that need to perform security assessments across a broad set of models regularly, the lack of a bulk retrieval mechanism for security status presents a significant operational challenge. The necessity to efficiently gather this information for a large number of models outweighs the concerns if this feature is made optional.

We believe that an optional parameter to include this information would allow users with bulk processing needs to benefit from this functionality while users who do not need it would not be impacted. We urge the Hugging Face team to reconsider this feature request, perhaps exploring backend optimizations or alternative implementations that could make this feasible for bulk queries, even if it's a slightly different endpoint or a paginated/asynchronous approach for fetching this specific data in bulk.

The ability to efficiently assess model security at scale is crucial for maintaining a secure AI ecosystem.

Reference to previous issue: [#2649 (Option to get security status with hf_api)](https://github.com/huggingface/huggingface_hub/issues/2649)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reopen - Option to include security_repo_status in list_models API for bulk queries #3083

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Reopen - Option to include security_repo_status in list_models API for bulk queries #3083

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions