Add group sparsity support

## Description
Implement group sparsity to prune entire structures (filters, attention heads, etc.) rather than individual weights.

## Motivation
Structured pruning can provide better hardware acceleration since it maintains dense tensor operations on smaller tensors, rather than sparse operations on large tensors.

## Proposed Implementation
- Group gates for conv filters: single gate per filter
- Group gates for attention heads: single gate per head
- Group gates for channels: single gate per channel

## Example
```python
# Prune entire conv filters
conv = L0Conv2d(64, 128, 3, structured='filter')

# Prune attention heads
attn = L0MultiheadAttention(embed_dim=512, num_heads=8, structured='head')
```

## References
- Section 3.2 of the L0 paper on structured sparsity

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add group sparsity support #5

Description

Motivation

Proposed Implementation

Example

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add group sparsity support #5

Description

Description

Motivation

Proposed Implementation

Example

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions