Skip to content

Feature Request: Tagging System for File Collections (Group & User Workspaces) #282

Open
@paullizer

Description

@paullizer

Summary

Implement a comprehensive tagging system for organizing and filtering file collections in both group and user workspaces. Tags are stored as objects containing both the tag name and the associated group or user ID, ensuring clear scoping and efficient filtering. Tags are assignable to both documents and their chunks, and are always scoped to their creator (user or group).


Motivation

As the number of documents and chunks increases, users and groups need a scalable way to organize, search, and filter files. Tagging enables logical grouping and focused queries, reducing backend load and improving frontend usability. This is especially important as data sets grow and documents are shared across groups or managed by individual users.


Requirements & Code Integration

1. Data Model & Storage

  • Tag Structure:

    • Tags are stored as objects in a tags array within each document and each chunk, e.g.:

      "tags": [
        { "tag_name": "ProjectX", "group_id": "abc123" }
      ]

      or

      "tags": [
        { "tag_name": "Urgent", "user_id": "user456" }
      ]
    • Tags are always scoped to their creator (user or group) and are not shared.

    • User documents and their chunks are only associated with users; group documents and their chunks are only associated with groups.

  • Containers & Indexes:

    • User documents are stored in the DOCUMENTS Cosmos container and indexed in the user-index.
    • Group documents are stored in the GROUP_DOCUMENTS Cosmos container and indexed in the group-index.
    • Chunks for user documents are indexed in the user-index; chunks for group documents are indexed in the group-index.
  • Tag Management:

    • Group tags are managed in the group's Cosmos record; user tags in the user's profile record.

2. API Endpoints

3. Frontend Integration

4. Settings & Permissions

5. Filtering & Querying

6. Migration & Backward Compatibility

  • Provide migration scripts for both user and group Cosmos DB containers to add the new tag object structure to documents and chunks.
  • Ensure all existing document and chunk endpoints remain backward compatible.

7. Testing

  • Add/extend tests for tag CRUD, assignment, filtering, and permissions for both user and group document and chunk flows.

Example Data Model (Cosmos DB)

User document:

{
  "document_id": "doc101",
  "user_id": "userA",
  "tags": [
    { "tag_name": "Supple", "user_id": "userA" }
  ]
}

User document chunk:

{
  "chunk_id": "chunk123",
  "document_id": "doc101",
  "tags": [
    { "tag_name": "Supple", "user_id": "userA" }
  ]
}

Group document:

{
  "document_id": "doc789",
  "group_id": "groupA",
  "tags": [
    { "tag_name": "Finance", "group_id": "groupA" }
  ]
}

Group document chunk:

{
  "chunk_id": "chunk456",
  "document_id": "doc789",
  "tags": [
    { "tag_name": "Finance", "group_id": "groupA" }
  ]
}

Mermaid Diagram

erDiagram
    USER {
      string user_id
    }
    GROUP {
      string group_id
    }
    DOCUMENT {
      string document_id
      string user_id
    }
    GROUP_DOCUMENT {
      string document_id
      string group_id
    }
    DOCUMENT_CHUNK {
      string chunk_id
      string document_id
    }
    GROUP_DOCUMENT_CHUNK {
      string chunk_id
      string document_id
    }
    USER_TAG {
      string tag_name
      string user_id
    }
    GROUP_TAG {
      string tag_name
      string group_id
    }

    USER ||--o{ DOCUMENT : "owns"
    GROUP ||--o{ GROUP_DOCUMENT : "owns"
    DOCUMENT ||--o{ DOCUMENT_CHUNK : "has"
    GROUP_DOCUMENT ||--o{ GROUP_DOCUMENT_CHUNK : "has"
    DOCUMENT ||--o{ USER_TAG : "has"
    GROUP_DOCUMENT ||--o{ GROUP_TAG : "has"
    DOCUMENT_CHUNK ||--o{ USER_TAG : "has"
    GROUP_DOCUMENT_CHUNK ||--o{ GROUP_TAG : "has"
Loading

Acceptance Criteria

  • Admin can enable/disable tagging globally via [admin_settings.html](https://github.com/microsoft/simplechat/issues/application/single_app/templates/admin_settings.html:1).
  • Group tags are managed and assigned from group workspace; user tags from user workspace.
  • Tags are stored as objects with tag_name and group_id or user_id in Cosmos DB, with group and user documents and chunks in separate containers/indexes.
  • Documents and chunks can be filtered by tags, with queries scoped to the active group/user and their respective containers and AI Search indexes.
  • UI/UX is intuitive for tag management and assignment at both document and chunk level.
  • Backend endpoints are secure and validated.
  • All new/updated endpoints and UI components are covered by tests.

Key Files to Update

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions