Refactor Collection Creation for no delay #243

nishika26 · 2025-06-19T12:46:00Z

Summary

Target issue is #217

Refactored POST /collections/create to first insert the collection in the DB with status='processing' before triggering background tasks.
Async task now handles vector store + assistant creation and updates the collection with LLM details and status (success/failed).
Reused existing CollectionCrud and DocumentCollectionCrud methods for updates and associations.

NOTES

Logging few important parameters such as - INFO - Collection created: 425111c7-25a3-42d3-b373-c97202d9aad5 | Time: 7.514757871627808s | Files: 1 |Sizes:[137.57] KB |Types: ['txt']
To log file size(s) of files being uploaded, this logic was used to first calculate the file size and then the logic was added to core/cloud/storage.py

codecov · 2025-06-19T15:10:01Z

Codecov Report

Attention: Patch coverage is 90.55118% with 24 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
backend/app/api/routes/collections.py	73.91%	12 Missing ⚠️
backend/app/core/cloud/storage.py	16.66%	5 Missing ⚠️
backend/app/crud/collection.py	78.57%	3 Missing ⚠️
.../api/routes/collections/test_create_collections.py	96.96%	2 Missing ⚠️
backend/app/models/organization.py	50.00%	1 Missing ⚠️
backend/app/tests/utils/utils.py	92.30%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

backend/app/alembic/versions/75b5156a28fd_add_organization_id_project_id_status_.py

AkhileshNegi · 2025-06-20T10:26:38Z

backend/app/alembic/versions/75b5156a28fd_add_organization_id_project_id_status_.py

+        "collection", "llm_service_id", existing_type=sa.VARCHAR(), nullable=True
+    )
+    op.alter_column(
+        "collection", "llm_service_name", existing_type=sa.VARCHAR(), nullable=True
+    )


these were already there in table why we need them again?

they are not being added again, they are getting altered from non-nullable to nullable columns

AkhileshNegi · 2025-06-20T10:28:07Z

backend/app/api/routes/collections.py

+        assistant = assistant_crud.create(
+            vector_store.id, **dict(request.extract_super_type(AssistantOptions))
+        )


backend/app/tests/utils/utils.py

backend/app/api/deps.py

avirajsingh7 · 2025-06-20T10:01:49Z

backend/app/api/deps.py

+    )
+
+
+CurrentUserOrgproject = Annotated[UserProjectOrg, Depends(get_current_user_org_project)]


While CurrentUserOrgProject is perfectly fine and clear, you could also consider a more concise alternative like CurrentUserContext or CurrentUserScope.

And UserProjectOrg as UserContext or UserScope.

You are right about using it this way but let's keep this for later

avirajsingh7 · 2025-06-20T10:20:09Z

backend/app/tests/api/routes/collections/test_collection_info.py

+    api_key_headers: dict[str, str],
+):
+    user = get_user_from_api_key(db, api_key_headers)
+    collection = create_collection(db, user, status="processing")


When writing test cases, make sure to delete any entries created in the database during the test.
We want to maintain a consistent and clean database state across all tests.

This issue is currently present throughout many of our test cases and needs to be addressed to ensure reliable and isolated testing.

May be you can refer this and use teardown function.

Ideally, we should seed necessary data at the start of the test (we can use seed_script), and any additional data created during the test should be deleted once it has been used.

avirajsingh7 · 2025-06-20T10:28:55Z

backend/app/tests/utils/utils.py

    return user.id


+def get_real_api_key_headers(db: Session) -> dict[str, str]:


Currently, every time the get_real_api_key_headers function is called, it creates a new API key, project, and organization.

Instead, you can seed this data once at the beginning of the test session and reuse the same API key across test cases—similar to how normal_user_token and super_user_token are handled.

This approach reduces redundant setup, improves test performance, and ensures consistency across tests.

backend/app/models/collection.py

nishika26 added 5 commits June 19, 2025 18:12

routes and deps

47a9b89

small fix

f3494ec

logging

e30f5d8

adding org id and project in crud tests

60c1afd

migration file

c0ae5a3

nishika26 changed the title ~~routes and deps~~ Refactor Collection Creation for no delay Jun 19, 2025

nishika26 self-assigned this Jun 19, 2025

nishika26 added bug Something isn't working dalgo-migration ready-for-review labels Jun 19, 2025

nishika26 added this to Dev Priorities Jun 19, 2025

nishika26 moved this to In Progress in Dev Priorities Jun 19, 2025

nishika26 linked an issue Jun 19, 2025 that may be closed by this pull request

resource key returned in the async api has a delay in creation #217

Closed

nishika26 marked this pull request as ready for review June 19, 2025 15:12

nishika26 requested review from AkhileshNegi and avirajsingh7 and removed request for AkhileshNegi June 19, 2025 17:58

AkhileshNegi removed the status in Dev Priorities Jun 20, 2025

nishika26 and others added 5 commits June 20, 2025 14:52

test cases

8a2b80f

test cases

e15ed29

Merge branch 'main' into bug/response_delay

e961d10

formatting

56c5166

test case fix

230d6f1

AkhileshNegi requested changes Jun 20, 2025

View reviewed changes

avirajsingh7 requested changes Jun 20, 2025

View reviewed changes

nishika26 added 3 commits June 20, 2025 18:15

migration and pascalcase

c89ead5

pr review fixes

f08dbfe

vector store fix

0a8f0e0

AkhileshNegi approved these changes Jun 20, 2025

View reviewed changes

avirajsingh7 approved these changes Jun 20, 2025

View reviewed changes

Merge branch 'main' into bug/response_delay

803c534

AkhileshNegi merged commit 36650dd into main Jun 21, 2025
1 check passed

AkhileshNegi deleted the bug/response_delay branch June 21, 2025 05:02

github-project-automation bot moved this to Closed in Dev Priorities Jun 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor Collection Creation for no delay #243

Refactor Collection Creation for no delay #243

Uh oh!

nishika26 commented Jun 19, 2025 •

edited

Loading

Uh oh!

codecov bot commented Jun 19, 2025 •

edited

Loading

Uh oh!

Uh oh!

AkhileshNegi Jun 20, 2025

Uh oh!

nishika26 Jun 20, 2025

Uh oh!

AkhileshNegi Jun 20, 2025

Uh oh!

AkhileshNegi Jun 20, 2025

Uh oh!

Uh oh!

Uh oh!

avirajsingh7 Jun 20, 2025

Uh oh!

nishika26 Jun 20, 2025 •

edited

Loading

Uh oh!

avirajsingh7 Jun 20, 2025

Uh oh!

avirajsingh7 Jun 20, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		)


		CurrentUserOrgproject = Annotated[UserProjectOrg, Depends(get_current_user_org_project)]

		return user.id


		def get_real_api_key_headers(db: Session) -> dict[str, str]:

Refactor Collection Creation for no delay #243

Refactor Collection Creation for no delay #243

Uh oh!

Conversation

nishika26 commented Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

NOTES

Uh oh!

codecov bot commented Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

AkhileshNegi Jun 20, 2025

Choose a reason for hiding this comment

Uh oh!

nishika26 Jun 20, 2025

Choose a reason for hiding this comment

Uh oh!

AkhileshNegi Jun 20, 2025

Choose a reason for hiding this comment

Uh oh!

AkhileshNegi Jun 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

avirajsingh7 Jun 20, 2025

Choose a reason for hiding this comment

Uh oh!

nishika26 Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

avirajsingh7 Jun 20, 2025

Choose a reason for hiding this comment

Uh oh!

avirajsingh7 Jun 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nishika26 commented Jun 19, 2025 •

edited

Loading

codecov bot commented Jun 19, 2025 •

edited

Loading

nishika26 Jun 20, 2025 •

edited

Loading