Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index Cleanup - Job #3133

Merged
merged 8 commits into from
Apr 11, 2023
Merged

Conversation

djabarovgeorge
Copy link
Contributor

What change does this PR introduce?

Why? (Context)

We want to remove as many inefficient indexs and queries as possible.

Why was this change needed?

Stop using single indexes on the mongoose key implementation
Add compound indexes for common queries and patterns using MongoDB index best practices
Remove unused indexes from MongoDB atlas after implementing the new compound indexes

Other information (Screenshots)

@linear
Copy link

linear bot commented Apr 3, 2023

Comment on lines -78 to -86
public async findInAppsForDigest(organizationId: string, transactionId: string, subscriberId: string) {
return await this.find({
_organizationId: organizationId,
type: ChannelTypeEnum.IN_APP,
_subscriberId: subscriberId,
transactionId,
});
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed because this query is not in use

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An initial proposal I still want to overview again and see if we can make it fewer indexes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left the notes so ill have a ref on the desition making will update them once ill finish.

Copy link
Contributor

@p-fernandez p-fernandez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🌟

libs/dal/src/repositories/job/job.schema.ts Outdated Show resolved Hide resolved
libs/dal/src/repositories/job/job.schema.ts Outdated Show resolved Hide resolved
Comment on lines 137 to 140
/*
* This index was initially created to optimize:
* apps/api/src/app/events/usecases/add-job/add-delay-job.usecase.ts
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving the explanations for the different index optimisations are very helpful. 🙌🏻

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that even better would be to provide here some query examples? Because this comment is already not accurate as @LetItRock moved this file :P

Having some query examples here stay accurate even when code moves.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point added the original query and the function name as a context

Comment on lines 137 to 140
/*
* This index was initially created to optimize:
* apps/api/src/app/events/usecases/add-job/add-delay-job.usecase.ts
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that even better would be to provide here some query examples? Because this comment is already not accurate as @LetItRock moved this file :P

Having some query examples here stay accurate even when code moves.

Comment on lines 142 to 143
transactionId: 1,
_subscriberId: 1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of the cardinality of those 2 fields together I feel like we don't need all the other properties here of _templateId and _environmentId unless this could become a covered query. So having transactionId, _subscriberId, status should be enough in my opinion

Comment on lines 155 to 156
transactionId: 1,
_subscriberId: 1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following the top comment regarding the above query, I think this can be omitted?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, the next index below is a copy of this one.
From our discussion with mongo the order of the fields by the esr rule is the most important. Mongo should be smart enough to use the below index if this one is removed.

Comment on lines 193 to 351
_templateId: 1,
_subscriberId: 1,
_templateId: 1,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

switched _subscriberId with _templateId because it has a higher cardinality

@djabarovgeorge
Copy link
Contributor Author

djabarovgeorge commented Apr 4, 2023

@scopsy

what do you think of changing
jobSchema.index({
_subscriberId: 1,
_templateId: 1,
status: 1,
type: 1,
transactionId: 1,
});

to

jobSchema.index({
_subscriberId: 1,
_templateId: 1,
_environmentId : 1
type: 1,
status: 1,
});

That will mean that we could delete it because we already have:

jobSchema.index({
_subscriberId: 1,
_templateId: 1,
_environmentId: 1,
type: 1,
status: 1,
updatedAt: 1,
transactionId: 1,
});

At the moment they look really similar.

* ...(digestKey && { [`payload.${digestKey}`]: digestValue }),
* }
*/
jobSchema.index({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this index may be "overrided" by the first query.
When you run this query, can you run https://www.mongodb.com/docs/manual/reference/method/cursor.explain/ to see which index the above query is running?

@scopsy
Copy link
Contributor

scopsy commented Apr 5, 2023

@djabarovgeorge we also need to add a single key index to support _notificationId queries, this is used by the activity feed populate from mongoose

@djabarovgeorge djabarovgeorge added this pull request to the merge queue Apr 11, 2023
Merged via the queue into next with commit 6d2cf1a Apr 11, 2023
11 checks passed
@djabarovgeorge djabarovgeorge deleted the NV-1938/job-collection-indexes-refactor branch April 11, 2023 08:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants