-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(api): performance service #2878
Conversation
Why are you not just using OTLP? |
I guess you are referring to OpenTelemetry, if not I don't know what you might be talking about. Acronyms are hard. To be honest I didn't think about it because this had a clear purpose of measuring execution times and was not meant to be able to export data. Mostly to add a quick tool for @tsssdev and his request about how to be able to measure Trigger Event execution performance. |
Yes, I am about open telemetry. The most important primitive of OTLP is the span. It represents a unit of work and you annotate your code to track spans. For example method calls, db calls, http requests and so on. Spans have a correlation Ids to represent the parent span and there is a protocol to add this correlation IDs to http requests for distributed tracing. Most implementations for like this:
So what I would do:
You can then get rid of a lot of code and the performance data can also be used for self hosting, e.g. in Azure Application Insights or Google Stackdriver and so on. |
@p-fernandez I agree, I think this will be fine for now. However in the long run I agree with @SebastianStehle that we should be doing open-telemetry, However doing this in the trigger engine and mail parser is much harder then the nest based apps as there are easy OS modules for it. https://github.com/pragmaticivan/nestjs-otel @SebastianStehle I will adding open-telemetry into our backlog. |
public buildDigestFilterStepsMark( | ||
transactionId: string, | ||
templateId: string, | ||
notificationId: string, | ||
subscriberId: string | ||
): IMark { | ||
const mark = { | ||
id: `${MarkFunctionNameEnum.DIGEST_FILTER_STEPS}:event:${transactionId}:template:${templateId}:notification:${notificationId}:subscriber:${subscriberId}:steps`, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to pass the arguments here if we only calculate the averages by the MarkFunctionNameEnum?
couldn't we just generate the random id if the concurrent requests is the problem here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are needed for having individual marks and to help to debug different executions if needed. And to have the right data as they are being accumulated I needed to make it unique, if not they would get overriden everytime it was executed. I saw your mention regarding generating a random UUID but that would make more difficult to debug any of the executions to be honest.
Right now the publishing of the individual marks is disabled as it was cluttering the log output and don't need it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
won't be the request scope random id the solution here?
@@ -69,6 +71,8 @@ export class EventsController { | |||
@UserSession() user: IJwtPayload, | |||
@Body() body: TriggerEventRequestDto | |||
): Promise<TriggerEventResponseDto> { | |||
const mark = this.performanceService.buildEndpointTriggerEventMark(body.transactionId as string); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should hold all this logic in the decorator? I can imagine it like this:
@MeasurePerformance('EventsController.trackEvent')
async trackEvent(
then I don't have to deal with the service
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LetItRock This is something nest-otel can cover for us
} | ||
|
||
@Injectable() | ||
export class EventsPerformanceService { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that we need to create an events-specific performance service, I feel like all this might be generic code, that can be reused in any context. This way we won't end up writing per context performance services. So I see it with the decorator as I mentioned before.
The publishResults
function can be defined as the helper function that will be called from anywhere we need to get the stats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I try to think of the different modules as if they were its own microservice, so making this domain related service allows to:
- Keep isolated in the domain the specific mark generation.
- Have their own kind of calculations and what publish and what not (think in a different domain we might want to have different calculations, instead of average other algorithm, publish different results, etc)
That doesn't mean that's going to happen, to be reused elsewhere, so I can understand why you are suggesting something more generic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know to me it would be more to maintain... but yeah maybe let's start with something simple and then adjust to the new requirements in the future...
return this.setStart(mark); | ||
} | ||
|
||
public buildEndpointTriggerEventMark(transactionId: string): IMark { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these all won't be needed if we will create generic function with mark like:
const mark = {
id: `${functionName}:${uuid}`,
};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressing this comment here: #2878 (comment)
77f524f
to
70cc09c
Compare
@p-fernandez How we generate load? do we have load testing infrastructure in place? Under decent to high load only we can see real issues with locking or with out locking.. if we have Dev/QA type of environment that can take decent load then we can implement quick load tests with K6(https://github.com/grafana/k6 ) or any other similar tool. |
@tsssdev Our testing bed is dev.web.novu.co and we are looking to add k6 load tests in the api e2e test folder. If you want to create a pr for that we would be more then happy to look at it and merge it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good start, lets get it merged in and we can go from here.
Co-authored-by: Paweł Tymczuk <LetItRock@users.noreply.github.com>
Co-authored-by: Paweł Tymczuk <LetItRock@users.noreply.github.com>
9198951
to
e890ef6
Compare
What change does this PR introduce?
Implements a service that allows to gather performance data and Node.js internals data for certain areas of the API.
Also adds a performance test to understand the duration of different executions under different loads.
Why was this change needed?
Will be helpful to study performance of the sensible parts of the Trigger Event and what is the overhead of the distributed locking mechanism. Also can help to set the base for load performance test suites (in a way).
Other information (Screenshots)