Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDS pipethrough optimizations #2770

Merged
merged 47 commits into from
Sep 19, 2024
Merged

PDS pipethrough optimizations #2770

merged 47 commits into from
Sep 19, 2024

Conversation

matthieusieben
Copy link
Contributor

No description provided.

@matthieusieben matthieusieben force-pushed the msieben/micro-optimizations branch 4 times, most recently from 7d9350c to f629fc7 Compare September 2, 2024 18:39
@matthieusieben matthieusieben force-pushed the msieben/micro-optimizations branch from 956c637 to 343dc79 Compare September 6, 2024 14:28
@matthieusieben matthieusieben changed the title Micro optimizations in PDS PDS pipethrough optimizations Sep 6, 2024
@matthieusieben matthieusieben force-pushed the msieben/micro-optimizations branch from 904663b to 10da838 Compare September 6, 2024 21:34
@matthieusieben
Copy link
Contributor Author

matthieusieben commented Sep 6, 2024

Still need to make sure tests pass and add some more

@@ -43,15 +43,6 @@ export function ssrfFetchWrap<C = FetchContext>({
}

if (url.protocol === 'http:' || url.protocol === 'https:') {
// @ts-expect-error non-standard option
if (request.dispatcher) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dispatcher was never exposed on Request


if (allowUnknownTld !== true && !isValidDomain(url.hostname)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be performed through the lookup fn

})
}
if (isHandlerPipeThroughBuffer(outputUnvalidated)) {
setHeaders(res, outputUnvalidated)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clean

}
if (!Object.values(v['headers']).every(isString)) {
return false
const createValidator =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we avoided zod here because it can be kinda heavy duty for validation of stuff that is often invalid. I know it used to handle this by throwing and catching errors which can be rough on Node so we tried to keep it out of the happy/critical path of our server stuff.

But I also have some vague memory that they fixed this & made it faster? Any idea? Or any idea how these compare?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, validating here is way overkill. Since we know the input is of type HandlerOutput, all we need is to discriminate based on that. We can even make some of the checks we currently do disappear !

const createValidator =
<T>(schema: ZodSchema<T>) =>
(v: unknown): v is T =>
schema.safeParse(v).success
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have a helper function for this that we use in most places in common-web/check

check.is(v, schema)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a check.create(schema) as the semantics I needed are a bit different

return nsid.endsWith('/') ? nsid.slice(0, -1) : nsid // trim trailing slash
// /!\ Hot path

const originalUrl = ('originalUrl' in req && req.originalUrl) || req.url
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we rewrite as ternary? may be a bit easier to read on first glance

@@ -23,42 +30,52 @@ export const getLocalLag = (local: LocalRecords): number | undefined => {
return Date.now() - new Date(oldest).getTime()
}

export const handleReadAfterWrite = async <T>(
export const pipethroughReadAfterWrite = async <T>(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dig this fn a lot

const safeAgent = new undici.Agent({
allowH2: cfg.fetch.allowHTTP2, // This is experimental
headersTimeout: cfg.fetch.headersTimeout,
maxResponseSize: cfg.fetch.maxResponseSize,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we cool re-using these config values for both the safeFetch & safeAgent?

only issue I could think of is if we have different tradeoffs on maxResponseSize

@@ -256,12 +263,36 @@ export class AppContext {
appviewCdnUrlPattern: cfg.bskyAppView?.cdnUrlPattern,
})

const safeAgent = new undici.Agent({
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I almost want to call this proxyAgent or safeProxyAgent or something similar to give the sense of what it's for since it's only use there

reqInit: RequestInit,
): Promise<Response> => {
let res: Response
type Accept = [name: string, flags: { q: string }]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thoughts on

type Accept = {
  name: string
  q: string
}

instead of the array syntax? seems more clear & type-safe

Copy link
Contributor Author

@matthieusieben matthieusieben Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of this code is to be agnostic of the accept-* header being negotiated.

Consider for example:

Accept-Patch: text/example;charset=utf-8, text/example

In that case, the Accept type could be generalized as such:

type Accept = [name: string, flags: Record<string, string>]

When there is an obvious case for generalization, I like to write generic code. I guess this boils down to coding style preferences.

I can make this code more straight forward if you prefer or if you think this is too much YAGNI ?

What type safety difference are you thinking of ?

Copy link
Collaborator

@dholms dholms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some small tweaks/questions, but broadstrokes this looks awesome 💪 super stoked to get this in

@matthieusieben matthieusieben force-pushed the msieben/micro-optimizations branch from 205bbae to f1639a5 Compare September 16, 2024 14:04
@dholms dholms merged commit a07b211 into main Sep 19, 2024
11 checks passed
@dholms dholms deleted the msieben/micro-optimizations branch September 19, 2024 23:24
@github-actions github-actions bot mentioned this pull request Sep 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants