-
Notifications
You must be signed in to change notification settings - Fork 11
feat: add function to calculate dynamic default Actor memory #570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feat: add function to calculate dynamic default Actor memory #570
Conversation
|
Asking for review 👇 @barjin & @B4nan - Best practices regarding Thank you 🙏 |
…ed environments (#573) Worker-based environments (e.g. `jest` tests) do not have access to `globalThis.require()` (while calling `require()` instead works). Deeper investigation shows that `tsup` doesn't transpile this in a dangerous way, so we can use this here. Unblocks apify/apify-client-js#782 (tested) Unblocks failing tests in #570
package.json
Outdated
| "jest": "^29.7.0", | ||
| "lerna": "^9.0.0", | ||
| "lint-staged": "^16.0.0", | ||
| "mathjs": "^15.1.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to add this to the package.json of the utilities package, root package.json is only for dev dependencies (and this is not a dev dependency)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be removed
tobice
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there, really sorry for the delayed review.
The general approach makes sense 👍
Three main points:
- Not sure about the package name.
- Not sure about using the very specific
LruCache. - The decisions we make around
Math.jsshould be better documented.
Besides that, I left my usual nits and improvement suggestions.
|
|
||
| import type { LruCache } from '../../datastructures/src/lru_cache'; | ||
|
|
||
| type ActorRunOptions = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have this type already somewhere?
| input: Record<string, unknown>; | ||
| } | ||
|
|
||
| export const DEFAULT_MEMORY_MBYTES_MAX_CHARS = 1000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| export const DEFAULT_MEMORY_MBYTES_MAX_CHARS = 1000; | |
| export const DEFAULT_MEMORY_MBYTES_EXPRESSION_MAX_LENGTH = 1000; |
- I think max length is a common term and more self-explanatory.
- I already got myself confused by the name of this constant.
MBYTES_MAX_CHARS... I was like what the hell is this 😅 Max length only makes sense on an expression, so let's mention it.
| * A Set of allowed keys from ActorRunOptions that can be used in | ||
| * the {{runOptions.variable}} syntax. | ||
| */ | ||
| const ALLOWED_RUN_OPTION_KEYS = new Set<keyof ActorRunOptions>([ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you define the type for ActorRunOptions the way we usually do (or just manage to reuse an existing type), you won't need this.
Copy-pasted definition for ACTOR_PERMISSION_LEVEL:
export const ACTOR_PERMISSION_LEVEL = {
FULL_PERMISSIONS: 'FULL_PERMISSIONS',
LIMITED_PERMISSIONS: 'LIMITED_PERMISSIONS',
};
export type ACTOR_PERMISSION_LEVEL = ValueOf<typeof ACTOR_PERMISSION_LEVEL>;You should be able to get the keys with just Object.keys(ACTOR_PERMISSION_LEVEL).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about this one? 👋
| export const calculateDefaultMemoryFromExpression = ( | ||
| defaultMemoryMbytes: string, | ||
| context: MemoryEvaluationContext, | ||
| options: { cache: LruCache<EvalFunction> } | undefined = undefined, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means that we only support this very specific, synchronous cache. What if we want to cache it in Redis? Can we use a more generic interface?
package.json
Outdated
| "jest": "^29.7.0", | ||
| "lerna": "^9.0.0", | ||
| "lint-staged": "^16.0.0", | ||
| "mathjs": "^15.1.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be removed
| "include": ["src/**/*"], | ||
| "compilerOptions": { | ||
| "module": "node20", | ||
| "moduleResolution": "node16", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you really need module: node20, the resolution should be also node20
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i wonder if this is really needed, you only modified this config, which is used for your IDE, not for building the package (so no effect on users/production build)
tobice
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes 👍
I noticed some of my comments were left without a reaction and unless I missed something, they don't seem to be addressed. Pls take look.
| * A Set of allowed keys from ActorRunOptions that can be used in | ||
| * the {{runOptions.variable}} syntax. | ||
| */ | ||
| const ALLOWED_RUN_OPTION_KEYS = new Set<keyof ActorRunOptions>([ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about this one? 👋
| // extra (has functional impact) | ||
| // We disable evaluate to prevent users from calling it inside their expressions. | ||
| // For example: defaultMemoryMbytes = "evaluate('2 + 2')" | ||
| evaluate() { throw new Error('Function evaluate is disabled.'); }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compile is no longer here? Is that intentional?
| evaluate() { throw new Error('Function evaluate is disabled.'); }, | ||
| parse() { throw new Error('Function parse is disabled.'); }, | ||
| simplify() { throw new Error('Function simplify is disabled.'); }, | ||
| derivative() { throw new Error('Function derivative is disabled.'); }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder: Why do we have to disable these? We have never installed them 😄 Or all these sort of "default", as in, always included?
| * | ||
| * All `input.*` values are accepted, while `runOptions.*` are validated (7 variables from ALLOWED_RUN_OPTION_KEYS). | ||
| * | ||
| * Note: this approach allows developers to use a consistent double-brace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * Note: this approach allows developers to use a consistent double-brace | |
| * Note: While not really needed for Math.js, this approach allows developers | |
| * to use a consistent double-brace templating syntax `{{runOptions.timeoutSecs}}` | |
| * across the Apify platform. We also want to avoid compiling the expression with the | |
| * actual values as that would make caching less effective. |
(opt) I added some more context.
| return variableName; | ||
| } | ||
|
|
||
| // 3. Throw error for unrecognized variables (e.g. {{someVariable}}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As said, IMHO this particular error will never be thrown as the first check already takes care of that.
I'd remove the first check (not just the 1 number 😛 ).
| @@ -0,0 +1,22 @@ | |||
| import type { EvalFunction } from 'mathjs'; | |||
|
|
|||
| export type ActorRunOptions = { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As asked before, don't we have this as an existing type?
| input: Record<string, unknown>; | ||
| } | ||
|
|
||
| export type CompilationCache = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we make it async to make it more future proof? You won't be able to use e.g. Redis with the current interface.
This PR introduces the logic to calculate a dynamic default Actor memory limit based on a provided expression string.
TL;DR:
{{variable}}placeholders with strict validation:input.*paths are allowed, whilerunOptions.*keys are strictly whitelisted(only 7 variables allowed).get(obj, path, default)helper to safely access nested properties without crashing on missing paths (e.g.,get(input, 'startUrls.length', 0)).Note: when using this function in our API, we'll use it in
worker_thread. I didn't add worker_thread to this package, because I suppose it can be used from client-side as well.More context: Dynamic default Actor run memory notion spec