fix(cli): fix cache .gitignore written to fs root and undefined workflow name#1444
fix(cli): fix cache .gitignore written to fs root and undefined workflow name#1444lamine2000 wants to merge 7 commits into
Conversation
…low name
- Replace ensureGitIgnore path-walking logic with a direct getCacheRoot
helper. The old loop used endsWith('.cli-cache') to find the cache
root, but if the path never contained that segment (custom cachePath)
it would walk all the way to '/' and write /.gitignore
- Fall back to 'workflow' when plan.workflow.name is undefined, avoiding
.cli-cache/undefined directories
- Add cachePath to saveToCache/clearCache options types so custom cache
paths set by workspace projects are correctly forwarded to getCachePath
- Remove spurious await on the now-synchronous getCachePath call
- Add regression tests covering expressionPath + cacheSteps (the exact
scenario reported in OpenFn#669, which had no test coverage)
Fixes OpenFn#669
josephjclark
left a comment
There was a problem hiding this comment.
Hi @lamine2000, thank you for raising this. I think there's just one necessary change but there are a few discussion points I'd like addressed.
I am not able to reproduce Mtuchi's original issue. It is quite old. Were you able to reproduce it with the CLI?
I appreciate the type fix on saveToCache. It is correct, but the PR comments are not. This untrue: This meant custom cache paths set by workspace projects were silently ignored when calling getCachePath from these functions. The option is carried through regardless of the typings.
Please check that AI generated PR descriptions are accurate! Misleading descriptions take more time to check
| const basePath = path.resolve( | ||
| baseDir ?? process.cwd(), | ||
| `${CACHE_DIR}/${workflowName}` | ||
| `${CACHE_DIR}/${workflowName ?? 'workflow'}` |
There was a problem hiding this comment.
Not sure how often this problem occurs but I don't want to default to workflow here
A better solution would be
const basePath = path.resolve(
baseDir ?? process.cwd(),
CACHE_DIR,
workflowName,
);
Where the undefined workflow name will simply be ignored by path.resolve (I think!)
There was a problem hiding this comment.
Ah I see. Following your comment and checking the node docs, this doesn't actually work. A non string causes the throw. one comment coming in on the refactor
|
|
||
| const ensureGitIgnore = (options: any, cachePath: string) => { | ||
| // Returns the root cache directory where .gitignore should live | ||
| const getCacheRoot = (options: Pick<Opts, 'baseDir' | 'cachePath'>): string => { |
There was a problem hiding this comment.
What is the different between getCachePath and getCacheRoot?
| return path.resolve(baseDir ?? process.cwd(), CACHE_DIR); | ||
| }; | ||
|
|
||
| const ensureGitIgnore = (options: any, cacheRoot: string) => { |
There was a problem hiding this comment.
The old logic here, the reason we walk up through the folder structure, is to see if there's a parenting .gitignore file which we can edit. This would live at the root of the github repo, which we may or may not find.
This PR changes the approach to force a .gitignore file as a child of the cache folder.
I'm not against this. It might be a bit messy but I suppose it's not too big a deal. And I can see the original issue of writing a .gitignore file to root (although I can't reproduce this locally but that might be something in my local setup)
So Ok, let's do it!
| t.is(gitignore, '*'); | ||
| }); | ||
|
|
||
| // Regression test for https://github.com/OpenFn/kit/issues/669 |
There was a problem hiding this comment.
Why is this test making such a distinction between workflow.js and expression.js? I don't think it materially affects the cache behaviour?
|
|
||
| // Regression test for https://github.com/OpenFn/kit/issues/669 | ||
| // Running a .js expression (not a workflow JSON) with caching enabled must not crash. | ||
| test.serial('cache steps when running a .js expression', async (t) => { |
There was a problem hiding this comment.
This test just ensures that the cache is written, not read. Which is fine - I'd just like the test name to be accurate.
- Eliminate getCacheRoot helper: getCachePath(options) with no workflowName/stepId already returns the CACHE_DIR root naturally - Use .filter(Boolean) spread into path.resolve so undefined workflowName does not cause ERR_INVALID_ARG_TYPE (path.resolve throws on undefined) - Keep cachePath in saveToCache/clearCache Pick types so custom paths work - Rename gitignore test name to be more accurate Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Not with the CLI directly, no. But I traced both failure paths in the code and confirmed them.
For the `path.resolve` crash, this reproduces it:
```
node -e "const path = require('path'); path.resolve('/base', '.cli-cache', undefined)"
TypeError [ERR_INVALID_ARG_TYPE]: The "path" argument must be of type string. Received undefined
```
`plan.workflow.name` is undefined when running a bare `.js` file, so that's exactly what was happening.
For the `.gitignore` at `/`, I confirmed it by reading the code. The old loop walked up the path one segment at a time looking for something ending in `.cli-cache`. If the cache path doesn't contain that string, it reaches root and writes there.
The two new tests cover both cases and fail against the original code, which felt like the most solid confirmation I could give.
On the description — you're right, I corrected it. The type fix doesn't affect runtime behaviour at all, the option was always passed through. I've updated the description to reflect that accurately.
And noted on AI-generated descriptions — I should have caught that before opening the PR. I'll make sure to verify every claim manually going forward.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This reverts commit 1e47fad.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use workflowName ?? '' instead of .filter(Boolean) spread. path.resolve treats '' as a no-op so undefined workflowName still resolves to the bare CACHE_DIR root. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
Fixes the CLI cache bug reported in #669 where running a
.jsexpression with caching enabled (viaOPENFN_ALWAYS_CACHE_STEPS=trueor--cache-steps) caused aTypeError [ERR_INVALID_ARG_TYPE]crash.ensureGitIgnorerewritten: The old implementation walked up the cache file path looking for a segment ending in.cli-cache. If the path did not contain that segment (e.g. a customcachePath), the loop traversed all the way to the filesystem root and wrote/.gitignore. Replaced with a direct call togetCachePath(options)(noworkflowName/stepId), which naturally returns the cache root, so no traversal is needed.path.resolvethrowsERR_INVALID_ARG_TYPEwhen passedundefined.getCachePathnow uses.filter(Boolean)before spreading intopath.resolve, so an absentplan.workflow.nameis simply omitted from the path rather than crashing.getCacheRoothelper: After the above two fixes,getCachePath(options)with no extra arguments already returns the cache root directly — no separate helper needed.saveToCacheandclearCache: Both were typed asPick<Opts, 'baseDir' | 'cacheSteps'>, missingcachePath, so thecachePathbranch insidegetCachePathwas unreachable from these functions. AddedcachePathto both Pick types.Test plan
cache steps when running a .js expression— runs an expression withcacheSteps: trueand verifies the cache file is written under.cli-cache/<name>/(this is the exact scenario from CLI cache bug #669, which had zero test coverage).cli-cache has a gitignore when caching a .js expression— verifies.gitignoreis correctly placed in.cli-cache/and not at the filesystem rootCloses #669