New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(gatsby-source-filesystem): Upgrade XState and refactor its usage #17192
refactor(gatsby-source-filesystem): Upgrade XState and refactor its usage #17192
Conversation
a8b6eac
to
4e78531
Compare
I get this on CI:
But ain't sure what I should do to fix it, or even try this locally. |
Reason for this is mostly that initial node creation (during bootstrap) may result in thousands of File nodes, so this would be very spammy. Node creation after bootstrap is finished is result of file changes and for most use cases this will be single new file / file changes at a time (editing files locally) I'll take a look on this (why it fails in CI) - I agree this is a lot cleaner than current implementation
This is quirk of our hacky utility for testing changes in packages ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome! I left few notes, but foundation is very solid
}, | ||
}, | ||
}, | ||
// TODO: those two were not restricted to READY state, but maybe they should? or even maybe it should queue in NOT_READY? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very nice illustration of why using actions
in state machine definition is so great (opposed to what we currently have) - it made it very clear that this is not handled properly. Those definitely need to have a bit different handling in CHOKIDAR.NOT_READY
and CHOKIDAR.READY
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That said, this might be not in scope of this PR (it preserves current handling) - so I'll leave this up to you if you want to attack this. If you don't - that's ok, let's just leave this TODO here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer keeping this as is now. I can try to fix this in a followup PR though - if only you could give me hints on how the handling should differ. Should those be queued until ready?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, let's try to minimize scope of the PR. This will allow faster iteration.
}, | ||
// TODO: those two were not restricted to READY state, but maybe they should? or even maybe it should queue in NOT_READY? | ||
on: { | ||
CHOKIDAR_CHANGE: { actions: `createAndProcessNode` }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be handled in exactly same way as CHOKIDAR_ADD
event (so queue in NOT_READY and process immediately in READY)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe even CHOKIDAR_ADD
/ CHOKIDAR_CHANGE
event can be merged because as far as plugin is concerned there's not much difference between creating and updating nodes (gatsby core takes care of this stuff).
But I don't think that verbosity of events is any problem and this might be easier to read and understand
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so you have actually answered here my previous question 😅 Still would be in favor of keeping current logic for now, I can prepare a PR with that queuing immediately after this would get merged in.
As to verbosity of events - I would be in favor of keeping separate names.
// TODO: those two were not restricted to READY state, but maybe they should? or even maybe it should queue in NOT_READY? | ||
on: { | ||
CHOKIDAR_CHANGE: { actions: `createAndProcessNode` }, | ||
CHOKIDAR_UNLINK: { actions: `deleteNode` }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think this need similar handling as CHOKIDAR_ADD
/ CHOKIDAR_CHANGE
:
We should queue it when it's not ready. I think we do need single queue for additions, changes and deletions - so potential queue could look like this:
[
{
type: 'add',
path: '<some_path>'
},
{
type: 'del',
path: '<some_path>'
}
]
so then flushing queue would handle those in proper order
deleteNode({ node }) | ||
} | ||
}, | ||
flushPathQueue(_, { resolve, reject }) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent embarrassingly long time looking where resolve
/ reject
is coming from (initially I thought this is how xstate
handles async action, before I found that resolve
/reject
is send in fsMachine.send({ type: `CHOKIDAR_READY`, resolve, reject })
I do think we can rework it to be more inline with idiomatic xstate by:
- creating new state in
CHOKIDAR
(sibling toCHOKIDAR.NOT_READY
) - let's call itFLUSHING
states: {
NOT_READY: {
on: {
- CHOKIDAR_READY: `READY`,
+ CHOKIDAR_READY: `FLUSHING`,
CHOKIDAR_ADD: { actions: `queueNodeProcessing` },
},
},
+ FLUSHING: {
+ invoke: {
+ src: () => {
+ return flushPathQueue()
+ },
+ onDone: {
+ target: `READY`,
+ },
+ },
+ },
READY: {
on: {
CHOKIDAR_ADD: { actions: `createAndProcessNode` },
},
},
},
see https://xstate.js.org/docs/guides/communication.html#invoking-services for docs about invoke
property
- change the promise returned from
sourceNodes
a bit, letchokidar.on(`ready`)
just send event to machine, and let's wait for machine to reachCHOKIDAR.READY
to resolve that promise
Making it this way also shows that there is another potential problem (it's good thing) - what happens currently when chokidar
emits event while flushing is in progress? Right now we would not queue work, but instead would process it immediately - making order of processing potentially incorrect. So I think while we are flushing, we should queue chokidar events and when flushing finishes check if queue is empty (then we can go to CHOKIDAR.READY
state, and if it's not - flush again and repeat that until queue is empty)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that 👍
One question though - flushPathQueue
might reject and this would stay unhandled. I have no idea under what circumstances it could reject though? So not sure how this should be handled and for what to await for in ready
listener to reject returned promise there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And again - maybe we could pursue fixing this in a followup PR? Or do you prefer keeping it as part of this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, my idea here is certainly not fully fleshed out (just starting point).
And also agree let's not scope creep this PR - let's make this PR 1 to 1 refactor (keeping current behaviour with all its flaws). All those edge cases that were uncovered can be incrementally handled in future pull requests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, I believe it's 1 to 1 right now if I haven't made any stupid mistake. I still don't know how to test this properly though - so I would appreciate you taking a look at this or giving me guidance.
@pieh thanks for the initial review! |
4e78531
to
d67e1b4
Compare
I added basic test suite (that tests I'll merge this in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank for cleaning this up! It's much easier to follow now!
Holy buckets, @Andarist — we just merged your PR to Gatsby! 💪💜 Gatsby is built by awesome people like you. Let us say “thanks” in two ways:
If there’s anything we can do to help, please don’t hesitate to reach out to us: tweet at @gatsbyjs and we’ll come a-runnin’. Thanks again! |
@pieh cool, I’ll work on fixing those issues discovered during the PR in following days |
* Upgrade XState and refactor its usage in gatsby-source-filesystem * test: add basic test suite for sourceNodes
* Upgrade XState and refactor its usage in gatsby-source-filesystem * test: add basic test suite for sourceNodes
Description
I was exploring the codebase to see how you are using XState and I've encountered this old machine using barebones Machine API from older version (v3) of XState. So I've decided to refactor this to use Interpreter (which you are also using in other parts of the codebase), mainly as an exercise for myself but I believe that it's cleaner now and gives better visualization than the previous version:
I strongly believe this should behave exactly the same as the previous implementation, BUT I have no idea how to test this properly. I could use a hint regarding that from you 😉
I've also left one TODO comment which should be resolved before merging (even if resolving just means removing it - I was unsure what's the proper logic for this one, I've decided to keep it like it was for now). In addition to that I've noticed that processed nodes before bootstrap is don are not reported - shouldn't maybe those logs be queued and flushed later?