Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Watch for file notifications #31

Open
clarete opened this issue Feb 21, 2021 · 11 comments
Open

Watch for file notifications #31

clarete opened this issue Feb 21, 2021 · 11 comments
Labels
enhancement New feature or request

Comments

@clarete
Copy link
Collaborator

clarete commented Feb 21, 2021

The current build process is currently comprised of scanning the list of all websites within weblorg--sites and then scanning all the routes within each site once, and running the pipeline once. That must work well reasonably fast since it's part of the main feature of weblorg. However, while developing themes or even during the process of building the content, it'd be quite helpful if we triggered the build of only the files that were changed.

It'd be great if weblorg-export could take a :watch parameter and block forever if that was set to true and re-generate the website over any file changes.

at the end of our well known publish.el files we'd do something like

(weblorg-export :watch (getenv "WATCH"))

at the terminal we'd do something like this

WATCH=t emacs --script publish.el

That'd hang the terminal and print out events every time files got changed.

Notice that this is pretty limited; but if it serves us well, it can be expanded to 1. be smarter about what it watches, maybe by taking something like :watch-pattern and :watch-exclude and then 2. by trying to only re-generate pieces related to the modification. Notice that 2 might get a bit more complicated because if we receive a notification about a change on a template, it's not enough to look for :template entries in routes, as maybe the template being changed isn't there, but might be in the extends of another template.

@clarete clarete added the enhancement New feature or request label Feb 21, 2021
@nanzhong
Copy link
Collaborator

nanzhong commented Feb 22, 2021

I actually have something like this implemented outside of emacs/weblorg. I've built a docker container that expects the weblorg site base dir to me mounted in, and when started, it will build the site (also supports pre and post build scripts via file naming conventions), start up a web server, and continuously monitor for changes to trigger rebuilds.

The hard part feels like the determination of what needs to be rebuilt (especially when templates are modified). My container naively rebuilds everything on any file change.

I can clean that up and provide an example in the next few days.

@clarete
Copy link
Collaborator Author

clarete commented Feb 22, 2021

Hi @nanzhong! That would be lovely. Even if we add the file watcher with pure Emacs-Lisp, having a docker setup that people could copy from would be pretty awesome. Off the top of my head, I can think of creating a new directory under examples in the root directory and adding something like with-docker or something to the likes. I'd also try to fit it into the documentation somewhere.

What do you think?

@nanzhong
Copy link
Collaborator

nanzhong commented Mar 4, 2021

Yup, for sure. Sorry for the delay, it's been a busy week. I'll try and have something up in the next few days.

@clarete
Copy link
Collaborator Author

clarete commented Mar 5, 2021

No rush @nanzhong! thanks for the heads up and take your time! I hope the week is going great 😄

@nanzhong
Copy link
Collaborator

nanzhong commented Mar 8, 2021

@clarete I've put together #40 with a small change to demonstrate what I've thrown together in https://github.com/nanzhong/weblorg-docker.

@clarete
Copy link
Collaborator Author

clarete commented Mar 8, 2021

This is very exciting. Thank you so much for putting the docker image together and for integrating the build of the example with it!
I think this will accelerate the development of themes quite a lot and whenever we create a command line tool (#16) for us, the init command should also create a Makefile like the one you put together alongside with the publish.el file! And as I mentioned in the PR, it'd be very cool to get the weblorg's documentation website to also use this because it's quite boring to run all those watch;build;serve commands separately in multiple shells (hehehe, yeah I have quite a poor workflow for this right now

Thank you so much for the PR @nanzhong!

@clarete
Copy link
Collaborator Author

clarete commented Mar 8, 2021

On the native file change notification, I did get a little proof of concept going locally but I've noticed that although it works very well while Emacs is in interactive mode, it doesn't really seem to work when I try the very same code under emacs --script. My initial debugging indicates that Emacs doesn't seem to be running the file descriptor checks that react to filesystem notification events; I did confirm the filesystem events are properly registered in inotify (I'm on Linux). I'll continue the investigation and report here accordingly.

@nanzhong
Copy link
Collaborator

nanzhong commented Mar 9, 2021

Regarding the CLI tool. I actually have another version of this tooling that's written in go. It removes the dependency on inotify-tools and is in general much easier to extend than the bash script. If the intent is to provide a cli tool to accompany weblorg, that might be a potential starting point. I'm not sure what your language preferences are for things like this.

I also took a stab at exploring a native file watcher in weblorg, but didn't make much progress. I also realized that file-notify-add-watch did not work recursively which means additional work would have to be done.

I ended up going back and forth on the usability argument for native file watching. This is just my opinion, but I feel like introducing this watcher loop or async watching logic in welorg might not be the most ideal approach unless it can work via emacs --script; the mostly single threaded nature of emacs makes me not want to run weblorg in the same emacs process or daemon that I use for other things like editing the site content itself.

If weblorg supported incremental builds where it only rebuilt the pages that needed to be rebuilt (there are a bunch of ways this could potentially work, but I haven't spent too much time thinking about which would make the most sense yet), having a companion cli tool that handles the file watching and triggering the rebuilds feels like a pretty good compromise. That cli tool would have an emacs dependency, but that feels like a reasonable expectation given how weblorg works.

@clarete
Copy link
Collaborator Author

clarete commented Mar 12, 2021

Hi @nanzhong!

Sorry for taking a bit to answer such thoroughly written message. And thanks for sharing such cool ideas! Thank you for spending some time thinking of all these options and I agree with you that the Emacs Lisp solution is mostly appealing if that works with emacs --script and while we don't get there, I think whatever (even more than one) tool that works well could be recommended.

But also, even when we achieve good usability of file watchers purely written in Emacs Lisp, I think this docker setup is still invaluable for weblorg because a lot of people have docker installed in their computers, and it becomes friction free to get started from there.

Regarding your thoughts on incremental builds, I agree with you that this isn't exactly a trivial issue. If we get the watchers to work with emacs --script, I think it'd be worth pursuing a more sophisticated solution, like installing watchers recursively within the website directory being watched; enqueue & debounce change notifications (this debouncing needs to be per file).

Once we receive a change notification, we need to sort out the type of changed file parse it and find extends tags in case of templates or include statements if they're org-mode files. We can't be that smart for any other type of file, but we can allow people to hook up arbitrary commands to patterns e.g.: *.scss + sass {{ post.file }} so when we detect changes on such extensions, we trigger the recorded command.

These of course are just thoughts until we understand why registering file notifications work but the file descriptor of inotify within Emacs (inotifyfd within src/inotify.c) doesn't seem to be watched. I don't know much about the Emacs source code but I'll continue the research there and report here! Meanwhile, I pushed my progress on the Emacs Lisp file watcher here if you want to take a look: https://github.com/emacs-love/weblorg/compare/file-watch

I will dive this weekend and figure out why my build didn't work. I noticed that I didn't have weblorg installed within the container, though templatel was there. That was where I left the debugging but I'll get to the bottom of it soon and report! 😄

Thank you once again for all your thoughts and time on this! Have a great weekend! 🙇🏾

@nanzhong
Copy link
Collaborator

I had a bit of spare time today to revisit this. My understanding of emacs internal and emacs lisp is very superficial, so take what I say with a giant grain of salt 😛.

I think the reason you're having trouble with the file watches is because of the way emacs batch mode handles events (it doesn't have the usual event loop as in interactive mode) and emacs' threading model (cooperative threads, but still single-threaded). What I think is happening on your branch is that you are either deadlocking or livelocking between the worker and main thread, and not actually allowing the file watch handler to run.

I've taken a stab at this and have something basic working (only tested on linux). The general idea is:

  • Maintain two pieces of state: a hashtable of all the directories being watched and whether a re-export is needed.
  • On (weblorg-watch) setup directory watches for all the relevant dirs (we only care about directories because we can get file events by watching a dir).
  • When a file change event is detected: check if we need to update (add/remove) directory watches, mark whether a re-export is needed.
  • Use a timer to debounce exports.

The sequence of things is:

  1. (weblorg-watch) is called
  2. Setup all the relevant directory watches
  3. Start the re-export timer that only exports if needed (and resets the state after export)
  4. Hook into the event loop via (read-event) (this feels like a really big hack... we're essentially blocking on events in batch mode triggering emacs idleness so that the file watch event handlers can do their thing)

Generally this seems to work, but there are some really obvious issues:

  • Because we're just tracking "dirty" state and debouncing naively via the timer, change events to additional files during an export (which can take a while) are going to be lost.
    • I think we can address this one by tracking more information:
      • instead of a single "dirty" state, we should track a list of change events that include file changed and time of change
      • cache more information about when an export took place for a given input file of a route
    • Using that information together with when the re-export started, we should be able to determine if we've captured all the pending changes.
  • Exporting everything is a big hammer that is not specific enough.
    • I think we can address this by building a dependency graph of what a route's input file depends on (e.g. the org file itself, the theme's path, etc.) and only selectively exporting the transient dependents of the file change that we detect.

I've pushed my basic implementation to https://github.com/nanzhong/weblorg/tree/watch. Feel free to take that and improve on it.

cc @clarete

@clarete
Copy link
Collaborator Author

clarete commented Sep 2, 2021

hi hi @nanzhong this is pretty awesome to say the least. thank you so much for taking the time and for putting together something so fun to play with! What a great hack with the timers! 😄 I've just had some time to look more closely to the code and the watching does seem to work well enough to be worth exploring implementing this feature more seriously.

Now that we have a direction on such a deep issue, I think building the dependency graph sounds like a good next task. Since without it, we can't really provide a good implementation of what to do when a notifier event arrives. That might be something that we have to add to templatel first (to be able to map template inheritance into that graph).

The other random thought I had was to generate HTML on the fly instead of re-exporting the whole site to disk, and then adding some hot reloading script to integrate nicely with the browser development experience. Hehehe, of course every word I wrote in this last paragraph is worth 50h of Emacs Lisp hacking, so might not happen very soon but it's still worth thinking about it 😺

Fun times! \o/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants