-
Notifications
You must be signed in to change notification settings - Fork 0
Reuse namespace #43
Comments
Another idea that I want to throw out here. I tried the following snippet in
It takes 65s to run, without caching. This suggests that if we have a better dependency manager, then we can require a bunch of files at once, saving the cost of (The running time reduces even further to 28s after excluding |
If you can make Pollen go faster, great. Making it go faster while preserving its features is the hard part, I have found.
This is what I thought when I added parallel rendering. It was less true than I hoped. 🤯 Removing steps from an expensive computation is a great way to save time. But it’s only “free” if you know for sure that skipping those steps never leads to incorrect results. Attaching permanent caveats — “it works, if you know that X Y and Z are true” — leads to despair, which is not free. IIRC the reason fresh namespaces were necessary was to support dynamic re-evaluation during an interactive project server session. Otherwise, You may be right, however, that certain simplifications are possible during a non-interactive session (say, when using |
For instance. The reason, say, Scribble can be faster on large documents is that all the component source files are pulled into one master source — this one source is compiled & evaluated — and then multiple pages are emitted as output. Pollen, by contrast, has one source per output file, each separately evaluated. OTOH Scribble can do this because it exerts more control over how the document is structured. You can import your own functions to a Scribble source. But it doesn’t permit the granularity of control that Pollen does. Costs vs. benefits. I’ve considered, at least, whether Pollen could similarly “gang” files together and consolidate evaluations. For instance, by packing a number of source files into another as submodules. But I don’t see why this would change anything, aside from repositioning the pieces on the board. A module has the same evaluation costs regardless if it’s a submodule or standalone source file. As a middle approach, I’ve also considered whether Pollen could introduce a concept of a one-to-many page. This would be faster to evaluate, because it would be a single evaluation (like a Scribble source). But it would be a distinct concept within Pollen from the current preprocessor / Markdown / markup files. The problem with a one-to-many file type is that it makes dynamic refresh annoying, because now you have to refresh a possibly huge source in order to refresh one small part. |
The other issue with one-to-many page generation is that I have never wanted this once for my own work. For me, the value of Pollen is exactly that it is so luxuriously indulgent. Every page triggers a full program evaluation! Where else can you get this? Nowhere. By contrast, the one-to-many publishing model is well covered by other tools — Scribble, or Frog, or a zillion other static-site generators beyond. So, though I am always interested in making Pollen faster, it only makes sense if the new technique supports the core theory of operation. Which is why, so far, I have focused more on file-based caching and more recently parallel processing. I’m sure there are other good ideas yet to be discovered. |
What would be a test case that demonstrates this behavior? The fix in #49 doesn’t break any existing Pollen tests, nor any of my own projects. Moreover, Pollen doesn’t guarantee a clean namespace for rendering — like I say, it’s more of a necessity to support dynamic refresh during an interactive session. My hunch is that the situation doesn’t arise much in the wild, because Racket naturally deters use of global variables and mutation. |
Consider:
Prior the namespace reuse,
After the namespace reuse,
|
I think I would call this a case of nondeterministic compilation, in which case Pollen’s guarantees needn’t be any stronger than Racket’s. For instance, if we convert these files to Racket modules, we’d get the same weird behavior:
Suppose these all live in collection |
I think mutation like this is quite common when one wants to communicate across tags. E.g., making footnotes. There's a way to make it work by dealing with things in the
And this would work prior namespace reuse, with Note that I'm not saying that producing "1 or 2" and "3 and 4" are wrong. It's an acceptable behavior, but there should be a way to make it possible to produce "1 or 2" and "1 and 2". One easy way is to fix this problem is to create a tag named One feature that I think will be very useful is some sort of However, this means
Then:
will deterministically produce:
But as you can see, I need to wrap everything in But OK, perhaps macro is too demanding, then another possibility is thunking. That is,
But users are allowed to override
|
Note: I edited the above comment a lot. You might want to read it from GitHub instead of email. |
Yes — moreover, this is the Rackety way to go about it, and using fresh namespaces would be both perverse and slow.
The idea of a function named, say, |
|
I’m not averse to something like |
Would this be acceptable? |
Why not try moving |
Just chiming in to say I do use mutable hash tables in my However, I’m not complaining or asking to revert. I am persuaded that the new way has benefits. I just want to understand what the implications are right now for state that I want preserved between tag function calls but not across pages when doing parallel renders. Are parameters no longer sufficient for this purpose? I understand that refactoring so that dealing with everything inside |
Right — you’ll need to manage the state for each page explicitly, rather than relying on that behavior as a side effect of fresh namespaces. In general, using Concatenating the keys would work, though it makes per-page queries a little messy. One could also convert a footnote hash into a hash with subhashes: the top level is indexed by #lang racket
(require pollen/core)
(define fn-hash (make-hash))
(define (fn txt)
(define page-path (hash-ref (current-metas) 'here-path))
(define fn-hash-page (hash-ref! fn-hash page-path make-hasheq))
(define fn-count (add1 (length (hash-keys fn-hash-page))))
(hash-set! fn-hash-page fn-count txt)
(format "~a is fn ~a" txt fn-count)) |
I think it's the same reason why |
I’ve been testing and working to ensure I fully understand the implications of this change. Tell me if I have this correct:
And finally:
As to this last bit, consider this MVE. I could not find a sequence of |
The fresh namespace is only necessary in the project-server context, because that’s the only way to make sure that all updated source files (incl Your description of the behavior seems right except for the last point. It would be more accurate to say that after this change, a Pollen source may or may not be evaluated in its own namespace, just as currently, it may or may not be evaluated in parallel. In both cases, the programming should not depend on any side effects of these environments. That said, one can still avoid parallel processing — possibly useful for projects that want a guaranteed evaluation order. Likewise, I could add a command-line switch or Your code example depends on mutation of a global variable, which is always going to be troublesome. |
I did an experiment by modifying Pollen to use the same namespace instead of creating a new one for every file. For
pollen-tfl
with one thread, the rendering time afterraco pollen reset
reduces from 332s to 121s.Of course, the behavior would be different. In particular, if files like
pollen.rkt
have a side-effect (say, mutate a global variable), then the side-effect would persist across rendering multiple files. However, for projects that don't have side-effects (which are probably the majority?), I think this is a performance boost for free?Perhaps there should be an option to allow using the same namespace?
The text was updated successfully, but these errors were encountered: