Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sonars - pools of lightweight path-aware reactions #165

Closed
darwin opened this issue Aug 1, 2015 · 26 comments
Closed

Sonars - pools of lightweight path-aware reactions #165

darwin opened this issue Aug 1, 2015 · 26 comments

Comments

@darwin
Copy link

darwin commented Aug 1, 2015

During implementation of Plastic I ran into a performance issue. I'm using re-frame and reagent and for complex input files my code generated thousands of reactions to watch changes in my app-db (which is RAtom).

When profiling I discovered that my re-frame subscriptions were responsible for generating hundreds of reactions similar to this one (just paths were different):

(make-reaction (fn [] ((let [interesting-value (get-in @app-db [:some :specific :path])]
                         (do-something-with interesting-value)))))

What is the problem?

There are two in fact:

  1. Each reaction registers watchers on all derefed things it finds. In this case my app-db ended up with thousands of watch handers registered to it.
  2. Each reaction does identical? test as an optimisation to prevent reactions on changes with identical values (see also discussion here). But this test is not path-aware. It checks identity of content of derefed atom, but doesn't test identity for the path actually used. In my case app-db is pretty busy and changes every time anyone touches anything deep inside (which is promoted design pattern with re-frame and in general I guess).

My solution

I have implemented a light-weight path-aware reaction (sonar-reaction) which solves the problem 2. It exactly knows what source it derefers and which path is then fetched. So it can allow identical? test for the path.

Also I have implemented a pool for such reactions (sonar), which does add-watch subscription on source atom only once to solve the problem 1. It also does all identical? checks on behalf of individual sonar-reactions in one go in an efficient way (by pruning the paths tree top-to-bottom):
https://github.com/darwin/plastic/blob/5cbb01f210b29cb706f442d646a7ddab5adb49e7/cljs/src/main/plastic/reagent/sonar.cljs#L29-L36

Here is current implementation:
https://github.com/darwin/plastic/blob/master/cljs/src/common/plastic/reagent/sonar.cljs

The code does also management of sonars. They go away when last sonar reaction in the pool is disposed.

As a side note. This solution could be generalised to handle:

  1. multiple paths per single sonar-reaction
  2. multiple derefed sources per single sonar-reaction

But that wasn't my use-case.

Why opening this issue?

I wanted to share this solution, because it interoperates seamlessly with the rest of reagent. Keep in mind that sonar reactions are just special optimised reactions so other reactions can deref them. It may be an optimisation step for people who run into similar issues.

Also this could be an inspiration for future reagent optimisation in more general case. Reagent could pool reactions by deref-source to reduce number of add-watch calls. Also implementation of RCursor could implement similar optimisation as I outlined here.

@seantempesta
Copy link

That's pretty cool. I've been thinking about doing something like this for a while, but haven't hit a sufficient number of reactions to warrant the optimization. Personally, I think it'd be rad if ratom had a path aware option built in, since it seems like the most common case.

@GetContented
Copy link

@seantempesta Isn't "a path aware option" exactly what a cursor is?

@darwin
Copy link
Author

darwin commented Aug 1, 2015

@JulianLeviston currently, cursor is just a wrapped (get-in @ratom path) reaction:
https://github.com/reagent-project/reagent/blob/master/src/reagent/ratom.cljs#L119

AFAIK it does not help you reduce number of ratom watchers or optimize them for path lookup. Actually that was my first trial to use RCursor to optimize my performance bottleneck, but soon I realised that is not possible.

@GetContented
Copy link

Ah ok. Damn that probably explains why I had to do a lot of crazy things in my app to mitigate the large amount of rerendering that was happening. I ended up using a debouncing core async channel as a queue.

Also explains why sometimes things get INCREDIBLY slow sometimes. You wouldn't notice it but for the fact that my app has huge numbers of component instances in it, like yours.

I wonder if you could integrate the optimisation into reagent standard? Possibly by registering a single reaction that also has many of your path aware reactions hanging off it, so that all subsequent calls to reaction actually return a lightweight path aware "reaction" which is just a registration into that one central pool that only has one observer on it per ratom. Without modifying the API?

Haha sorry I'm half asleep. I just realised that that's precisely what you're doing :) But it would be good if you could just integrate it into the current reaction API, no?

@darwin
Copy link
Author

darwin commented Aug 1, 2015

I think the current reaction API is quite smart from UX perspective, it uses deref feature to feel familiar and convenient. But it has two drawbacks, first it cannot be path-aware (you would need to pass path parameter when doing deref).

And IMO the second drawback is that reactions have to "scan" for derefs in the code by running it. Each invocation can return different set of "derefables" used and reaction has to watch for changes and do book-keeping around that dynamic list (register/unregister watchers on underlying derafables, etc.). Again this is convenient for people who don't want to care about guts of reaction machinery, but this is unnecessary in most cases. I believe my code does not branch to fetch values from different atoms/paths based on some dynamic value.

SonarReaction is simplified to the point that it has only one static derefable it depends on so there is no book-keeping needed.

@darwin
Copy link
Author

darwin commented Aug 1, 2015

@JulianLeviston have you tried to profile your code in Chrome Devtools? It provides pretty powerful tools to see where is your app spending time.

@danielcompton
Copy link
Contributor

@darwin I've been thinking about this recently for re-frame. Every time you subscribe to a subscription, you get a new Reaction created. If re-frame returned the same Reaction each time (by storing it in an atom) instead of creating a new one this would cut down on the watcher burden.

I haven't fully thought through the implications of this though.

@GetContented
Copy link

@darwin yes, I have... but not recently... I've only recently finished the rewrite from coffescript/Ember (because Ember is too magic, it's not declarative or functional and Ember Data wasn't ready) to cljs/Om. And then from cljs/Om to cljs/Reagent (because when doing serious applications, Om's syntax gets in the way too much, and the pre-om-next atom/cursor story was a little half-baked and makes doing some complicated things incredibly hard - obviously not to detract from David Nolen. I deeply respect him.)

Also, I had some other things that were slowing the system down (mostly the fact that I'm still stuck on Rails on the backend), but also the fact that I have a big iFrame previewer of HTML generated from my editor which on every keystroke was causing it to re-render (and go back to the server and ask for all the assets, which was causing the server to run out of memory). Basically it's been a big old mess, and the Reagent re-rendering issue hasn't even been the worst proplem.

I also kind of "black boxed" Reagent's stuff up to a degree, up to this point. I guess I'd hoped / assumed that it'd be built very efficiently, but I guess for most people's use-cases this efficiency just doesn't matter. Mind you, that's why Rails is so slow (roughly 100 req/sec on modern hardware).

Time to run it again :)

@mike-thompson-day8
Copy link
Member

@darwin You provide this code for discussion:

(make-reaction (fn [] ((let [interesting-value (get-in @app-db [:some :specific :path])]
                         (do-something-with interesting-value)))))

Your make-reaction wraps two steps:

  1. a value is extracted from the (single source of truth) ratom, via a path
  2. some computation happens using this value

In your code above, step 1 is:
(let [interesting-value (get-in @app-db [:some :specific :path])
and step 2 is:
(do-something-with interesting-value)

For the sake of efficiency, we'd like for step 2 to only happen when step 1 returns a different value, where "different" is judged by identical?.

But when both steps are wrapped in the same reaction step 1 and step 2 are bonded together. We'd like a scheme which allowed them to act independently.

The easiest way to achieve this is via chained reactions (For simplicity, I'll switch away from using make-reaction and use the reaction macro here):

(let [interesting-value (reaction  (get-in @app-db [:some :specific :path]))]    ;; step 1
  (reaction  (do-something-with  @interesting-value)))                    ;; step 2

With this two-reaction arrangement, step 2 computation will never happen unless it's inputs change. If the input @interesting-value yields an identical value, then step 2 won't be recomputed, and the "propagation" of changes/recomputations are short circuited at that point (and Views won't rerender unnecessarily)

In many cases, this two step arrangement will give you all the performance you need. But there are ways to make things more efficient again, if you want to go to the effort.

@GetContented
Copy link

@mike-thompson-day8 would it be safe to say that adding this or even a more efficient version into reagent would be worthwhile?

@seantempesta
Copy link

I'm still kinda new to this world (which is awesome btw), so correct me if
I'm wrong:

  1. The ratom knows when data inside of it changes, and exactly which paths
    those changes occur in
  2. This ratom could record a list of callback functions to fire when a
    specified path changes
  3. Reactions could specify a path into the ratom (thereby registering
    their callback)

@JulianLeviston https://github.com/JulianLeviston: no, I hadn't thought
of using cursors, but this does seem like the perfect mechanism to register
#3?

Sean

On Sun, Aug 2, 2015 at 9:45 AM, JulianLeviston notifications@github.com
wrote:

@mike-thompson-day8 https://github.com/mike-thompson-day8 would it be
safe to say that adding this or even a more efficient version into reagent
would be worthwhile?


Reply to this email directly or view it on GitHub
#165 (comment)
.

@GetContented
Copy link

@seantempesta not a callback, IMHO - just an observer. I hadn't realised that every cursor adds an additional reaction on the main atom. I'd assumed it already behaved in the efficient way that "sonar-reaction" solves.

My main interest was whether @darwin and others would suggest/allow putting this into reagent proper, coz it seems like it should function that way.

Also keen to hear what @mike-thompson-day8 's changes that are even more efficient are. :)

@mike-thompson-day8
Copy link
Member

@seantempesta as the self-appointed president of the Cursor Cynics Society, I'd encourage you to resist the siren call of Cursors. Your architecture will thank you for it. (I'm actually not that against read-only cursors, but I'm vociferously anti read/write cursors).

To me, FRP is a much more useful and powerful concept.

Realise that in my example previously, the creation of step1 and step2 as separate steps was just the process of creating a two step FRP Signal graph.

@JulianLeviston there's not a lot to add to Reagent, other than perhaps better explanation. If you have a LOT of reactions referencing the same path, and it sounds like @darwin does, then perhaps cache the signal to that path, so you stop duplication. Sketch of the code (untested) ...

A cache of existing signals for certain paths:

(def path-signal-cache  (atom {}))   ;; maps from path to Signal 
(defn signal-for-path 
  [app-db path]
  (if-let [cached-signal (get  @path-signal-cache path)]
    cached-signal
    (let [sig  (reaction (get-in @app-db path) 
                         :on-dispose #(swap! path-signal-cache dissoc path))]   ;; <-- notice
      (swap! path-signal-cache assoc  path sig)
      sig)))

When a cached signal ends up not used anymore, it will be disposed of, and the on-dispose callback will remove it from the cache.

Using this setup, the previous 2 step becomes:

(let [interesting-value (signal-for-path  app-db  [:some :specific :path])]    ;; step 1
  (reaction  (do-something-with  @interesting-value)))                            ;; step 2

Even if you had 500 references to that path, the signal graph for all 500 would share the one Signal node. And further more, if that path hadn't changed, then none of the 500 dependent reactions would ever re-compute.

@JulianLeviston if yu did want to add something to reagent, I think it would be a declarative way to create a signal graph. I'm thinking of something like Prismatic's Graph library. But that's another day for me :-)

@mike-thompson-day8
Copy link
Member

BTW, if you are using re-frame, I'd recommend this approach to Debugging (what's getting rendered, what subscriptions are firing, etc). It is still something of a work in progress, but it has progressed far enough to give you interesting information.

https://github.com/Day8/re-frame/wiki/Debugging

@GetContented
Copy link

@mike-thompson-day8 agreed about FRP. Couldn't agree more. I'm not using re-frame.

@GetContented
Copy link

@mike-thompson-day8 It's a little disappointing that code runs when it isn't necessary, though. For instance, my medium sized pages end up with about 3000 watchers for some reason. Does that mean every single time my one ratom changes, it runs 3000 functions each time? GUH that's abysmal. I guess it's the price you pay for not being in a truly lazy language.

I guess I'm probably using Reagent somewhat incorrectly... I should most likely be building pure functions as components, rather than building a lot of cursors... that would probably limit the amount of reactions created, and by consequence limit the sheer amount of code that runs. I need to experiment! but, first I have to rebuild my server software, get more clients, and add more features to our system. LOL

@seantempesta
Copy link

@mike-thompson-day8 https://github.com/mike-thompson-day8: hahaha. The
siren song is so beautiful though! Actually, I started learning
Clojurescript by learning Om (where I was introduced to cursors) and also
found the concept of the r/w cursor to be rather limiting. You'll be happy
to know that I'm using re-frame at the moment and am really liking it!

@JulianLeviston https://github.com/JulianLeviston: I'm also disappointed
that every time my app-db changes all of my reactions are re-run. It just
seems...unnecessary. The ratom knows which paths have changed, so it
should only notify the relevant reactions to run.

So even if I were to cache the signals, each signal will run anytime the
app-db changes? A "reaction" just re-runs anytime any deref'ed atoms
change, right?

Sean

On Sun, Aug 2, 2015 at 2:33 PM, Mike Thompson notifications@github.com
wrote:

@seantempesta https://github.com/seantempesta as the self-appointed
president of the Cursor Cynics Society, I'd encourage you to resist the
siren call of Cursors. Your architecture will thank you for it. (I'm
actually not that against read-only cursors, but I'm vociferously anti
read/write cursors).

To me, FRP is a much more useful and powerful concept.

Realise that in my example previously, the creation of step1 and step2 as
separate steps was just the process of creating a two step FRP Signal
graph.

@JulianLeviston https://github.com/JulianLeviston there's not a lot to
add to Reagent, other than perhaps better explanation. If you have a LOT of
reactions referencing the same path, and it sounds like @darwin
https://github.com/darwin does, then perhaps cache the signal to that
path, so you stop duplication. Sketch of the code ...

A cache of existing signals for certain paths:

(def path-signal-cache (atom {})) ;; maps from path to Signal

(defn signal-for-path
[app-db path](if-let [cached-signal %28get @path-signal-cache path%29]
cached-signal
%28let [sig %28reaction %28get-in @App-DB path%29
:on-dispose #%28dissoc path-signal-cache path%29%29] ;; notice on-dispose
%28swap! assoc path-signal-cache path sig%29
sig%29))

When a cached signal ends up not used anymore, it will be disposed of, and
the on-dispose callback will remove it from the cache.

Using this setup, our two previous 2 step becomes:

(let [interesting-value (signal-for-path app-db [:some :specific :path])] ;; step 1
(reaction (do-something-with @interesting-value))) ;; step 2

Even if you had 500 references to that path, the signal graph for all 500
would share the one Signal node. And further more, if that path hadn't
changed, then none of the 500 dependent reactions would ever re-compute.

@JulianLeviston https://github.com/JulianLeviston if yu did want to add
something to reagent, I think it would be a declarative way to create a
signal graph. I'm thinking of something like Prismatic's Graph library. But
that's another day for me :-)


Reply to this email directly or view it on GitHub
#165 (comment)
.

@danielcompton
Copy link
Contributor

@seantempesta have a go reading Mike's comment again and https://github.com/Day8/re-frame/blob/master/README.md#a-more-efficient-signal-graph. While all of your top level reactions will be rerun when you change app-db, if you make your top level reactions just derefs of app-db, you will only be running a very cheap reaction every time. When a path hasn't changed, propagation will stop at the top level deref.

So, is your method more efficient? It almost certainly is because it's doing less work. In practice, it would be good to see some numbers showing how much cheaper it is. I imagine that would go a long way to convincing people that it's worth merging this with Reagent.

@seantempesta
Copy link

@danielcompton: Yeah, I get that. I've been writing multi-level reactions in the app I'm building and it's been going great. I know that it isn't that expensive to re-check all of the top-level reactions if it's just a get-in on a deref'ed ratom -- which is why I decided not to worry about this until I hit a performance problem.

But checking the same data over and over again (which I guess is the React way) isn't necessary with this kind of architecture. We know exactly what changed when we centralize the app-state in an ratom.

I guess it just feels inelegant.

@danielcompton
Copy link
Contributor

Sure, I agree too. And the other side of this coin is not recreating multiple reactions in re-frame for the same subscription. Both of these would speed things up.

@GetContented
Copy link

lol @mike-thompson-day8 I didn't quite understand where you were coming from with your "signal for path" example the other day until today... I just came back to this problem today because one of our clients hit this problem hard on our app... couldn't load the page. So I have to do something about it, faste. So I actually (re-)invented your "signal for path" example just now, then it reminded me of what you were saying, so I came and looked at your code and you have the exact code I'm writing. Very nice :-) 👍 And I did some tests before to check if it'd be fast, and 3000 atom swaps happens incredibly fast, so I'll implement this shortly. :)

Tho... I'm not entirely convinced, because it seems your code just wraps a reaction around the app-db again for each signal path, which according to my understanding of the way reactions work, will make each trigger every single time the app-db changes (which is what's causing my app slowness, if I understand correctly). I'll persist with my idea and see where it takes me. Off the top of my head, I'd thought to hold a set of ratoms and have a single reaction which updates all of the ratoms in the "path signal cache" by swap. Anyway... we'll see :)

LOL I'm going to read the source for ratom and reaction so I make sure I really understand how they work properly.

@sebluy
Copy link

sebluy commented Aug 12, 2015

I've been playing around, trying to solve similar problems. Here is my attempt:
https://github.com/sebluy/sigsub
I just added some basic documentation so take a look if your interested.

@GetContented
Copy link

@sebluy yeah, I've been doing an experiment, too. Yours looks very much like Hoplon's cell system to me.

I've got a working system in my app, now, however my subscriptions and their relationships are much more complex than a simple tree: I often have subscriptions that rely on multiple separate data sets, and they're not small data sets. Also, many components rely on the same data and I don't want to send that data around through the app. The shape of my data isn't a simple tree. I keep running into browser speed issues when I run it on slower machines because when I load my data it's loading in paginated to be kind to my server, which causes multiple computation passes of the subscriptions in the client as it constantly recomputes all the dependencies.

So, I'm currently rewriting my data-loading functions so that they collate all the data first, then swap it in finally when it's all in the client. This will make things much faster.

Having hundreds or thousands of watches on my main model atom was causing my app to grind to a halt before, but it's much snappier now, as I have each subscription having its own cached data r/atom. Any time I have something updating my main model, this busts the caches that need busting (subscription specifiers can take a predicate which also gets used to check if the cache needs busting across some data), and does a recomputation of the subscriptions which are interested in the new data. Works very nicely, and much more efficient than using r/cursors or r/atoms directly.

@mike-thompson-day8
Copy link
Member

I'm tempted to close this issue. Thoughts?

This issue feels more about documenting techniques. It feels like something that should be written up in the Wiki? Anyone up for that? I'd like to get more contributed docs into the Wiki.

@GetContented
Copy link

Yeah, I agree.

It's something that the industry has been, an continues to grapple with ongoing. I'm not convinced there's a clear, great path here. Elm's architecture (https://github.com/evancz/elm-architecture-tutorial/), Om-Next, Reagent using various architectures, etc. all have their positives and negatives. Anyone building non-trivial apps that are backed by a database or other backend needs to be aware of these issues (intermediate Reagent programmers), but a github "bug-tracker issue" really isn't the right spot for it.

@darwin
Copy link
Author

darwin commented Aug 24, 2015

Ok, no problem. Closing.

@darwin darwin closed this as completed Aug 24, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants