Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Facets are actually implementation leak #240

Closed
ivan-kleshnin opened this issue May 22, 2015 · 64 comments
Closed

Facets are actually implementation leak #240

ivan-kleshnin opened this issue May 22, 2015 · 64 comments

Comments

@ivan-kleshnin
Copy link
Contributor

I think about this thing... Facets described as "views over data". But the same thing may be said about cursors. They are also "views over data". The difference is that one data is "static" and other is "dynamic". But this means nothing. If we have c = f(b) rule we never conclude that c has different nature than b. Derived and initial data are expressed in the same syntax and are equal for the consumers. The whole Math and Computer Science are based on that.

Unfortunately, this is not the case with facets. Client code must be aware of this artifical separation:

let foo = state.facets.foo;
vs
let foo = state.select("foo").get()

or

@branch({
  cursors: ...
  vs 
  facets: ...
})

This seems wrong to me. I shouldn't be concerned about such private details of the data in the client code. Is it "static" or "dynamic"? I don't care. I shouldn't ask. But now the client code and implicit rules about our data are coupled and we can't simply switch between a <- b and b <- a causalities. This means implementation leak.

Unless I miss something, I propose to think about merging cursor and facet concepts into one more powerful abstraction (keeping cursor name). So cursors may be expressed in terms of cursors and static data then.

But... saying that... I'm afraid that we actually reinvent the wheel here.
This issues @christianalfoni raised:
Yomguithereal/baobab-react#44
#180
push me even more to the thought that Baobab will benefit being built over smarter abstraction(s).
Event emitters are too primitive. We want to control initial states, we want to have movable parts having single app state at the same time. We want more and more complex primitives to express relations between data in facets like filters of all kinds...

It sounds like... RxJS Observables could handle this better.
Or CSP channels.

There are attempts to bind React and Rx... with more or less luck.
https://github.com/fdecampredon/rx-flux
https://github.com/r3dm/thundercats

Noone of them takes the concept of single app state, they are basically follow the Flux path having distinct Stores. But in everything else... we are moving to the same direction. Any state (including Baobab trees, of course) can be expressed in terms of temporal reduce function named scan. The difference between it and familiar reduce we used to is that this scan broadcasts every new state to the observers, not just returns one final data (because most of data sources never finish). I wonder if it's possible to just drop all that event emitter poor machinery and rebuilt everything on something more powerful and more suitable to our big big big list of requirements. Sounds scary, I know.

I'd like to think I overcomplicate things and there is a well-defined outline of what Baobab should and shouldn't do. Somewhere. But I'm afraid I'm not.

🐫

@christianalfoni
Copy link
Contributor

Hi @ivan-kleshnin ,

I agree with your conclusions on separating static state and dynamic state as two completely different concepts is not ideal. As you say, they are really the same.

I am working on a project called Cerebral where I did this:

https://github.com/christianalfoni/cerebral/blob/master/API.md#compose-state

It allows functions inside the state tree. Now, the thing is that the state tree is traversed on initialization and the functions are replaced with their "initial state value" and the behaviour is added "behind the scenes". So the functions are never actually part of the tree. It has the same basic types, just that some state are dynamic.

I think this works rather well actually.

@ivan-kleshnin
Copy link
Contributor Author

I agree with your conclusions on separating static state and dynamic state as two completely different concepts is not ideal. As you say, they are really the same.

Yes. I forgot to put yet another two obvious examples. SQL views are Facet equivalent in backend world. They are queried as usual tables. And Flux aggregate stores. There is more, but that's enough for illustration.

Baobab aims to be a DB for Frontend (at least that's how I see it). There is a project called DataScript which is ClojureScript only (hard to get it without Clojure knowledge) but very very interesting. It's also positions itself as a reactive database. It has immutable data concept (all history is kept by default) and powerful quiries over data. Did somebody analyzed it?

I want to encourage people here to think and to discuss. That's the best we can do at this point.

I am working on a project called Cerebral

Great! I see sound Baobab influence 😄 I will check it out.

@Yomguithereal
Copy link
Owner

Hello @ivan-kleshnin, @christianalfoni.
Interesting stuff here. I was thinking, when designing the facets, of integrating them into the tree itself as functions but couldn't find a way of doing so without being too misleading: how do you differentiate functions which are actually your state and functions that are meant to be run to solve the state (I know this can be solved by saying you shouldn't store functions in the tree anyway)? how do you define dependencies without falling into misleading heuristics when walking the tree?

@ivan-kleshnin, I am a bit unclear concerning the difference you put between observables and event-emitting? Doesn't observables use event emitting under the hood? Isn't Baobab a kind of observable in a sense? I'll need to document myself a bit more about reactive programming.

@Yomguithereal
Copy link
Owner

On a side note, I stumbled upon the concept of "computed observable" lately and I must say facets looks quite like that.

@ivan-kleshnin
Copy link
Contributor Author

I was thinking, when designing the facets, of integrating them into the tree itself as functions but couldn't find a way of doing so without being too misleading: how do you differentiate functions which are actually your state and functions that are meant to be run to solve the state (I know this can be solved by saying you shouldn't store functions in the tree anyway)? how do you define dependencies without falling into misleading heuristics when walking the tree?

Interesting and important questions here. Sure, it's hard to give precise advices out of context.
But... to speculate... There are now two quite mainstream directions in JS. Two schools one may say.
First school tries to add more and more OOP sugar on top of JS. Some people even reimplement JS constructor mechanics to make it more "powerful". Add private / public props, emulate multi-inheritance etc. As an example of this I often show stampit.

They want to transform JS into "real" or "better" OOP dynamic language and continue to rely on duck-typing.

Second school goes in the opposite direction. People here try to never use objects (e.g. methods and this keyword). Anything can be better expressed with native values and functions – their motto.
I belong to this second school completely. We believe you never need objects so all this stuff about prototypes goes straight to the trash can. We believe the functional paradigm proved itself as superior and OOP should be marked and removed as one of the biggest mistakes.

It's even harder to stuck somewhere in between because you never know what to rely upon.
I easily use typeof and instanceof operators instead of crazy magic shit because I know there will be only native types. Never custom types. My models look like this:

// Object(String, *) -> Object(String, *)
function User(data) {
  ...
  return merge(data, default); // output type <=> input type
}

So, back to the topic. My subjective opinion is that, from the API point of view, facets should be invisible to the end user.

state.select("users").get("id");
state.select("users").get("some-dynamic-data");

From the implementation point, you should declare that Baobab contains only pure native data.
Don't bother yourself supporting anything else. Baobab state should be serializable to JSON so better to say NO to generators and to functions right from the start. Then you just use instanceof Array or instanceof Function and it is good heuristics in your context no matter what some OOP freaks will say. If one complains me about frames and other exotic shit not supported I remind him about RFC full of the most stupid limitations ever. Send your resentment to them first.


I am a bit unclear concerning the difference you put between observables and event-emitting? Doesn't observables use event emitting under the hood? Isn't Baobab a kind of observable in a sense?

Difference is in the power of primitives. As soon as you need to combine the results of 2+ event emitters or to make something with the timeline itself (like delaying, throttling, buffering) you're in the trouble. Reimplementing of all this is possible, of course, but you'll have tough times.

Observables are on the different layer of power and inner complexity. If we're gradually going in that direction with facets, it may be better to jump this pit completely, embracing one of the existing tools.
Heavy meta-tasks like counting / measuring of big collections become blockers without some kind of throttling.

I'm not sure about CSP-JS. I never tried this, it's just one more possibility to consider.
CSP and Rx are two quite different approaches providing similar benefits in the end.
Link to comparison.

@christianalfoni
Copy link
Contributor

Hi guys,

Yeah, I agree that the tree should only support types that can be transferred "over the wire".... meaning JSON. Functions define behaviour. They are like "triggers". If Baobab meets a function it will run it and expect some description that defines the value to put at the path. The description can also state what other paths that should trigger an update on the current path. But most importantly the description contains a "getter" method that will be mapped to the path. So if a select matches a "getter path" it will redirect to the getter method and run it, instead of returning the actual value of the path.

I think Baobab needs to make a choice too, about what it should be. Now it is just holding state, emitting updates on paths and allows to "compose state" (with facets)... and does that very well :-) I have little experience with functional reactive programming, but built https://github.com/christianalfoni/R which uses those concepts. So in my not so very strong opinion, I think FRP is a different beast. You do not "hold state" in the same way, it just flows through the system rather. But yeah, not much experience :-)

@ivan-kleshnin
Copy link
Contributor Author

So in my not so very strong opinion, I think FRP is a different beast. You do not "hold state" in the same way, it just flows through the system rather. But yeah, not much experience :-)

It's not true. FRP holds state withing the closure of the scan function.
Both RxJS and Bacon have it. Maybe there are more options but I met this one everywhere.

https://twitter.com/dan_abramov/status/595538554459189249
https://gist.github.com/gaearon/c02f3eb38724b64ab812

It's interesting how reduce, the most powerful collection handler in FP, becomes
scan, the most powerful observable handler in FRP. This function has state in both versions and it turns out enough to solve just about every case requiring state.

@christianalfoni
Copy link
Contributor

Ah, yeah, but you can not just access it. You have to listen to changes to be able to grab it. Unlike Baobab you can just point to the tree and grab the state. Observables does not have "get()", you have to listen to changes, some of them gives the latest result and some only delivers future values. As I understand it.

@ivan-kleshnin
Copy link
Contributor Author

Ah, yeah, but you can not just access it. You have to listen to changes to be able to grab it. Unlike Baobab you can just point to the tree and grab the state. Observables does not have "get()", you have to listen to changes, some of them gives the latest result and some only delivers future values. As I understand it.

That's the whole point and this is very useful to eliminate concurrency bugs as a class. As long as you get value it may become outdated in any moment. And as soon as there is at least one command between those get and direct value usage that may affect this value – you'll get a bug. Not to mention a lot of work with selecting "right" names for old, one, transional, etc. value holders.

Alternative is to try to never read from Baobab into variables (which is not always possible) but in this case you have twice as long code. I tried both approaches a lot. Observables give much cleaner and predictable picture.

@christianalfoni
Copy link
Contributor

Yeah, FRP is really interesting that way. What I also noticed is how extremely easy it is to use immutable data. It just "fits right in there".

But the fact that you can not just grab state makes FRP difficult to grasp I think. Normally you can just grab the state and mutate it, but in FRP you have to "merge a state flow into other state flows", like merging a button click observable into an observable that produces "remove from array mutation using ID" merged into an existing "scan" observable that runs different mutations on that array. Its just VERY different :-)

I think we will see more developer friendly abstractions on this in near future. Rxjs is just impossible to get a hold on for the common developer. It would be great to have a RxJS light or something, with just the basic methods and method names that says exactly what it does. "reduce" and "scan" does not make any sense when you compare it to names like "add" and "remove" which are sooo explicit.

I tried to make something like that with https://github.com/christianalfoni/observable-state. It is a lot more user-friendly, but probably is very ineffective and is not completely FRP :-)

@ivan-kleshnin
Copy link
Contributor Author

Ah, yeah, but you can not just access it. You have to listen to changes to be able to grab it. Unlike Baobab you can just point to the tree and grab the state. Observables does not have "get()", you have to listen to changes, some of them gives the latest result and some only delivers future values. As I understand it.

That's the whole point and this is very useful to eliminate concurrency bugs as a class. As long as you get value it may become outdated in any moment. And as soon as there is at least one command between those get and direct value usage that may affect this value – you'll get a bug. Not to mention a lot of work with selecting "right" names for old, one, transional, etc. value holders.

This may sound like "I'll never get such cases". But everyone gets it... just as bugs with mutability are expected in mutable environment... concurrency bugs are expected in concurrency environment.

Alternative is to try to never read from Baobab into variables (which is not entirely possible) but in this case you have twice as long code. I tried both approaches a lot. Observables give much cleaner and predictable picture.

Imagine a URL query. You have to parse it into filters, sorts and other derivatives. You have to pass all of them down to the program layers. But nothing prevents you from updating one of them and forgetting to update another. Then you'll get a potential (very probable) bug. Query updates on URL changes. You have to reevaluate filters manually every time or make it reactive. No other choice. Every bit you keep interactive leaves the possibility of unsync. It leaves temporal hole in your code because there is no interactive mechanics to declare temporal dependecies.

I thought (and keep thinking) a lot why in every example I tried Reactive beat Interactive.
I met no good explanation yet. From the practical point benefits are sensible. My experience unambiguously says Reactive is better. But why?! What about theory? I believe the reason can be explained with these diagrams:

Interactive dataflow chunk
  => a
x => b 
  => c  
Reactive dataflow chunk
a <- 
b <- x
c <-

Reactive describes what is Now in terms of what we have in the Past.
Interactive tries to set the Future from what is Now.

The first game-changing difference is that the Past is established and can be expressed in terms
of exact code. The Future, contrary, is a pure abstraction and you have to keep it in your brain.
Load and unload. Load and unload. Brain is limited as well as energy.

Now to the second crucial difference.

The most common dataflow is in the form of rotated piramide.

Program dataflow
x1 >
x2 > y1 
x3 > y2 > z
x4 > y3
x5 >

To proove it just recall any common function. Number of input arguments is nearly always <= than number of output arguments. You may pass a lot of arguments to get a single number (like count). The opposite cases are very rare like passing a number to get a bunch of Lorem Ipsum paragraphs. The same is true for entire program.

With Reactive paradigm you just declare dependencies between inputs and outputs.
You use several inputs to construct one output in most of the cases.
Notice that the form of the Reactive dataflow chunk is the same as the form of the Program dataflow. Just a chunk of it.
Reactive dataflow chunks are consolidating from the point of the timeline.

The form of Interactive dataflow chunk is reversed!
Interactive dataflow chunks are deconsolidating from the point of the timeline.

Does this explanation clears something or obscures instead? 😃
I'm thinking about article on the subject.

@christianalfoni
Copy link
Contributor

@ivan-kleshnin Haha, you should definitely write an article on this :-)

I discussed the subject with some colleagues and they were... "It is not possible to handle the complexity of web applications with FRP". I do see their point as FRP examples very often creates small specific flows. It would be really interesting to see examples of common patterns in complex web applications solved with FRP. Something more than a TODO list.

Some things to consider:

  • Ajax requests
  • Polling
  • Optimistic updates
  • Lookups (Like you have a userId on a project and need to find the user in your users state)
  • etc.?

@ivan-kleshnin
Copy link
Contributor Author

But the fact that you can not just grab state makes FRP difficult to grasp I think. Normally you can just grab the state and mutate it, but in FRP you have to "merge a state flow into other state flows", like merging a button click observable into an observable that produces "remove from array mutation using ID" merged into an existing "scan" observable that runs different mutations on that array. Its just VERY different :-)

It is. But it's just because it's unfamiliar. When you learned programming everything were like this.
Loops over an arrays were "tough tasks" 😄 Recursion was "OMG?!"

I think we will see more developer friendly abstractions on this in near future. Rxjs is just impossible to get a hold on for the common developer. It would be great to have a RxJS light or something, with just the basic methods and method names that says exactly what it does. "reduce" and "scan" does not make any sense when you compare it to names like "add" and "remove" which are sooo explicit.

RxJS is really big, but, as always you need only a small amount of it's operators for most of the tasks.
I would say about ten of them cover 90%.

I discussed the subject with some colleagues and they were... "It is not possible to handle the complexity of web applications with FRP". I do see their point as FRP examples very often creates small specific flows. It would be really interesting to see examples of common patterns in complex web applications solved with FRP. Something more than a TODO list.

I dream of such example as well. I'm going to finish my React-Ultimate example, grab more experience, define weak places and reimplement the same thing on CycleJS. With all that points you enumerated.
I believe the picture is quite opposite: it's not possible to support big bullet-proof interactive program.

@christianalfoni
Copy link
Contributor

You know what would be a great angle on an article? "RxJS for the common developer". Take these 10 "operators", explain them and use them with common examples in state handling. That would be an AWESOME article!

@ivan-kleshnin
Copy link
Contributor Author

👍 I'll try to make this happen.

@scabbiaza
Copy link

I don't have so much experience as you guys, but also want to put my notes on this.

I think by working with data:

  • we should not think about its implementation
  • we should not break structure/namespaces

Here is some example from my code

 let State = new Baobab({
    messages: {
      models: undefined,
      id: undefined,
    }
  }, {
  facets: {
    newMessages: {
      cursors: {
        messages: "messages",
      },
      get: function (data) {
        return filter(data.messages.models, (model) => model.unread);
      }
    },
  }
});

By binding this data with component I need to remember what are cursors and what are facets

@branch({
  cursors: {
    messages: "messages",
  },
  facets: {
    newMessages: "newMessages",
  },
})

Second, each time by getting/setting this data I need again remember my structure

State.select("messages").get("models");
State.select("newMessages").get();

It would be good to have everything in one place, like so

State.select("messages").get("models");
State.select("messages").get("new");

@christianalfoni
Copy link
Contributor

@ivan-kleshnin
Copy link
Contributor Author

@christianalfoni I did pass this tutorial 😄
Unfortunately there was so many errors that I was very frustrated.

@Yomguithereal
Copy link
Owner

@scabbiaza, this is an interesting point indeed. It would be more comfortable to select both raw and computed data through the same API. I see two problems with this however:

  • Isn't it necessary that the user explicitly knows whether is manipulating raw vs. computed data? Isn't that too magic too blur the limit between both?
  • How should we define computed data in the tree? Should we define the facets as it is know but take the risk for them to be overridden by the tree's data? Should we consider functions in the states as computed data? How would we define their dependencies then etc?

Do you have ideas API-wise on how it could work for the state definition?

@scabbiaza
Copy link

I associate facets with SQL Views.
You can work with it as with usual table. In case when you work/write ORM, it can be important.
Would it be too magic? No. This is the way to hide complexity.

The other association, that come to my mind, is the @property decorator in python.

I think facets should be Read-only as in case of SQL Views and @property decorators.

I don't have the answers on other questions now, but I will think about.
Thank you!

@ivan-kleshnin
Copy link
Contributor Author

You can work with it as with usual table. In case when you work/write ORM, it can be important.
Would it be too magic? No. This is the way to hide complexity.

Yes. Think about backend when in doubt. When you SELECT from a "view" table you get data. When you INSERT into a view table you get an error. Same thing should be in Baobab: I'd like to get an Exception when I'm trying to replace facet function with some data. Because such cases are application design issues. Should never be allowed or muted.

The other association, that come to my mind, is the @Property decorator in python.

Same thing with ES6 get / set. Implementation is hidden behind the same interface.

@Aetet
Copy link

Aetet commented Jun 9, 2015

I think GraphQL just another interpretation for facets. May be it will be useful transpile them for generating facets for baobab

@Yomguithereal
Copy link
Owner

@scabbiaza @ivan-kleshnin I totally agree that facets would produce readonly paths. But how would you define your state then? Those are my hypotheses and they feel somewhat clunky:

// How to define dependencies?
// This is not good
const tree = new Baobab({
  names: ['John', 'Jack'],
  surnames: ['Blue', 'White'],
  fullNames: function() {
    return _.zip(this.get('names'), this.get('surnames'));
  }
});

// Nasty heuristics
const tree = new Baobab({
  names: ['John', 'Jack'],
  surnames: ['Blue', 'White'],
  fullNames: {
    cursors: {
      names: ['names'],
      surnames: ['surnames']
    },
    get: function({names, surnames}) {
      return _.zip(names, surnames);
    }
  }
});

// Somewhat clunky
const tree = new Baobab({
  names: ['John', 'Jack'],
  surnames: ['Blue', 'White'],
  fullNames: {
    $facet: {
      cursors: {
        names: ['names'],
        surnames: ['surnames']
      },
      get: function({names, surnames}) {
        return _.zip(names, surnames);
      }
    }
  }
});

Any idea?

@Yomguithereal
Copy link
Owner

Hello @Aetet. Could you develop a bit more please?

@Aetet
Copy link

Aetet commented Jun 9, 2015

So graphql just respresent common useful functions as declarative tree and after compilation it will expand to usual json. Here's example of the parser output. As for me I really don't know which approach closer to me. But this two weeks as I use baobab, it makes me happy. I think this is right direction for state handling.

@ivan-kleshnin
Copy link
Contributor Author

But how would you define your state then.

Keep current declaration syntax?! I'm concerned mostly about access (read / write) syntax.

@Yomguithereal
Copy link
Owner

Sure, current declaration would stay but how would you define data dependencies for computed state?

@Yomguithereal
Copy link
Owner

I guess keeping the $ in the path would be the way to go. Especially because it would be easier internally to guess whether a path involves facets by just checking the strings.

@Aetet
Copy link

Aetet commented Jun 11, 2015

Now we have pure functions that always produce same output as we pass same arguments. I agree that it will be useful:

{
  $index: {
    users: ['data', $somefacet, 'users'],
    get: function({users}) {}
  }
}

But I don't like idea of dynamic facet at get, because it creates or request new data at runtime. We cannot test this function and it looks like "good" old this. So this one lost purity for get:

{
  $index: {
    users: ['data', 'users'],
    get: function({users}) {
       tree.addFacet(['pathToFacet']);
    }
  }
}

@Aetet
Copy link

Aetet commented Jun 11, 2015

I think prepend $ will be better, because it seem more clear that here we have facet

@Yomguithereal
Copy link
Owner

I agree facets should be pure functions @Aetet. addFacets and createFacets would disappear anyway if this makes it to v2.

@Yomguithereal
Copy link
Owner

So I thought about this a little bit more, specifically to an implementation and must say it's not as easy as it sounds :-). The main problem here is that gets will require me to walk the tree until leaf level where the user has requested data to solve facets if needed. This is costly and I'll probably need to make a hashmap register of where the facets are in the tree etc. Does anyone has already something of the kind and know of an optimal way to achieve all this?

@ivan-kleshnin
Copy link
Contributor Author

Can you provide some code samples to make it more clear?

@Yomguithereal
Copy link
Owner

What I mean is the following:

// Considering the following tree
const tree = new Baobab({
  data: {
    users: [
      {id: 0, name: 'John'},
      {id: 1, name: 'Jack'}
    ],
    $index: {
      cursors: {
        users: ['users']
      },
      get: function({users}) {
        return _.indexBy(users, 'name');
      }
    }
  }
});

// I assume that getting 'data':
tree.get('data');
// would produce the following:
>>> {
  users: [
    {id: 0, name: 'John'},
    {id: 1, name: 'Jack'}
  ],
  $index: {
    John: {id: 0, name: 'John'},
    Jack: {id: 1, name: 'Jack'}
  }
}

This means that, on get, I actually need to walk the tree down to the leaves to be sure I've solved any computed data on the part of the tree the user requested. Which is costly.

To solve the problem partially, I need to hold an index of hashed paths leading to some computed data. This means less performant writes but keeps us good best cases when getting because I can avoid walking the tree if I don't need to and I can cut the walk by storing references in my index.

This means, at the end: less performant write overall while approximately same read performances for the tree.


Another parallel question would be whether computed data should be part of a serialized version of the tree (I guess not).


Now, re-reading the beginning of the issue @ivan-kleshnin, I wonder whether you can help me answer the following questions:

  • Where would you place Baobab accordingly to FRP? What could be done here to go towards "more" FRP or "less"?
  • Are there examples of usage of a centralized state with FRP?

The bottom line here is to know what I can learn from other paradigms that could help me better the library in some way?

@ivan-kleshnin
Copy link
Contributor Author

This means that, on get, I actually need to walk the tree down to the leaves to be sure I've solved any computed data on the part of the tree the user requested. Which is costly.

Why is it costly? It's just a few additional functional calls, no?
User data can be 1-2-3-4 levels deep but not 100 or 1000 levels.

To solve the problem partially, I need to hold an index of hashed paths leading to some computed data. This means less performant writes but keeps us good best cases when getting because I can avoid walking the tree if I don't need to and I can cut the walk by storing references in my index.

You're operating on memory, not hard-drive. Your (our) main concern should be memory usage, not performance which should be great unless you do something really weird (which you don't).
Am I wrong?

Another parallel question would be whether computed data should be part of a serialized version of the tree (I guess not).

I guess not. Normalization is the best default choice.

Now, re-reading the beginning of the issue @ivan-kleshnin, I wonder whether you can help me answer the following questions... The bottom line here is to know what I can learn from other paradigms that could help me better the library in some way?

Complex questions. I'm afraid I don't have enough experience to judge. I need to think it out.
Right now, besides of all that we're already discussed, I can add two directions to the boil.
They are probably for distant future but may help to choose between current options.

Query builder

Drop that mongoDB-like weird React rudiment and implement LINQ builder instead.

cursor.select("user").where({activated: true}).do();
cursor.delete("user").where({deleted: true}).do();

This API is just an example. I would inspect http://sqlkorma.com/ to begin with.
Seems very cool.

Purely functional API

Sooner or later people will ask about custom operators. Then you'll have a hard time with namespacing, issue that is always introduced by OOP. Monkeypatching is dangerous because of shared mutables and is more or less suitable only for app code (not library code).

See how HighLand devs was blocked by it.

Possible API, requires curryable functions:

import {pipe} from "ramda";
import BB from "baobab";

pipe(
  BB.select("user"),
  BB.where({activated: true}),
  BB.get,
)(tree);

pipe(
  BB.select("user"),
  BB.where({activated: true, payed: false}),
  BB.set({blocked: true}),
)(tree);

// without namespacing: better to watch for reserved words upfront
pipe(
  select("user"),
  where({activated: true, payed: false}),
  set({blocked: true}),
)(tree);

@Yomguithereal
Copy link
Owner

@ivan-kleshnin, @christianalfoni: https://gist.github.com/staltz/868e7e9bc2a7b8c1f754 Is this introduction to reactive programming better that what you both came across?

@ivan-kleshnin
Copy link
Contributor Author

@Yomguithereal, yes.

The problem with the topic is that while it's quite old (late 90-s) most of information
still exist in the form of academic papers. Relatively inaccessible to an average person.

The best explanations I met come from Evan Czaplicki, the author of Elm.
This guy is definitely a genius and I highly recommend to bare with him.

Understanding formulations of FRP:
https://www.youtube.com/watch?v=Agu6jipKfYw

Very detailed explanations of the design decisions behind Elm:
http://elm-lang.org/papers/concurrent-frp.pdf

Very useful matherial to shape your reasoning about related subjects like RxJS, CSP etc.

After watching and reading this I tend to think that approach RxJS provides really is an overcomplicated. But, in any case, it's a pure win against React where asyncronous setState() basically blocks any attempt to make something more serious, like interactive game.

I would also like to get useful links from other people.

@AutoSponge
Copy link

@Yomguithereal, I'm using BB (with some success) as an intermediary step between imperative and FRP for junior developers. I can provide "data pumps" from cursors (and even complicated, calculated versions via facets) which other developers use to power views/templates.

I think the EventEmitter is exactly the right implementation because it can be wrapped by an Observable implementation if needed (for instance to throttle/debounce a stream). I imagine CSP would grow an enormous memory footprint with no real performance gain.

The only opportunity I see would be to leverage the browser where web workers are available for doing calculations in facets without blocking the UI which is good for perceived performance. The async nature lends itself to the API and may improve real performance.

@Yomguithereal
Copy link
Owner

baobab@dev now implements computed data within the tree #278.

@Yomguithereal
Copy link
Owner

I guess we can close this since facets are no more an implementation leak :).

@oresmus
Copy link

oresmus commented Aug 29, 2015

I am coming late to this discussion, since I am just learning about these libraries (and about web programming and js in general), but OTOH I have thought about related issues for some time, know something about FRP, etc.

I just want to describe a use case to support the view that there should be no enforced distinction (even by strict naming convention) between computed and "directly stored" state in your interfaces for accessing state.

use case: data with more than one possible representation

Consider an interface which wants to reveal data in either of two representations. E.g. temperature in degrees F or C, or a 3d model in either VRML or as a .obj file (or whatever).

We would like two accessor methods, one for each representation that is desired, so the functions can have a typed return value (whether formally or just in their documentation).

In a typical implementation, the backing store (I mean the state tree, like Baobab, plus whatever layers around it help with loading or computing data) will keep the data in one format, and compute the other one when requested, perhaps caching it or not.

But a change in implementation of the store might change the decision of which format is stored vs computed; or this might differ for different individual objects, depending on their source; it might change over the lifetime of a single object; it might even keep both forms for speed. But the code that does the accesses should not have to know about any of this. It should be able to use the same interface whether the data it requests will turn out to be directly stored or computed.

but don't we need to know whether it's computed in practice?

Now, indeed there are use cases where it feels like you need to know the difference. But if you examine them more closely, what you really need is something else, which can be expressed by either a fancier data type for the return value, or a fancier interface permitting a succession of values, or both. (And Baobab already gives you the "succession of values" by default, so all we need to add here is the metainfo about each value.)

For example, suppose some data is computed and this computation might take a long time, or fail, or be non-deterministic, or depend on the client platform. Then at the very least you need to account for the final result not being available right away, and ideally you might want to get more info than just the value, like a loading flag, an error message, a series of partial results (so the user can see that part of the data that's already loaded), warnings that it's non-deterministic, etc. Then your UI has the option of displaying something that depends on that meta-info about the ordinary value, like a "loading ..." message.

What you need then is for the return type from the access function to have that extra info, as well as the ordinary value. (Or alternatively, one accessor for the ordinary value and one for just the associated metainfo.)

If you want the usage to be simpler, you can always convert that into a simpler type with some standard wrapper, which might return a promise, throw an error if the final value is not yet ready, return the best current approximation to the ordinary value, or whatever.

You might think that you only need those things for computed data, not for stored data. But suppose your application starts storing too much data to fit on the client, and you want to revise it to load some of it lazily from the server? Then many of those same things might happen, and you need to account for them. Effectively you are revising the implementation of stored data and replacing it with computed data (thinking of "loading from server" as a form of computing).

So indeed you might need to start using a different interface then (and you might want a naming convention to correspond to which interface you're using), but it's not because of whether the data is computed or stored, but because of complexities in the nature of the data you actually want to access and display. And though this is correlated in practice with whether the data is computed or stored, it's not the same thing -- some computation is trivial enough to ignore, and some stored data can be so slow to access that it might as well be computed.

(To keep things non-confusing, you probably do want some kind of naming convention about the interface, e.g. data_name returns the ordinary value vs $data_name returns all the fancy metainfo too. But the distinction indicated by the name is about the interface, not the store implementation. By convention you might provide both methods for everything. Of course you'd want a non-boilerplaty way of implementing that.)

what about raising an error if you try to set computed data?

There is also the issue of computed data being an error to "set", but if you're using this one-way-flow pattern, then even your ordinary stored data should be an error to set from its access interface. So this is not a real difference either.

(And to continue the examples above, you could provide methods elsewhere capable of setting the data; you could even have a set method for each format, if necessary converting the provided value to the format it wanted to store, or changing its mind at that time about which format it did want to store for that object.)

@Yomguithereal
Copy link
Owner

Hello @oresmus. This is very interesting. Thanks. I feel that some points you raise here are at the center of Cerebral's philosophy (@christianalfoni). I am very happy to see you vouch for the homogeneity of access interface for stored/computed data because I was currently doubting some things about this.

Concerning the final part, this is what I try to achieve with the current v2 implementation and does make sense to me also.

@Yomguithereal
Copy link
Owner

I will take some time to ponder your text some more and will be back with more feedback if you want to develop on this discussion.

@AutoSponge
Copy link

I'm using cursors and facets interchangeably in my views (where I don't need to set). This is all I needed to do: const data = (e && e.data) ? e.data.data : e.target.get(); If you want to unify the interface, just start with that but IMO it's trivial to wrap it.

Having built an entire app with Baobab and something other than React, I can say I prefer having the distinction between facets and cursors. I know cursors are inherently "unsafe" because they can be a source of mutation. My views only declare cursors for data they can update (from user intents). Otherwise, it's common to see something like this at the top of a view: const {user$, entry$, page$} = appModel.facets;

Lastly, as I mentioned before, eventemitters are the backbone of streams and therefore can become streams if the implementor wishes. There's no reason to force streams/observables. Just create a wrapping lib that delivers streams or promises or whatever.

@Yomguithereal
Copy link
Owner

Is that to say you prefer keep the facets outside the state altogether then?

@AutoSponge
Copy link

@Yomguithereal, I think facets are a convenience. They wrap, sometimes naively, 1 or more cursors and emit updates of their own. It's very helpful and a pleasure to code around, but I could just as easily write it myself with possibly better results.

For instance, I have a very complicated facet in my application that gets updates from 7 other facets. When the app bootstraps, much of the data is not present and I have to short-circuit the calculations for performance. There's nothing in Baobab to say that a dependent cursor or facet is required for calculation.

This also causes "false" update events as data comes into the model asynchronously, the short-circuited facet emits update. My next step is to wrap that with a filter (could be Rx, a promise, another emitter, it doesn't matter) that holds previous value (or just noops when the data is empty). If the new value differs, it's an update, otherwise don't emit. This saves downstream calculations and possibly layout thrashing while views try to render.

I guess my point is that the wrapping of cursors might seem obvious to one person but holds lots of nuance for someone else based on their implementation. I like having small, powerful building blocks. If you want to extend this, I applaud you but I'd hope that you use a plugin/adapter system rather than change what's currently working.

@Yomguithereal
Copy link
Owner

So you'd prefer the v1 current system for facets over the v2 one then? The thing is I am currently stalling v2's release not to rush anything concerning facets and be sure everything was made correctly. The only thing that v2 changes concerning facets is that they now sit within the tree itself rather than in their own compartment. But this is the main point that should be solved here.

@AutoSponge
Copy link

@Yomguithereal when I'm done with this project, I'll change to v2 and try it out. From personal experience, having facets segregated in the model definition and clearly separated in views/controllers made it easy for me to inform other developers which data points were read-only and which could be updated by the controller.

@Yomguithereal
Copy link
Owner

@AutoSponge, this problem should be partially solved by the fact that, by convention, computed nodes in the tree should have a key starting by $ so that, by looking at the key, there should not be any confusion. But I agree this is somewhat not perfect.

@oresmus
Copy link

oresmus commented Aug 31, 2015

Thinking more about this, and after reading the subsequent comments, I want to say some things on the other side of what I said earlier. (My true position is somewhere in between -- I'm still trying to synthesize all this.)

  • it may be that alternate representations of "definitive" (independently and directly settable) state is somewhat of a special case -- important but perhaps not typical.
  • certainly there are uses of facets for views of the data which are inherently only computable, i.e. which could not sensibly be definitive in any implementation -- for example, a count of the number of users with some property.
  • even for definitive data (which is settable), it is useful to have a kind of interface to it which (unlike a cursor) can't possibly be used to set it. (This reminds me of Mark Miller's thoughts on "capabilities", which anyone interested in this topic should certainly read about. BTW, the older community that discussion is part of uses the term "facet" for a different concept.)
  • it is important, when writing code which uses an interface to some state, to have a clear understanding of which things that seem syntactically like different pieces state "overlap" (so sets of one affect gets of the other). In complicated cases this can only be documented, not perfectly formalized, but naming conventions can help make it clear, and interface differences might be justified if they help avoid bugs or misunderstandings here. OTOH they might make legitimate changes in implementation harder, as in the example I gave before.
  • I still think it's useful to imagine that in general, for any "variable" (named thing you can ask to get or in some cases set), there is a simple standard way to "get the best approximation to the current value", vs. another standard way (with a noticeably different look in the client code) to "get all the public metainfo you have about this variable, in a standard format which may be more or less complete for different individual objects", which could include things like whether it's loaded yet, or partly loaded, associated warnings or errors about this value, modtime, etc. And that both interfaces might be useful for the same variables, and for both definitive and computed variables.
  • Finally, I'm sure FRP (functional reactive programming, for any new reader coming into this) has a lot that should be listened to about this, though if its lessons can be incorporated into a more familiar-seeming interface, that will be a big win. (I could be wrong about that, if its main lesson turns out to be "never think in the old way or you will make mistakes".)

There are different opinions about how to design a specific interface which one could derive from all this, and I'm not yet ready to argue for or against anything specific. In fact, I would need more experience actually using these things before I should try to argue anything like that.

digression: another system of interpretive dependency tracking

I do have some experience implementing and using a different style of this kind of thing (in NanoEngineer-1 and in some personal programs). That was a system in which you can compute any expression of definitive variables, and it will automatically track which ones were used, and let you subscribe to the first future change to any of those variables, so you know when a recompute might be needed -- you use this to maintain a "dirty" flag on each computed value, used to recompute it as needed before returning from each access.

This can be made very efficient in terms of number of updates -- a computed value marked dirty stops listening to changes to anything, but is not recomputed until it's next needed. So total time is proportional to rate at which things need to be recomputed (since at least one input changed), times average number of variables they depend on (and those can be other computed things). In particular, in any one "update cycle" (analogous to an "animation frame" in a browser or game) nothing can be recomputed more than once, even if there are many dependency paths between it and things which changed, and nothing can be recomputed unless it was both needed and some of its ultimate dependencies were different.

But it has some efficiency problems too, some obvious and some not -- this is getting kind of long, so I'll leave out the details.

In ease of writing correct code, if you use it correctly it's very good -- no need to declare specific dependencies, and no need for them to be the same (for any one computed variable) on each update cycle. But if you use it wrongly it can be hard to debug.

I learned about that scheme from a Lisp UI system made at CMU in the 80's, called Garnet. (That scheme itself was from an even older subsystem called "KR" which stood for "Knowledge Representation" and was someone's thesis.) It doesn't seem to be widely known.

I have thought of some efficiency improvements to that scheme, and implemented some but not others. Ultimately I think a thing like this should have compiler support. I also think it needs more flexibility. All that leads me in the direction of FRP rather than continuing to try to do this kind of thing interpretively.

But for a typical web app, as opposed to a CAD program, the computation overhead for this might be trivial compared to the DOM updates and the programmer time, so the CPU time inefficiency of interpretive dependency-tracking might be a nonissue. Thus something like Baobab, which I see as packaging up a specific (simple but useful) kind of dependency tracking, might be very good.

(I am interested in Cerebral too, but have not looked at it closely enough to make any comparison.)

@Yomguithereal
Copy link
Owner

Thanks again for your insight @oresmus. This is very interesting. This is true that this kind of dependency system introduces performance issues to tackle but I hope I have achieved a reasonably performant implementation for Baobab.

On a practical level therefore, do you think computed data nodes should indeed enter the tree as this is planned and currently implemented for v2, or do you think a further separation of concern should be kept?

@oresmus
Copy link

oresmus commented Sep 1, 2015

You're welcome @Yomguithereal, and of course thanks for providing Baobab which is both useful and discussion-stimulating. But as for the question about which specific interface is better for Baobab (v1 or v2 or something else), I wish I could give advice on that, but I have not yet used it in practice or even fully read the documentation -- I am still in the process of deciding which tools/libraries to use for my first web app, and it seems like I am finding more things to investigate every day. So I don't think I can give proper feedback on that, compared to actual users. If I become one, I won't hesitate to provide more pointed feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants