Skip to content
This repository has been archived by the owner on Mar 15, 2018. It is now read-only.

High-level: What about Store(s)? #9

Open
robcolburn opened this issue Apr 6, 2015 · 17 comments
Open

High-level: What about Store(s)? #9

robcolburn opened this issue Apr 6, 2015 · 17 comments
Labels

Comments

@robcolburn
Copy link

Just curious what becomes of Stores in this paradigm? Flow introduced the ideas of Stores as a way to keep consistent data across your different views (if notifications change for one view, state change pushes to all views). Relay solves this with a single store, and an understanding of different "entities".

In general, Transmit seems cool - easier to follow and a statically analyzable way to handle pre-loading on the server. 👍

@RickWong
Copy link
Owner

RickWong commented Apr 6, 2015

Transmit offers the following benefits of the Relay/GraphQL paradigm:

  • Query co-location: Write data queries close to where you use the data. This is really a clever invention by Facebook's Relay team. It improves the comprehensibility of a React component and the maintainability of a React codebase at large. Transmit also allows you to co-locate queries and sub-queries in several places through the composability of Promises. (see Promise.all). Read more about query co-location in the Unofficial Relay FAQ.
  • Declarative queries and centralized execution: In React you write your React components as declarations. That is, when you call React.createClass() nothing is really being updated or rendered yet. Only later when you call React.render() then the whole component tree is mounted and executed by React with its virtual DOM optimizations et cetera. Relay provides an identical infrastructure for data quering. Queries are written declaratively and are composed hierarchically, and are executed later as a single query in a centralized place. This means you can do pre/post-processing in the root query, for example caching or query flattening to prevent over-fetching (Transmit does not include these mechanisms, explained below). When you want to use the data, you only have to know that the data is passed down as props to your component. No imperative code necessary to wire the data to a component compared to Flux; so it's very convenient. Transmit uses the asynchronicity and composability of Promises to resolve the queries in a centralized place where the data can then be passed down as props on render. Read more about Declarative programming.
  • Query separation from component lifecycles: Relay queries are not tightly coupled to the components and their lifecycles. And since they're declared statically in containers, they can be executed separately. Perfect for isomorphic apps. Via a setQueryParams() method in props you can hook into the component lifecycle if necessary. Transmit mimics the Relay API, and also provides the renderToString() and render() methods for server-side rendering. Read more about Isomorphic Relay.

On top of this, being able to declare queries not just as GraphQL but using technically very generic and powerful Promises makes Transmit theoratically limitless. The powerful thing about Promises is that they can do more than just querying. What if you want to convert fetched data into Immutable.js Records? You're in control. What if you don't want data from the backend but rather from a CORS request to a third party? No problem. Transmit lets you do that.

However there are some undocumented Relay + GraphQL Data management capabilities that Transmit hasn't implemented yet. Mostly the way Relay pre-processes queries like query flattening and hitting caches with a complete query or parts of it. Both could answer your question about how identical data can be queried from, and propagated into different components. Little is known about these fairly important Relay mechanisms, only that they are the reason why Relay and GraphQL will probably be released together as one package because GraphQL probably exposes certain query attributes making pre-processing easier. I believe these mechanisms are already worth exploring, so for that purpose I'm using a combination of Transmit + Flux-y Stores. I haven't figured it out yet. To be completely honest I'm not even certain whether Promises alone are sufficient. A possible weakness of Promises is that they cannot be flattened/combined automatically (like GraphQL queries) by Transmit without extra annotations.

@josephsavona
Copy link

It's great to see people experimenting in this space and with the ideas of Relay.

To be completely honest I'm not even certain whether Promises alone are sufficient. A possible weakness of Promises is that they cannot be flattened/combined automatically (like GraphQL queries) by Transmit without extra annotations.

This is the key point. Promise-returning functions look declarative, but they're actually still imperative and completely opaque to the framework. Unless you annotate them, as you mentioned.

@RickWong
Copy link
Owner

It's great to see people experimenting in this space and with the ideas of Relay.

Thanks for inventing Relay in the first place. Its concept really complements the React architecture beautifully.

This is the key point. Promise-returning functions look declarative, but they're actually still imperative and completely opaque to the framework. Unless you annotate them, as you mentioned.

I'm still imagining ways to properly and if possible declaratively annotate Promises-as-queries. A generic solution that does not rely on a custom QL (but would also work in harmony with a custom QL). Perhaps a centraliziation layer on top of the standard Fetch API, that groups API requests by URL or a user function, might work... Ah well will explore it as soon as I have time anyway.

@josephsavona Two questions: In Relay/GraphQL if two separate Relay components ask for the same graph node (by primary key) is that node queried once (flattened) or twice (second time hitting cache) server-side, given that the two components don't have a common parent but only a common ancestor? Also even if the query does run twice (for example, if primary key wasn't used), does Relay have some sort of nodes repository to make sure that results with identical primary keys are merged eventually, and then only propagated once?

@josephsavona
Copy link

I'm still imagining ways to properly and if possible declaratively annotate Promises-as-queries

I'm curious what you come up with here. We found promises and URIs to be insufficient, hence GraphQL ;-)

...is that node queried once...or twice...

We do everything we can to avoid duplicate work while constructing and resolving queries. There are several aspects to this, and we hope to describe them in more detail in future blog posts or talks.

does Relay have some sort of nodes repository

Yes, see the blog post.

@myndzi
Copy link

myndzi commented Sep 19, 2015

The fundamental problem, for this purpose, with promises is that they represent the result of an operation. This means they are inherently not declarative. You can compose them, sure, but each individual promise represents an individual operation that has already been kicked off. You'll never be able to compose the "information you want to receive" into a single request like GraphQL does.

@primozs
Copy link

primozs commented Oct 7, 2015

what about with observables?

@josephsavona
Copy link

@primozs Observables have the same problem as promises, they represent opaque commands. See Thinking in Relay for more details on why Relay describes queries as data.

@primozs
Copy link

primozs commented Oct 7, 2015

@josephsavona Yes, thanks for the link.

@tejacques
Copy link

What about returning a Query object? You could do something similar to Redux to declare how queries should be combined, and how the results of queries should be interpreted/operated.

It would look something like this (extremely rough).

// dataFetchers.js
export function batchFetch(state) {
    return fetch('https://some/api/batch', {
        method: 'post',
        body: JSON.stringify(state)
    }).then(function(response) {
        return response.json();
    });
}

// dataReducers.js
const initialBatchState = {
    storyIds: []
};
export function storyReducer(state = initialBatchState, action) {
    switch(action.type) {
    case STORY:
        return Object.assign({}, state, {
            storyIds: [
                ...state.storyIds,
                action.storyId
            ]
        });
    }
    return state;
}

// story.js
// ... Story implementation
export default Transmit.createContainer(Story, {
    queries: {
        story({ storyId }) {
            return {
                type: STORY,
                storyId: storyId
            };
        }
});

// newsfeed.js
// .. Newsfeed implementation

export default Transmit.createContainer(Newsfeed, {
    initialVariables: {
        storyIds: [1, 2, 3]  // Default variable.
    },
    queries: {
        // Fragment names become the Transmit prop names.
       stories ({storyIds}) {
            // This "stories" query returns a Promise composed of 3 other Promises.
            return storyIds.map(storyId => Story.queries.story({storyId}));
        }
    }
});

// initialize.js
import { Transmit } from 'react-transmit';
import { batchFetch } from './dataFetchers.js';
import { storyReducer } from './dataReducers.js';

Transmit.setReducers([storyReducer]);
Transmit.setFetchers([batchFetch]);

This is obviously extremely rough as I just typed it out in about 5 mins, but the idea is this:

  • define what a query looks like for each component
  • define how queries should be combined
  • define how queries or combined queries fetch data
  • define how results of fetch data should be expanded/interpreted (not displayed)

Some care would need to be taken to do this correctly, but the advantage is that you end up with a more generic and extensible version of Relay that would allow you to plug in whatever you need. You want to batch calls? Great add a reducer and a batch fetcher. You want to cache queries and results? No problem, add a fetcher that checks a cache. You want to expand results? No problem add a result transformation step. etc. etc.

@robcolburn
Copy link
Author

Personally, I've built something similar to Transmit (hopefully it'll get near pen-source able, too coupled to application logic at this point).

I ended up pouring promise results into a Redux store, and receiving the results in HOC containers. So, it's like throwing Redux in the middle of Transmit. The actual promises are query-language on top of JSON-API.


A few lessons learned.

  • You want a store in the mix. It enables hydration for server-side rendering, and some extensibility.
  • While we are far ahead of vanilla REST in terms of optimizing queries, client expressiveness, and standardized query/views - it's not infinitetly composable the way GraphQL is.
  • Client-side batching of queries is not a thing. You need an API that defines a pattern mechanism for batching. JSON-API defines mechanisms for sub-/related- resources, but not a mechanism for gathering a query based on the result if that query. That's a mouthful… here's how GraphQL can make code prettier

JSON-API

myAPI("shows/5", {
  fields : (
    shows: "title,description",
    images: "url,width,height"
  },
  include: "coverImage"
}).then(payload => ({
  show: payload.data,
  episodes: myAPI("videos", {
    fields:{
      videos: "title,airdate",
      images: "url,width,height"
   },
   filter: {
     show: payload.data.id
     type: "Full Episode"
   }
   include: "thumbail",
   sort: "airdate",
   range: 10
  })
})
shows(id: 5) {
  title,
  description,
  coverImage {
    url,
    width,
    height
  }
  episodesConnection(sort: airdate, limit: 10) {
    edges {
      title,
      airdate,
      thumbail {
        url,
        width,
        height
    }
  }

Relay enhances this by understanding a data model, and how to do things like paginate intrinsically.

@tejacques
Copy link

I've thought about this fairly significantly, and I've changed my mind about how it might actually work best.

I think keeping the API as-is actually works out quite well, but there can be HOC and libraries for data fetching that perform the magic. For example, imagine that you have a client JS API library which communicates with your backend. api.stories.get(id) which queues a request for a data ID and returns a promise with the story data. You can potentially hook this up to a Redux container or whatever, and specify that React-Transmit via a plugin or hook or something should:

  • Transmit.preFetch() { api.queueRequests(); dispatch(gettingData()) } -- called before fetching in forceFetch. API now queues requests and deals with batching via some internal stuff
  • Transmit.onDispatched() { api.flush() } -- called after creating the fetchPromise in forceFetch. All requests have been dispatched and the API batches and flushes. (Theoretically there could be other chains of requests that happen, but that would be atypical, and just means they wouldn't be batched, which is fine)
  • Transmit.onFetch(newData) { dispatch(updateData(newData)) } -- API updates store by dispatching a Redux action with the new data.

It would be up to your API/fetching library to support batching. It could also just be non-batching, communicating via regular xhr requests or websockets, in which case all preFetch/onFetch would need to do is dispatch the appropriate actions. The API or fetching library can also be responsible for handling duplicate requests , etc.

I think this handles the most common use cases where Relay really shines over React-Transmit. The tradeoff is users will need to provide this API library themselves, and probably batch endpoints as well, but I think it's much easier and more approachable to do that than to create a graphQL endpoint and include Relay.

@RickWong
Copy link
Owner

Thanks for the input so far guys. It really helps in finding the best way to build this part of Transmit. I'm currently working on fetch-rest and looking into Facebook's dataloader for batching.

I no longer think complete data management belongs in Transmit, as users must stay in control of their Promises. The problem now is that Promises are either read-only or write+read coupled together in one Promise. So it's kind of hard to get inside and between a Promise to do the batching/unbatching or any kind of manipulation. Therefore Transmit cannot manage what the Promises actually resolve. The users must have an API to tell Transmit.

So I think @tejacques brings up a great point: Transmit could provide new hooks to let the users signal and listen to Transmit from the outside. onFetch was a great example of listening from the outside. It made Transmit so much more useful in read-only situations. preFetch could potentially do the same for write+read coupled situations.

I'll keep these ideas in mind while working on the next versions of Transmit and fetch-rest.

@tejacques
Copy link

One other thing I forgot to mention. I think Relay with GraphQL can take advantage of performing child fetches and transformations in the initial request. In order to do the same with a data-only library, a traditional promise approach does not work, because it determines what the subsequent requests are using the results of the first, which are unknown outside of that function

Example:

getNextUsers(page) {
  return api.getNextUserids(page).then(userids =>
    userids.map(userid => api.getUser(userid)));
}

What I proposed gets part of the way there -- any top level request could batch all calls on that level only. If you want to have something that performs more like graphQL, it needs to pull and flatten the requests, so that they can be batched and performed all at once so that it looks more like this:

getNextUsers(page) {
  return api.pipeline(page, api.getNextUserids, api.map(api.getUser));
}

Now at the top level, we know exactly what's going to happen, and the API knows how to chain things together.

To fully take advantage of this, it might make sense to change from a fragment which takes a piece of information and returns a promise, to a factory which creates that function:

// ... in UserList (or something)
getNextUsers(): (page: number) => Promise<User[]> {
  // This returns a function that takes a page, and returns a promise containing a list of users
  return api.compose(api.getNextUserids, api.map(User.query.getUser));
}
// ... in User
getUser(): (userid: number) => Promise<User> {
  return api.getUser;
}

Now each component still specifies how it gets it's own data, but everything is flattened to the top level and exposed to the api library. So Trasmit can call container.forceFetch(data), which will create the fetch function at the appropriate level, and the API can batch everything as necessary.

I think I might try to proof-of-concept this using something like isomorphine and possibly a fork of Transmit that works with the changes mentioned here.

This modification always grabs all of the data, because it's done on the server in a single batched transformed request. The client can only take advantage of locally caching things if it waits to see what data it will need. Some combination of the two may be best for real world performance. I need to take a closer look at Relay to see what they do. I'm not sure if this actually covers everything, or if there are some other lurking issues that might crop up, would love to hear from @RickWong or @josephsavona!

@josephsavona
Copy link

I think Relay with GraphQL can take advantage of performing child fetches and transformations in the initial request.

@tejacques Yes. Relay supports fetching all of the data for an entire view hierarchy up front, in a single round trip. This approach is described in-depth in our guide Thinking in Relay. Especially relevant to this GitHub issue is this section (emphasis from the original):

[fetching data for a view hierarchy] is the perfect use-case for GraphQL because it provides a syntax for describing data-dependencies as data, without dictating any particular API. Note that Promises and Observables are often suggested as alternatives, but they represent opaque commands and preclude various optimizations such as query batching.

I'd definitely encourage you to check out Relay if you're looking for something to help fetch data more efficiently. You can even use Relay without a GraphQL server via relay-local-schema . This lets you write a thin GraphQL schema over your REST API, inject it into Relay, and then get all the benefits of Relay's caching, batching, etc.

@tejacques
Copy link

@josephsavona thanks for the link and confirmation (and explanation!). I've used Relay in small example projects, but haven't dug around deeply into the implementation. Honestly I think GraphQL with Relay is already a nearly ideal solution, the main issue I have with it is its file size. The latest react-relay release on npm when bundled with webpack has a minified file size of 202KB (56KB gzipped). react-transmit currently adds 45KB on top of React (13KB gzipped).

If you have an app and are using React Native, or all your users are on desktops/laptops with good internet connections, then it's not really an issue; however, on older mobile devices on 3G connections that takes an extra 1-2 seconds to download parse and execute. The combined cost of react+relay is 324KB (88KB gzip), that in addition to other libraries can be a nonstarter for a lot of projects. I believe most of the size of Relay comes from dealing with GraphQL so ironically the thing that makes Relay so declarative, composable, and friendly to developers makes it less friendly to users.

This can be greatly improved with server side rendering, and techniques like asynchronously loading javascript from the head, combined with flushing the head as soon as possible, but it's still a cost that would be nice to avoid, and you're still paying the cost of parse and execution time that can lock up the UI.

As you pointed out, you don't need to have a GraphQL server, but writing a local schema over a REST API is a similar amount of work you need to do on the server anyway, so it might as well be done on the server to take advantage of it for server side rendering and smaller client bundle size. I actually see this as an advantage of Relay, since GraphQL is one of if not the best data requesting interface I've used, and the upfront cost pays off quickly.

Finally, Relay's server-side rendering support isn't quite production ready last I checked. Even if you get past some of the current hiccups by using Isomorphic-React-Relay or similar, you actually still end up with a paradigm that is less than ideal. Here's what I mean -- the current state of the art for React server side rendering works like this:

  • Request comes in
  • Isomorphic Router specifies component to render and bundle entry
  • tag is sent to client with deferred script loading of bundle entry point / styles / etc.
  • Fetch all data needed for components somehow (Router + Relay, React-Transmit, etc.)
  • Put data into a store when it's all retrieved
  • Create root React element with store
  • rootElement.renderToString(), and inject into appropriate place in the body
  • serialize store into JSON, and hydrate client store in a script tag
  • client bootstraps react, now when routes change, we pull in new data (ideally batching in a single request like relay) to our store and re-render.

That's pretty great, but these days people are paying much closer attention to things like time to first paint, or time to see content above the fold (I just call it time-to-first-page-paint, or TTFPP, because everyone knows acronyms make things look more official and important). Because this model waits for all data to be available before synchronously rendering everything, there's a cost associated with it with slower TTFPP.

If you're trying to minimize TTFPP, you may want to be able to batch data requests more intelligently than all-at-once, and render and stream everything to the client as soon as it is ready. I'm not aware of anyone who is providing anything like this as of right now for React components, and if Relay has thought about this, or is planning on supporting it (or I'm just that dumb that I missed that it can do it already), then I would be extremely happy and have nothing left to add.

To try to avoid the x => y problem. What I really want is an isomorphic solution to minimizing the TTFPP with the smallest total JavaScript size I can get away with on the client. Ideally it would be highly declarative, with data-dependencies locally specified on components (a-la GraphQL/Relay), small library size (<50k on top of React), and support streaming rendering on the server (think https://github.com/aickin/react-dom-stream combined with relay or react-transmit).

Here's where the above differs from the current state:

  • Data-fetching layer (relay, react-transmit, etc.), intelligently batches requests needed for component. A way that might make sense is to batch each child of the router component's data separately, as often a route consists of <Header/><RouteComponent/><Footer/>
  • components are rendered and streamed to client as soon as their data is ready. This way the header, which may have static data or very lightweight data dependencies, can be sent to the client rapidly without waiting for the heavier-weight data in the RouteComponent to return.

Anyway, as a summary of this thread so far, the findings are:

  • Old: You need to return a composable flattened description of the data you need / how to grab the data you need. Promises do not work for this (hence GraphQL). GraphQL is great for this, because you say what data you need, rather than how to get it, and you leave the how to the engine. The downside of this is the translation can be costly (in terms of bundle size, may be harder to reason about exactly what is going on, etc.).
  • New: Ideally there could be a system for asynchronously rendering a component to a stream. This is actually highly related to the first issue, because it order to do it, you need to be able to hook into how your data is batched and resolved as stream, rather than a single Promise. Something more like an Observable, for example.

This ended up being a really really long post. To paraphrase Blaise Pascal I would have written a shorter post but I didn't have the time.

@josephsavona
Copy link

you may want to be able to batch data requests more intelligently than all-at-once, and render and stream everything to the client as soon as it is ready. I'm not aware of anyone who is providing anything like this as of right now for React components

@tejacques At Facebook we're using Relay to do just this: sending down just the required data and rendering early, then streaming down supplementary data and updating the view. This isn't available quite yet in OSS (it relies on some internal hooks), but we're collaborating with the community to bring this to open source. We described the approach here.

@tejacques
Copy link

@josephsavona: Awesome stuff! I'll follow the progress there more closely.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

6 participants