Skip to content
This repository has been archived by the owner on Apr 23, 2022. It is now read-only.

How to do the Trace query in grahpql #18

Open
fosterlynn opened this issue Jan 22, 2019 · 13 comments
Open

How to do the Trace query in grahpql #18

fosterlynn opened this issue Jan 22, 2019 · 13 comments
Labels
question Further information is requested

Comments

@fosterlynn
Copy link
Contributor

I'm thinking about the Trace. (Basically the Trace will bring back the previous objects to the object you are querying. If you keep asking recursively, you will get the whole tree of inputs to an economic resource.)

@bhaugen specified the base logic here: https://www.valueflo.ws/appendix/track.html

So, resources, events, processes, transfers are involved. Seems like they should each have a trace method, which can return different kinds of objects.

One question I have is which "layer" should do what? Should @sqykly 's backend bring back the whole tree of objects? Or just be responsible for each object's trace method? If the latter, who is responsible for the recursion logic?

The other question I have is how to bring back different types of objects in a trace graphql query? It seems to me (with my somewhat limited graphql experience) that it is pretty strongly typed. I can see where you can work with subclassing to bring back different types of objects with one query (we did that with Person and Organization in OCP), but since you need different properties back from each of them, even that becomes more difficult.

@pospi (and @sqykly ) - any thoughts on this one? Am I missing something?

@fosterlynn fosterlynn added the question Further information is requested label Jan 22, 2019
@fosterlynn
Copy link
Contributor Author

#12 (comment) @sqykly 's thoughts: keep as much as possible of the trace logic on the server. Makes sense to me.

On the second question, I have a thought. We could use VFObject or create another one if needed, bring back just the id and __typename for everything in the trace. The the client would get back everything in the history of the resource, and need to look up each object to get the rest of the display info. Not too horrible, they won't be terribly long generally.

Also there might be non-database info needed to make the relationships clear - @bhaugen did something for that when he wrote one before. So maybe that would drive us to create a new implementation object that just references the fundamental ones.

Just thinking, feedback welcome.

@bhaugen
Copy link

bhaugen commented Jan 22, 2019

Here's some notes I took when I refactored resource flow traversals:
valueflows/django-vocabulator#7

You can find the (python) traversal methods here:
https://github.com/valueflows/django-vocabulator/blob/master/vocab/models.py

Find

  • def next
    • to track
  • def pred or def prev
    • to trace

Some variations on both methods to find the next process instead of the next event.

In (I think) all cases, next or pred/previous will return a list, not a single object. Might be a list with one element, or an empty list.

You will get instances of different classes depending on the class of the object you ask for next or preds.

@bhaugen
Copy link

bhaugen commented Jan 22, 2019

@fosterlynn

there might be non-database info needed to make the relationships clear - @bhaugen did something for that when he wrote one before. So maybe that would drive us to create a new implementation object that just references the fundamental ones.

In retrospect, neither of us where sure what that meant, but we guessed it had to do with processing and displaying trace or track information.

I've done it with trees:

And graph diagrams:

The tree is presented one-step-at-a-time, altho the view sends all of the traced data to the template. If you do one-step-at-a-time, be aware that any track or trace function will return a collection containing zero, one, or many elements.

@sqykly
Copy link
Contributor

sqykly commented Jan 23, 2019

At present, the server goes one step at a time, but does so for a whole set of objects. One horizontal slice of the tree. My reasoning was that if clients need an algorithm we don't implement on the server, they can certainly compose their own out of the trace and track primitives. In addition, it could be used to manage the screen space or bandwidth of the output that's displayed to a user.

As for the varying types, almost all responses from the server will be wrapped in a CrudResponse object (@pospi is using DHTResponse for strictly non-error responses) which includes the type of the object and it's hash (id).

Regrettably, I think I goofed up designing the API, because it returns strictly hashes for trace and track. I will definitely have to change that so you can tell whether you have a Transfer or a Process, now that we have Process. I'll fix that tomorrow.

@pospi
Copy link
Member

pospi commented Jan 24, 2019

It's a bit of a weird one, this. I actually think the logic is better on the client end, because if it's inside the DHT (or any other backend), you run the risks both of overloading the system and of getting back an insanely large dataset that clogs up your UI. It's kindof an unbound loop... do we have circular flows in VF, ever? Because those would break it.

We also talked some time ago about "lazy loading" being a nice pattern for Holochain apps, and I think the client-side recursive logic for this is a part of it. Like, you'd keep digging through the UI and more inputs would pop into view as needed. Maybe what's actually needed is a depth parameter to the track/trace, which is limited to some upper number by the system to prevent people overloading it.

Bringing back different types in a single query is no problem, GQL supports union types. The workarounds you're talking about aren't necessary, you just use the on syntax to declare the fields to pull out for each returned type.

Always returning a list also sounds like the appropriate API to me, agreed on that.

@sqykly
Copy link
Contributor

sqykly commented Jan 24, 2019

Nevermind, I must have fixed this earlier, but forgot to update the documentation to reflect it. I'll clean up web-api.md and commit an update today.

@sqykly
Copy link
Contributor

sqykly commented Jan 24, 2019

It's a bit of a weird one, this. I actually think the logic is better on the client end, because if it's inside the DHT (or any other backend), you run the risks both of overloading the system and of getting back an insanely large dataset that clogs up your UI. It's kindof an unbound loop... do we have circular flows in VF, ever? Because those would break it.

I mean the logic of each step is on the server, as it currently is written. However, you bring up an interesting point here regarding unbound loops. If the logic behind detecting and halting such loops is not on the server, it becomes the client's headache. Every client has to re-implement that logic, possibly incorrectly, which will end up burdening the server with unending calls to trace/track. Would it be better, then, to write that logic once on the server, make sure it's right, and relieve the client of the task?

The middle ground, which is doing N steps of the algorithm before returning the result instead of all or one, is the most troublesome. Just 1 places the loop-checking responsibility squarely on the client, so the server does none of it. All would allow the client to take it easy and let the server work it out; the server can trivially do this, faster and better than the client can. If N steps are performed, the task of loop detection becomes more challenging, because every result might contain up to N-1 redundant objects. Since the server doesn't trivially keep the state of its algorithm between calls, it can't possibly detect loops larger than N elements. To do so, a new DHT entry would need to be drawn up just for that state, the user would need to keep track of its hash, etc. in order to get the right result. It's certainly do-able, it just creates more work for both parties than one or all.

We also talked some time ago about "lazy loading" being a nice pattern for Holochain apps, and I think the client-side recursive logic for this is a part of it. Like, you'd keep digging through the UI and more inputs would pop into view as needed. Maybe what's actually needed is a depth parameter to the track/trace, which is limited to some upper number by the system to prevent people overloading it.

Exactly what I was thinking when I wrote it that way.

@bhaugen
Copy link

bhaugen commented Jan 24, 2019

Re detecting cycles/loops in traversals: the usual way to do that in a recursive algorithm is to use a set, which I have called "visited" (which is the same as everybody else I've seen), add every object you have visited to that set, and before you do anything else with a newly visited object, see if it is already in "visited", and if so, skip it.

See https://github.com/valueflows/django-vocabulator/blob/master/vocab/models.py#L327
and then find all mentions of visited

@pospi
Copy link
Member

pospi commented Jan 30, 2019

It seems like we're all pretty much on the same page with this.

I think the "lazy load" pattern is going to be fine and not too much burden on the client-side logic, but time will tell. I'm feeling like the DHT is going to have a lot of other work to do, so I don't like the idea of burdening that part of the system with additional work for graph traversals.

Logic of each step on the server: very sensible.
Logic for recursing on the server: unconvinced it's a good idea. Don't think the client needs to care much either, or be any smarter than "load the next thing in this direction". If someone wants to endlessly dig through a circular resource flow and render 100 copies of the same 5 items, I say let them. I think a depth parameter is a "nice to have", for all the reasons of complexity outlined above. Let's start with the simplest MVP and add additional traversal logic as we hit limitations / awkwardness.

@bhaugen
Copy link

bhaugen commented Jan 30, 2019

I'm good with step-at-a-time.

@pospi
Copy link
Member

pospi commented Feb 6, 2019

So, just to confirm what is needed to implement trace functionality in the UI for GFD:

  • Do traceEvents and traceTransfers both need to be wired up somehow in order to display this?
  • How do I get the input list of events for traceEvents? Presumably some relationship for the owned economic resources needs to be filled in.
  • Similarly, how do I get the input list of transfers for traceTransfers?

With this data at hand, we can extend the agent rendering functionality to show the tree of events leading in to each resource. Is that what we want? So the missing step is resource -> events, so far as I can tell...

(apologies if already answered @sqykly - I wasn't quite sure of exact instructions but they may be up there!)

@fosterlynn
Copy link
Contributor Author

Here is a reference to what you might expect: https://www.valueflo.ws/appendix/track.html.

So in the flow can be: Processes, Transfers, EconomicEvents, EconomicResources - but not all types for an input type, will be one or two mostly.

The input from the UI should be one item, which you should be able to feed to the backend. The item could be any of the above. Any of those should be able to respond to a trace method, and return whatever items are appropriate in the next step, which could be of more than one type. (I'm not sure if you will need to combine more than one type yourself, or if that is now built into the backend.)

Every tree will start with an EconomicResource. (apple turnovers for GFD scenario I believe is the best)

@pospi
Copy link
Member

pospi commented Feb 10, 2019

That's kinda why I'm asking... I have these two different methods specifically for events & transfers, but I was expecting a generic "trace" that accepts any of the above & returns any of the above. It looks like what's present in the DHT is just traceEvents(Event) -> Process and traceTransfers(Transfer) -> Event.

It doesn't feel to me that this feature is ready for UI integration and wiring up, maybe @sqykly's intention was to use them in a more deliberate manner when stepping through the demo?

Given that I will be out of contact from the 14th-18th and only intermittently online between the 21st-10th (and moving house thereafter) I am left to wonder if there is really bandwidth / coordination time to get GFD into a stable state before I leave, and I wouldn't want to have things left up in the air while I'm away...

Thoughts?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants