Skip to content
This repository has been archived by the owner on Aug 25, 2023. It is now read-only.

Documentation of module context #36

Merged
merged 3 commits into from
May 1, 2019

Conversation

michielbdejong
Copy link
Contributor

No description provided.

@michielbdejong michielbdejong changed the title Initial documentation of module context Documentation of module context Apr 26, 2019
@michielbdejong
Copy link
Contributor Author

See https://github.com/inrupt/wac-ldp/blob/module-context-docs/README.md#context for human-readable display of the markdown source. Please add comments to this PR about how we can improve this documentation.

@michielbdejong michielbdejong mentioned this pull request Apr 26, 2019
18 tasks
Copy link

@RubenVerborgh RubenVerborgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main issues for me:

  • the definition of the arrow as source code dependency
  • need for a description of each block (especially the new ones) to determine responsibilities
  • the listing of "5 main functional components". I think they are rather different kinds of specific HTTP handlers the server can delegate to, and I agree on LDP and IDP as being such blocks, but unsure about the others.
  • we forgot server management as a block (managing users and their pods by server admin)

Detailed issues:

  • We might want the arrow to mean "uses" instead of "depends on (source code)". For instance, I would expect the server to use LDP and Databrowser, but only depend on them in the abstract sense: to the server, they should just be generic HTTP handlers.

  • Seems like we are missing a router component.

  • Are there definitions of the blocks? What is the exact functionality of "Notifications" and "Search"?

  • There is no fundamental difference between "LDP" and "Search", in the sense that both are HTTP interfaces to the data shielded by auth. This is now clear from their positioning in the diagram, but we perhaps want a different word for "Search". I imagine a diagram where there are a bunch of different "RDF-over-HTTP" blocks, where LDP, Search-A, Search-B, Query-C, etc. are just instances. (so something like [LDP] [A] … [Z]). I don't think "Search" is a thing like "LDP" is (but "Search-A" etc. are).

  • Just like "Search" is not a component for me, I don't think "Data browser" is. There is a notion of "statically served files", so just a "Static" component would do for me. (Cfr. https://github.com/solid/solid-architecture/blob/master/server/request-flow.md where there are LDP and non-LDP requests.)

  • IDP: let's remove "persistent storage"? Blocks might or might not have storage; not a concern at this level. Same for Persistence; could just be in-memory. The DB icon clutters unnecessarily.

  • The dependency between Persistence and Auth depends on specific implementations of Auth (but it is indeed reasonable to have it). Another way to view things is that Auth depends on LDP. Such a view might even be necessary depending on who has the knowledge to do the URL-to-filename translation.

  • The fact that Auth is, under current arrow semantics, a code dependency of LDP/Notifications/Search prevents a layered approach where these components do not have to explicitly perform auth calls. A very different option for the architecture is to have Auth before LDP (without source code dependency), and to have Notifications/Search depend on LDP rather than on Storage. Not saying that we should, just want to bring up this possibility.

Note: these comments are just on the architectural diagram; code structure discussions are only meaningful after we have the architecture nailed down.

@michielbdejong
Copy link
Contributor Author

Re server vs router, I was thinking of the router as (the main) part of the server. So the router does not have a code dependency on the route handlers it routes to, but the server is the resulting executable that contains the router plus all the route handlers, and so it has all of them as dependencies.

I don't think "Search" is a thing like "LDP" is (but "Search-A" etc. are

Good point! I'll change that.

Re data browser, I included it mainly because it's a sizeable software project, it's quite essential for the services a pod provider runs, and it's also very visible to end-users. Also, static vs dynamic is not a very interesting distinction to me - /.well-known/openid-configuration might be static, but it's very much part of the IDP module. But I agree with your point that there may also be other statics like help-pages, the login/register/what-is-this page you see when you're not logged in, etc. Then again, the login page is again part of the IDP and unrelated to the databrowser. Maybe we can add a component or a ... for other miscellaneous route handlers like help pages etc.

Not so sure about removing the db and network interface icons, to me they are quite essential to see what's going on. Otherwise it seems that 'Persistence' is the database, and not the source code module surrounding it (and same for 'Server' looking like it means http entry/exit point instead of source code surrounding it), and then the meaning of the arrows becomes confusing. But I'll think about it.

Re auth depending on ldp, or auth being a step before ldp, I'll think about that some more.

these comments are just on the architectural diagram; code structure discussions are only meaningful after we have the architecture nailed down.

Totally agree now, that's also what @justinwb explained to me yesterday.

@RubenVerborgh
Copy link

Re data browser, I included it mainly because it's a sizeable software project

Not disagreeing, but data browser is just a UI for a Solid pod.
Actually, Warp used to be the default in NSS 3 (but then it broke with NSS 4).

So "Databrowser" is actually just a collection of static files that make up the UI that you see. It is a client-side Solid app.

Also, static vs dynamic is not a very interesting distinction to me

It is, because static files you can host on any Solid pod. E.g., I can just host Databrowser, Warp, or any other Solid app or UI myself.

I cannot host LDP though.

So see static/dynamic as "does the server require anything beyond standard HTTP?".

Makes a big difference, because the set of static files can be expanded without even touching the code.

So I insist that Databrowser is just serving static files, which is something we will need to do anyway (CSS etc.).

/.well-known/openid-configuration might be static

Not "static" in the sense of "constant" / "equal across all servers".

But I agree with your point that there may also be other statics like help-pages, the login/register/what-is-this page you see when you're not logged in, etc. Then again, the login page is again part of the IDP and unrelated to the databrowser. Maybe we can add a component or a ... for other miscellaneous route handlers like help pages etc.

Exactly, and they are just all statically wired up files.
The static UI should be separated from all the rest.

Crucially, login and register pages should just be static files in this server. In NSS, that UI is tightly integrated with the internals. What we need is an RDF API for logging in, creating accounts, etc., and a front-end UI (a completely static client-side Solid app) that uses that API. That way, different servers can use different login and management apps.

So let's have a generic component for serving static files, AKA client-side Solid apps and their assets.

Not so sure about removing the db and network interface icons, to me they are quite essential to see what's going on.

It's visual noise that distracts from what we need to see in the figure: components and relations. Opening the figure now, I see two big databases and a purple thing, but that's not the essence. Whether or not a component has storage or network interface is not an important question at this stage. If storage and network are important, let's have them as components instead. Or just a list below the figure "these components use storage".

@michielbdejong
Copy link
Contributor Author

michielbdejong commented Apr 26, 2019

New version:
Functional components of the V-Next server (5)

@RubenVerborgh
Copy link

Thanks, this makes sense to me (pending a resolution for auth).

What do you think? And what do others think?

There's several options besides what I suggested, so I'm sure there will be other takes.

@michielbdejong
Copy link
Contributor Author

Re @kjetilk's questions:

I'd argue that one functional component should be input validation and filter

that would be a sub-component inside the "Data interface 1 (e.g. LDP)" component.

there was this highly painful case with quota... We had nowhere to store that...

I previously thought we should check quota and conflicting writes before attempting to write to storage, but later @pmcb55 convinced me that it's better not to do those checks and locks beforehand, but rather just let the operations error when and where a problem occurs, so the quota check would be inside the Persistence component.

I wonder if we should think about [non-user-owned persistent storage] as a separate functional component

I'm going to try to do without that if we can. If the only reason would be setting different quota limits for different users then I think we can just leave that feature out (but if you want then we put it on the request list in #31?)

@kjetilk
Copy link

kjetilk commented Apr 29, 2019

Re @kjetilk's questions:

I'd argue that one functional component should be input validation and filter

that would be a sub-component inside the "Data interface 1 (e.g. LDP)" component.

For one thing, I think we need input validation and filter also on the query component.

The other thing is that I think it can get really large. Like, orders of magnitude larger than the rest of the server. The stuff we'd want it for, like shapes validation and QA purposes, is probably not too large, but then there's the infamous EU Copyright Article 17 (formerly 13), which may require us to spend much more time on input filtering than we do on any other activity. That legislation has created a minefield for Europe, and although I don't see how anyone would argue that the sqrt(#humans) assumed servers are large deployments, you never know.

At least, I'm thinking it would consist of several microservices that has a consistent interface and some internal structure, but that is not sufficient reason to consider it a functional component by itself, I guess.

there was this highly painful case with quota... We had nowhere to store that...

I previously thought we should check quota and conflicting writes before attempting to write to storage, but later @pmcb55 convinced me that it's better not to do those checks and locks beforehand, but rather just let the operations error when and where a problem occurs, so the quota check would be inside the Persistence component.

Right, that would simplify. Just to have it on the radar so that we're not going ending up with the same situation that so simple things get really hard because we simply haven't got a place to put that data.

I wonder if we should think about [non-user-owned persistent storage] as a separate functional component

I'm going to try to do without that if we can. If the only reason would be setting different quota limits

The main reason was to store the current usage (which is a bit too expensive to compute on the fly). Different quotas for different users are already in NSS5.

However, I can also imagine us wanting to cache certain things internally, so, it might be a nice-to-have shared component.

@michielbdejong michielbdejong mentioned this pull request Apr 29, 2019
@michielbdejong
Copy link
Contributor Author

@kjetilk yeah, the BlobTree implementation can decide for itself how it caches current usage, so that's all inside that component, and then it can throw OverQuota errors which will then be converted to some http error status (403 maybe?) in the HttpResponder. I created a separate issue for your point about input validation.

@RubenVerborgh re '(pending a resolution for auth)', can you be more specific about what you want me to resolve and how? I think if you look at the implementations from #28 and inrupt/websockets-pubsub#1 then the current diagram correctly described how auth is a software dependency of LDP and of WebSockets notifications, and how auth depends on the BlobTree storage, right? The goal of this PR is to document the current architecture, we can list refactor requests in #31.

@pmcb55 @justinwb what do you think?

@RubenVerborgh
Copy link

The goal of this PR is to document the current architecture, we can list refactor requests in #31.

Okay, got it. But can we also have a future architecture diagram?

@michielbdejong
Copy link
Contributor Author

can we also have a future architecture diagram?

Yes! What would you change?

@michielbdejong michielbdejong merged commit 82dc94c into wac-parts-rebased May 1, 2019
@michielbdejong
Copy link
Contributor Author

I'll merge this and then we can do a new PR for the 'future' diagram.

@michielbdejong michielbdejong deleted the module-context-docs branch September 24, 2019 12:41
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants