Merge c72e1ff into 3b353af

CommunitySolidServer · Mar 4, 2022 · de2b6c9 · de2b6c9
2 parents 3b353af + c72e1ff
commit de2b6c9
Show file tree

Hide file tree

Showing 13 changed files with 531 additions and 130 deletions.
diff --git a/README.md b/README.md
@@ -16,7 +16,7 @@ This Pod acts as your own personal storage space
 so you can share data with people and Solid applications.**
 
 As an open and modular implementation of the
-[Solid specifications](https://solid.github.io/specification/),
+[Solid specifications](https://solidproject.org/TR/),
 the Community Solid Server is a great companion:
 
 - 🧑🏽 **for people** who want to try out having their own Pod
@@ -101,16 +101,16 @@ These parameters give you direct access
 to some commonly used settings:
 
 | parameter name         | default value              | description                                                                                                                          |
-| --------------         | -------------              | -----------                                                                                                                          |
-| `--port, -p`           | `3000`                     | The TCP port on which the server runs.                                                                                               |
-| `--baseUrl, -b`        | `http://localhost:$PORT/`  | The public URL of your server.                                                                                                       |
-| `--loggingLevel, -l`   | `info`                     | The detail level of logging; useful for debugging problems.                                                                          |
+|------------------------|----------------------------|--------------------------------------------------------------------------------------------------------------------------------------|
+| `--port, -p`           | `3000`                     | The TCP port on which the server should listen.                                                                                      |
+| `--baseUrl, -b`        | `http://localhost:$PORT/`  | The base URL used internally to generate URLs. Change this if your server does not run on `http://localhost:$PORT/`.                 |
+| `--loggingLevel, -l`   | `info`                     | The detail level of logging; useful for debugging problems. Use `debug` for full information.                                        |
 | `--config, -c`         | `@css:config/default.json` | The configuration for the server. The default only stores data in memory; to persist to your filesystem, use `@css:config/file.json` |
-| `--rootFilePath, -f`   | `./`                       | Root folder of the server, when using a file-based configuration.                                                                    |
+| `--rootFilePath, -f`   | `./`                       | Root folder where the server stores data, when using a file-based configuration.                                                     |
 | `--sparqlEndpoint, -s` |                            | URL of the SPARQL endpoint, when using a quadstore-based configuration.                                                              |
-| `--showStackTrace, -t` | false                      | Enables detailed logging on error pages.                                                                                             |
-| `--podConfigJson`      | `./pod-config.json`        | Path to the file that keeps track of dynamic Pod configurations.                                                                     |
-| `--mainModulePath, -m` |                            | Path from where Components.js will start its lookup when initializing configurations.
+| `--showStackTrace, -t` | false                      | Enables detailed logging on error output.                                                                                            |
+| `--podConfigJson`      | `./pod-config.json`        | Path to the file that keeps track of dynamic Pod configurations. Only relevant when using `@css:config/dynamic.json`.                |
+| `--mainModulePath, -m` |                            | Path from where Components.js will start its lookup when initializing configurations.                                                |
 
 ### 🧶 Custom configurations
 More substantial changes to server behavior can be achieved
@@ -133,7 +133,7 @@ the [📐 architectural diagram](https://rubenverborgh.github.io/solid-server-a
 can help you find your way.
 
 If you want to help out with server development,
-have a look at the [📓 developer notes](https://github.com/solid/community-server/blob/main/guides/developer-notes.md) and
+have a look at the [📓 guides](https://github.com/solid/community-server/blob/main/guides/) and
 [🛠️ good first issues](https://github.com/solid/community-server/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22).
 
 

diff --git a/guides/README.md b/guides/README.md
@@ -0,0 +1,28 @@
+# Documentation
+
+The documentation here is still incomplete both in content and structre, so feel free to open
+a [discussion](https://github.com/solid/community-server/discussions) about things you want to see added.
+While we try to update this documentation together with updates in the code,
+it is always possible we miss something,
+so please report it if you find incorrect information or links that no longer work.
+
+An introductory tutorial that gives a quick overview of the Solid and CSS basics can be found
+[here](https://github.com/KNowledgeOnWebScale/solid-linked-data-workshops-hands-on-exercises/blob/main/css-tutorial.md).
+This is a good way to get started with the server and its setup.
+
+If you want to know what is new in the latest version,
+you can check out the [release notes](https://github.com/solid/community-server/blob/main/RELEASE_NOTES.md)
+for a high level overview and information on how to migrate your configuration to the next version.
+A list that includes all minor changes can be found in 
+the [changelog](https://github.com/solid/community-server/blob/main/CHANGELOG.md)
+
+## Sections
+
+* [How to make changes to the repository](making-changes.md)
+* [Basic example HTTP requests](example-requests.md)
+* [How the server uses dependency injection](dependency-injection.md)
+* [What the architecture looks like](architecture.md)
+* [How to use the Identity Provider](identity-provider.md)
+
+For core developers with push access only: 
+[How to release a new version](release.md)
diff --git a/guides/architecture.md b/guides/architecture.md
@@ -0,0 +1,73 @@
+# Architecture overview
+
+The initial architecture document the project was started from can be found [here](https://rubenverborgh.github.io/solid-server-architecture/solid-architecture-v1-3-0.pdf).
+Many things have been added since the original inception of the project,
+but the core ideas within that document are still valid.
+
+As can be seen from the architecture, an important idea is the modularity of all components.
+No actual implementations are defined there, only their interfaces.
+Making all the components independent of each other in such a way provides us with an enormous flexibility:
+they can all be replaced by a different implementation, without impacting anything else.
+This is how we can provide many different configurations for the server,
+and why it is impossible to provide ready solutions for all possible combinations.
+
+## Handlers
+A very important building block that gets reused in many places is the `AsyncHandler`.
+The idea is that a handler has 2 important functions.
+`canHandle` determines if this class is capable of correctly handling the request,
+and throws an error if it can not.
+For example, a class that converts JSON-LD to turtle can handle all requests containing JSON-LD data,
+but does not know what to do with a request that contains a JPEG.
+The second function is `handle` where the class executes on the input data and returns the result.
+If an error gets thrown here it means there is an issue with the input.
+For example, if the input data claims to be JSON-LD but is actually not.
+
+The power of using this interface really shines when using certain utility classes.
+The one we use the most is the `WaterfallHandler`,
+which takes as input a list of handlers of the same type.
+The input and output of a `WaterfallHandler` is the same as those of its inputs,
+meaning it can be used in the same places.
+When doing a `canHandle` call, it will iterate over all its input handlers
+to find the first one where the `canHandle` call succeeds,
+and when calling `handle` it will return the result of that specific handler.
+This allows us to chain together many handlers that each have their specific niche,
+such as handler that each support a specific HTTP method (GET/PUT/POST/etc.),
+or handlers that only take requests targeting a specific subset of URLs.
+To the parent class it will look like it has a handler that supports all methods,
+while in practice it will be a `WaterfallHandler` containing all these separate handlers.
+
+Some other utility classes are the `ParallelHandler` that runs all handlers simultaneously,
+and the `SequenceHandler` that runs all of them one after the other.
+Since multiple handlers are executed here, these only work for handlers that have no output.
+
+## Streams
+Almost all data is handled in a streaming fashion.
+This allows us to work with very large resources without having to fully load them in memory,
+a client could be reading data that is being returned by the server while the server is still reading the file.
+Internally this means we are mostly handling data as `Readable` objects.
+We actually use `Guarded<Readable>` which is an internal format we created to help us with error handling.
+Such streams can be created using utility functions such as `guardStream` and `guardedStreamFrom`.
+Similarly, we have a `pipeSafely` to pipe streams in such a way that also helps with errors.
+
+## Example request
+In this section we will give a high level overview of all the components
+a request passes through when it enters the server.
+This is specifically an LDP request, e.g. a POST request to create a new resource.
+
+1. The correct `HttpHandler` gets found, responsible for LDP requests.
+2. The HTTP request gets parsed into a manageable format, both body and metadata such as headers.
+3. The identification credentials of the request, if any, are extracted and parsed to authenticate the calling agent.
+4. The request gets authorized or rejected, based on the credentials from step 3
+   and the authorization rules of the target resource.
+5. Based on the HTTP method, the corresponding method from the `ResourceStore` gets called,
+   which in the case of a POST request will return the location of the newly created error.
+6. The returned data and metadata get converted to an HTTP response and sent back in the `ResponseWriter`.
+
+In case any of the steps above error, an error will be thrown.
+The `ErrorHandler` will convert the error to an HTTP response to be returned.
+
+Below are sections that go deeper into the specific steps.
+Not all steps are covered yet and will be added in the future.
+
+[How authentication and authorization work](authorization.md)
+[What the `ResourceStore` looks like](resource-store.md)
diff --git a/guides/authorization.md b/guides/authorization.md
@@ -0,0 +1,56 @@
+# Authorization
+
+Authorization is usually handled by the `AuthorizingHttpHandler`,
+and goes in the following steps:
+
+ 1. Identify the credentials of the agent making the call.
+ 2. Extract which access modes are needed for the request.
+ 3. Reading the permissions the agent has.
+ 4. Compare the above results to see if the request is allowed.
+
+## Authentication
+There are multiple `CredentialsExtractor`s that each determine identity in a different way.
+Potentially multiple extractors can apply,
+making a requesting agent have multiple credentials. 
+The `DPoPWebIdExtractor` is most relevant for the [Solid-OIDC specification](https://solid.github.io/solid-oidc/),
+as it parses the access token generated by a Solid Identity Provider.
+Besides that there are always the public credentials, which everyone has.
+There are also some debug extractors that can be used to simulate credentials,
+which can be enabled as different options through the `config/ldp/authentication` imports.
+
+If successful, a `CredentialsExtractor` will return a key/value map
+linking the type of credentials to their specific values.
+
+## Modes extraction
+Access modes are a predefined list of `read`, `write`, `append`, `create` and `delete`.
+The `ModesExtractor`s determine which modes will be necessary,
+based on the request contents.
+The `MethodModesExtractor` determines modes based on the HTTP method.
+A GET request will always need the `read` mode for example.
+Specifically for PATCH requests there are extractors for each supported PATCH type,
+such as the `N3PatchModesExtractor`,
+which parses the N3 Patch body to know if it will add new data or only delete data.
+
+## Permission reading
+`PermissionReaders` take the input of the above to determine which permissions are available for which credentials.
+The modes from the previous step are not yet needed,
+but can be used as optimization as we only need to know if we have permission on those modes.
+Each reader can potentially return a potential answer if it only checks specific cases.
+Those results then get combined in the `UnionPermissionReader`.
+In the default configuration there are currently 4 relevant permission readers that get combined:
+
+1. `PathBasedReader` rejects all permissions for certain paths, to prevent access to internal data.
+2. `OwnerPermissionReader` grants control permissions to agents that are trying to access data in a pod that they own.
+3. `AuxiliaryReader` handles all permissions for auxiliary resources by requesting those of the subject resource if necessary.
+4. `WebAclReader` reads out the relevant `.acl` resource to read out the defined permissions.
+
+All of the above is if you have WebACL enabled.
+It is also possible to always grant all permissions for debugging reasons
+by changing the authorization import to `config/ldp/authorization/allow-all.json`.
+
+## Authorization
+All the results of the previous steps then get combined to either allow or reject a request.
+If no permissions are found for a requested mode,
+or they are explicitly forbidden,
+a 401/403 will be returned,
+depending on if the agent was logged in or not.
diff --git a/guides/custom-configurations.md b/guides/custom-configurations.md
diff --git a/guides/dependency-injection.md b/guides/dependency-injection.md
@@ -0,0 +1,54 @@
+# Dependency injection
+
+The community server uses the _dependency injection_ framework
+[Components.js](https://github.com/LinkedSoftwareDependencies/Components.js/)
+to link all class instances together,
+and uses [Components-Generator.js](https://github.com/LinkedSoftwareDependencies/Components-Generator.js)
+to automatically generate the necessary description configurations of all classes.
+This framework allows us to configure our components in a JSON file.
+The advantage of this is that changing the configuration of components does not require any changes to the code, 
+as one can just change the default configuration file, or provide a custom configuration file.
+
+More information can be found in the Components.js [documentation](https://componentsjs.readthedocs.io/),
+but a summarized overview can be found below.
+
+## Component files
+Components.js requires a component file for every class you might want to instantiate.
+Fortunately those get generated automatically by Components-Generator.js.
+Calling `npm run build` will call the generator and generate those JSON-LD files in the `dist` folder.
+The generator uses the `index.ts`, so new classes always have to be added there
+or they will not get a component file.
+
+## Configuration files
+Configuration files are how we tell Components.js which classes to instantiate and link together.
+All the community server configurations can be found in
+the [`config` folder](https://github.com/solid/community-server/tree/master/config/).
+That folder also contains information about how different pre-defined configurations can be used.
+
+A single component in such a configuration file might look as follows: 
+```json
+{
+  "comment": "Storage used for account management.",
+  "@id": "urn:solid-server:default:AccountStorage",
+  "@type": "JsonResourceStorage",
+  "source": { "@id": "urn:solid-server:default:ResourceStore" },
+  "baseUrl": { "@id": "urn:solid-server:default:variable:baseUrl" },
+  "container": "/.internal/accounts/"
+}
+```
+
+With the corresponding constructor of the `JsonResourceStorage` class:
+```ts
+public constructor(source: ResourceStore, baseUrl: string, container: string)
+```
+
+The important elements here are the following:
+* `"comment"`: _(optional)_ A description of this component.
+* `"@id"`: _(optional)_ A unique identifier of this component, which allows it to be used as parameter values in different places.
+* `"@type"`: The class name of the component. This must be a TypeScript class name that is exported via `index.ts`.
+
+As you can see from the constructor, the other fields are direct mappings from the constructor parameters.
+`source` references another object, which we refer to using its identifier `urn:solid-server:default:ResourceStore`.
+`baseUrl` is just a string, but here we use a variable that was set before calling Components.js
+which is why it also references an `@id`.
+These variables are set when starting up the server, based on the command line parameters.