Skip to content

Commit

Permalink
Merge c72e1ff into 3b353af
Browse files Browse the repository at this point in the history
  • Loading branch information
joachimvh committed Mar 4, 2022
2 parents 3b353af + c72e1ff commit de2b6c9
Show file tree
Hide file tree
Showing 13 changed files with 531 additions and 130 deletions.
20 changes: 10 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ This Pod acts as your own personal storage space
so you can share data with people and Solid applications.**

As an open and modular implementation of the
[Solid specifications](https://solid.github.io/specification/),
[Solid specifications](https://solidproject.org/TR/),
the Community Solid Server is a great companion:

- 🧑🏽 **for people** who want to try out having their own Pod
Expand Down Expand Up @@ -101,16 +101,16 @@ These parameters give you direct access
to some commonly used settings:

| parameter name | default value | description |
| -------------- | ------------- | ----------- |
| `--port, -p` | `3000` | The TCP port on which the server runs. |
| `--baseUrl, -b` | `http://localhost:$PORT/` | The public URL of your server. |
| `--loggingLevel, -l` | `info` | The detail level of logging; useful for debugging problems. |
|------------------------|----------------------------|--------------------------------------------------------------------------------------------------------------------------------------|
| `--port, -p` | `3000` | The TCP port on which the server should listen. |
| `--baseUrl, -b` | `http://localhost:$PORT/` | The base URL used internally to generate URLs. Change this if your server does not run on `http://localhost:$PORT/`. |
| `--loggingLevel, -l` | `info` | The detail level of logging; useful for debugging problems. Use `debug` for full information. |
| `--config, -c` | `@css:config/default.json` | The configuration for the server. The default only stores data in memory; to persist to your filesystem, use `@css:config/file.json` |
| `--rootFilePath, -f` | `./` | Root folder of the server, when using a file-based configuration. |
| `--rootFilePath, -f` | `./` | Root folder where the server stores data, when using a file-based configuration. |
| `--sparqlEndpoint, -s` | | URL of the SPARQL endpoint, when using a quadstore-based configuration. |
| `--showStackTrace, -t` | false | Enables detailed logging on error pages. |
| `--podConfigJson` | `./pod-config.json` | Path to the file that keeps track of dynamic Pod configurations. |
| `--mainModulePath, -m` | | Path from where Components.js will start its lookup when initializing configurations.
| `--showStackTrace, -t` | false | Enables detailed logging on error output. |
| `--podConfigJson` | `./pod-config.json` | Path to the file that keeps track of dynamic Pod configurations. Only relevant when using `@css:config/dynamic.json`. |
| `--mainModulePath, -m` | | Path from where Components.js will start its lookup when initializing configurations. |

### 🧶 Custom configurations
More substantial changes to server behavior can be achieved
Expand All @@ -133,7 +133,7 @@ the [📐 architectural diagram](https://rubenverborgh.github.io/solid-server-a
can help you find your way.

If you want to help out with server development,
have a look at the [📓 developer notes](https://github.com/solid/community-server/blob/main/guides/developer-notes.md) and
have a look at the [📓 guides](https://github.com/solid/community-server/blob/main/guides/) and
[🛠️ good first issues](https://github.com/solid/community-server/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22).


Expand Down
28 changes: 28 additions & 0 deletions guides/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Documentation

The documentation here is still incomplete both in content and structre, so feel free to open
a [discussion](https://github.com/solid/community-server/discussions) about things you want to see added.
While we try to update this documentation together with updates in the code,
it is always possible we miss something,
so please report it if you find incorrect information or links that no longer work.

An introductory tutorial that gives a quick overview of the Solid and CSS basics can be found
[here](https://github.com/KNowledgeOnWebScale/solid-linked-data-workshops-hands-on-exercises/blob/main/css-tutorial.md).
This is a good way to get started with the server and its setup.

If you want to know what is new in the latest version,
you can check out the [release notes](https://github.com/solid/community-server/blob/main/RELEASE_NOTES.md)
for a high level overview and information on how to migrate your configuration to the next version.
A list that includes all minor changes can be found in
the [changelog](https://github.com/solid/community-server/blob/main/CHANGELOG.md)

## Sections

* [How to make changes to the repository](making-changes.md)
* [Basic example HTTP requests](example-requests.md)
* [How the server uses dependency injection](dependency-injection.md)
* [What the architecture looks like](architecture.md)
* [How to use the Identity Provider](identity-provider.md)

For core developers with push access only:
[How to release a new version](release.md)
73 changes: 73 additions & 0 deletions guides/architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# Architecture overview

The initial architecture document the project was started from can be found [here](https://rubenverborgh.github.io/solid-server-architecture/solid-architecture-v1-3-0.pdf).
Many things have been added since the original inception of the project,
but the core ideas within that document are still valid.

As can be seen from the architecture, an important idea is the modularity of all components.
No actual implementations are defined there, only their interfaces.
Making all the components independent of each other in such a way provides us with an enormous flexibility:
they can all be replaced by a different implementation, without impacting anything else.
This is how we can provide many different configurations for the server,
and why it is impossible to provide ready solutions for all possible combinations.

## Handlers
A very important building block that gets reused in many places is the `AsyncHandler`.
The idea is that a handler has 2 important functions.
`canHandle` determines if this class is capable of correctly handling the request,
and throws an error if it can not.
For example, a class that converts JSON-LD to turtle can handle all requests containing JSON-LD data,
but does not know what to do with a request that contains a JPEG.
The second function is `handle` where the class executes on the input data and returns the result.
If an error gets thrown here it means there is an issue with the input.
For example, if the input data claims to be JSON-LD but is actually not.

The power of using this interface really shines when using certain utility classes.
The one we use the most is the `WaterfallHandler`,
which takes as input a list of handlers of the same type.
The input and output of a `WaterfallHandler` is the same as those of its inputs,
meaning it can be used in the same places.
When doing a `canHandle` call, it will iterate over all its input handlers
to find the first one where the `canHandle` call succeeds,
and when calling `handle` it will return the result of that specific handler.
This allows us to chain together many handlers that each have their specific niche,
such as handler that each support a specific HTTP method (GET/PUT/POST/etc.),
or handlers that only take requests targeting a specific subset of URLs.
To the parent class it will look like it has a handler that supports all methods,
while in practice it will be a `WaterfallHandler` containing all these separate handlers.

Some other utility classes are the `ParallelHandler` that runs all handlers simultaneously,
and the `SequenceHandler` that runs all of them one after the other.
Since multiple handlers are executed here, these only work for handlers that have no output.

## Streams
Almost all data is handled in a streaming fashion.
This allows us to work with very large resources without having to fully load them in memory,
a client could be reading data that is being returned by the server while the server is still reading the file.
Internally this means we are mostly handling data as `Readable` objects.
We actually use `Guarded<Readable>` which is an internal format we created to help us with error handling.
Such streams can be created using utility functions such as `guardStream` and `guardedStreamFrom`.
Similarly, we have a `pipeSafely` to pipe streams in such a way that also helps with errors.

## Example request
In this section we will give a high level overview of all the components
a request passes through when it enters the server.
This is specifically an LDP request, e.g. a POST request to create a new resource.

1. The correct `HttpHandler` gets found, responsible for LDP requests.
2. The HTTP request gets parsed into a manageable format, both body and metadata such as headers.
3. The identification credentials of the request, if any, are extracted and parsed to authenticate the calling agent.
4. The request gets authorized or rejected, based on the credentials from step 3
and the authorization rules of the target resource.
5. Based on the HTTP method, the corresponding method from the `ResourceStore` gets called,
which in the case of a POST request will return the location of the newly created error.
6. The returned data and metadata get converted to an HTTP response and sent back in the `ResponseWriter`.

In case any of the steps above error, an error will be thrown.
The `ErrorHandler` will convert the error to an HTTP response to be returned.

Below are sections that go deeper into the specific steps.
Not all steps are covered yet and will be added in the future.

[How authentication and authorization work](authorization.md)
[What the `ResourceStore` looks like](resource-store.md)
56 changes: 56 additions & 0 deletions guides/authorization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Authorization

Authorization is usually handled by the `AuthorizingHttpHandler`,
and goes in the following steps:

1. Identify the credentials of the agent making the call.
2. Extract which access modes are needed for the request.
3. Reading the permissions the agent has.
4. Compare the above results to see if the request is allowed.

## Authentication
There are multiple `CredentialsExtractor`s that each determine identity in a different way.
Potentially multiple extractors can apply,
making a requesting agent have multiple credentials.
The `DPoPWebIdExtractor` is most relevant for the [Solid-OIDC specification](https://solid.github.io/solid-oidc/),
as it parses the access token generated by a Solid Identity Provider.
Besides that there are always the public credentials, which everyone has.
There are also some debug extractors that can be used to simulate credentials,
which can be enabled as different options through the `config/ldp/authentication` imports.

If successful, a `CredentialsExtractor` will return a key/value map
linking the type of credentials to their specific values.

## Modes extraction
Access modes are a predefined list of `read`, `write`, `append`, `create` and `delete`.
The `ModesExtractor`s determine which modes will be necessary,
based on the request contents.
The `MethodModesExtractor` determines modes based on the HTTP method.
A GET request will always need the `read` mode for example.
Specifically for PATCH requests there are extractors for each supported PATCH type,
such as the `N3PatchModesExtractor`,
which parses the N3 Patch body to know if it will add new data or only delete data.

## Permission reading
`PermissionReaders` take the input of the above to determine which permissions are available for which credentials.
The modes from the previous step are not yet needed,
but can be used as optimization as we only need to know if we have permission on those modes.
Each reader can potentially return a potential answer if it only checks specific cases.
Those results then get combined in the `UnionPermissionReader`.
In the default configuration there are currently 4 relevant permission readers that get combined:

1. `PathBasedReader` rejects all permissions for certain paths, to prevent access to internal data.
2. `OwnerPermissionReader` grants control permissions to agents that are trying to access data in a pod that they own.
3. `AuxiliaryReader` handles all permissions for auxiliary resources by requesting those of the subject resource if necessary.
4. `WebAclReader` reads out the relevant `.acl` resource to read out the defined permissions.

All of the above is if you have WebACL enabled.
It is also possible to always grant all permissions for debugging reasons
by changing the authorization import to `config/ldp/authorization/allow-all.json`.

## Authorization
All the results of the previous steps then get combined to either allow or reject a request.
If no permissions are found for a requested mode,
or they are explicitly forbidden,
a 401/403 will be returned,
depending on if the agent was logged in or not.
28 changes: 0 additions & 28 deletions guides/custom-configurations.md

This file was deleted.

54 changes: 54 additions & 0 deletions guides/dependency-injection.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Dependency injection

The community server uses the _dependency injection_ framework
[Components.js](https://github.com/LinkedSoftwareDependencies/Components.js/)
to link all class instances together,
and uses [Components-Generator.js](https://github.com/LinkedSoftwareDependencies/Components-Generator.js)
to automatically generate the necessary description configurations of all classes.
This framework allows us to configure our components in a JSON file.
The advantage of this is that changing the configuration of components does not require any changes to the code,
as one can just change the default configuration file, or provide a custom configuration file.

More information can be found in the Components.js [documentation](https://componentsjs.readthedocs.io/),
but a summarized overview can be found below.

## Component files
Components.js requires a component file for every class you might want to instantiate.
Fortunately those get generated automatically by Components-Generator.js.
Calling `npm run build` will call the generator and generate those JSON-LD files in the `dist` folder.
The generator uses the `index.ts`, so new classes always have to be added there
or they will not get a component file.

## Configuration files
Configuration files are how we tell Components.js which classes to instantiate and link together.
All the community server configurations can be found in
the [`config` folder](https://github.com/solid/community-server/tree/master/config/).
That folder also contains information about how different pre-defined configurations can be used.

A single component in such a configuration file might look as follows:
```json
{
"comment": "Storage used for account management.",
"@id": "urn:solid-server:default:AccountStorage",
"@type": "JsonResourceStorage",
"source": { "@id": "urn:solid-server:default:ResourceStore" },
"baseUrl": { "@id": "urn:solid-server:default:variable:baseUrl" },
"container": "/.internal/accounts/"
}
```

With the corresponding constructor of the `JsonResourceStorage` class:
```ts
public constructor(source: ResourceStore, baseUrl: string, container: string)
```

The important elements here are the following:
* `"comment"`: _(optional)_ A description of this component.
* `"@id"`: _(optional)_ A unique identifier of this component, which allows it to be used as parameter values in different places.
* `"@type"`: The class name of the component. This must be a TypeScript class name that is exported via `index.ts`.

As you can see from the constructor, the other fields are direct mappings from the constructor parameters.
`source` references another object, which we refer to using its identifier `urn:solid-server:default:ResourceStore`.
`baseUrl` is just a string, but here we use a variable that was set before calling Components.js
which is why it also references an `@id`.
These variables are set when starting up the server, based on the command line parameters.
Loading

0 comments on commit de2b6c9

Please sign in to comment.