Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc/spec: namespaces and discovery #303

Closed
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
198 changes: 198 additions & 0 deletions docs/spec/namespace.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,198 @@
# Namespace
The global namespace represents the full set of named repositories which can be
referenced by Docker. Any scope within the namespace may refer to a subset of
the named repos. A repository contains a set of content-addressable blobs and
tags referencing those blobs. The namespace can be used to discover the
location and certificate information of a repository based on DNS, HTTP
requests and other means.

A repository should always be referenced by its fully qualified name. If a
client presents a shortened name to a user, that name should be fully expanded
based on rules defined by the client before contacting a remote server. When
mirroring repositories, the original name for a repository must be used.
Changing a repository name is equivalent to copying or moving and will not be
referenced by the original repository.

## Terminology

- *Global Namespace* - The full set of referenceable names.
- *Name* - A fully qualified string containing both the domain and resource name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think examples of these last four and how they relate would be super useful, still find the terms a bit overloaded or interchanged in some parts of the doc

- *Repository* - A collection of objects under the same name within the
namespace.
- *Namespace* - also *Namespace Scope* - A collection of repositories with a
common name prefix and set of services including registry API, index, and trust
context.
- *Short Name* - A name which does not contain a domain and requires
expansion to a fully qualified name before resolving to a repository.

## Format
A name consists of two parts, a DNS host name plus a repository path. The host
name follows the DNS rules without change. The total length of a name is 255
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maximum total length

characters however there is no specific limitation on DNS or path components.

### Name Grammar
```
<name> ::= <hostname>"/"<path>
<hostname> ::= <host-part>*["."<host-part>][":"<port>]
<path> ::= <path-part>*["/"<path-part>]
<host-part> ::= <regexp "[a-z]([-]?[a-z0-9])*">

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the hostname is supposed to allow all valid domain names, it should not exclude valid domain names like 509films.com. See wikipedia for a simple explanation. To keep this spec and the eventual implementation simple, we can be more lax on the limit of 127 tree subdivisions and only 253 characters, letting the dns libraries handle the error.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is partly a hostname, first and foremost, it is a namespace for images. This has to be both a hostname and a valid name component for a docker image. Opening it up to allow whatever dns allows would be problematic for backwards compatibility.

If you disagree, please purpose a change to the BNF.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I would allow starting with a number, but requiring a letter somewhere after that. We should also allow multiple hyphens especially for punycode.

- <host-part> ::= <regexp "[a-z]([-]?[a-z0-9])*">
+ <host-part> ::= <regexp "([0-9][-0-9]*)?[a-z](-*[a-z0-9])*">
# Number, followed by optional non-letters, followed by required letter, followed by anything, then a non-hyphen character to end

valid:
9-a
xn--n3h
lib

invalid:
9
99-99
a0- (still invalid)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevvooe (or any others) what are the thoughts about my proposed BNF to bring the host parts in line with DNS names? (ie the comment above this one)

I have not seen any discussion on this and wanted to make sure we are not limiting where a user can host their own registry by only allowing a subset of valid domain names.

It may require tweaking if you want to follow the exact DNS specifications, but I would rather we at least err on being a superset of DNS rather than a subset.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yosifkit A few constraints on what is accepted are acceptable in the face of consistency and security. I am not sure that I agree that this would limit where a user can host a registry. Domains are cheap or free. Do you have specific examples of situations where people want to host images in a domain that can't be represented and they absolutely cannot change the domain?

Check out #609 for the work on the reference package. We are putting together a full BNF for a "Reference" which will capture all of this.

Either way, it's easier to make this more accepting later than it is to lock this down further in the future. We can make this suggested change at any time, but going back to the more restrictive format would be impossible.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paultag That's a bit oversimplified. There are a number of factors.

@yosifkit @tianon Right now, you are literally bike-shedding over "I want a snowman". Reducing my argument to 'arbitrarily restricting people's options... because "Domains are cheap or free"' is not really fair.

This specification captures the current state of "namespaces" and extends it with discovery. This isn't the right venue for changes to this format. If you would like to make a change here, the following are list of changes we've made to allowed image names based on clear, well-founded technical arguments:

moby/moby#10392
#687
#241

I look forward to your constructive, well-founded PR to add punycode support to image names! We'll update this proposal accordingly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Been trying to stay out of this one. I think @stevvooe is right that this is not the place to make such changes and decisions. This format is tracking a single DNS RFC. If we want to include other standard formats we should make sure the support is elsewhere and document what RFCs we are using to build this BNF. Then it can easily be audited for accuracy. For the <path-part> we do have more domain over other changes since there are no RFCs to base this off of.

https://www.youtube.com/watch?v=C1ZHetiuRb0

Copy link
Contributor

@tianon tianon Jul 29, 2015 via email

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using docker 1.7.1 and registry:2.0.1 I was able to tag, push, and pull an image as 9n--13h.docker:5000/hello-world*. So it is not an active problem in docker or the registry, just, as I understand it, a proposed definition here. The reason I initially chimed in was to ensure that this definition did not limit what is an acceptable hostname. So, yes, I want the BNF to reflect the DNS RFCs to allow all valid DNS hostnames as you state in this doc:

The host name follows the DNS rules without change. link

Also, it looks like my proposed change above is wrong in some places. A better one would be ^(?![0-9]+$)(?!-)[a-zA-Z0-9-]{1,63}(?<!-)$ (stackoverflow).

* yes, that is not a punycode address, but highlights the needed double hyphen for punycode as well as domains that start with a number that also contain a non-number

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yosifkit @tianon I've added a unit test for a few of these examples that we support in the future to ensure any changes get caught in the implementation of this specification.

#791

Feel free to submit a PR that modifies the name component regexp appriopriately and we can flip this test case to valid.

<port> ::= <number 1 to 65535>
<path-part> ::= <regexp "[a-z0-9]([._-]?[a-z0-9])*">
```

## Metadata
The metadata for a namespace is a list of entries consisting of a scope, action,
and space separate arguments. Each action may interpret the arguments
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"space separated"

differently. The scope of an individual entry defines which namespaces the
value may apply. It is up to the resolution process to return the set of
metadata which should be applied for a given name, however any returned values
which are out of scope should be considered invalid.

### Actions
#### pull

Used to represent a registry endpoint which supports pull operations. This
may include full registries as well as read-only mirrors.

The arguments for pull consist of a registry endpoint as well as optional
arguments for priority and key=value flags.

`<registry endpoint> [<priority>] [<flag>[=<value>], ...]`

##### Priority

Integer value providing relative sort order between other endpoints with the
same action. Higher priority endpoints should be tried before lower priority
endpoints.

##### Flags

| Key | Value | Default | Description |
|---|---|---|---|
| trim | boolean | false | Whether this registry endpoint expects the hostname to be trimmed from the API requests. This is used for compatibility with existing registries |
| version | string | "2.0" | Which API version the registry implements |
| notag | boolean | false | Whether tag operations are not supported by this registry |

#### push

Used to represent a registry endpoint which supports push operations. Should
never be defined for read-only mirrors.

Push uses the same arguments as pull.

#### index

Used to represent a search index endpoint.

`<registry endpoint> [version=<value>]`

#### namespace

Used to extend the interface to a parent or stop further namespace processing
by not providing any arguments. When a parent is provided as an argument,
namespace processing should continue by including the resolved values of the
the parent. A namespace action without a scope can be used to turn off a
namespace by providing no values except a namespace action. This will end
processing since a value is found however not provide any metadata which can
be used to configure the endpoint.

`[<parent scope>, ...]`

## Discovery
The discovery process involves resolving a name into namespace scoped metadata.
The namespace metadata contains the full set of information needed to fetch and
verify content associated with the namespace repository. The metadata includes
list of registry API endpoints, the trust model, and search index. The discovery
process should not be considered secure and therefore certificates retrieved as
part of the discovery process should be verified before trusting.

Discovery can be defined as...
`<fully qualified name> -> scope([<registry API endpoint>, ...], <publisher certificate>, <search index>, ...)`

or in Go as...
```go
type Resolver interface {
Resolve(name string) Metadata
}
```

### Default Method

The first element of the namespace is extracted and used as the domain name for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/namespace/fully qualified name/ ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, should be fully qualified name

resolution. The domain name should be used unmodified.

`<fully qualified name> -> <domain>/<path>`

#### HTTPS
The discovery related metadata will be fetched via HTTPS from the DNS resolved
location using the remaining namespace path elements as the HTTP path in a GET
request. A discovery request URL would be in the format
`https://<domain>/<name>?docker-discovery=1`

For example, "example.com/foo/bar" would create a URL
"https://example.com/foo/bar?docker-discovery=1"

##### HTML Body
```html
<meta name="docker-scope" content="example.com"><!-- Applies to all metadata -->
<meta name="docker-registry-push" content="https://registry.example.com/v2/ version=2.0 trim">
<meta name="docker-registry" content="https://registry.example.com/v1/ version=1.0">
<meta name="docker-registry-pull" content="https://registry.mirror.com/v2/ version=2.0">
<meta name="docker-registry-pull" content="http://registry.mirror.com/v2/ version=2.0">
<meta name="docker-index" content="https://search.mirror.com/v1/ version=1.0">
```

| Name | Action | Content |
|---|---|---|
| docker-scope | | fully qualified name |
| docker-registry | push+pull | pull arguments |
| docker-registry-pull | pull | pull arguments |
| docker-registry-push | push | push arguments |
| docker-index | index | index arguments |
| docker-namespace | namespace | fully qualified name |

#### Fallback (Compatibility)
If HTTPS is not implemented for a namespace, a fallback protocol may be used.
The fallback process involves attempting to ping possible registry API
endpoints to determine the set of endpoints and using no trust model. This
should preferably be used only when a namespace is explicitly marked as
insecure.

### Extensibility
A custom method may be used to provide discovery by implementing the
`Resolver` interface.

### Scope
The namespace information produced from the discovery process may contain a
scope field. The scope field means the information may apply to any namespace
with a prefix of the given scope. The scope prefix will always be applied with
a path separator. If the scope field is omitted, the information may not be
applied to any other namespace.

### Endpoints
The registry API endpoints each contain a version (may be v1 or v2 registries)
and may either be pull only mirrors or full registries. The trust model applies
to each registry API endpoint and may not be overloaded by an individual
endpoint. The trust model defines the method for verifying the content
retrieved from an endpoint, not the method for authentication or authorization.
Each registry API endpoint is responsible for specifying its authentication or
authorization method.

It may also be possible in the future to extend these endpoints to support
direct downloads of tarballs, such as the result from a `docker save`.

## Name Expansion
Before a name can be resolved, it must be expanded to its fully qualified form.
This may mean adding to the resource path as well as inserting a default
domain. The rules for expansion must be determined by the client.

Expansion can be defined as...
`<name> -> <fully qualified name>`

### Compatibility
Current Docker clients have a default expansion which must remain backwards
compatible from a user perspective. Docker clients expand short names containing
no slashes as "docker.io/library/{name}" and all other short names as
"docker.io/{name}". Current tooling built around Docker expects to use the
Docker hub registry for all short names.