Skip to content

Commit

Permalink
Merge 'origin/master' into 3636.doc-toc-reorg
Browse files Browse the repository at this point in the history
  • Loading branch information
sajith committed Apr 13, 2021
2 parents 3d9ccde + ff0f00f commit bcdb2e2
Show file tree
Hide file tree
Showing 56 changed files with 607 additions and 634 deletions.
2 changes: 2 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,10 @@ jobs:
os:
- macos-latest
- windows-latest
- ubuntu-latest
python-version:
- 2.7
- 3.6

steps:

Expand Down
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ preserving your privacy and security.
contributing
CODE_OF_CONDUCT
developer-guide
ticket-triage
release-checklist
desert-island

Expand Down
55 changes: 55 additions & 0 deletions docs/proposed/http-storage-node-protocol.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,61 @@ Specifically, it should be possible to implement a Tahoe-LAFS storage server wit
The Tahoe-LAFS client will also need to change but it is not expected that it will be noticably simplified by this change
(though this may be the first step towards simplifying it).

Motivation
----------

Foolscap
~~~~~~~~

Foolscap is a remote method invocation protocol with several distinctive features.
At its core it allows separate processes to refer each other's objects and methods using a capability-based model.
This allows for extremely fine-grained access control in a system that remains highly securable without becoming overwhelmingly complicated.
Supporting this is a flexible and extensible serialization system which allows data to be exchanged between processes in carefully controlled ways.

Tahoe-LAFS avails itself of only a small portion of these features.
A Tahoe-LAFS storage server typically only exposes one object with a fixed set of methods to clients.
A Tahoe-LAFS introducer node does roughly the same.
Tahoe-LAFS exchanges simple data structures that have many common, standard serialized representations.

In exchange for this slight use of Foolscap's sophisticated mechanisms,
Tahoe-LAFS pays a substantial price:

* Foolscap is implemented only for Python.
Tahoe-LAFS is thus limited to being implemented only in Python.
* There is only one Python implementation of Foolscap.
The implementation is therefore the de facto standard and understanding of the protocol often relies on understanding that implementation.
* The Foolscap developer community is very small.
The implementation therefore advances very little and some non-trivial part of the maintenance cost falls on the Tahoe-LAFS project.
* The extensible serialization system imposes substantial complexity compared to the simple data structures Tahoe-LAFS actually exchanges.

HTTP
~~~~

HTTP is a request/response protocol that has become the lingua franca of the internet.
Combined with the principles of Representational State Transfer (REST) it is widely employed to create, update, and delete data in collections on the internet.
HTTP itself provides only modest functionality in comparison to Foolscap.
However its simplicity and widespread use have led to a diverse and almost overwhelming ecosystem of libraries, frameworks, toolkits, and so on.

By adopting HTTP in place of Foolscap Tahoe-LAFS can realize the following concrete benefits:

* Practically every language or runtime has an HTTP protocol implementation (or a dozen of them) available.
This change paves the way for new Tahoe-LAFS implementations using tools better suited for certain situations
(mobile client implementations, high-performance server implementations, easily distributed desktop clients, etc).
* The simplicity of and vast quantity of resources about HTTP make it a very easy protocol to learn and use.
This change reduces the barrier to entry for developers to contribute improvements to Tahoe-LAFS's network interactions.
* For any given language there is very likely an HTTP implementation with a large and active developer community.
Tahoe-LAFS can therefore benefit from the large effort being put into making better libraries for using HTTP.
* One of the core features of HTTP is the mundane transfer of bulk data and implementions are often capable of doing this with extreme efficiency.
The alignment of this core feature with a core activity of Tahoe-LAFS of transferring bulk data means that a substantial barrier to improved Tahoe-LAFS runtime performance will be eliminated.

TLS
~~~

The Foolscap-based protocol provides *some* of Tahoe-LAFS's confidentiality, integrity, and authentication properties by leveraging TLS.
An HTTP-based protocol can make use of TLS in largely the same way to provide the same properties.
Provision of these properties *is* dependant on implementers following Great Black Swamp's rules for x509 certificate validation
(rather than the standard "web" rules for validation).

Requirements
------------

Expand Down
27 changes: 27 additions & 0 deletions docs/ticket-triage.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
=============
Ticket Triage
=============

Ticket triage is a weekly, informal ritual that is meant to solve the problem of
tickets getting opened and then forgotten about. It is simple and keeps project
momentum going and prevents ticket cruft.

It fosters conversation around project tasks and philosophies as they relate to
milestones.

Process
-------
- The role of Ticket Triager rotates regularly-ish, and is assigned ad hoc
- The Triager needs a ``Trac`` account
- The Triager looks at all the tickets that have been created in the last week (or month, etc.)
- They can use a custom query or do this as the week progresses
- BONUS ROUND: Dig up a stale ticket from the past
- Assign each ticket to a milestone on the Roadmap
- The following situations merit discussion:
- A ticket doesn't have an appropriate milestone and we should create one
- A ticket, in vanishingly rare circumstances, should be deleted
- The ticket is spam
- The ticket contains sensitive information and harm will come to one or more people if it continues to be distributed
- A ticket could be assigned to multiple milestones
- There is another question about a ticket
- These tickets will be brought as necessary to one of our meetings (currently Tuesdays) for discussion
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Empty file added newsfragments/3616.minor
Empty file.
Empty file added newsfragments/3645.minor
Empty file.
1 change: 1 addition & 0 deletions newsfragments/3651.minor
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
We added documentation detailing the project's ticket triage process
Empty file added newsfragments/3657.minor
Empty file.
Empty file.
1 change: 1 addition & 0 deletions newsfragments/3666.documentation
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
`tox -e docs` will treat warnings about docs as errors.
Empty file added newsfragments/3667.minor
Empty file.
Empty file added newsfragments/3669.minor
Empty file.
Empty file added newsfragments/3670.minor
Empty file.
Empty file added newsfragments/3671.minor
Empty file.
Empty file added newsfragments/3674.minor
Empty file.
3 changes: 2 additions & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
# See the docs/about.rst file for licensing information.

import os, subprocess, re
from io import open

basedir = os.path.dirname(os.path.abspath(__file__))

Expand Down Expand Up @@ -357,7 +358,7 @@ def run(self):

setup(name="tahoe-lafs", # also set in __init__.py
description='secure, decentralized, fault-tolerant file store',
long_description=open('README.rst', 'rU').read(),
long_description=open('README.rst', 'r', encoding='utf-8').read(),
author='the Tahoe-LAFS project',
author_email='tahoe-dev@tahoe-lafs.org',
url='https://tahoe-lafs.org/',
Expand Down
8 changes: 5 additions & 3 deletions src/allmydata/introducer/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,11 @@
from future.builtins import filter, map, zip, ascii, chr, hex, input, next, oct, open, pow, round, super, bytes, dict, list, object, range, str, max, min # noqa: F401

import re

from foolscap.furl import decode_furl
from allmydata.crypto.util import remove_prefix
from allmydata.crypto import ed25519
from allmydata.util import base32, rrefutil, jsonbytes as json
from allmydata.util import base32, jsonbytes as json


def get_tubid_string_from_ann(ann):
Expand Down Expand Up @@ -123,10 +125,10 @@ def __init__(self, when, index, canary, ann_d):
self.service_name = ann_d["service-name"]
self.version = ann_d.get("my-version", "")
self.nickname = ann_d.get("nickname", u"")
(service_name, key_s) = index
(_, key_s) = index
self.serverid = key_s
furl = ann_d.get("anonymous-storage-FURL")
if furl:
self.connection_hints = rrefutil.connection_hints_for_furl(furl)
_, self.connection_hints, _ = decode_furl(furl)
else:
self.connection_hints = []
14 changes: 12 additions & 2 deletions src/allmydata/introducer/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,11 +24,12 @@
from zope.interface import implementer
from twisted.application import service
from twisted.internet import defer
from twisted.internet.address import IPv4Address
from twisted.python.failure import Failure
from foolscap.api import Referenceable
import allmydata
from allmydata import node
from allmydata.util import log, rrefutil, dictutil
from allmydata.util import log, dictutil
from allmydata.util.i2p_provider import create as create_i2p_provider
from allmydata.util.tor_provider import create as create_tor_provider
from allmydata.introducer.interfaces import \
Expand Down Expand Up @@ -148,6 +149,15 @@ def init_web(self, webport):
ws = IntroducerWebishServer(self, webport, nodeurl_path, staticdir)
ws.setServiceParent(self)


def stringify_remote_address(rref):
remote = rref.getPeer()
if isinstance(remote, IPv4Address):
return "%s:%d" % (remote.host, remote.port)
# loopback is a non-IPv4Address
return str(remote)


@implementer(RIIntroducerPublisherAndSubscriberService_v2)
class IntroducerService(service.MultiService, Referenceable):
name = "introducer"
Expand Down Expand Up @@ -216,7 +226,7 @@ def get_subscribers(self):
# tubid will be None. Also, subscribers do not tell us which
# pubkey they use; only publishers do that.
tubid = rref.getRemoteTubID() or "?"
remote_address = rrefutil.stringify_remote_address(rref)
remote_address = stringify_remote_address(rref)
# these three assume subscriber_info["version"]==0, but
# should tolerate other versions
nickname = subscriber_info.get("nickname", u"?")
Expand Down
2 changes: 1 addition & 1 deletion src/allmydata/scripts/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -351,7 +351,7 @@ def opt_exclude_from(self, filepath):
line. The file is assumed to be in the argv encoding."""
abs_filepath = argv_to_abspath(filepath)
try:
exclude_file = file(abs_filepath)
exclude_file = open(abs_filepath)
except:
raise BackupConfigurationError('Error opening exclude file %s.' % quote_local_unicode_path(abs_filepath))
try:
Expand Down
30 changes: 20 additions & 10 deletions src/allmydata/scripts/common.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
# coding: utf-8

from __future__ import print_function
from six import ensure_str

import os, sys, textwrap
import codecs
Expand All @@ -22,11 +21,13 @@
from future.utils import PY2
if PY2:
from future.builtins import str # noqa: F401
else:
from typing import Union

from twisted.python import usage

from allmydata.util.assertutil import precondition
from allmydata.util.encodingutil import unicode_to_url, quote_output, \
from allmydata.util.encodingutil import quote_output, \
quote_local_unicode_path, argv_to_abspath
from allmydata.scripts.default_nodedir import _default_nodedir

Expand Down Expand Up @@ -274,18 +275,27 @@ def get_alias(aliases, path_unicode, default):
return uri.from_string_dirnode(aliases[alias]).to_string(), path[colon+1:]

def escape_path(path):
# type: (str) -> str
# type: (Union[str,bytes]) -> str
u"""
Return path quoted to US-ASCII, valid URL characters.
>>> path = u'/føö/bar/☃'
>>> escaped = escape_path(path)
>>> str(escaped)
'/f%C3%B8%C3%B6/bar/%E2%98%83'
>>> escaped.encode('ascii').decode('ascii') == escaped
True
>>> escaped
u'/f%C3%B8%C3%B6/bar/%E2%98%83'
"""
segments = path.split("/")
result = "/".join([urllib.parse.quote(unicode_to_url(s)) for s in segments])
result = ensure_str(result, "ascii")
if isinstance(path, str):
path = path.encode("utf-8")
segments = path.split(b"/")
result = str(
b"/".join([
urllib.parse.quote(s).encode("ascii") for s in segments
]),
"ascii"
)
# Eventually (i.e. as part of Python 3 port) we want this to always return
# Unicode strings. However, to reduce diff sizes in the short term it'll
# return native string (i.e. bytes) on Python 2.
if PY2:
result = result.encode("ascii").__native__()
return result
3 changes: 2 additions & 1 deletion src/allmydata/scripts/create_node.py
Original file line number Diff line number Diff line change
Expand Up @@ -449,12 +449,13 @@ def create_node(config):
v = remote_config.get(k, None)
if v is not None:
# we're faking usually argv-supplied options :/
v_orig = v
if isinstance(v, str):
v = v.encode(get_io_encoding())
config[k] = v
if k not in sensitive_keys:
if k not in ['shares-happy', 'shares-total', 'shares-needed']:
print(" {}: {}".format(k, v), file=out)
print(" {}: {}".format(k, v_orig), file=out)
else:
print(" {}: [sensitive data; see tahoe.cfg]".format(k), file=out)

Expand Down
24 changes: 13 additions & 11 deletions src/allmydata/scripts/tahoe_backup.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,16 @@
from __future__ import print_function

from past.builtins import unicode

import os.path
import time
import urllib
import json
from urllib.parse import quote as url_quote
import datetime

from allmydata.scripts.common import get_alias, escape_path, DEFAULT_ALIAS, \
UnknownAliasError
from allmydata.scripts.common_http import do_http, HTTPError, format_http_error
from allmydata.util import time_format
from allmydata.util import time_format, jsonbytes as json
from allmydata.scripts import backupdb
from allmydata.util.encodingutil import listdir_unicode, quote_output, \
quote_local_unicode_path, to_bytes, FilenameEncodingError, unicode_to_url
Expand Down Expand Up @@ -52,7 +54,7 @@ def mkdir(contents, options):

def put_child(dirurl, childname, childcap):
assert dirurl[-1] != "/"
url = dirurl + "/" + urllib.quote(unicode_to_url(childname)) + "?t=uri"
url = dirurl + "/" + url_quote(unicode_to_url(childname)) + "?t=uri"
resp = do_http("PUT", url, childcap)
if resp.status not in (200, 201):
raise HTTPError("Error during put_child", resp)
Expand Down Expand Up @@ -97,7 +99,7 @@ def run(self):
except UnknownAliasError as e:
e.display(stderr)
return 1
to_url = nodeurl + "uri/%s/" % urllib.quote(rootcap)
to_url = nodeurl + "uri/%s/" % url_quote(rootcap)
if path:
to_url += escape_path(path)
if not to_url.endswith("/"):
Expand Down Expand Up @@ -165,7 +167,7 @@ def upload_directory(self, path, compare_contents, create_contents):
if must_create:
self.verboseprint(" creating directory for %s" % quote_local_unicode_path(path))
newdircap = mkdir(create_contents, self.options)
assert isinstance(newdircap, str)
assert isinstance(newdircap, bytes)
if r:
r.did_create(newdircap)
return True, newdircap
Expand All @@ -192,7 +194,7 @@ def check_backupdb_file(self, childpath):
filecap = r.was_uploaded()
self.verboseprint("checking %s" % quote_output(filecap))
nodeurl = self.options['node-url']
checkurl = nodeurl + "uri/%s?t=check&output=JSON" % urllib.quote(filecap)
checkurl = nodeurl + "uri/%s?t=check&output=JSON" % url_quote(filecap)
self._files_checked += 1
resp = do_http("POST", checkurl)
if resp.status != 200:
Expand Down Expand Up @@ -225,7 +227,7 @@ def check_backupdb_directory(self, compare_contents):
dircap = r.was_created()
self.verboseprint("checking %s" % quote_output(dircap))
nodeurl = self.options['node-url']
checkurl = nodeurl + "uri/%s?t=check&output=JSON" % urllib.quote(dircap)
checkurl = nodeurl + "uri/%s?t=check&output=JSON" % url_quote(dircap)
self._directories_checked += 1
resp = do_http("POST", checkurl)
if resp.status != 200:
Expand Down Expand Up @@ -345,7 +347,7 @@ def backup(self, progress, upload_file, upload_directory):
target = PermissionDeniedTarget(self._path, isdir=False)
return target.backup(progress, upload_file, upload_directory)
else:
assert isinstance(childcap, str)
assert isinstance(childcap, bytes)
if created:
return progress.created_file(self._path, childcap, metadata)
return progress.reused_file(self._path, childcap, metadata)
Expand Down Expand Up @@ -525,12 +527,12 @@ def consume_directory(self, dirpath):
return self, {
os.path.basename(create_path): create_value
for (create_path, create_value)
in self._create_contents.iteritems()
in self._create_contents.items()
if os.path.dirname(create_path) == dirpath
}, {
os.path.basename(compare_path): compare_value
for (compare_path, compare_value)
in self._compare_contents.iteritems()
in self._compare_contents.items()
if os.path.dirname(compare_path) == dirpath
}

Expand Down
8 changes: 6 additions & 2 deletions src/allmydata/scripts/tahoe_get.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from __future__ import print_function

import urllib
from urllib.parse import quote as url_quote
from allmydata.scripts.common import get_alias, DEFAULT_ALIAS, escape_path, \
UnknownAliasError
from allmydata.scripts.common_http import do_http, format_http_error
Expand All @@ -20,7 +20,7 @@ def get(options):
except UnknownAliasError as e:
e.display(stderr)
return 1
url = nodeurl + "uri/%s" % urllib.quote(rootcap)
url = nodeurl + "uri/%s" % url_quote(rootcap)
if path:
url += "/" + escape_path(path)

Expand All @@ -30,6 +30,10 @@ def get(options):
outf = open(to_file, "wb")
else:
outf = stdout
# Make sure we can write bytes; on Python 3 stdout is Unicode by
# default.
if getattr(outf, "encoding", None) is not None:
outf = outf.buffer
while True:
data = resp.read(4096)
if not data:
Expand Down
Loading

0 comments on commit bcdb2e2

Please sign in to comment.