Skip to content
Leberwurscht edited this page Dec 7, 2010 · 26 revisions

Planning

Preliminary project name

sduds for "synchronising diaspora user directory server"

Requirements for webfinger profiles

  • a name field must be set
  • a hometown field can be set
  • a "sduds" field must be set, indicating that the profile should be added to the directory. The value is some public key or encrypted password, this can be used for removing the entry even if the user doesn't have control over his webfinger address anymore.

API features

  • webfinger address submission
  • return matching webfinger addresses for given name/hometown
  • delete entry

actions triggered on API requests

  • webfinger address submission
    • Retrieve webfinger information. If "sduds" field is not set, remove database entry with this webfinger address if existant.
    • If the "sduds" field is set, and there's no database entry for the corresponding webfinger address:
      • save (full name, hometown, webfinger address, sduds, current date) to database
    • If the "sduds" field is set, and there's already a database entry present:
      • update database entry to (full name, hometown, webfinger address, sduds, current date)
  • return matching webfinger addresses for given name/hometown
    • search database, return webfinger address
  • delete entry
    • NOTE: this is only necessary if user does not have control over his webfinger address anymore, if he still has, he can clear the "sduds" field in his webfinger profile and resubmit his webfinger address.
    • check if user is allowed to delete entry using the "sduds" field. If he is, delete the entry.

security from an API point of view

  • Problem: users can add/resubmit arbitrary many entries (DoS)
    • Solution: Perhaps limit number of resubmissions per day for one webfinger address and use a CAPTCHA for adding new entries.
  • Problem: users can add entries with false names/hometowns
    • Solution: None, we need to accept this
  • Problem: users should only be allowed to add/alter entries for webfinger addresses they control
    • Solution: already solved
  • Problem: users should only be allowed to delete entries for webfinger addresses they control or when they are verified using the "sduds" field
    • Solution: already solved

Remarks

  • if a user looses both his webfinger address and his private key, he looses control over his directory entry.
  • as long as a user controls his webfinger address, he is able to alter and delete the corresponding entry for this address at any time
  • if a user's webfinger address changes, he can simply delete and resubmit his entry.

security from a server-to-server point of view

There is synchronisation between servers, so if one server does not do his work faithfully, it will compromise the whole network. There are two approaches:

  • For every change made to the database, each server must prove that it was legal. Have to think about this. Especially: How could a server prove it has not invented the information but retrieved it from some legal webfinger address? Seems impossible.
  • Make servers take control samples of each other and maintain a web of trust between servers. Badly behaving servers can be kicked out.

This needs some more thinking.

  • How can one server verify that other servers ask for a CAPTCHA properly? One can perhaps require a signature of a CAPTCHA provider for every submission, and check this signature with a public key of this CAPTCHA provider. Problems: Is there any CAPTCHA provider which offers signatures? We also need to prevent simple reuse of signatures. And: We want a decentralised system, so it's not a good solution to depend on one CAPTCHA provider. Can we build a own distributed CAPTCHA provider?
  • How can one server check that the webfinger profile was read out properly by the other servers, without modifications? Making other servers simply re-read the webfinger profile should be the solution. One thing to consider is that a bad user could modify his webfinger profile, request the server to update the entry, and change the profile again subsequently. If he does this a few times, he can achieve that the server is kicked out of the directory network.

Ideas:

  • To prevent reuse of signatures, one can send the webfinger address to the CAPTCHA provider and make it sign it only if the CAPTCHA is solved, so we have a guarantee that one CAPTCHA was solved per webfinger address.
  • Prevent users from getting servers kicked: Only count non-concurrence of webfinger profiles once per webfinger address. Now, if someone wants to get the server kicked, he has to solve a lot of CAPTCHAS before.

Problems remaining now:

  • How to get an appropriate CAPTCHA provider? Can this be done in some decentralised manner?
  • How much impact have got many collaborating bad servers?

Reformulation

  • Being pragmatic, we must depend on one CAPTCHA provider to prevent spam. This is not decentralised, but it's not so bad. Some trusted party (like Wikimedia Foundation for Wikipedia) can offer this service. In case of need we can make the directory servers change to another CAPTCHA provider.
  • Database columns:
    • webfinger address
    • full name
    • hometown
    • timestamp for expiration
    • captcha provider's signature for the webfinger address
    • hash built out of columns webfinger address, full name, hometown, timestamp
  • Each database operation must either be verifiable by control samples or have a proof of legitimacy attached. The two operations are:
    • Add entry
      • require a captcha provider's signature as proof that some human really wants to have his webfinger address submitted
      • If timestamp is very old, full name and hometown fields must be retrieved from the webfinger profile. If timestamp is recent, we only need to take control samples. If the results differ, we can act on the assumption that the server that transmitted the entry is bad.
    • Delete entry
      • This is a verifiable type of operation: A deletion is valid if the webfinger profile information differs from the one stored in the database, or the webfinger profile does not indicate any more that a directory entry is wanted, or the timestamp is very old(expiration). Treat this operation as verified if there is also an add entry operation for this webfinger address with a more recent timestamp.
      • ( This operation can also have a proof of legitimacy (see the description above concerning the sduds field). But after some thinking about it, this does not seem necessary, and the additional sduds column would take disk space, so it's better without it. )
  • Every Server maintains a violations list for every server it synchronises with, containing the webfinger addresses for which the control samples differed and a timestamp. If some server has too much violations per time, synchronisation with it is to be stopped. (We can't exclude the server immediately after one violation as violations can also occur when a user changes his webfinger profile too fast.)
  • Updating a database entry means deleting the old one and adding a new one, preserving the captcha provider's signature.
  • If a server knew a hash once, it may not simply delete it. It must discard the database entry, but add the hash to an internal removed hashes list.
  • The synchronisation process will work like this:
    • Server A connects to server B and asks him to synchronise.
    • Server B checks whether A is in his partners list; if not, refuse connection.
    • Apply reconciliation algorithm to the two sets hashes of A and hashes of B; this will yield the hashes A knows but B doesn't (A\B) and the other way round(B\A).
    • A checks if B\A contains removed hashes and tells B these hashes. So does B for A\B.
    • A sends database entries belonging to hashes in A\B to B, but excludes the removed hashes B told A in the step before. So does B.
    • Both servers now have a list of entries to be deleted and to be added. They verify the retrieved informations by checking the proofs of legitimacy and by taking control samples. If there are too outdated entries, invalid proofs of legitimacy or if the violations list grows too big, notify the other server, discard the data and remove it from the partners list. Otherwise, correct the entries for which verification failed and apply changes to database.
  • Do the following on webfinger address submission:
    • If this address was updated too recently, ignore request to prevent DoS
    • Retrieve webfinger information
    • Check whether the webfinger profile is valid and the sduds field is set. If not, see whether there is a database entry for this webfinger address in the database and remove it. Stop processing the request.
    • If a database entry for this webfinger address doesn't exist:
      • make sure the request contains a valid captcha signature
      • add database row with information from webfinger profile
    • If a database entry for this webfinger address exists:
      • delete existing database entry and add a new one with the new information and the timestamp set to current date; keep the captcha provider's signature.
  • Run regularly:
    • get all database entries older than a certain limit(this limit must be common for all directory servers) and delete them.

Problem checklist

  • The user should have control over his entry under all circumstances
    • As long as he has control over his webfinger address, he can resubmit his address; this will make the directory server to update/delete his entry as wanted.
  • The system needs to be secure against manipulation/deletion of entries
    • The information is only retrieved from the webfinger address, so one would have to manipulate the webfinger profile itself to manipulate an entry from the outside. Same holds true for deletion.
    • Bad directory servers can claim that they have found a webfinger profile to be deleted/changed and send a manipulated entry to other servers when synchronising. If a directory server does this too often, the control samples taken by other servers will finally prove it guilty and the connection will be cut. This will only help against massive manipulation, not against manipulating single entries. So it is still important that directory server operators choose the servers to synchronise with carefully.
  • The system needs to be secure against spamming
    • Spamming by manipulating entries is covered above
    • Spamming by adding new entries will be reduced by the CAPTCHAs.
  • The system needs to be secure against DoS (e.g. trick servers in kicking other servers out of their partners list)
    • Bad users can submit webfinger addresses and change them immediately afterwards, so the control samples will differ from the retrieved information. However, they need to solve a CAPTCHA for every webfinger address, this will make it more difficult.
    • A bad user can make a server update his entry a lot of times and change his profile immediately afterwards every time. This will not get the server kicked because violations are only counted once per webfinger address.
    • Bad directory servers can try to get another directory server kicked by its partners by submitting invalid entries. It is however more probable that the bad server gets kicked by the server he wants to manipulate.
  • Old entries must be deleted automatically
    • This is possible using the timestamp column in the database.
  • Servers must not break down trying to take control samples when many entries are coming in from other servers.
  • If there is a long chain of servers synchronising with each other, and the server at one end gets a key, the timestamp may be too old when it arrives at the other end of the chain.
    • We can't simply make all servers accept arbitrary old timestamps, because taking control samples only makes sense for up-to-date data. When there are old timestamps, either the receiver or the sender must verify that the information has not changed since it was acquired. If it has not changed, the timestamp in the database should not be updated, however(otherwise, the hash will change, and the entry will be sent back).
    • Perhaps it makes sense to let servers send entries "with reservation", that means that the entry is too old so the server cannot be sure whether it is still valid. In fact, it does at least not make sense not to offer this feature, as this is nearly equivalent to the one server using the webfinger address submission feature of the other server. One important difference is however that no new hash should be created, as this could result in dangling hashes.
    • If one server receives an item "with reservation", it must look up the webfinger profile and check if it is still valid. If it is, it must somehow tell its partners that this is the case. We can't simply update the timestamp, because the hash will change and the information will run the chain back (It's necessary that the hash depends on the timestamp, otherwise a user will resubmit his address to keep it from expiring, but this information will not be transmitted to other servers).
      • We could make two timestamp fields; one "user submission" timestamp on which the hash depends and from which the expiration date is computed, and one "checked by a server" timestamp on which the hash doesn't depend. This will make the database bigger(timestamp takes 4 bytes).
      • Another solution would be to make the hash independent of the timestamp and require the user to change some other webfinger information from time to time if he does not want it to expire(but which field? If we take an extra field for this in the database, it will also require space). Will this approach have bad consequences? It will make the users update their profile more often than otherwise, and this information will be propagated in the directory server network every time. But this is also the case if we leave everything as it was: If a user resubmits his webfinger address, a new hash will be created and propagated. It will also change the expiration mechanism: If a long chain exists, and the profile is re-read, the entry will expire later on this server and on all other servers who got his information.
      • So we either need two timestamps or the whole expiration mechanism must be rethought.
      • Will use the version with two timestamps.