Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Algorithm for OpenStreetMap ODbL transition.

branch: master

Merge remote-tracking branch 'gnonthgol/master'

Conflicts:
	redact_changeset.rb
latest commit 9f2a9a7f7a
Matt Amos authored August 27, 2012
Octocat-spinner-32 .gitignore Ignore logs directory. June 28, 2012
Octocat-spinner-32 Gemfile Added Typhoeus to get parallel access to the API. August 27, 2012
Octocat-spinner-32 Gemfile.lock Added Typhoeus to get parallel access to the API. August 27, 2012
Octocat-spinner-32 INSTALL.Linux Fixing up install instructions June 17, 2012
Octocat-spinner-32 README.md Update readme with info from zere August 25, 2012
Octocat-spinner-32 TESTING.md Add instructions on how to run the run_bot.rb script June 29, 2012
Octocat-spinner-32 abbreviations.py move final check behind pop / add docu June 14, 2012
Octocat-spinner-32 abbreviations.rb use old hash instead creating an new one June 19, 2012
Octocat-spinner-32 actions.rb Initial commit. February 19, 2012
Octocat-spinner-32 additional_users.xml Add more users and changesets from Poland, based on lists from balrog… July 24, 2012
Octocat-spinner-32 bounds.xml Add Belarus earlier in the list. July 10, 2012
Octocat-spinner-32 change_bot.rb Reset the diff state on delete July 16, 2012
Octocat-spinner-32 changeset.rb Initial commit. February 19, 2012
Octocat-spinner-32 changesets_blacklist.txt Add more users and changesets from Poland, based on lists from balrog… July 24, 2012
Octocat-spinner-32 changesets_whitelist.txt Update the whitelist from the wiki, avoiding a couple of hidden pitfa… June 28, 2012
Octocat-spinner-32 check_history.rb Output timing of each step when asked for help in optimizing June 14, 2012
Octocat-spinner-32 db.rb Add individual edit white & blacklisting. July 11, 2012
Octocat-spinner-32 dbimport.rb Add a tool for importing user statuses to the database June 14, 2012
Octocat-spinner-32 diff.rb Made way-diff operations detect moves so that they can be OT'd and re… June 12, 2012
Octocat-spinner-32 edits_blacklist.txt Allow the bot to read the edits lists, and put placeholder text files… July 11, 2012
Octocat-spinner-32 edits_whitelist.txt Add the edits whitelist to git. July 17, 2012
Octocat-spinner-32 example.auth.yaml Add a script to populate the tracker database with an ordered list of… June 25, 2012
Octocat-spinner-32 extract_loader.rb sequence_ids start at different values for way_nodes and relation_mem… July 11, 2012
Octocat-spinner-32 geom.rb Made way-diff operations detect moves so that they can be OT'd and re… June 12, 2012
Octocat-spinner-32 get_auth.rb Fix up get_auth so that it doesn't clobber the whole file. June 28, 2012
Octocat-spinner-32 osm.rb Sorting on classes returns nil, instead sort on the class name. July 14, 2012
Octocat-spinner-32 osm_parse.rb Do not use floating points for node coordinates to prevent errors June 16, 2012
Octocat-spinner-32 osm_print.rb Use XML Nodes to write out user-modifiable strings. July 13, 2012
Octocat-spinner-32 pg_db.rb Deal with old relations in the db having a non-zero-based sequence_id July 11, 2012
Octocat-spinner-32 redact_changeset.rb Merge remote-tracking branch 'gnonthgol/master' August 27, 2012
Octocat-spinner-32 run_bot.rb Add pid to log file name July 17, 2012
Octocat-spinner-32 run_candidates.rb Remove stray debugging code. July 06, 2012
Octocat-spinner-32 run_mega_relation.rb Add a simple script to redact the mega-relation of doom. July 24, 2012
Octocat-spinner-32 run_regions.rb Refactor region detection to avoid duplicate regions in first loop. July 09, 2012
Octocat-spinner-32 split_logs.sh use find instead of ls, fixes "/bin/ls: Argument list too long" July 24, 2012
Octocat-spinner-32 tags.rb Do not redact type tags on relations June 08, 2012
Octocat-spinner-32 test.rb Re-added test_diff to list of all tests. Commented out debug statemen… May 14, 2012
Octocat-spinner-32 test_abbrev.rb Fix bad merge June 17, 2012
Octocat-spinner-32 test_auto.rb Remove created_by tags in tests May 04, 2012
Octocat-spinner-32 test_auto_fail.rb Fixed relation 19000 test, missing versions June 11, 2012
Octocat-spinner-32 test_diff.rb Merge branch 'master' of git://github.com/zerebubuth/openstreetmap-li… June 11, 2012
Octocat-spinner-32 test_exceptions.rb A test for the individual edit white/black lists. July 11, 2012
Octocat-spinner-32 test_geom.rb Merge branch 'master' of git://github.com/zerebubuth/openstreetmap-li… June 11, 2012
Octocat-spinner-32 test_needs_clarity.rb Do not redact type tags on relations June 08, 2012
Octocat-spinner-32 test_node.rb Amend the results of the fp_bug2 test, to expect no actions. July 13, 2012
Octocat-spinner-32 test_odbl_tag.rb Add a more complecated test case for odbl=clean wierdness and fix April 25, 2012
Octocat-spinner-32 test_references.rb Add failing test for deleting nested relations in the correct order. July 04, 2012
Octocat-spinner-32 test_relation.rb Add test case from relation 166487 July 16, 2012
Octocat-spinner-32 test_tags.rb Add my Supercalifragilisticexpialidocious Stret trivial change example June 14, 2012
Octocat-spinner-32 test_tags_lowlevel.rb Set permissions to runnable for runnable files June 07, 2012
Octocat-spinner-32 test_util.rb Make individual test files runnable April 15, 2012
Octocat-spinner-32 test_way.rb Add more test cases for ways June 12, 2012
Octocat-spinner-32 user.rb Initial commit. February 19, 2012
Octocat-spinner-32 users_whitelist.txt Add Michael Neville by request of Simon Poole. July 18, 2012
Octocat-spinner-32 util.rb Swapped out home-grown LCS implementation for one from Rosetta Code, … April 09, 2012
README.md

Introduction

This is example (but working) code for the algorithm for the transition from CC-BY-SA to ODbL data. At this stage all the methods are mocked out, but future development will add the ability to run this against an apidb format database, or possibly a live API.

Requirements

To run this, you'll need Ruby (probably >=1.9.3) and some gems, which you can install with bundle install. Then you'll be able to run:

ruby test.rb

which will run the full range of unit tests. The test files can also be individually run to concentrate on some aspects of the suite.

Tools

A simple command-line tool, check_history.rb, is provided to query an API and return information about the actions that the bot would execute, if it were running for real. If you find any results from this which are not as expected, then they would make good unit tests. Run ruby check_history.rb --help for more information on running the tool, and on the available options.

Changeset redaction

The redact_changeset.rb script can be used to redact changesets. To use it, create a yaml config file using get_auth.rb and run it with ruby redact_changeset.rb.

Test-Driven Development

This code is intended to be read as a test-driven development. It's very hard to read most code when it implements a complex algorithm, especially when it is expected to be read by anyone not fluent in the language of choice (i.e: Ruby). In order to improve the understandability of the code, this project is intended to be test-driven, with well-commented tests to define the functionality. Hopefully these tests are quite easy to read, without being a ruby expert.

Tests can be found in the various files prefixed 'test_'. For example tests_node.rb contains a set of tests to run just involving nodes, and this is a good place to start. You'll find tests which describe nodes being created, moved, and having tags changed by various users (license change agreers and disagreers in various combinations). A test then the gives the expected resulting actions which a bot should be deciding upon, to put the node in a clean state, and to redact versions from the editing history.

Actions

The main algorithm in the code, in change_bot.rb takes the history of an element and turns this into a set of "actions", where each action is one of:

  1. Edit[new object]. This will be turned into an edit which is pushed to the API. Note that the version number should be equal to the object that this will be applied on top of. Also, the changeset ID should be -1.
  2. Delete[class, id]. This will be turned into a delete request and pushed to the API. Note that this and Edit are pretty much mutually exclusive.
  3. Redact[class, id, version, visibility]. This will be a call to the special API call which hides a version in the history, and means that it won't be distributed any more.

The redaction visibility has values :hidden and :visible and these may have different effects eventually. The intended meanings are:

  • Hidden: This version does not contribute to the final version of the object and must be completely hidden.
  • Visible: This version contains information that cannot be distributed, but may also contain information which contributes to the final version. In future implementations of the API, some of this information (authorship, metadata, etc...) may be made visible.
Something went wrong with that request. Please try again.