Algorithm for OpenStreetMap ODbL transition.
Switch branches/tags
Nothing to show
Clone or download
zerebubuth Merge pull request #7 from naoliv/patch-3
Increase the max number of objects per changeset
Latest commit 6e57e1b Jan 26, 2018
Permalink
Failed to load latest commit information.
.gitignore Ignore logs directory. Jun 28, 2012
Gemfile Add minitest gem dependency. Jan 24, 2018
Gemfile.lock Update bundler versions. Jan 24, 2018
INSTALL.Linux Fixing up install instructions Jun 17, 2012
README.md Update readme with info from zere Aug 26, 2012
TESTING.md Add instructions on how to run the run_bot.rb script Jun 29, 2012
abbreviations.py move final check behind pop / add docu Jun 14, 2012
abbreviations.rb use old hash instead creating an new one Jun 18, 2012
actions.rb Initial commit. Feb 19, 2012
additional_users.xml Add more users and changesets from Poland, based on lists from balrog… Jul 24, 2012
bounds.xml Add Belarus earlier in the list. Jul 10, 2012
change_bot.rb Let blacklist override odbl=clean Aug 29, 2012
changeset.rb Initial commit. Feb 19, 2012
changesets_blacklist.txt Add more users and changesets from Poland, based on lists from balrog… Jul 24, 2012
changesets_whitelist.txt Update the whitelist from the wiki, avoiding a couple of hidden pitfa… Jun 28, 2012
check_history.rb Update the server to dev API Jan 24, 2018
db.rb Add individual edit white & blacklisting. Jul 10, 2012
dbimport.rb Add a tool for importing user statuses to the database Jun 14, 2012
diff.rb Made way-diff operations detect moves so that they can be OT'd and re… Jun 12, 2012
edits_blacklist.txt Allow the bot to read the edits lists, and put placeholder text files… Jul 11, 2012
edits_whitelist.txt Remove objects from the whitelist that are on the amended blacklist Aug 29, 2012
example.auth.yaml Add a script to populate the tracker database with an ordered list of… Jun 25, 2012
extract_loader.rb add some uncommitted changes Jul 10, 2015
geom.rb Made way-diff operations detect moves so that they can be OT'd and re… Jun 12, 2012
get_auth.rb Fix up get_auth so that it doesn't clobber the whole file. Jun 28, 2012
osm.rb Sorting on classes returns nil, instead sort on the class name. Jul 14, 2012
osm_parse.rb Do not use floating points for node coordinates to prevent errors Jun 16, 2012
osm_print.rb Use XML Nodes to write out user-modifiable strings. Jul 13, 2012
pg_db.rb Deal with old relations in the db having a non-zero-based sequence_id Jul 11, 2012
redact_changeset.rb Increase the max number of objects per changeset Jan 24, 2018
run_bot.rb Add pid to log file name Jul 17, 2012
run_candidates.rb Remove stray debugging code. Jul 6, 2012
run_mega_relation.rb add some uncommitted changes Jul 10, 2015
run_regions.rb Refactor region detection to avoid duplicate regions in first loop. Jul 9, 2012
split_logs.sh use find instead of ls, fixes "/bin/ls: Argument list too long" Jul 24, 2012
tags.rb Do not redact type tags on relations Jun 8, 2012
test.rb Update tests to modern minitest. Jan 24, 2018
test_abbrev.rb Update tests to modern minitest. Jan 24, 2018
test_auto.rb Update tests to modern minitest. Jan 24, 2018
test_auto_fail.rb Update tests to modern minitest. Jan 24, 2018
test_diff.rb Update tests to modern minitest. Jan 24, 2018
test_exceptions.rb Update tests to modern minitest. Jan 24, 2018
test_geom.rb Update tests to modern minitest. Jan 24, 2018
test_needs_clarity.rb Update tests to modern minitest. Jan 24, 2018
test_node.rb Update tests to modern minitest. Jan 24, 2018
test_odbl_tag.rb Update tests to modern minitest. Jan 24, 2018
test_references.rb Update tests to modern minitest. Jan 24, 2018
test_relation.rb Update tests to modern minitest. Jan 24, 2018
test_tags.rb Update tests to modern minitest. Jan 24, 2018
test_tags_lowlevel.rb Update tests to modern minitest. Jan 24, 2018
test_util.rb Update tests to modern minitest. Jan 24, 2018
test_way.rb More rubocop suggestions. Jan 24, 2018
user.rb Initial commit. Feb 19, 2012
users_whitelist.txt Add Michael Neville by request of Simon Poole. Jul 18, 2012
util.rb Swapped out home-grown LCS implementation for one from Rosetta Code, … Apr 9, 2012

README.md

Introduction

This is example (but working) code for the algorithm for the transition from CC-BY-SA to ODbL data. At this stage all the methods are mocked out, but future development will add the ability to run this against an apidb format database, or possibly a live API.

Requirements

To run this, you'll need Ruby (probably >=1.9.3) and some gems, which you can install with bundle install. Then you'll be able to run:

ruby test.rb

which will run the full range of unit tests. The test files can also be individually run to concentrate on some aspects of the suite.

Tools

A simple command-line tool, check_history.rb, is provided to query an API and return information about the actions that the bot would execute, if it were running for real. If you find any results from this which are not as expected, then they would make good unit tests. Run ruby check_history.rb --help for more information on running the tool, and on the available options.

Changeset redaction

The redact_changeset.rb script can be used to redact changesets. To use it, create a yaml config file using get_auth.rb and run it with ruby redact_changeset.rb.

Test-Driven Development

This code is intended to be read as a test-driven development. It's very hard to read most code when it implements a complex algorithm, especially when it is expected to be read by anyone not fluent in the language of choice (i.e: Ruby). In order to improve the understandability of the code, this project is intended to be test-driven, with well-commented tests to define the functionality. Hopefully these tests are quite easy to read, without being a ruby expert.

Tests can be found in the various files prefixed 'test_'. For example tests_node.rb contains a set of tests to run just involving nodes, and this is a good place to start. You'll find tests which describe nodes being created, moved, and having tags changed by various users (license change agreers and disagreers in various combinations). A test then the gives the expected resulting actions which a bot should be deciding upon, to put the node in a clean state, and to redact versions from the editing history.

Actions

The main algorithm in the code, in change_bot.rb takes the history of an element and turns this into a set of "actions", where each action is one of:

  1. Edit[new object]. This will be turned into an edit which is pushed to the API. Note that the version number should be equal to the object that this will be applied on top of. Also, the changeset ID should be -1.
  2. Delete[class, id]. This will be turned into a delete request and pushed to the API. Note that this and Edit are pretty much mutually exclusive.
  3. Redact[class, id, version, visibility]. This will be a call to the special API call which hides a version in the history, and means that it won't be distributed any more.

The redaction visibility has values :hidden and :visible and these may have different effects eventually. The intended meanings are:

  • Hidden: This version does not contribute to the final version of the object and must be completely hidden.
  • Visible: This version contains information that cannot be distributed, but may also contain information which contributes to the final version. In future implementations of the API, some of this information (authorship, metadata, etc...) may be made visible.