The tinyarchive repository is a loose collection of scripts to help with backing up URL shorteners. Most scripts are written in Python.
The very core of the whole thing. It consists of multiple Berkely DB B-Tree databases that contain mappings from short url codes to long URLs. For each shortener there is one database. For example, the database bitly.db might contain the following mappings:
The tracker is a completely separate application that hands out tasks to tinyback instances.
When tr.im shut down, part of it's database was preserved. In 2013 tr.im was relaunched by Matthew Kelly, but all the old shortlinks were lost. With a little magic, it was possible to refill the new tr.im database with links from the old tr.im database. One such magic trick is trim-old.tinyarchive.org: Since tr.im had trouble with some URLs (for whatever reason), instead of directly linking to the URL, it was created to redirect to trim-old.tinyarchive.org/$UUID and then is redirected to the real URL from there.
Creates a new release from the database. By specifying the location of a previous release, the create_release.py script can check which files have not changed and avoid recompressing them, which would waste time and possibly change their hashsum. The code_to_file.json file is used to map from a shortener name and code to a specific output file.
Creates the sqlite3 database used by the trim-old website.
Imports finished tasks from the tracker into the database.
One-off script to import CSV dumps from the URL shortener at tny.im.
Opposite of create_release.py: Takes a release and imports it into the database, using the code_to_file.json file to map from input file to URL shortener name.
Outputs a JSON structure containing a mapping from URL shortener name to number of shorturls in the database.
Calls the tracker's cleanup admin function, which removes finished tasks and resets assignments for tasks assigned over 30 minutes ago.
Fetches a list of finished tasks from the tracker, then for each task first downloads the payload and then tells the tracker to mark the task as deleted. For each task, a JSON file with the task metadata and a corresponding txt.gz with the payload is stored in the output directory.
Takes a JSON file containing task metadata and registers a new task with the same parameters at the tracker.
File with some helper functions to create new tasks at the tracker.
Untested and unfinished tool to import the unrolled URLS from the Twitter spritzer provided by swebb.