A data management tool for humans
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
bin
lib Replace magic number with intention revealing constant Apr 12, 2014
spec Remove rspec config since it is empty Jul 9, 2014
.gitignore
.travis.yml Add Travis CI configuration and run specs on rubies 1.9.3, 2.0.0, and… Jul 8, 2014
Gemfile update rubygems to use https Mar 28, 2013
Guardfile add rspec, guard, and a first test Mar 26, 2013
README.md fix syntax Oct 31, 2016
Rakefile Add rspec rake tasks and have default rake task run specs Jul 8, 2014
glean.gemspec Set development dependency versions Jul 9, 2014

README.md

Glean was a fun expriement, but is no longer maintained.

Glean - A data management tool for humans

Glean is experimental, expect breaking changes until v1.0.0

About

Glean targets human curated datasets, with a goal of easy collaboration.

Data is stored in the human readable data format, TOML. You can think of it as Markdown for data. Each dataset is stored in a git repository, which makes it easy to track revisions, propose changes, and collaborate on datasets.

Each file represents one piece of data (a hash of hashes). Filenames and directory structure are not significant to the data, but are useful for organization and human collaboration via Pull Requests.

Goals

  • Easily pull commonly used datasets into projects
  • Curate data using Pull Requests
  • Preserve attribution for contributors

Sources

Glean datasets are available from three distinct sources:

  1. Core
  • Available via search
  • Hosted on the Glean GitHub organization
  1. Contrib
  • Available via search using --contrib
  • Hosted on GitHub and cataloged by Glean Contrib
  1. User
  • TODO
  • Directly available via URL

Installation

$ gem install glean

Requirements:

  • Git

Command line

$ glean help
NAME
    glean - A data management tool for humans

SYNOPSIS
    glean [global options] command [command options] [arguments...]

VERSION
    0.0.13

GLOBAL OPTIONS
    --help    - Show this message
    --version - 

COMMANDS
    export - Export a dataset
    get    - Download a dataset by name
    help   - Shows a list of commands or help for one command
    info   - Show dataset information
    search - Search for datasets

Examples

Core:

$ glean export countries --format=json
{"name":"Andorra","code":"ad"}
{"name":"United Arab Emirates","code":"ae"}
{"name":"Afghanistan","code":"af"}
...
$ glean export us-states --format=yaml
--- !ruby/hash:Hashie::Mash
name: Alaska
abbreviation: ak
--- !ruby/hash:Hashie::Mash
name: Alabama
abbreviation: al
--- !ruby/hash:Hashie::Mash
name: Arkansas
abbreviation: ar
...

Contrib:

$ glean export lagalaxy/trophies --format=json
{"competition":"MLS Supporters' Shield","year":1998}
{"competition":"CONCACAF Champions' League","year":2000}
{"competition":"Lamar Hunt U.S. Open Cup","year":2001}
...

Rails

Gemfile:

gem 'glean'

db/seeds.rb:

if Country.count == 0
  countries = Glean::Dataset.new('glean/countries')
  countries.each do |country|
    Country.create :name => country.name, :code => country.code
  end
end
$ rake db:seed

Other Frameworks

I'm not sure how you'd do it, but I want to make it easy. Open an issue, or better yet drop some code in a Pull Request.