Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Quickly download, clean up, and install ecological datasets into a database management system
Python HTML Inno Setup
branch: master

Merge pull request #317 from ethanwhite/dup-downloads

Only load a script the first time it is discovered
latest commit 9e13843645
@ethanwhite ethanwhite authored
Failed to load latest commit information.
apidocs add default script directory to dev docs
app Remove trailing whitespace throughout the codebase
docker Add Dockerfile
engines Check platform once and then use throughout code
lib fix string formatting
scripts Merge pull request #300 from dmcglinn/EA_palmer2007
test Add test for md5sum calculation
.gitignore Add .DS_Store to .gitignore and remove existing .DS_Store
.travis.yml Simplifying .travis.yml now that I understand how to test locally
CHANGES.md Add change log
CITATION Adding a CITATION file
CONTRIBUTING.md Add note to CONTRIBUTING about Travis postgreSQL failures
LICENSE Moving license into separate LICENSE file
MANIFEST.in Make sure that CITATION is added to distributions
README.md Add Moore Foundation funding to acknowledgements
__init__.py Merge pull request #317 from ethanwhite/dup-downloads
__main__.py Check platform once and then use throughout code
build.sh Cleanup and simplify the Linux build script
build_mac Updating build_mac to use better directory structure.
build_win Update Windows build script to build installer using Inno Setup
compile.py Remove trailing whitespace throughout the codebase
icon.ico Replacing globe icon with dog icon in Windows executable.
lscolumns.py Remove trailing whitespace throughout the codebase
make_docs.sh Separating documentation build from deb package build.
modpath.iss Add modpath.iss
osx_icon.icns Add icon to OS X app
requirements.txt Removing pyodbc from requirements.txt
retriever_installer.iss Add retriever to system path not user path
setup.py Make platform checking in setup.py consistent with rest of code
stdeb.cfg Allow either wxPython 2.8 or 3.0 for Debian package
term_size.py Adding lscolumns to the retriever explicitly, since there are still w…
try_install_all.py Remove trailing whitespace throughout the codebase
version.py For master, don't use hashes to check whether scripts are out-of-date.
version.txt Remove Download Only scripts from version.txt in master

README.md

Retriever logo Build Status

Large quantities of ecological and environmental data are increasingly available thanks to initiatives sponsoring the collection of large-scale data and efforts to increase the publication of already collected datasets. As a result, progress in ecology is increasingly limited by the speed at which we can organize and analyze data. To help improve ecologists' ability to quickly access and analyze data we have been developing software that designs database structures for ecological datasets and then downloads the data, pre-processes it, and installs it into major database management systems (at the moment we support MySQL, PostgreSQL, SQLite, and Microsoft Access).

Once the EcoData Retriever has loaded the data into the database it is easy to connect to the database using standard tools (e.g., MS Access, Filemaker, etc.).The EcoData Retriever can download and install small datasets in seconds and large datasets in minutes. The program also cleans up known issues with the datasets and automatically restructures them into a format appropriate for standard database management systems. The automation of this process reduces the time for a user to get most large datasets up and running by hours, and in some cases days.

Installing (binaries)

Precompiled binaries the most recent release are available for Windows, OS X, and Ubuntu/Debian at the project website.

Installing From Source

To install the EcoData Retriever from source, you'll need Python 2.6+ with the following packages installed:

  • wxPython
  • xlrd

The following packages are optional

  • PyMySQL (for MySQL)
  • sqlite3 (for SQLite)
  • psycopg2 (for PostgreSQL)
  • pyodbc (for MS Access - this option is only available on Windows)

To install from source

  1. Clone the repository
  2. From the directory containing setup.py, run the following command: python setup.py install
  3. After installing, type retriever from a command prompt to launch the EcoData Retriever

Using the Command Line

After installing, run retriever update to download all of the available dataset scripts. To see the full list of command line options and datasets run retriever --help. The output will look like this:

usage: retriever [-h] [-v] [-q] {install,update,gui,new,ls,citation,help} ...

positional arguments:
  {install,update,gui,new,ls,citation,help}
                        sub-command help
    install             download and install dataset
    update              download updated versions of scripts
    gui                 launch retriever in graphical mode
    new                 create a new sample retriever script
    ls                  display a list all available dataset scripts
    citation            view citation
    help

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  -q, --quiet           suppress command-line output

To install datasets, use retriever install:

usage: retriever install [-h] [--compile] [--debug]
                         {mysql,postgres,sqlite,msaccess,csv} ...

positional arguments:
  {mysql,postgres,sqlite,msaccess,csv}
                        engine-specific help
    mysql               MySQL
    postgres            PostgreSQL
    sqlite              SQLite
    msaccess            Microsoft Access
    csv                 CSV

optional arguments:
  -h, --help            show this help message and exit
  --compile             force re-compile of script before downloading
  --debug               run in debug mode

For example, to install the Breeding Bird Survey data into an sqlite database named mydatabase.db you would use:

retriever install sqlite BBS -f mydatabase.db

Acknowledgments

Development of this software was funded by the Gordon and Betty Moore Foundation's Data-Driven Discovery Initiative through Grant GBMF4563 to Ethan White and the National Science Foundation as part of a CAREER award to Ethan White.

Something went wrong with that request. Please try again.