Skip to content

Latest commit

 

History

History
71 lines (64 loc) · 2.41 KB

readme_paper.org

File metadata and controls

71 lines (64 loc) · 2.41 KB

README

Contents

Replication Data for Images of the arXiv: reconfiguring large scientific image datasets

This data repository contains the replication data for the paper Images of the arXiv: reconfiguring large scientific image datasets.

Computer Setup Instructions

Computer Specs

OS

  • Linux: Ubuntu 18.04

Hardware

  • Intel i7 CPU
  • 500GB NVMe solid state drive
  • 4TB 72000 rpm hard disk
  • 32GB DDR3 RAM
  • NVidia RTX 2080 graphics card 8GB VRAM

Installing software

Metha

https://github.com/miku/metha

SQLite (command line)

Ubuntu ships with SQLite. Simply call

sqlite3 /path/to/database.sqlite3
Python SQLite

This is included in Python:

import sqlite
DBBrowser for SQLite (optional)

This software is handy for having a graphical way to examine the SQLite database and can also be used to run commands https://sqlitebrowser.org/dl/

sudo add-apt-repository -y ppa:linuxgndu/sqlitebrowser
sudo apt-get update
sudo apt-get install sqlitebrowser
Other software
  • Anaconda (recommended for installing and managing Python packages)
  • Python (2 and 3)
  • ImageMagick (for convert and identify)
  • Jupyter Notebook
  • SQLite interfaces for Python and Bash
  • tensorflow-gpu

Environments

We used two different conda environments for running the required scripts. The first is py37, which contains basic Python3 packages, matplotlib, and other utilities. The second is tf_gpu, which is configured to run TensorFlow 1.14 using GPU acceleration. This package will take longer to install so is provided separately. See the YAML files in the conda folder.

Instructions

Database

Provided in SQLite format. Contains metadata regarding articles, images, and figure captions up to the end of 2018.

Downloading data

See dataset_method.md.

Creating database

See sqlite_method.md.

Image credits for paper

See image_credits.md.

Plots

Scripts for running plots found in the sqlite-scripts folder.