import NAIP data into a postgres database #23

andrewljohnson · 2016-05-10T20:18:48Z

Putting the data in Postgres seems like a good mid-game/end-game move. Do this after we put up deeposm.org, want to scale, and/or want to provide a place for researchers to run arbitrary experiments.

Benefits include:

make rotating tiles easier (issue rotate training images #24) - current pipeline could me modified too, but more hacky
maybe easier to do bounding box queries, than if data is cached in a non-relational way from NAIPs to disk
enable an API that would allow for more arbitrary training data, in less disk space

silberman · 2016-05-13T04:18:52Z

...we provide a function where you could say "give me 40,000 random tiles from within these long,lat bounding boxes, and label them using these labeller functions I want to try", and that would be fast.

(A "labeller" function being something that takes a long,lat bounding box and returns some numpy array, with simple ones that return 1:1 arrays of one of the RGBI bands, or more complex ones like the has_center_road and its various permutations, or ones that map 64x64 to 4x4 binary has-road, or has-tennis, etc)

We could still cache it if we want, (as is being discussed in #30 ), though I think we can ditch all the NAIP-specific details, and just save to NetCDF the arrays that are going straight into tensorflow, plus some metadata about the experiment if we want.

andrewljohnson · 2016-06-22T16:33:02Z

merging with other infrastructure issues

andrewljohnson added data pipeline Infrastructure and removed data pipeline labels May 10, 2016

andrewljohnson mentioned this issue May 13, 2016

split data creation and analysis into separate Docker apps #30

Closed

andrewljohnson changed the title ~~provide training data in arbitrary batches~~ provide training data in arbitrary batches (fix memory bulge in data pipeline) May 13, 2016

andrewljohnson changed the title ~~provide training data in arbitrary batches (fix memory bulge in data pipeline)~~ import NAIP data into a postgres database May 29, 2016

andrewljohnson mentioned this issue May 29, 2016

host on AWS, use multiple GPUs #8

Closed

andrewljohnson closed this as completed Jun 22, 2016

andrewljohnson mentioned this issue Jun 22, 2016

automate/improve infrastructure #59

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

import NAIP data into a postgres database #23

import NAIP data into a postgres database #23

andrewljohnson commented May 10, 2016 •

edited

Loading

silberman commented May 13, 2016 •

edited by andrewljohnson

Loading

andrewljohnson commented Jun 22, 2016

import NAIP data into a postgres database #23

import NAIP data into a postgres database #23

Comments

andrewljohnson commented May 10, 2016 • edited Loading

silberman commented May 13, 2016 • edited by andrewljohnson Loading

andrewljohnson commented Jun 22, 2016

andrewljohnson commented May 10, 2016 •

edited

Loading

silberman commented May 13, 2016 •

edited by andrewljohnson

Loading