-
Notifications
You must be signed in to change notification settings - Fork 3.7k
img2img_allen_brain problem defs #872
img2img_allen_brain problem defs #872
Conversation
only outstanding issue blocking this should be that the test in allen_brain_test.py that uses tf.eager breaks many/all other t2t tests not compatible with eager mode.
squashed these:
commit 76c8cd3f105c07919f84d70b02b1fbd38b5376fe
Author: cwbeitel <me@cwbeitel-github.com>
Date: Fri Jun 15 18:35:13 2018 +0000
fix mock_raw_data
commit f0dc3a0
Merge: 3e17473 4b7360d
Author: cwbeitel <me@cwbeitel-github.com>
Date: Thu Jun 14 22:54:33 2018 +0000
Merge branch 'allen-deps' of https://github.com/cwbeitel/tensor2tensor into allen-deps
commit 3e17473
Author: cwbeitel <me@cwbeitel-github.com>
Date: Fri Jun 8 17:04:39 2018 +0000
cleanup
commit 3e8ef9d
Author: cwbeitel <me@cwbeitel-github.com>
Date: Wed Jun 6 23:07:28 2018 +0000
removed image summary writing from metrics
training was 10x slower than CIFAR img2img problem, profiled and found 91% of time was being sp
ent writing ImageSummary's, removed that from metrics and saw the speed increase as expected.
commit 4814556
Author: cwbeitel <me@cwbeitel-github.com>
Date: Tue May 29 16:00:17 2018 +0000
basic img2img prob. defs that maybe get data
commit 320f06c
Author: cwbeitel <me@cwbeitel-github.com>
Date: Thu May 24 18:53:02 2018 +0000
test without accessing network, follow rcfile
* made test that accesses remote API optional, off by default
* included test that runs sub-image extractor on mocked raw input image within mocked directory
structure mirroring what would be produced by running actual data downloader
* refactoring according to rcfile
commit 503123b
Author: cwbeitel <me@cwbeitel-github.com>
Date: Wed May 23 22:09:31 2018 +0000
add [allen] deps to travis setup
commit 239188b
Author: cwbeitel <me@cwbeitel-github.com>
Date: Wed May 23 21:39:13 2018 +0000
data download and sub-image gen utility
* if additional [allen] deps not present suggest running `pip install tensor2tensor[allen]`
* obtains a list of section dataset ID's from Allen Institute API
* downloads specified number of raw images from specified number of section datasets
* produces specified number of sub-images of specified size from raw images.
* simple e2e test that runs the data downloader and sub-image generator with default parameters
commit af34cc6
Author: cwbeitel <me@cwbeitel-github.com>
Date: Wed May 23 17:47:35 2018 +0000
deps. for allen brain img probs. via extras
|
@rsepassi Noticed many other tests were failing outside of allen_brain* tests with a bunch of errors about Eager mode, guessing the problem for those is my test that uses Eager. |
|
Where are you seeing the errors about Eager mode? I don't see them in the Travis test logs. When you run the tests locally with We'll want to ensure these all pass on Travis. One idea is that maybe it's problematic running the allen tests alongside all the others and we'll want to run them separately. You can see in the |
|
@rsepassi The verbose output that mentions Eager doesn't show on Travis (only lots of failed tests, see https://travis-ci.org/tensorflow/tensor2tensor/jobs/392846781). It will probably fix it to just ignore allen tests in main pytest call and call these separately as you were suggesting. Ah there we go maybe before the reason it wasn't triggering the CI on the last PR was because there was a merge conflict, fixed that here and it proceeded. |
|
Looks like the allen tests are failing. |
|
@rsepassi Yeah hmm taking a look. Those pass on my local walking through the CI commands. Removed -q from pip installs and put in commands to check if pandas is getting dropped at some point or never installed but it looks like the CI is using the old version of .travis.yml. |
|
@rsepassi Those changes appear to have fixed everything except for a linter error in gym_utils, https://travis-ci.org/tensorflow/tensor2tensor/jobs/394214469#L2613, Verified the allen_brain* files pass the linter locally. |
|
@rsepassi My bad, there was a little linter fix I made and forgot about. I can make those changes to gym_utils if that simplifies things. |
|
Hmm looks like those may not have actually been part of the problem on travis. |
|
@rsepassi Let me know if you're waiting on me for anything, otherwise no pressure! |
|
Was just going through to merge it but noticed that AllenSDK has a non-permissive license which actually makes it hard to import and use the code within Google. It may still be possible but it would be a lot of work and I just don't have the cycles to go through something like that right now unfortunately. Now I don't want your work to go to waste so here's what I'm thinking: you can maintain a fork of Very sorry about this, wish I noticed it earlier. What do you think? |
|
No worries yeah that sounds fine for now, thanks for the cycles thus far. I'll make that change. As well as look into getting permission to host the data in a requester-pays GCS bucket. The code for scraping their API seems like it will be fragile over the long-term and I think a better approach than having to maintain this is to have it work once and provide that result as a pre-processed dataset in the style of things like imagenet, celeba, etc. |
* Simplify allen_brain_utils* to download smaller static list of images, avoiding allensdk dependency. * Simplify _generator given input is already in [0, 255] * _generator ignores likely non-tissue/background regions based on max intensity * Fractional in-painting step added to pre-processor, depends on inpaint_fraction property, randomly positioned square region * Random crop to target dimensions instead of downsampling leaving full resolution of target intact * Re-using same generated data for different problems that share examples (differing in terms of how these are pre-processed)
|
@rsepassi Hey! I removed the allensdk dependency but kept the data download utils which is far simpler if maintaining a list of image IDs (instead of building these through various queries to the API). There are 100 IDs included here randomly sampled across the full set of Mouse data. Given that we're considering 64 x 64px regions of images that are like 8.5k x 15k that's enough to generate around 3M examples. If that's not enough I referenced the API docs and a gist that includes the old download code. |
|
Thank you @cwbeitel and sorry for the delay. Will merge shortly. |
|
@rsepassi No worries. Thank you. |
PiperOrigin-RevId: 207290554
only outstanding issue blocking this should be that the test in allen_brain_test.py that uses tf.eager breaks many/all other t2t tests not compatible with eager mode.
squashed these:
commit 76c8cd3f105c07919f84d70b02b1fbd38b5376fe
Author: cwbeitel me@cwbeitel-github.com
Date: Fri Jun 15 18:35:13 2018 +0000
commit f0dc3a0
Merge: 3e17473 4b7360d
Author: cwbeitel me@cwbeitel-github.com
Date: Thu Jun 14 22:54:33 2018 +0000
commit 3e17473
Author: cwbeitel me@cwbeitel-github.com
Date: Fri Jun 8 17:04:39 2018 +0000
commit 3e8ef9d
Author: cwbeitel me@cwbeitel-github.com
Date: Wed Jun 6 23:07:28 2018 +0000
ent writing ImageSummary's, removed that from metrics and saw the speed increase as expected.
commit 4814556
Author: cwbeitel me@cwbeitel-github.com
Date: Tue May 29 16:00:17 2018 +0000
commit 320f06c
Author: cwbeitel me@cwbeitel-github.com
Date: Thu May 24 18:53:02 2018 +0000
structure mirroring what would be produced by running actual data downloader
* refactoring according to rcfile
commit 503123b
Author: cwbeitel me@cwbeitel-github.com
Date: Wed May 23 22:09:31 2018 +0000
commit 239188b
Author: cwbeitel me@cwbeitel-github.com
Date: Wed May 23 21:39:13 2018 +0000
commit af34cc6
Author: cwbeitel me@cwbeitel-github.com
Date: Wed May 23 17:47:35 2018 +0000