Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assert positive_fraction <= pi #6

Closed
micahrapp opened this issue Mar 14, 2019 · 5 comments
Closed

assert positive_fraction <= pi #6

micahrapp opened this issue Mar 14, 2019 · 5 comments

Comments

@micahrapp
Copy link

So I sometimes get the following error when running topaz train:

Traceback (most recent call last):
File "/usr/local/anaconda3/envs/topazenv/bin/topaz", line 11, in
load_entry_point('topaz==0.1.0', 'console_scripts', 'topaz')()
File "/usr/local/anaconda3/envs/topazenv/lib/python3.6/site-packages/topaz/main.py", line 144, in main
args.func(args)
File "/usr/local/anaconda3/envs/topazenv/lib/python3.6/site-packages/topaz/commands/train.py", line 643, in main
, autoencoder=args.autoencoder
File "/usr/local/anaconda3/envs/topazenv/lib/python3.6/site-packages/topaz/commands/train.py", line 422, in make_training_step_method
assert positive_fraction <= pi
AssertionError

In older versions (where you specified pi), I would decrease the value of radius and that eliminated the error. In current versions, increasing num-particles did the job, though I'm sure tweaking radius would also do it.

Most recently, I saw the error with a radius of 5 and num-particles of 100. Increasing to 200 or 300 got rid of the error, though I definitely don't expect that many particles per micrograph. I seem to have a way around the issue, but just curious as to what ratio of radius to num-particles generates this error.

@tbepler
Copy link
Owner

tbepler commented Mar 14, 2019

This error happens when you set pi to be smaller than the observed fraction of positives. That is, you have more labeled particles per micrograph (on average) than you are telling topaz you expect to have.

For example, if you have 100 labeled particles per micrograph and set n=50, it wouldn't make any sense, because you already know there must be at least 100 particles per micrograph!

@micahrapp
Copy link
Author

Ah got it! Does it do this on a per micrograph basis or does it average the pi value for all of them?

@tbepler
Copy link
Owner

tbepler commented Mar 14, 2019

It's averaged over all micrographs.

@tbepler tbepler closed this as completed Mar 19, 2019
@micahrapp
Copy link
Author

micahrapp commented Apr 4, 2019

Sorry to bring this back up, but a labmate just ran into this same issue, and I noticed something weird

source split p_observed num_positive_regions total_regions
0 train 0.0244 5738886 235065600
0 test 0.0249 1462512 58766400
Specified expected number of particle per micrograph = 50.0
With radius = 8
Setting pi = 0.04441738816738817

So the p_observed for test and train is 0.02, and it set pi to 0.04. So the set pi is higher than the observed pi, which is good. But they still got that error. Am I missing something?

@tbepler tbepler reopened this Apr 4, 2019
@tbepler
Copy link
Owner

tbepler commented Apr 4, 2019

Yep, there was a bug in the code for this. To adjust pi to only apply to the unlabeled data, the training script sets pi = pi - p_observed, because the user defined pi is for all data. The check was being done after the adjustment, as well as before, leading to this error. You can install from the latest master branch where I just fixed this. As a workaround, though, just set -n/--pi even larger to compensate.

Fixed in 79e38bf

@tbepler tbepler closed this as completed Apr 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants