Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] HD 143006 part 2 #80

Merged
merged 56 commits into from
Jul 31, 2021
Merged

[WIP] HD 143006 part 2 #80

merged 56 commits into from
Jul 31, 2021

Conversation

iancze
Copy link
Collaborator

@iancze iancze commented Jun 18, 2021

This is a draft pull request to monitor the changes between this branch and the main branch, as well as provide a vehicle for commentary and review.

To update the file contents, continue to push committed changes to the branch, i.e. $git push origin HD143006_Part_2_branch

RCF42 and others added 30 commits June 16, 2021 17:16
Changed log directory for first set of loss values
Figured out and added directions to examine functions, displayed tensorboard
Fixed error in calling trainable-- getting different error. I suspect it has to do with serialization of model
tune.run switches to a different directory that is created outside of the original working directory. Adjusted where file is created. Switched directory.

# ## Initializing Model with the Dirty Image
#
# We now have our model and data, but before we set out trying to optimize the image we should create a better starting point for our future optimization loops. A good idea for the starting point is the dirty image, since it is already a maximum likelihood fit to the data. The problem with this is that the dirty image contains negative flux pixels, while we impose the requirement that our sources must have all positive flux values. Our solution then is to optimize the RML model to become as close to the dirty image as possible (while retaining image positivity).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A "better" starting point relative to what? A blank/uniform image? I think we could make this clearer by saying we need to pick a starting point, which could be basically anything, and the dirty image is a logical choice. Rather than the current language which sort of implies we are changing the existing starting point to the dirty image instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think we should be careful with the language used at the end of this line. The current text says "Our solution then is to optimize the RML model to become as close to the dirty image as possible (while retaining image positivity)" which is true in this case; we're only enforcing image positivity which is a very simply prior, but in most cases we're not trying to make an image as close to the dirty image as possible, we're trying to make a better image. I think it would be useful to rephrase either emphasizing that image positivity is the only prior being considered here, or that we are trying to make the best possible image given the dirty image and the need for positive pixel values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@briannazawadzki Are there cases when we don't initialize the model with the dirty image before beginning optimization for the best image? My understanding was that this section of the tutorial is initializing the model to the dirty image while retaining the image positivity prior, so we would want the dirty image to closely resemble the model at this point. Then later we take this as the starting point and begin to perform the actual optimization such that the model fits the visibilities but also takes into account any other priors so the model resembles what we'd expect the true image to be.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hgrzy Oops, I'm sorry, I must not have my GitHub notifications configured correctly. Yes, your understanding is correct. The dirty image is being used as the starting point for the optimization loop. But sure, there could be cases where you don't want to do that. For example, you might want to see if you can still converge on a reasonable solution using the same priors but starting with a uniform blank image. It would probably take more iterations to converge than using the dirty image, since presumably the dirty image is already in the ballpark of what the final result will be, while a blank image needs a lot more adjusting.

My comment was mainly just being nitpicky about language for clarity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@briannazawadzki No worries! Ah that makes sense. Thanks for bringing up that good point.

#


# To optimize the RML model toward the dirty image, we will create our training loop using a [loss function](../api.html#module-mpol.losses) and an [optimizer](https://pytorch.org/docs/stable/optim.html#module-torch.optim). MPoL and PyTorch both contain many optimizers and loss functions, each one suiting different applications. Here we use PyTorch's [mean squared error function](https://pytorch.org/docs/stable/generated/torch.nn.MSELoss.html) between the RML model image pixel fluxes and the dirty image pixel fluxes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would cite the optimization tutorial for more info about the training loop.

return test_score


# Now, with our functions defined, we need to do the critical part of dividing our dataset into training and test datasets. There are many ways of going about this but here we are splitting it radially and azimuthally and removing chunks. This is visualized in the [Cross Validation tutorial](crossvalidation.html).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You already cite the cross validation tutorial so maybe this isn't necessary, but maybe mention that MPoL's Dartboard is an easy built-in way to get the polar coord grid, and maybe very briefly mention why we are choosing to do it this way.

# With these set up, we can now make our training function (instead of a loop, we use a function here since the training loop will be ran multiple times with different configurations). The hyperparameters, such as `epochs` and `lambda_TV`, are contained under `config`. Most of them are used in the loss functions and can be read about [here](../api.html#module-mpol.losses).


def train(model, dataset, optimizer, config, writer=None, logevery=50):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused. Now you are introducing more priors than just the image positivity? I think you need to be more clear about what you're including, because earlier it seemed like you were saying that you were only using an image positivity prior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this, I just feel like I don't have the knowledge to explain each of the priors here. Adding the lambda_sparsity prior affects the sparsity of the image, the lr affects how aggressive the model learns, epochs determines the amount of iterations the model trains for, I can assume that prior_intensity affects the strangth of the entropy term, but I'm not quite sure what the prior entropy is. If you someone doesn't mind explaining this and confirming what I said about the other priors that would be very helpful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll give my thoughts. I actually think it's okay if you don't fully explain all of the priors here, but I think it would be useful to explicitly state that now you are expanding beyond the initial optimization shown above, which only required image positivity.

The lambda terms affect the strength of the regularizing terms, but don't affect the form of the regularizer. They are basically just coefficients to weight each term. So sparsity, total variation (TV), and entropy are all different priors/regularizers that we add to the total loss term. I'm working on explaining each of them in my paper, but I haven't written everything yet. I think for the purposes of the tutorial it would be fine to just mention that you're now using multiple priors, and direct the reader to the API for more information on these losses.

@hgrzy hgrzy marked this pull request as draft July 1, 2021 20:28
@hgrzy hgrzy marked this pull request as ready for review July 2, 2021 20:32
@hgrzy
Copy link
Contributor

hgrzy commented Jul 2, 2021

@iancze Ready for another round of review!


# Here is where we define the hyperparameters under the `config` dictionary. Most of these hyperparameters, such as `lambda_TV` and `entropy`, correspond to the scalar prefactors $\lambda$ as described in the [intro to RML](rml_intro.html).

config = (
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do these config values come from? The text doesn't provide any context to why they might be set to the values that they are.

@iancze
Copy link
Collaborator Author

iancze commented Jul 21, 2021

@trq5014 @hgrzy @RCF42 would it be possible to update the final CV example with some of the better-looking hyperparameter settings that you all found recently? It makes sense to show off our best image :)

After that, I think this looks ready to merge! Thanks for all of the hard work.

@trq5014
Copy link
Contributor

trq5014 commented Jul 21, 2021

Yeah that can be done @iancze ! Do you want us to add the TSV prior in as well? Or should we stick with what we have?

@iancze
Copy link
Collaborator Author

iancze commented Jul 21, 2021

Might as well add in the TSV prior if it's needed to get the "best" image so far.

@iancze iancze merged commit ee52f09 into main Jul 31, 2021
@iancze iancze deleted the HD143006_Part_2_branch branch July 31, 2021 10:47
@iancze
Copy link
Collaborator Author

iancze commented Jul 31, 2021

Thanks for all the excellent work, everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants