Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sprint20 Add Random Variables (Using Abstract Syntax Tree) #243

Open
wants to merge 84 commits into
base: main
Choose a base branch
from

Conversation

nabriis
Copy link
Collaborator

@nabriis nabriis commented May 29, 2023

closes #202, #85, #29

Related to #250

Tracking of final changes to PR

17/11/2023

  • Consider making name of random variable different than parameter name.
  • Update logic in to_cuqi_format (removed again)
  • Explain condition method and gradient method
  • Explain geometry and dim etc.

18/11/2023

  • Consider input name to JointDistribution and BayesianProblem (components)
  • Consider if force_ndarray should detect distribution and give useful error message (potential common mistake), as well as Model.forward(Distribution)
  • Consdier if it should be par_name for Distribution. Maybe another name?
  • Disable condition related properties of RVs for now.
  • Remove __call__ method of Distribution. Rename to e.g. fix or given
  • Update demo files? (Moved to legacy folder with readme)
  • Check if text descriptions near updates from Dist to RV should be added (e.g. if we write stuff like. Define distribution, when we actually now define RV).

24/11/2023

  • Whiteboard meeting on RV API

@nabriis
Copy link
Collaborator Author

nabriis commented Jun 3, 2023

Hi @amal-ghamdi and @jakobsj. Following the seminar I have now updated the tutorial. Feel free to review

@nabriis
Copy link
Collaborator Author

nabriis commented Jun 14, 2023

Minor issue if you do

A = cuqi.model.LinearModel(lambda x: x, lambda y: y, 1, 1))
x = cuqi.distribution.Gaussian(0,1)
y = x+1
A@y # returns x+1

This is because A evaluates the lambda expression (which is identity) and thus simply returns y. This behavior is unexpected and LinearModel should have special behavior implemented for RandomVariable class.

EDIT: now fixed.

Copy link
Contributor

@jakobsj jakobsj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick approval. Few tiny comments inline. Looks really good, exciting new functionality. Merge pending more detailed review and approval from @amal-ghamdi

cuqi/distribution/_cmrf.py Outdated Show resolved Hide resolved
cuqi/distribution/_gmrf.py Outdated Show resolved Hide resolved
cuqi/distribution/_gmrf.py Outdated Show resolved Hide resolved
cuqi/utilities/_utilities.py Show resolved Hide resolved
demos/tutorials/RandomVariables.py Outdated Show resolved Hide resolved
Copy link
Contributor

@amal-ghamdi amal-ghamdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @nabriis for this rich feature. Really impressive work you have done here that I enjoyed reviewing and learning from!.

I have some comments here and there for your consideration.

cuqi/distribution/_cmrf.py Outdated Show resolved Hide resolved
cuqi/distribution/_distribution.py Outdated Show resolved Hide resolved
cuqi/distribution/_distribution.py Outdated Show resolved Hide resolved
cuqi/distribution/_gamma.py Outdated Show resolved Hide resolved
cuqi/distribution/_gmrf.py Outdated Show resolved Hide resolved
cuqi/randomvariable/_randomvariable.py Show resolved Hide resolved
cuqi/randomvariable/_randomvariable.py Outdated Show resolved Hide resolved
cuqi/randomvariable/_randomvariable.py Show resolved Hide resolved
cuqi/randomvariable/_randomvariable.py Outdated Show resolved Hide resolved
cuqi/randomvariable/_randomvariable.py Show resolved Hide resolved
@nabriis nabriis force-pushed the sprint20_randomvariables_ast branch from 6ab12d9 to 3d73813 Compare August 15, 2023 23:11
@nabriis
Copy link
Collaborator Author

nabriis commented Aug 15, 2023

Need to decide if .rv should even be exposed at this point in time.

@nabriis nabriis requested a review from jakobsj November 18, 2023 21:29
@nabriis
Copy link
Collaborator Author

nabriis commented Nov 18, 2023

Hello @jakobsj, @amal-ghamdi, @chaozg. I updated the code according to Jakobs comments. I also reverted the to_cuqi_format change, which greatly simplified the additional code to be reviewed (e.g. changes to specific distributions etc.).

Copy link
Contributor

@jakobsj jakobsj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @nabriis, I've run through the tutorial again and really like the updates you have made. Most points are now addressed. Running through did result in a number of new questions/points, not sure if bugs, please see specific comments. I also resolved many of the conversations. I will take a look at the remaining ones. I haven't got more time today so I think it makes sense if you can address this round of comments, and any from others, then after that I can take a look again, including at other parts of the code.

demos/tutorials/RandomVariables.py Show resolved Hide resolved
demos/tutorials/RandomVariables.py Outdated Show resolved Hide resolved
demos/tutorials/RandomVariables.py Outdated Show resolved Hide resolved
demos/tutorials/RandomVariables.py Outdated Show resolved Hide resolved
demos/tutorials/RandomVariables.py Show resolved Hide resolved
demos/tutorials/RandomVariables.py Outdated Show resolved Hide resolved
demos/tutorials/RandomVariables.py Outdated Show resolved Hide resolved
demos/tutorials/RandomVariables.py Outdated Show resolved Hide resolved
demos/tutorials/RandomVariables.py Outdated Show resolved Hide resolved
demos/tutorials/RandomVariables.py Show resolved Hide resolved
@jakobsj
Copy link
Contributor

jakobsj commented Nov 20, 2023

I have gone through another time and resolved conversations that had been addressed. There are four old ones, I think all having to do with if_dist_force_rv. I think this functionality has changed back and forth to to_cuqi_array or similar, and I'd like to understand these changes and behavior better before resolving these.

@nabriis
Copy link
Collaborator Author

nabriis commented Nov 20, 2023

I have gone through another time and resolved conversations that had been addressed. There are four old ones, I think all having to do with if_dist_force_rv. I think this functionality has changed back and forth to to_cuqi_array or similar, and I'd like to understand these changes and behavior better before resolving these.

Thanks @jakobsj. I updated once again. In relation to the force_ndarray and to_cuqi_format these methods are used to have inputs to distributions conform to the expected format. After removing to_cuqi_format again, we are currently only forcing ndarray in scalar-type inputs. For example if the user creates a Gaussian like this:

X = Gaussian(0, 1)

both the mean and covariance will become ndarrays.

The reason to generalize to to_cuqi_format would be to add e.g. error handling for common pitfalls. For example:

# A Gaussian distribution with conditioning variable x (from cov) is created in line two:
x = Gaussian(0, 1).rv
y = Gaussian(0, x).rv

The above is as expected. However, in the following case we get odd behavior that we would like to catch

# A Gaussian distribution with a ??Gaussian distribution?? as cov is created in line two:
x = Gaussian(0, 1) # Forgot to put .rv
y = Gaussian(0, x).rv

This behavior was also present in current CUQIpy. We could catch this mistake by parsing inputs and if they were a Distribution object, throw error with helpful message asking if the user forgot .rv using a more general method named e.g. to_cuqi_format or parse_input or cuqify_input

Originally I had a if_dist_force_rv method that called .rv on distributions given as input to other distributions, but I now rather think we should throw an error with a helpful message instead.

@nabriis nabriis requested a review from jakobsj November 20, 2023 21:20
Copy link
Contributor

@amal-ghamdi amal-ghamdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @nabriis again for this carefully and creatively designed PR, powerful and impressive work!

I left some comments throughout, feel free to address or immediately resolve those as it makes sense to you.

I have few general thoughts for your consideration:

  • I thought we can make par_name more specific and clear if we call it rv_name, if that is applicable in all cases.
  • I would assume we need issues to update other demo/plugin repos to address this change.

Well done!

demos/tutorials/Gibbs.py Outdated Show resolved Hide resolved
Comment on lines 434 to 435
# The following methods define algebraic operations on distributions
# The return type is a RandomVariable instance
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these two comments outdated? if not, it seems a bit unclear

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Removed.

Comment on lines 46 to 53
@property
def _par_name(self):
""" Return name of likelihood """
return self.distribution._par_name

@_par_name.setter
def _par_name(self, value):
self.distribution._par_name = value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just curious if the property par_name is sufficient here? May be I am missing why we need property _par_name

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. This is now removed and it helped simplify some checks in RV class, which checked for the private property _par_name where it should not have done so.

self._name = _get_python_variable_name(self)
return self._name
if self._par_name is None:
self._par_name = _get_python_variable_name(self) # TODO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have an issue #184 about refactoring UserDefinedLikelihood. I am not sure what the TODO here refer to but thought may be it can be added to that issue?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its related. Currently the UserDefinedLikelihood can infer python variable names where normally after this change it was supposed to only be RV class that does that. However, we have some use cases, where users define their UserDefinedLikelihood using e.g.
y = UserDefinedLikelihood(...)

Assuming it gets the name y. I think we can add this info into the issue you mentioned.

cuqi/density/_density.py Outdated Show resolved Hide resolved
def get_conditioning_variables(self):
""" Get conditioning variables. """
if self.is_transformed:
raise NotImplementedError("Extracting conditioning variables is not implemented for transformed random variables")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you see this as high priority issue or not necessarily? if so, we can create an issue to address it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we should generate a tracking issue to track all of the features we have disabled and aim to create in the future for RV class. There are more than thus, but if we create the issue we can add bullets to it.

Comment on lines +271 to +272
new_variable._distributions = copy(self.distributions)
new_variable._tree = copy(self._tree)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these two lines needed? Don’t they happen automatically after first line? not sure though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not, it is a shallow copy. So copying here ensures we get a new pointer to the distributions, meaning if we change internal parameters it only affects this new copied RV.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think of it as a 2-level deep copy.

# Providing data to the Bayesian problem creates a posterior as shown.
#

BP.set_data(y=y_obs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a comment here, the resulting distribution BP is JointDistribution. I think this is expected because our posterior is very specific and not general to handle this case. Do we need/have an issue for that?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't recall. But yes its not a "Posterior" class, but a JointDistribution representing a posterior over joint set of variables

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I experimented a bit with the rvs, and one scenario in inferring the distribution name came to mind (below). Depending on the flag, z2 name will be different. Is that something we want to capture in an issue? Also, is this is something we should prevent by not allowing creating more than one rv from a distribution (kind of 1 to 1 association in terms of class relationship).

Z_s = cuqi.distribution.Gaussian(0, lambda s: s)
Z = Z_s(s=3)
z = Z.rv
flag = False
if flag:
    print(z.name)
z2 = Z.rv
print(z2.name)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point here. I actually updated now so that .rv creates a copy of the distribution and does NOT introduce side effects to the original variables. Thus Z_s and Z in the above example does not get their parameter name updated. Instead its updated in z.dist.par_name.


# %%
######################################################################
# Direct evaluation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be we can clarify to the user here or in some other place in the tutorial that using () operator means different things when we talk about X and x, X(1) is logpdf and X.rv(1) is the evaluation as described here.

@jakobsj
Copy link
Contributor

jakobsj commented Nov 22, 2023

I believe I have now understood the points about forcing rv, etc, and I agree that in the spirit of being clear about when something is an RV and a distb respectively, we should not convert automatically between the two behind the scenes. Instead, as you suggest suitable error messages should be given when a user tries to mix.

@jakobsj
Copy link
Contributor

jakobsj commented Nov 22, 2023

I have spent the morning thinking more about distributions, random variables and the "call" syntax. I am not sure if this is the best place to capture these, but for now I will put them here. Please note, that I honestly feel the random variable is an amazing addition, and I am only giving it so much scrutiny, because it introduces such a fundamental change in the usage of CUQIpy, that we want to make sure everything remains (or becomes!) consistent.

This will be the first part, outlining some thoughts first on "call" syntax with only distributions, followed by using only random variables. I expect to follow up later (but out of time for now), with additional thoughts on potential cases of combining the two. Sorry if it reads slightly rambling at times, but I thought I'd sent it now, rather than let it sit until I have time to streamline it more.

Only distributions

Before introduction of RVs, we have distributions

X = Gaussian(0,1)

One can do

X(4)
EvaluatedDensity([-8.91893853])

in which the “call” syntax amounts to returning the scalar value of the pdf at the input value.

We can have a “simple” conditional distribution but eg omitting an input

Y = Gaussian(0)
Y(2) # or Y(cov=2)
CUQI Gaussian.

In which the “call” syntax amounts to conditioning on cov=2, and returning the resulting (unconditional) distribution.

Similarly using lambda function can set up more involved conditional distributions such as
Z = Gaussian(lambda t: 1/t)
Z(4)
CUQI Gaussian.

Two issues here:

  1. The “call” syntax means two different things, depending on whether used with an unconditional or a conditional distribution. In the extreme, one can in fact employ both in a single call
    Y(4)(3)
    which first conditions on cov=4, then evaluates the pdf at 3. If one knows what is going on, this can be understood, but is probably very confusing to users.

  2. For the conditional distribution with lambda function, it is unclear what
    Z(4)
    refers to, cov or t. Turns out one can in fact call with either! Followed by evaluation, we can see we get different results:
    Z(cov=4)(0)
    EvaluatedDensity([-1.61208571])

Z(t=4)(0)
EvaluatedDensity([-0.22579135])

and doing this without keyword we see that Z(4) corresponds to Z(t=4):
Z(4)(0)
EvaluatedDensity([-0.22579135])
as we get the same value.

It surprises me that using “cov” is permitted, when we have explicitly made Y conditional on t, instead of cov. Is that what we want?

Ideas to consider here: Replace both call syntax by more explicit syntax. And consider disallowing the usage of “cov” in conditioning when lambda function is used. And/or disallow positional inputs, allowing only keyword based for clarity.

Only random variables:

Introducing random variables (RVs):

X = Gaussian(0, 1)
x = X.rv

we can now use “call” syntax (of the RV) to do
x(4) # or x(x=4)
producing
4
because this is “direct evaluation”, meaning we fix x at 4, which means the value of x is indeed 4, ignoring that it also happens to be a Gaussian random variable.

For a transformed RV, eg.
y = (x + 10)**2
using the “call” syntax is consistent with this (and a bit clearer), since it also simply fixes the value of x, e.g.,
y(x=3)
produces 169, again ignoring that x is a Gaussian random variable.

Conceptually, this appears very similar to conditioning for distributions, in the sense that we fix one and given that evaluate the other. Except the transformed variable here is doing algebraic functions, not feeding through a distribution. But we do want to use RVs to specify relations between Rvs/distributions, not just algebraic manipulations. So we want to do eg
a = Gaussian(0,1).rv
b = Gaussian(0, 1/a).rv
which makes both RVs:
a ~ CUQI Gaussian.
b ~ CUQI Gaussian. Conditioning variables ['a'].
What happens if we use call syntax?
a(5) # or a(a=5)
5 which is consistent with the x above.
Same for b, which completely disregards that it is conditional on a:
b(5) # or b(b=5):
5
We get an error if we try
b(a=5)
ValueError: Expected arguments ['b'], got arguments {'a': 5}
So we cannot use “call” syntax to “directly evaluate” b for a fixed value of a, as in the case of y given a value of x above. This is probably because y only used algebraic operations, while b is a conditional distribution w.r.t. a. Currently, one then needs to do
b.condition(a=5)
yielding
b ~ CUQI Gaussian.
So now it has been conditioned on a. Maybe this usage is indeed not “direct evaluation”, since we end up with a fully specified distribution, not a value as in direct evaluation. But in general, it seems a subtle difference, and I could imagine more involved combinations of algebraic expressions and conditional distribution RVs for defining an RV. As well as wanting to call b.sample() and expect it to draw a sample from a, then use that in b. That gives an error at present, but mathematically nothing prevents us from expecting this to work.

An additional aspect is that before conditioning we have
print(b)
b ~ CUQI Gaussian. Conditioning variables ['a'].
but above, and also if we assign to, say c, we still get “b” printed:
c = b.condition(a=5)
print(c)
b ~ CUQI Gaussian.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants