Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shape mismatch when setting data, prohibiting predictive sampling #6230

Closed
falkmielke opened this issue Oct 19, 2022 · 2 comments
Closed

shape mismatch when setting data, prohibiting predictive sampling #6230

falkmielke opened this issue Oct 19, 2022 · 2 comments

Comments

@falkmielke
Copy link

falkmielke commented Oct 19, 2022

Description of your problem

when using pm.set_data() with data that has a different number of rows than the original data, pm.sample_posterior_predictive() will fail.

minimal example.

#!/usr/bin/env python3

import os as os
import numpy as np

import pymc as pm
import aesara as ae
import aesara.tensor as at

import matplotlib.pyplot as plt

print (pm.__version__, ae.__version__)

# data generation
n_observations = 99
slope = 3.

predictor = np.random.uniform(0., 1., n_observations)
residual = np.random.normal(0., 0.05, n_observations)
observable = predictor * slope + residual

# plt.scatter(predictor, observable, s=3, marker='o', edgecolor='k', facecolor='w', alpha=0.6)
# plt.show()


# model design
with pm.Model() as model:
    data = pm.Data('predictor', predictor, mutable = True)
    ones = pm.Data('epsilon', np.ones((n_observations, )), mutable = True)

    slope = pm.Normal( f'slope', mu = np.pi, sigma = 1.)

    estimator = at.dot(data, slope)
    # residual = pm.HalfCauchy('residual', 1.)
    residual = at.dot(ones, pm.HalfCauchy('residual', 1.))

    posterior = pm.Normal('posterior', mu = estimator, sigma = residual, observed = observable)


# inference
with model:
    trace = pm.sample(2**10)


# out-of-sample prediction
with model:
    pm.set_data({'predictor': 1.1*np.ones(7,)})
    prediction = pm.sample_posterior_predictive(trace)

full traceback.

Complete error traceback
Traceback (most recent call last):--------------------------------------------------------------------| 0.00% [0/4096 00:00<?]
  File "/usr/lib/python3.10/site-packages/aesara/compile/function/types.py", line 971, in __call__
    self.vm()
  File "/usr/lib/python3.10/site-packages/aesara/graph/op.py", line 543, in rval
    r = p(n, [x[0] for x in i], o)
  File "/usr/lib/python3.10/site-packages/aesara/tensor/random/op.py", line 368, in perform
    smpl_val = self.rng_fn(rng, *(args + [size]))
  File "/usr/lib/python3.10/site-packages/aesara/tensor/random/op.py", line 166, in rng_fn
    return getattr(rng, self.name)(*args, **kwargs)
  File "_generator.pyx", line 1136, in numpy.random._generator.Generator.normal
  File "_common.pyx", line 594, in numpy.random._common.cont
  File "_common.pyx", line 511, in numpy.random._common.cont_broadcast_2
  File "__init__.pxd", line 741, in numpy.PyArray_MultiIterNew3
ValueError: shape mismatch: objects cannot be broadcast to a single shape.  Mismatch is between arg 0 with shape (99,) and arg 1 with shape (7,).

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "PredictionShapeError.py", line 48, in <module>
    prediction = pm.sample_posterior_predictive(trace)
  File "/usr/lib/python3.10/site-packages/pymc/sampling.py", line 2022, in sample_posterior_predictive
    values = sampler_fn(**param)
  File "/usr/lib/python3.10/site-packages/pymc/util.py", line 366, in wrapped
    return core_function(**input_point)
  File "/usr/lib/python3.10/site-packages/aesara/compile/function/types.py", line 984, in __call__
    raise_with_op(
  File "/usr/lib/python3.10/site-packages/aesara/link/utils.py", line 534, in raise_with_op
    raise exc_value.with_traceback(exc_trace)
  File "/usr/lib/python3.10/site-packages/aesara/compile/function/types.py", line 971, in __call__
    self.vm()
  File "/usr/lib/python3.10/site-packages/aesara/graph/op.py", line 543, in rval
    r = p(n, [x[0] for x in i], o)
  File "/usr/lib/python3.10/site-packages/aesara/tensor/random/op.py", line 368, in perform
    smpl_val = self.rng_fn(rng, *(args + [size]))
  File "/usr/lib/python3.10/site-packages/aesara/tensor/random/op.py", line 166, in rng_fn
    return getattr(rng, self.name)(*args, **kwargs)
  File "_generator.pyx", line 1136, in numpy.random._generator.Generator.normal
  File "_common.pyx", line 594, in numpy.random._common.cont
  File "_common.pyx", line 511, in numpy.random._common.cont_broadcast_2
  File "__init__.pxd", line 741, in numpy.PyArray_MultiIterNew3
ValueError: shape mismatch: objects cannot be broadcast to a single shape.  Mismatch is between arg 0 with shape (99,) and arg 1 with shape (7,).
Apply node that caused the error: normal_rv{0, (0, 0), floatX, True}(RandomGeneratorSharedVariable(<Generator(PCG64) at 0x7F2ABD7743C0>), TensorConstant{(1,) of 99}, TensorConstant{11}, Elemwise{mul,no_inplace}.0, Elemwise{mul,no_inplace}.0)
Toposort index: 4
Inputs types: [RandomGeneratorType, TensorType(int64, (1,)), TensorType(int64, ()), TensorType(float64, (None,)), TensorType(float64, (None,))]
Inputs shapes: ['No shapes', (1,), (), (7,), (99,)]
Inputs strides: ['No strides', (8,), (), (8,), (8,)]
Inputs values: [Generator(PCG64) at 0x7F2ABD7743C0, array([99]), array(11), 'not shown', 'not shown']
Outputs clients: [['output'], ['output']]

HINT: Re-running with most Aesara optimizations disabled could provide a back-trace showing when this node was created. This can be done by setting the Aesara flag 'optimizer=fast_compile'. If that does not work, Aesara optimizations can be disabled with 'optimizer=None'.
HINT: Use the Aesara flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.

additional information.
I am on my yearly recursion to this problem, trying to do out-of-sample prediction on a complex linear LKJ-priored multivariate model. I had reported issues before here and everything temporarily worked in a 4.0beta due to this and then this PR.
However, turning back to my project with the current versions, I encountered the seemingly familiar shape conflicts.
What's worse is that I could condense this down, and it even happens on the minimal example provided above. You will find that I have actually tried to control the shape of all model components using tensor multiplication at.dot; however the aesara traceback still containes a wrong input shape (the last one).
This issue might be related, but that is not clear.

Versions and main components

  • PyMC/PyMC3 Version: 4.2.1
  • Aesara/Theano Version: 2.8.7
  • Python Version: 3.10.8
  • Operating system: linux 6.0.2-arch1-1
  • How did you install PyMC/PyMC3: pip
@ricardoV94
Copy link
Member

ricardoV94 commented Oct 20, 2022

There are two issues going on:

  1. You need to also update the shape of epsilon, otherwise it's incompatible with the new predictor
  2. You need to tell PyMC more information about the shape of the observed variable, otherwise it assumes it's the same as observed.

The following works on my end:

import numpy as np

import pymc as pm
import aesara as ae
import aesara.tensor as at


n_observations = 99
slope = 3.

predictor = np.random.uniform(0., 1., n_observations)
residual = np.random.normal(0., 0.05, n_observations)
observable = predictor * slope + residual

# model design
with pm.Model() as model:
    data = pm.Data('predictor', predictor, mutable = True)
    ones = pm.Data('epsilon', np.ones((n_observations, )), mutable = True)

    slope = pm.Normal( f'slope', mu = np.pi, sigma = 1.)

    estimator = at.dot(data, slope)
    residual = at.dot(ones, pm.HalfCauchy('residual_', 1.)) 

    posterior = pm.Normal('posterior', mu = estimator, sigma = residual, shape=estimator.shape, observed=observable)


# inference
with model:
    trace = pm.sample(2**10)

# out-of-sample prediction
with model:
    pm.set_data({'predictor': 1.1*np.ones((7,)), 'epsilon': np.ones((7,))})
    prediction = pm.sample_posterior_predictive(trace)

@falkmielke
Copy link
Author

thank you, @ricardoV94 ! I think I had the "epsilon" at some point, but the "shape" in the observed structure is exactly the trick I was missing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants