Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with COWs #9

Open
Bilokin opened this issue Jul 14, 2023 · 2 comments
Open

Problems with COWs #9

Bilokin opened this issue Jul 14, 2023 · 2 comments

Comments

@Bilokin
Copy link

Bilokin commented Jul 14, 2023

Hello,

I am considering to use COWs for a case when the discriminating variable correlates with the control variable, however I have encountered a problem when the discriminating variable has negative values.
Even if only lower bound of mrange argument is negative, then all values of W-matrix are nan and the computation fails with the following exception:

/cvmfs/belle.cern.ch/el7/externals/v01-12-01/Linux_x86_64/common/lib/python3.8/site-packages/sweights/cow.py:112: IntegrationWarning: The maximum number of subdivisions (50) has been achieved.
  If increasing the limit yields no improvement it is advised to analyze 
  the integrand in order to determine the difficulties.  If the position of a 
  local difficulty can be determined (singularity, discontinuity) one will 
  probably gain from splitting up the interval and calling the integrator 
  on the subranges.  Perhaps a special-purpose integrator should be used.
  N = quad(f, *self.mrange)[0]
/cvmfs/belle.cern.ch/el7/externals/v01-12-01/Linux_x86_64/common/lib/python3.8/site-packages/sweights/cow.py:124: RuntimeWarning: divide by zero encountered in divide
  return self.gk[k](m) * self.gk[j](m) / self.Im(m)
Initialising COW:
/cvmfs/belle.cern.ch/el7/externals/v01-12-01/Linux_x86_64/common/lib/python3.8/site-packages/sweights/cow.py:127: IntegrationWarning: The maximum number of subdivisions (50) has been achieved.
  If increasing the limit yields no improvement it is advised to analyze 
  the integrand in order to determine the difficulties.  If the position of a 
  local difficulty can be determined (singularity, discontinuity) one will 
  probably gain from splitting up the interval and calling the integrator 
  on the subranges.  Perhaps a special-purpose integrator should be used.
  return quad(integral, self.mrange[0], self.mrange[1])[0]
    W-matrix:
	[[nan nan]
	  [nan nan]]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In [291], line 10
      8 mrange = (binning[0], binning[-1])
      9 Im = 1
---> 10 cw = Cow(mrange, sig_pdf_obj, bkg_pdf_obj, Im, renorm=True, verbose=True)
     11 sweighter = SWeight(test_mc_df[c_var].values,
     12                     [sig_pdf_obj,bkg_pdf_obj],
     13                     [len(test_mc_df)*1.0-len(app_bkg_df),len(app_bkg_df)*1.],
   (...)
     16                     compnames=('sig','bkg'),
     17                     verbose=True, checks=False )

File /cvmfs/belle.cern.ch/el7/externals/v01-12-01/Linux_x86_64/common/lib/python3.8/site-packages/sweights/cow.py:105, in Cow.__init__(self, mrange, gs, gb, Im, obs, renorm, verbose)
    102     print("\t" + str(self.Wkl).replace("\n", "\n\t "))
    104 # invert for Akl matrix
--> 105 self.Akl = linalg.solve(self.Wkl, np.identity(len(self.Wkl)), assume_a="pos")
    106 if verbose:
    107     print("    A-matrix:")

File ~/.local/lib/python3.8/site-packages/scipy/linalg/_basic.py:140, in solve(a, b, sym_pos, lower, overwrite_a, overwrite_b, check_finite, assume_a, transposed)
    137 # Flags for 1-D or N-D right-hand side
    138 b_is_1D = False
--> 140 a1 = atleast_2d(_asarray_validated(a, check_finite=check_finite))
    141 b1 = atleast_1d(_asarray_validated(b, check_finite=check_finite))
    142 n = a1.shape[0]

File ~/.local/lib/python3.8/site-packages/scipy/_lib/_util.py:287, in _asarray_validated(a, check_finite, sparse_ok, objects_ok, mask_ok, as_inexact)
    285         raise ValueError('masked arrays are not supported')
    286 toarray = np.asarray_chkfinite if check_finite else np.asarray
--> 287 a = toarray(a)
    288 if not objects_ok:
    289     if a.dtype is np.dtype('O'):

File ~/.local/lib/python3.8/site-packages/numpy/lib/function_base.py:627, in asarray_chkfinite(a, dtype, order)
    625 a = asarray(a, dtype=dtype, order=order)
    626 if a.dtype.char in typecodes['AllFloat'] and not np.isfinite(a).all():
--> 627     raise ValueError(
    628         "array must not contain infs or NaNs")
    629 return a

ValueError: array must not contain infs or NaNs

One can obtain this result with the tutorial notebook by setting the nrange to (-1,1).

@Bilokin
Copy link
Author

Bilokin commented Jul 17, 2023

Tagging @matthewkenzie

@matthewkenzie
Copy link
Collaborator

Thank you for raising this. An unusual and interesting issue, I will investigate it.

For now perhaps I can suggest (horrible hack I know) shifting the entire distribution so that x values are all positive, this will not impact the weights.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants