copy of MulensData instance #40

rpoleski · 2022-04-19T11:34:03Z

I'm writing code for cleaning the data. It involves masking and un-masking points as bad. I've found it quite complicated - one cannot do

dataset.bad[i] = True

because it doesn't change dataset.good. One has to do something like:

bad = dataset.bad.copy()
bad[i] = True
dataset.bad = bad

We can either change implementation of .bad and .good (I don't know how - @ketozhang do you know how?) or we can add a function that makes a copy of a MulensData object. @jenniferyee What do you think?

The text was updated successfully, but these errors were encountered:

jenniferyee · 2022-04-19T12:53:34Z

I agree that this is a problem.

Currently "bad" and "good" are lists or np.array or something similar. There must be a way to make a class that behaves the way it needs to. For example, a "BadData" class (or maybe the class should be "DataQuality"?) whose properties "bad" and "good" are accessed by MulensData.bad and MulensData.good . Then it would have some kind of indexing method that inherits from list?

rpoleski · 2022-04-19T20:02:45Z

.bad and .good are two np.arrays of bool values and we want one to be logical not of the other all the time. I'm guessing it would be possible to define a new class that inherits from np.array and whenever a value of one of these arrays is changed, then the other is recalculated. However, to work fully it will require defining all operators that make sense for np.array, which would be lots of work. Maybe we should add notes in docstrings (so that users don't do data.bad[i] = True) and add the function that makes a copy of the MulensData class

ketozhang · 2022-04-19T20:23:11Z

I don't see a reason to keep a separate arrays for MulensData::_good and MulensData::_bad if they're opposite of each other. Can you do something like this?

class MulensData:

  @property
  def good(self):
    return self._good
  
  @property
  def bad(self):
    return ~self._good
  
  @good.setter
  def good(self, value):
    self._good = value
  
  @bad.setter
  def bad(self, value):
    self._good = not value

rpoleski · 2022-04-20T13:11:59Z

We had something like that before. But it turned out that np.logical_not() is extremely slow (at least on some computers). It was dominating the run time in case of the fitting tests we performed. Maybe it would be best to return to the previous version and use self._good as @ketozhang suggested. As far as I remember, we used self._bad previously. I'm guessing self._bad is very rarely used - maybe only in some plotting functions.

ketozhang · 2022-04-20T18:13:31Z

Will have to take a look how it was used: shape of array and how often MulensData::bad was called.

Instead of @property and building your own cache, you could use functools caching to handle this. See this SO question

rpoleski · 2022-04-22T11:45:33Z

I've checked that the only place where MulensData.bad is used is plotting the dataset. There are a few calls in that function, but of course we could change that to a single call. Now I agree with Keto's comment 3 posts back.

ketozhang · 2022-04-25T16:33:45Z

The solution I propose won't solve the problem where we want to allow the users to mutate MulensModel::bad[i] = value

To me, I think it's trivial to get the inverse (i.e., ~MulensModel::good) and to remove the property MulensModel::bad altogether (or the reverse since I more often see bad flags stored than good ones) whenever the user needs it.

rpoleski · 2022-10-09T02:13:06Z

MulensData.copy() added in commit 2aff631. Issue not yet fully resolved.

jenniferyee · 2023-10-05T20:18:33Z

Added a use case in dd54d4e

rpoleski · 2023-10-06T21:00:10Z

I've checked that the solution from Keto's first reply in this thread is not solving calls like:

data.bad[12] = True

or

data.bad[np.where(data.mag < 13.)] = True

Though the same ones for data.good do work. The above ones don't work, but there is no error and no warning!

ketozhang · 2023-10-06T21:13:27Z

That's right, the first solution I proposed will not let you mutate bad.

I'm still inclined to suggest not providing your user good. If they want only goods, many familiar with numpy know how to filter out things from an array:

good = data[~data.bad]

ketozhang · 2023-10-06T23:16:38Z

FYI, giving the user access to only the bad is the same behavior implemented in numpy's masked array.
https://numpy.org/doc/stable/reference/maskedarray.generic.html#accessing-only-the-valid-entries

rpoleski added the enhancement label Apr 19, 2022

rpoleski self-assigned this Oct 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

copy of MulensData instance #40

copy of MulensData instance #40

rpoleski commented Apr 19, 2022

jenniferyee commented Apr 19, 2022

rpoleski commented Apr 19, 2022

ketozhang commented Apr 19, 2022 •

edited

Loading

rpoleski commented Apr 20, 2022

ketozhang commented Apr 20, 2022

rpoleski commented Apr 22, 2022

ketozhang commented Apr 25, 2022

rpoleski commented Oct 9, 2022

jenniferyee commented Oct 5, 2023 •

edited

Loading

rpoleski commented Oct 6, 2023

ketozhang commented Oct 6, 2023 •

edited

Loading

ketozhang commented Oct 6, 2023

copy of MulensData instance #40

copy of MulensData instance #40

Comments

rpoleski commented Apr 19, 2022

jenniferyee commented Apr 19, 2022

rpoleski commented Apr 19, 2022

ketozhang commented Apr 19, 2022 • edited Loading

rpoleski commented Apr 20, 2022

ketozhang commented Apr 20, 2022

rpoleski commented Apr 22, 2022

ketozhang commented Apr 25, 2022

rpoleski commented Oct 9, 2022

jenniferyee commented Oct 5, 2023 • edited Loading

rpoleski commented Oct 6, 2023

ketozhang commented Oct 6, 2023 • edited Loading

ketozhang commented Oct 6, 2023

ketozhang commented Apr 19, 2022 •

edited

Loading

jenniferyee commented Oct 5, 2023 •

edited

Loading

ketozhang commented Oct 6, 2023 •

edited

Loading