Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with fitting DPsynthpop #1

Open
haaksb opened this issue Oct 14, 2022 · 0 comments
Open

Problem with fitting DPsynthpop #1

haaksb opened this issue Oct 14, 2022 · 0 comments
Assignees

Comments

@haaksb
Copy link

haaksb commented Oct 14, 2022

Hi

I have been testing dpart, and it looks very interesting. I have, however, run into a problem when running .fit on DPsynthpop, here is the code:

from dpart.engines import DPsynthpop
pb_model = DPsynthpop(epsilon=0.4)
pb_model.fit(df)

Error:

File c:\Users\hakonbo\Anaconda3\lib\site-packages\diffprivlib\models\logistic_regression.py:224, in LogisticRegression.fit(self, X, y, sample_weight)
    220     warnings.warn("Data norm has not been specified and will be calculated on the data provided.  This will "
    221                   "result in additional privacy leakage. To ensure differential privacy and no additional "
    222                   "privacy leakage, specify `data_norm` at initialisation.", PrivacyLeakWarning)
    223     self.data_norm = np.linalg.norm(X, axis=1).max()
--> 224 X = self._clip_to_norm(X, self.data_norm)
    226 self.multi_class = _check_multi_class(self.multi_class, solver, len(self.classes_))
    228 n_classes = len(self.classes_)
...
--> 159     raise ValueError(f"Clip value must be strictly positive, got {clip}.")
    161 norms = np.linalg.norm(array, axis=1) / clip
    162 norms[norms < 1] = 1

ValueError: Clip value must be strictly positive, got 0.0.

I have traced the error to the self.root stuff in dpart.py, which creates a column with only zeroes which then gives an error. I dont quite understand its function, or how the first column in the data set is supposed to be synthezised, so that is as far as I got. I would appreciate any help.

On a separate point, it seems that there is suppsed to be a "method" command that determines how each column is fitted, but I dont understand how to use it and cant find documentation on it. I would appreciate some pointers on that as well.

Hope you can take the time to help me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants