feat: add FHE training deployment #665

jfrery · 2024-05-01T17:37:27Z

closes https://github.com/zama-ai/concrete-ml-internal/issues/4373
closes https://github.com/zama-ai/concrete-ml-internal/issues/4454

Proposition:

from concrete.ml.deployment import FHEModelDev

# Save the training FHE circuit for production 
fhe_dev = FHEModelDev("fhe_training_sgd", sgd_clf_binary_fhe)
fhe_dev.save(mode="training|inference")

# No change on the client and server side

Check the notebook to see how to use it. Basically, it's the same as before for the user. We only add a single parameter in the fhe_dev.save

src/concrete/ml/deployment/fhe_client_server.py

src/concrete/ml/pytest/utils.py

tests/deployment/test_client_server.py

andrei-stoian-zama

The notebook seems a bit loaded..

I think we could:

move the plot_decision boundary functions to some file in utils.
why do we need two plot_decision_boundary functions?
only show the decision boundary graph at iteration 0 and the last iteration

Next, we should showcase Deployment first:

"Training on encrypted data in production": export the trainer with the FheModelDev/etc, load client/server, etc..
"Develop a Logistic regression trainer for encrypted data" -> here we show the simulation part

tests/deployment/test_client_server.py

src/concrete/ml/deployment/fhe_client_server.py

RomanBredehoft

thanks ! I have several comments and observations

jfrery · 2024-05-23T20:24:16Z

The notebook seems a bit loaded..

I agree.

I think we could:

* move the plot_decision boundary functions to some file in `utils`.

* why do we need two `plot_decision_boundary` functions?

One uses the weigths/bias and the other uses the concrete-ml .predict(). I will make a single one.

* only show the decision boundary graph at iteration 0 and the last iteration

I will do this if I can't find anything better. I agree 10 plots is too much. But I want to show that the model is learning.

Next, we should showcase Deployment first:

1. "Training on encrypted data in production": export the trainer with the FheModelDev/etc, load client/server, etc..

2. "Develop a Logistic regression trainer for encrypted data" -> here we show the simulation part

Feels weird to me to show first the production and then how you develop the model which will eventually go to production.

I agree with most of your remark. Maybe we should create a new notebook for production only and refer to the development one.

RomanBredehoft

thanks a lot for this !

script/make_utils/check_pytest_determinism.sh

src/concrete/ml/deployment/fhe_client_server.py

src/concrete/ml/pytest/utils.py

src/concrete/ml/quantization/quantized_module.py

RomanBredehoft · 2024-05-29T12:15:17Z

src/concrete/ml/sklearn/linear_model.py

@@ -687,13 +680,7 @@ def fit(  # type: ignore[override]

        # If the model should be trained using FHE training
        if self.fit_encrypted:
-            if fhe is None:
-                fhe = "disable"
-                warnings.warn(


no this warning was made on purpose, I think we should keep it

the idea if that users should set fhe to something if they activate fit_encrypted to be sure they know what they are doing

the reason why fhe is not defaulted to disable is to avoid the ambiguous situation of having fit_encrypted to false and fhe to disable, which wouldn't make much sense since training would be done in floating points with sklearn

cc @fd0r

Maybe I miss something but this didn't really make sense to me. If we have a warning here then why not for all our models in the predict?

avoid the ambiguous situation of having fit_encrypted to false and fhe to disable, which wouldn't make much sense since training would be done in floating points with sklearn

If fit_encrypted = False then we have this:

concrete-ml/src/concrete/ml/sklearn/linear_model.py

Lines 705 to 710 in 4f75979

if fhe is not None:

raise ValueError(

"Parameter 'fhe' should not be set when FHE training is disabled. Either set it to "

"None for floating point training in the clear or set 'fit_encrypted' to True when "

f"initializing the model. Got {fhe}."

)

Or what do I miss?

ah yes you right about the second case, good thing then !

but the reason we don't have this in the predict methods is simply because there is not ambiguity there, we only have one parameter related to fhe execution : fhe (and it can't be None). Whereas with encrypted training, we have this additional fit_encrypted in the init.

so more or less I believe the idea here was to better determine the fhe parameter's role, by having something like :

None : float clear

disable | simulate | fhe : the usual

so when a user sets fit_encrypted, it's better to make sure he's aware of what mode he's using

There is no notion of fhe= if it's a training in the clear and we have a assert on this. To me, fhe="disable" should be the default if users use fit_encrypted = True.

I don't think that the warning "Parameter 'fhe' isn't set while FHE training is enabled.\n Defaulting to '{fhe=}'" is of any help to the user.

If the API is really confusing, having warning to try to explain isn't the right direction. Instead we should rethink the API.

well iirc there was some discussion on this in particular when the API was made and this is how we decided to go for, maybe @fd0r can tell us more about it !

Yes @RomanBredehoft got it right.

We don't want disable by default since it wouldn't make sense if fit_encrypted=False.
But not providing a value for fhe if fit_encrypted=True is also "dangerous". The idea here was to make sure that the user has to provide some parameter for fhe in this case.

Now that I'm re-reading that I would have put an Exception instead of a warning here.

I don't think we should remove the warning.

We don't want disable by default since it wouldn't make sense if fit_encrypted=False.

But if fit_encrypted=False. and fhe != None, we raise an error so this is not possible.

But not providing a value for fhe if fit_encrypted=True is also "dangerous"

I really don't understand why this is "dangerous". For the inference, disable is the default. Why would it be dangerous to do it for training?

Also, what should the user learn from "Parameter 'fhe' isn't set while FHE training is enabled.\n Defaulting to '{fhe=}'"?

I really don't understand why this is "dangerous". For the inference, disable is the default. Why would it be dangerous to do it for training?

Because we would be changing the default from the signature mostly

Also, what should the user learn from "Parameter 'fhe' isn't set while FHE training is enabled.\n Defaulting to '{fhe=}'"?

Mainly that the default value is changing is shouldn't really happen. And that the prefered way is to specify this argument.

Alright 🏳️ well I will add back the warnings and tests for it.

RomanBredehoft · 2024-05-29T12:15:39Z

src/concrete/ml/sklearn/linear_model.py

@@ -763,13 +750,7 @@ def partial_fit(
        # A partial fit is similar to a fit with a single iteration. The slight differences between
        # both are handled in the encrypted method when setting `is_partial_fit` to True.
        if self.fit_encrypted:
-            if fhe is None:


same as above

src/concrete/ml/pytest/utils.py

tests/deployment/test_client_server.py

RomanBredehoft

thanks for the feature !! lots of cleaning, which is great. I have several questions and observations

fd0r · 2024-05-30T12:48:23Z

docs/advanced_examples/LogisticRegressionTraining.ipynb

Can the deployment be done without any call to the fit method?

We should be able to deploy with only the batch size, the number of features and the batch size as inputs.

Also we should probably take composition into account such that n-iter is included in the query.

Can the deployment be done without any call to the fit method?

We should be able to deploy with only the batch size, the number of features and the batch size as inputs.

I just use the existing way of creating a training FHE circuit. Do we have something else than the _fit_encrypted method to create such circuit? Or are you proposing to wrap this into another method within the FHEModel API?

That being said, I definitely agree that having to instantiate fhe circuit for deployment with a call to fit is super weird...

Also we should probably take composition into account such that n-iter is included in the query.

I am not sure where the n-iter would belong but I believe it's a parameter that needs to be sent to the server so I would say it would need to be sent at the same time as the serialized encrypted values. I don't think it has anything to do with the FHEModel API unless we start implementing the communication protocol.

I created this for the first point https://github.com/zama-ai/concrete-ml-internal/issues/4466

Thanks a lot for these changes, let's address the rest in a follow-up PR.

fd0r

Sorry for reviewing this so late.

I have a few comments on the API design.

IMHO we should consider composition for this, or merge as is and improve upon that, but it would be easier if it was already taken into account.

Also we shouldn't have to call the fit function before deploying imo. It's true that we would have to take the inputs ranges into account, the batch-size and number of features and targets.

fd0r

Let's iterate on that! Thanks a lot for taking care of this!

fd0r · 2024-06-03T07:32:59Z

src/concrete/ml/sklearn/linear_model.py

-                    f"Defaulting to '{fhe=}'",
-                    stacklevel=2,
-                )
+            fhe = "disable" if fhe is None else fhe


This wasn't changed back

Should be good

fd0r

The warning is still removed.

github-actions · 2024-06-03T09:05:23Z

Coverage passed ✅

Coverage details

---------- coverage: platform linux, python 3.8.18-final-0 -----------
Name    Stmts   Miss  Cover   Missing
-------------------------------------
TOTAL    7622      0   100%

59 files skipped due to complete coverage.

RomanBredehoft

looks good, thanks a lot ! one quick question: how is it going to work once we integrate composition with training (#660) ?

cla-bot bot added the cla-signed label May 1, 2024

jfrery force-pushed the feat/training_deployment branch 3 times, most recently from dd6fc4c to 3fa5d75 Compare May 23, 2024 14:12

jfrery marked this pull request as ready for review May 23, 2024 14:12

jfrery requested a review from a team as a code owner May 23, 2024 14:12