-
-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serialise, deserialise; for pickle, deepcopy #72
Comments
We also believe that this is a vital feature to have, we aim at providing serialization ASAP, but this requires planning a serialization strategy and a discussion around different possibilities like the ones you mentioned above, which is something that we can't start in the following days, but definitely an upcoming feature. |
Cheers, thanks for the lightning quick response. |
Our serialization would extend SEAL's serialization but we should also setup a strategy for our TenSEALContext as it can contain different object at a certain time, other more simple objects like CKKSVector are pretty straightforward. I haven't digged into the details but I guess there is an issue as well regarding dealing with file-like objects on both python and cpp worlds. We can discuss those challenges in more details if you are up to looking into them, but you can expect from our side to have things done during July as a late date. |
Thanks @youben11 for letting me know. I will leave this issue open for now/ for anything further relating to this. Cheers. |
Thanks @DreamingRaven for your inputs |
I will work on this one |
While hacky I have created seal->python serialisation and deserialisation here primarily for CKKS by dictionaries: and I use them in a meta object here: https://github.com/DreamingRaven/python-reseal/blob/39f6b250d18d62cbff1b185a10d52f12eac316e9/fhe/reseal.py#L173-L232 this allows me to do everything I need, including saving the serialised intermediates to databases deep copying etc. However since it uses file intermediaries from SEAL its a complete hack. But it may give some semblance of help I hope. NOTE: my pybind11 bindings are likely different to yours as my bindings are from huelse's seal-python repository but if they are consistently named I dont think that should add too much confusion. And this repository is super early stage still so not a lot of documentation has been added, sorry. |
The key thing I have found from doing this is that we don't need to make everything serialisable, only the primitives, parameters, ciphertext and keys. The rest (workers like keygen and encoder etc) can be regenerated on the fly as you can see in the above mentioned meta object ReSeal. E.G context is always rebuildable from parameters consistently, so even if the context object is rebuilt every time between generating keys and ciphertext etc the output is still properly decryptable. So I combine this with a Caching scheme to speed things up so we dont have to rebuild it every time, and just discard the cache when its time to serialise, so we get the best of having the objects stored for speed, but the ability to store a relative subset of the objects that are minimally required to rebuild the rest. EDIT: I realise now though that this might be at a higher level than you guys will end up tackling it, but hey its something maybe to help. |
@DreamingRaven Bogdan just made TenSEAL context serializable using pickle. Whenever we add the feature for serializing CKKSVector then we will be able to implement client/server apps! Please let us know how you feel about it :) |
@youben11 thanks for letting me know. I will give context serialization a spin when I get back to my desk and let you know if there are any issues.. Hopefully this can replace my existing workaround in my research, which would be amazing. I don't think its going to be very long before we start seeing FHE more and more in the field to start tackling very sensitive mostly untouched data. Its just a shame I wish I could find a way to add bootstrapping to MS-SEAL for arbitrary depth computation albeit with added noise, as that I believe is the big barrier for many libraries to be used in encrypted deep learning as a service applications. |
Indeed bootstrapping can let us compute circuits of arbitrary depth, however, its computation cost is still impractical as far as I know. Many research have been focusing on the use of Leveled-HE with optimized circuits and batching for implementing practical use cases. Looking forward for your feedback on serialization! |
Hey apologies for the delay. I tried out the same test file from my initial post to see if the context was now serialisable and deserialisable in python. However I have been having some issues with this: I still get:
im not sure if I have missed something with this, I will need to take a closer look to find out why. I am running from inside the latest openmined/tenseal docker container, I presume that is kept up to date with master. |
latest seems to be attached to the last release which doesn't have the serialization feature, we will look further why it's the case. By the mean time, you can use the one with latest-py38 tag, which is up to date with master, or build it locally as you suggested. |
Hey @youben11 serialisation test (for context) has passed by building the docker container from master, for my own purposes as well I will later confirm that the serialized objects can be easily bound to http requests and inserted into MongoDB (the latter I don't foresee being an issue). So all good there! minor dockerfile comment:On a related note, I suspect that the documentation and Dockerfiles are from different times, as the documentation says:
However that last command does not use the -f flag to specify the dockerfile name since they are not named exactly "Dockerfile" and rather Dockerfile-py3x such as:
Also I noted the dockerfiles attempt to copy the current directory into the containers however since they are in a subdirectory all that will be copied if one changes directory to run them will be the dockerfiles themselves and failing to build. I presume they were moved into a subdirectory after the documentation was written. To my undertstanding the docker context only reaches into subdirectories and not parent directories meaning either the dockerfiles need to be moved to the parent directory or need to be called with the context being in the top level directory of the project (of which I am not to certain of the specific command). So as a quick fix so I could test I just copied one to the parent directory and ran the following from the same directory:
If I can find the correct invocation to put the docker context in the top level directory I will put in a minor documentation edit so the command works as expected without any tweaks. |
Actually I may draft a pull request to use a more normal top level Dockerfile, .dockerignore, and docker-compose; to see if you prefer that to the current setup. Although its not a huge issue either way, it just might throw some people who don't necessarily use docker a lot. |
Thank you @DreamingRaven for your notes, as @youben11 said, the |
Hey @philomath213 thanks for getting back to me, I created a minor PR a few minutes ago #111 just to patch the documentation, to use the new command similar to the one you listed. I also took the liberty of adding a .dockerignore as well for good practice. EDIT: I think i prefer your order of options, I might swap it around in the PR |
Let me know when there is any progress on serialization of keys and ciphertext, I will create a unittest PR for verifying this functionality in the mean time since I dont feel confident id be able to implement it at the C++ level. Thanks for putting up with my ramblings, and for progressing this guys. |
Thank you @DreamingRaven ! The PR #109 is under review and should be merged in the following days. |
Well, the same problem as you when I want to share the tenseal CKKS context in some processes in Federated Learning. I guess deep_copy()'s problem is similar. |
@lunan0320 Hey, currently struggeling with a similar problem. Do you also use the FL Framework Flower ? |
|
Feature Description
I feel it would be vital to be able to deep-copy objects like context, private key, and ciphertext.
Similarly It will also be vital to real use to be able to pickle, and unpickle, context, private key, and ciphertext, or at the least save to a file-like object, but not necessarily a file itself.
Is your feature request related to a problem?
I would like to save directly to a non local system such as a database the necessary objects to encrypt, evaluate, and decrypt ciphertext, without having to write to the local filesystem, thus I need to create a file-like object to store elsewhere. Similarly to this end this will involve serialisation and desirialisation, probably from pickle, which currentley does not work with the pybin11 bindings here. This also prevents things like python deep-copying any of the tenseal objects which is necessary under certain use cases like several workers copying from the same object to evaluate/ compute some function.
What alternatives have you considered?
Saving to a file and loading that file to a database; time consuming, IO intensive, bottlenecked, and not easily scalable.
Additional Context
Here is a unit test showcasing the current inability to pickle, and deep-copy, unless I am misunderstanding how it is to be done here:
test.py
when run
python3 ./test.py
The text was updated successfully, but these errors were encountered: