Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Correctness of CrypTen #304

Closed
ZJG0 opened this issue Sep 10, 2021 · 12 comments
Closed

The Correctness of CrypTen #304

ZJG0 opened this issue Sep 10, 2021 · 12 comments

Comments

@ZJG0
Copy link

ZJG0 commented Sep 10, 2021

I have trained my own model in plaintext. Now I intend to inference the model by CrypTen. After many preliminary preparations, I find the result is different from the plaintext, so I think that some building blocks of CrypTen have problems.
How can i debug it? Hope you can give me some suggestions.

@lvdmaaten
Copy link
Member

There some functionality to validate correctness of function outputs in CrypTen, but it is pretty barebones. In particular, it is not yet integrated into the autograd. Having said that, you can probably use it to check the output of functions you think are suspect.

It is important to keep in mind that CrypTen uses fixed-point number encodings (with 16 bits by default) and that several function are evaluated by Newton-Raphson / Taylor / etc. approximations so some deviations from plaintext are expected. In particular, deviations can accumulate in very deep networks (for instance, we know deep Transformers are tricky because our erf approximations are not very good right now).

@ZJG0
Copy link
Author

ZJG0 commented Sep 13, 2021

To avoid the influence of complex networks(Transformers .etc), I trained a simple NN to verify the correctness of CrypTen.

model structure

class Net(torch.nn.Module):
	def __init__(self):
		super(Net,self).__init__()
		self.hidden = torch.nn.Linear(1,10)
		self.predict = torch.nn.Linear(10,1)
 
	def forward(self,x):
		x = F.relu(self.hidden(x))
		x = self.predict(x)
		return x

However, the loss I got with crypten is different from loss in plaintext. I offer an example as following:
test_crypten.zip

@knottb
Copy link
Contributor

knottb commented Sep 15, 2021

The zip file provided is missing the file model.pkl which is required to run private_inference.py

@ZJG0
Copy link
Author

ZJG0 commented Sep 16, 2021

Firstly, you must run train_model.py to get model.pkl

@knottb
Copy link
Contributor

knottb commented Sep 20, 2021

Just ran this with the outputs from train_model.py and I'm getting the same output and same loss for plaintext and crypten implementations. The loss I am getting for each is tensor(0.0032).

Is this what you expect?

@ZJG0
Copy link
Author

ZJG0 commented Sep 21, 2021

When I ran this code in plaintext, I got the value of loss also is tensor(0.0032). However, with crypten my loss is tensor(2.2370e+19)
image

@knottb
Copy link
Contributor

knottb commented Sep 24, 2021

Hmm, could you try updating your crypten version. I am running on our newest version (0.4.0) and I'm seeing correct results using CrypTen.

@ZJG0
Copy link
Author

ZJG0 commented Sep 24, 2021

To avoid the version problem, I have tried the latest version. I used git clone download the latest CrypTen and installed by
python setup.py build, python setup.py install. However, the result is no change. Is there a problem with the settings?

@knottb
Copy link
Contributor

knottb commented Sep 24, 2021

For reference, I changed some of the logs to show plaintext and crypten loss. This is my output (note this requires python3):

% python3 private_inference.py
plaintext: 0.0031806188635528088
/Users/brianknott/Library/Python/3.7/lib/python/site-packages/crypten/nn/onnx_converter.py:161: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at  ../torch/csrc/utils/tensor_numpy.cpp:180.)
  param = torch.from_numpy(numpy_helper.to_array(node))
/Users/brianknott/Library/Python/3.7/lib/python/site-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at  ../aten/src/ATen/native/BinaryOps.cpp:467.)
  return torch.floor_divide(self, other)
crypten: 0.0031804912723600864

This was run in a new conda environment with the following dependencies (most are not required, but this is my environment after installing from requirements.txt):

absl-py==0.13.0
antlr4-python3-runtime==4.8
bleach==4.1.0
cachetools==4.2.2
certifi==2021.5.30
charset-normalizer==2.0.4
colorama==0.4.4
crypten==0.4.0
cycler==0.10.0
docutils==0.17.1
future==0.18.2
google-auth==1.35.0
google-auth-oauthlib==0.4.6
grpcio==1.40.0
idna==3.2
importlib-metadata==4.8.1
joblib==1.0.1
keyring==23.1.0
kiwisolver==1.3.2
Markdown==3.3.4
matplotlib==3.4.3
numpy==1.21.2
oauthlib==3.1.1
omegaconf==2.1.1
onnx==1.10.1
packaging==21.0
pandas==1.3.2
Pillow==8.3.2
pkginfo==1.7.1
protobuf==3.17.3
pyasn1==0.4.8
pyasn1-modules==0.2.8
Pygments==2.10.0
pyobjc==4.2.2
pyobjc-core==4.2.2
pyobjc-framework-Accounts==4.2.2
pyobjc-framework-AddressBook==4.2.2
pyobjc-framework-AppleScriptKit==4.2.2
pyobjc-framework-AppleScriptObjC==4.2.2
pyobjc-framework-ApplicationServices==4.2.2
pyobjc-framework-Automator==4.2.2
pyobjc-framework-AVFoundation==4.2.2
pyobjc-framework-AVKit==4.2.2
pyobjc-framework-BusinessChat==4.2.2
pyobjc-framework-CalendarStore==4.2.2
pyobjc-framework-CFNetwork==4.2.2
pyobjc-framework-CloudKit==4.2.2
pyobjc-framework-Cocoa==4.2.2
pyobjc-framework-Collaboration==4.2.2
pyobjc-framework-ColorSync==4.2.2
pyobjc-framework-Contacts==4.2.2
pyobjc-framework-ContactsUI==4.2.2
pyobjc-framework-CoreBluetooth==4.2.2
pyobjc-framework-CoreData==4.2.2
pyobjc-framework-CoreLocation==4.2.2
pyobjc-framework-CoreML==4.2.2
pyobjc-framework-CoreServices==4.2.2
pyobjc-framework-CoreSpotlight==4.2.2
pyobjc-framework-CoreText==4.2.2
pyobjc-framework-CoreWLAN==4.2.2
pyobjc-framework-CryptoTokenKit==4.2.2
pyobjc-framework-DictionaryServices==4.2.2
pyobjc-framework-DiskArbitration==4.2.2
pyobjc-framework-EventKit==4.2.2
pyobjc-framework-ExceptionHandling==4.2.2
pyobjc-framework-ExternalAccessory==4.2.2
pyobjc-framework-FinderSync==4.2.2
pyobjc-framework-FSEvents==4.2.2
pyobjc-framework-GameCenter==4.2.2
pyobjc-framework-GameController==4.2.2
pyobjc-framework-GameKit==4.2.2
pyobjc-framework-GameplayKit==4.2.2
pyobjc-framework-ImageCaptureCore==4.2.2
pyobjc-framework-IMServicePlugIn==4.2.2
pyobjc-framework-InputMethodKit==4.2.2
pyobjc-framework-InstallerPlugins==4.2.2
pyobjc-framework-InstantMessage==4.2.2
pyobjc-framework-Intents==4.2.2
pyobjc-framework-IOSurface==4.2.2
pyobjc-framework-iTunesLibrary==4.2.2
pyobjc-framework-LatentSemanticMapping==4.2.2
pyobjc-framework-LaunchServices==4.2.2
pyobjc-framework-libdispatch==4.2.2
pyobjc-framework-LocalAuthentication==4.2.2
pyobjc-framework-MapKit==4.2.2
pyobjc-framework-MediaAccessibility==4.2.2
pyobjc-framework-MediaLibrary==4.2.2
pyobjc-framework-MediaPlayer==4.2.2
pyobjc-framework-ModelIO==4.2.2
pyobjc-framework-MultipeerConnectivity==4.2.2
pyobjc-framework-NetFS==4.2.2
pyobjc-framework-NetworkExtension==4.2.2
pyobjc-framework-NotificationCenter==4.2.2
pyobjc-framework-OpenDirectory==4.2.2
pyobjc-framework-Photos==4.2.2
pyobjc-framework-PhotosUI==4.2.2
pyobjc-framework-PreferencePanes==4.2.2
pyobjc-framework-PubSub==4.2.2
pyobjc-framework-QTKit==4.2.2
pyobjc-framework-Quartz==4.2.2
pyobjc-framework-SafariServices==4.2.2
pyobjc-framework-SceneKit==4.2.2
pyobjc-framework-ScreenSaver==4.2.2
pyobjc-framework-ScriptingBridge==4.2.2
pyobjc-framework-SearchKit==4.2.2
pyobjc-framework-Security==4.2.2
pyobjc-framework-SecurityFoundation==4.2.2
pyobjc-framework-SecurityInterface==4.2.2
pyobjc-framework-ServiceManagement==4.2.2
pyobjc-framework-Social==4.2.2
pyobjc-framework-SpriteKit==4.2.2
pyobjc-framework-StoreKit==4.2.2
pyobjc-framework-SyncServices==4.2.2
pyobjc-framework-SystemConfiguration==4.2.2
pyobjc-framework-Vision==4.2.2
pyobjc-framework-WebKit==4.2.2
pyparsing==2.4.7
python-dateutil==2.8.2
pytz==2021.1
PyYAML==5.4.1
readme-renderer==29.0
requests==2.26.0
requests-oauthlib==1.3.0
requests-toolbelt==0.9.1
rfc3986==1.5.0
rsa==4.7.2
scikit-learn==0.24.2
scipy==1.7.1
six==1.16.0
sklearn==0.0
tensorboard==2.6.0
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.0
threadpoolctl==2.2.0
torch==1.9.0
torchvision==0.10.0
tqdm==4.62.2
twine==3.4.2
typing-extensions==3.10.0.2
urllib3==1.26.6
webencodings==0.5.1
Werkzeug==2.0.1
zipp==3.5.0

@ZJG0
Copy link
Author

ZJG0 commented Sep 25, 2021

I know why we run the results differently. In the readme, I have noted the running method, you should open three terminals and respectively run :

RENDEZVOUS=file:///tmp/vfl WORLD_SIZE=3 RANK=0 python private_inference.py 
RENDEZVOUS=file:///tmp/vfl WORLD_SIZE=3 RANK=1 python private_inference.py 
RENDEZVOUS=file:///tmp/vfl WORLD_SIZE=3 RANK=2 python private_inference.py 

Because I want to construct the three-party MPC.

@knottb
Copy link
Contributor

knottb commented Sep 27, 2021

Indeed, however this should also work with @crypten.mpc.run_multiprocess(3), which I believe is how I ran this.

Note that for 3PC, some of the arithmetic has had a bug specified in #308. This bug should be fixed in #313.

@knottb
Copy link
Contributor

knottb commented Sep 30, 2021

This should be fixed with the landing of #313. Closing this issue. Please re-open if there are still issues with this.

@knottb knottb closed this as completed Sep 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants