Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how can i use CHEETAH in the secretflow? Can you please give a demo? #11

Closed
JunZ2364139375 opened this issue Jul 11, 2022 · 10 comments
Closed

Comments

@JunZ2364139375
Copy link

No description provided.

@JunZ2364139375 JunZ2364139375 changed the title how can i use CHEETAH in thee secretflow? Can you please give a demo? how can i use CHEETAH in t secretflow? Can you please give a demo? Jul 11, 2022
@JunZ2364139375 JunZ2364139375 changed the title how can i use CHEETAH in t secretflow? Can you please give a demo? how can i use CHEETAH in the secretflow? Can you please give a demo? Jul 11, 2022
@wuxibin89
Copy link

See secretflow/secretflow#30 , you can specify protocol as spu.spu_pb2.CHEETAH when initializing SPU:

{
    'nodes': [
        {
            'party': 'alice',
            'id': 'local:0',
            'address': '127.0.0.1:9001',
            'listen_address': '' # Optional. Address will be used if listen_address is empty.
        },
        {
            'party': 'bob',
            'id': 'local:1',
            'address': '127.0.0.1:9002',
            'listen_address': ''
        },
    ],
    'runtime_config': {
        'protocol': spu.spu_pb2.CHEETAH,
        'field': spu.spu_pb2.FM64,
        'sigmoid_mode': spu.spu_pb2.RuntimeConfig.SIGMOID_REAL,
    }
}

@mingo0117
Copy link

See secretflow/secretflow#30 , you can specify protocol as spu.spu_pb2.CHEETAH when initializing SPU:

{
    'nodes': [
        {
            'party': 'alice',
            'id': 'local:0',
            'address': '127.0.0.1:9001',
            'listen_address': '' # Optional. Address will be used if listen_address is empty.
        },
        {
            'party': 'bob',
            'id': 'local:1',
            'address': '127.0.0.1:9002',
            'listen_address': ''
        },
    ],
    'runtime_config': {
        'protocol': spu.spu_pb2.CHEETAH,
        'field': spu.spu_pb2.FM64,
        'sigmoid_mode': spu.spu_pb2.RuntimeConfig.SIGMOID_REAL,
    }
}

想使用猎豹,按如上runtime_config配置,报错如下:
是否是因为我是三方参与运算的原因?(看一些资料似乎猎豹是只支持两方)

Traceback (most recent call last):
  File "cypher_logistic_gegression.py", line 174, in <module>
    print("耗时:{0}秒".format(timeit.timeit('test()', setup='from __main__ import test', number=1)))
  File "/home/ops/anaconda3/envs/secretflow/lib/python3.8/timeit.py", line 233, in timeit
    return Timer(stmt, setup, timer, globals).timeit(number)
  File "/home/ops/anaconda3/envs/secretflow/lib/python3.8/timeit.py", line 177, in timeit
    timing = self.inner(it, self.timer)
  File "<timeit-src>", line 6, in inner
  File "cypher_logistic_gegression.py", line 163, in test
    losses = sf.reveal(losses)
  File "/home/ops/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/device/driver.py", line 158, in reveal
    value_obj = ray.get(value_ref)
  File "/home/ops/anaconda3/envs/secretflow/lib/python3.8/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
    return func(*args, **kwargs)
  File "/home/ops/anaconda3/envs/secretflow/lib/python3.8/site-packages/ray/worker.py", line 1845, in get
    raise value
ray.exceptions.RayActorError: The actor died because of an error raised in its creation task, ray::SPURuntime.__init__() (pid=16958, ip=10.100.82.173, repr=<secretflow.device.device.spu.SPURuntime object at 0x7fd75fafd2e0>)
  File "/home/ops/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 126, in __init__
    self.runtime = Runtime(self.link, self.conf)
  File "/home/ops/anaconda3/envs/secretflow/lib/python3.8/site-packages/spu/binding/api.py", line 34, in __init__
    self._vm = _lib.RuntimeWrapper(link, config.SerializeToString())
RuntimeError: what: 
        [external/yasl/yasl/link/transport/channel.cc:86] Get data timeout, key=root:P2P-2:1->0
stacktrace: 
#0 yasl::link::Context::RecvInternal()+0x7fd7b2b100b2
#1 yasl::link::Context::Recv()+0x7fd7b2b101f0
#2 spu::CheetahIo::fill_recv()+0x7fd7b2293a1f
(SPURuntime pid=10887, ip=10.100.82.74) 2022-08-02 03:22:24,843 ERROR worker.py:451 -- Exception raised in creation task: The actor died because of an error raised in its creation task, ray::SPURuntime.__init__() (pid=10887, ip=10.100.82.74, repr=<secretflow.device.device.spu.SPURuntime object at 0x7f77bfec0250>)
(SPURuntime pid=10887, ip=10.100.82.74)   File "/home/ops/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 126, in __init__
(SPURuntime pid=10887, ip=10.100.82.74)     self.runtime = Runtime(self.link, self.conf)
(SPURuntime pid=10887, ip=10.100.82.74)   File "/home/ops/anaconda3/envs/secretflow/lib/python3.8/site-packages/spu/binding/api.py", line 34, in __init__
(SPURuntime pid=10887, ip=10.100.82.74)     self._vm = _lib.RuntimeWrapper(link, config.SerializeToString())
(SPURuntime pid=10887, ip=10.100.82.74) RuntimeError: what: 
(SPURuntime pid=10887, ip=10.100.82.74)         [external/yasl/yasl/link/transport/channel.cc:86] Get data timeout, key=root:P2P-2:2->1
(SPURuntime pid=10887, ip=10.100.82.74) stacktrace: 
(SPURuntime pid=10887, ip=10.100.82.74) #0 yasl::link::Context::RecvInternal()+0x7f7816b100b2
(SPURuntime pid=10887, ip=10.100.82.74) #1 yasl::link::Context::Recv()+0x7f7816b101f0
(SPURuntime pid=10887, ip=10.100.82.74) #2 spu::CheetahIo::fill_recv()+0x7f7816293a1f
(SPURuntime pid=16958) 2022-08-02 03:22:24,861  ERROR worker.py:451 -- Exception raised in creation task: The actor died because of an error raised in its creation task, ray::SPURuntime.__init__() (pid=16958, ip=10.100.82.173, repr=<secretflow.device.device.spu.SPURuntime object at 0x7fd75fafd2e0>)
(SPURuntime pid=16958)   File "/home/ops/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 126, in __init__
(SPURuntime pid=16958)     self.runtime = Runtime(self.link, self.conf)
(SPURuntime pid=16958)   File "/home/ops/anaconda3/envs/secretflow/lib/python3.8/site-packages/spu/binding/api.py", line 34, in __init__
(SPURuntime pid=16958)     self._vm = _lib.RuntimeWrapper(link, config.SerializeToString())
(SPURuntime pid=16958) RuntimeError: what: 
(SPURuntime pid=16958)  [external/yasl/yasl/link/transport/channel.cc:86] Get data timeout, key=root:P2P-2:1->0
(SPURuntime pid=16958) stacktrace: 
(SPURuntime pid=16958) #0 yasl::link::Context::RecvInternal()+0x7fd7b2b100b2
(SPURuntime pid=16958) #1 yasl::link::Context::Recv()+0x7fd7b2b101f0
(SPURuntime pid=16958) #2 spu::CheetahIo::fill_recv()+0x7fd7b2293a1f
(SPURuntime pid=1420, ip=10.100.82.134) 2022-08-02 03:22:24,859 ERROR worker.py:451 -- Exception raised in creation task: The actor died because of an error raised in its creation task, ray::SPURuntime.__init__() (pid=1420, ip=10.100.82.134, repr=<secretflow.device.device.spu.SPURuntime object at 0x7f0d71cba250>)
(SPURuntime pid=1420, ip=10.100.82.134)   File "/home/ops/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 126, in __init__
(SPURuntime pid=1420, ip=10.100.82.134)     self.runtime = Runtime(self.link, self.conf)
(SPURuntime pid=1420, ip=10.100.82.134)   File "/home/ops/anaconda3/envs/secretflow/lib/python3.8/site-packages/spu/binding/api.py", line 34, in __init__
(SPURuntime pid=1420, ip=10.100.82.134)     self._vm = _lib.RuntimeWrapper(link, config.SerializeToString())
(SPURuntime pid=1420, ip=10.100.82.134) RuntimeError: what: 
(SPURuntime pid=1420, ip=10.100.82.134)         [external/yasl/yasl/link/transport/channel.cc:86] Get data timeout, key=root:P2P-2:0->2
(SPURuntime pid=1420, ip=10.100.82.134) stacktrace: 
(SPURuntime pid=1420, ip=10.100.82.134) #0 yasl::link::Context::RecvInternal()+0x7f0dc6b100b2
(SPURuntime pid=1420, ip=10.100.82.134) #1 yasl::link::Context::Recv()+0x7f0dc6b101f0
(SPURuntime pid=1420, ip=10.100.82.134) #2 spu::CheetahIo::fill_recv()+0x7f0dc6293a1f
(SPURuntime pid=1420, ip=10.100.82.134) 2022-08-02 03:22:24,859 ERROR worker.py:451 -- Exception raised in creation task: The actor died because of an error raised in its creation task, ray::SPURuntime.__init__() (pid=1420, ip=10.100.82.134, repr=<secretflow.device.device.spu.SPURuntime object at 0x7f0d71cba250>)
(SPURuntime pid=1420, ip=10.100.82.134)   File "/home/ops/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 126, in __init__
(SPURuntime pid=1420, ip=10.100.82.134)     self.runtime = Runtime(self.link, self.conf)
(SPURuntime pid=1420, ip=10.100.82.134)   File "/home/ops/anaconda3/envs/secretflow/lib/python3.8/site-packages/spu/binding/api.py", line 34, in __init__
(SPURuntime pid=1420, ip=10.100.82.134)     self._vm = _lib.RuntimeWrapper(link, config.SerializeToString())
(SPURuntime pid=1420, ip=10.100.82.134) RuntimeError: what: 
(SPURuntime pid=1420, ip=10.100.82.134)         [external/yasl/yasl/link/transport/channel.cc:86] Get data timeout, key=root:P2P-2:0->2
(SPURuntime pid=1420, ip=10.100.82.134) stacktrace: 
(SPURuntime pid=1420, ip=10.100.82.134) #0 yasl::link::Context::RecvInternal()+0x7f0dc6b100b2
(SPURuntime pid=1420, ip=10.100.82.134) #1 yasl::link::Context::Recv()+0x7f0dc6b101f0
(SPURuntime pid=1420, ip=10.100.82.134) #2 spu::CheetahIo::fill_recv()+0x7f0dc6293a1f

@6fj
Copy link
Member

6fj commented Aug 2, 2022

Exactly.
请在init spu device的时候只提供两个节点。

@mingo0117
Copy link

ok,thx

@mingo0117
Copy link

还有一个问题请教:两节点下,我测试逻辑回归的性能,同样的代码只是换了协议,semi2k需要8s,cheetah需要52s,这是为什么呢?cheetah这个在什么场景下使用才有性能优势?

@fionser
Copy link
Contributor

fionser commented Aug 3, 2022

@mingo0117 semi2k 是需要额外可信第三方的 (如 TEE) 去生成 随机数;而 cheetah 是纯两方的;如果部署上能支持 TEE 的话;大部分情况都是 semi2k 的性能更好。后续 Cheetah 的矩阵-乘法(应该)还会优化性能 。

@JunZ2364139375
Copy link
Author

我有一个问题想请问一下,当使用明文训练完一个模型,模型参数使用的是浮点数表示。在secretflow中是直接将模型参数乘以一个系数转换为整数表示吗?

@anakinxc
Copy link
Contributor

Hi @JunZ2364139375

浮点数在spu里是用定点数来表示的

@JunZ2364139375
Copy link
Author

Hi @JunZ2364139375

浮点数在spu里是用定点数来表示的

请问在cheetah协议中是如何转换为大整数的呢?

@anakinxc
Copy link
Contributor

Hi @JunZ2364139375
浮点数在spu里是用定点数来表示的

请问在cheetah协议中是如何转换为大整数的呢?

可以阅读一下这部分代码

ArrayRef encodeToRing(const ArrayRef& src, FieldType field, size_t fxp_bits,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants