Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simulation error with jax median operation #31

Closed
BeStrongok opened this issue Nov 2, 2022 · 9 comments
Closed

Simulation error with jax median operation #31

BeStrongok opened this issue Nov 2, 2022 · 9 comments
Assignees
Labels
enhancement New feature or request

Comments

@BeStrongok
Copy link

BeStrongok commented Nov 2, 2022

Hi,
I simulate the jax median function with the following code and want to see pphlo dialect, but get error.
The code is:

import spu
import jax.numpy as jnp
import numpy as np
import spu.binding.util.simulation as pps

protocol = spu.ProtocolKind.ABY3
field = spu.FieldType.FM64
simulator = pps.Simulator.simple(3, protocol, field)

x = np.array([1,2])
y = np.array([3,4])

def add(x, y):
    tmp = jnp.concatenate((x, y), axis=0)
    return jnp.median(tmp)

spu_add = pps.sim_jax(simulator, add)
print(spu_add(x, y))
print(spu_add.pphlo)

The error is:

WARNING:absl:No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
Traceback (most recent call last):
  File "test_hlo.py", line 18, in <module>
    print(spu_sigmoid(x, y))
  File "/root/miniconda3/envs/secretflow/lib/python3.8/site-packages/spu/binding/util/simulation.py", line 145, in wrapper
    out_flat = sim(executable, *args_flat)
  File "/root/miniconda3/envs/secretflow/lib/python3.8/site-packages/spu/binding/util/simulation.py", line 100, in __call__
    parties = [job.join() for job in jobs]
  File "/root/miniconda3/envs/secretflow/lib/python3.8/site-packages/spu/binding/util/simulation.py", line 100, in <listcomp>
    parties = [job.join() for job in jobs]
  File "/root/miniconda3/envs/secretflow/lib/python3.8/site-packages/spu/binding/util/simulation.py", line 43, in join
    raise self.exc
  File "/root/miniconda3/envs/secretflow/lib/python3.8/site-packages/spu/binding/util/simulation.py", line 36, in run
    self.ret = self._target(*self._args, **self._kwargs)
  File "/root/miniconda3/envs/secretflow/lib/python3.8/site-packages/spu/binding/util/simulation.py", line 89, in wrapper
    rt.run(executable)
  File "/root/miniconda3/envs/secretflow/lib/python3.8/site-packages/spu/binding/api.py", line 45, in run
    return self._vm.Run(executable.SerializeToString())
RuntimeError: what: 
        [Enforce fail at spu/hal/constants.cc:72] v.storage_type().isa<mpc::Pub2kTy>(). got aby3.BShr<PT_U8,1>
stacktrace: 
#0 spu::hal::test::dump_public_as<>()+0x7f8428fe92ad
#1 spu::device::getConditionValue()+0x7f8428feab98
#2 spu::device::pphlo::kernel::Sort<>()::{lambda()#1}::operator()()::{lambda()#2}::operator()()+0x7f8428fb9ec5
#3 std::__insertion_sort<>()+0x7f8428fba216
#4 std::__stable_sort_adaptive<>()+0x7f8428fbafca
#5 spu::device::pphlo::RegionExecutor::execute()+0x7f8428fbc423
#6 spu::device::pphlo::RegionExecutor::dispatchOp<>()+0x7f8428fad100
#7 spu::device::pphlo::RegionExecutor::dispatchOp<>()+0x7f8428fad526
#8 spu::device::pphlo::RegionExecutor::dispatchOp<>()+0x7f8428fad7d6
#9 spu::device::pphlo::RegionExecutor::dispatchOp<>()+0x7f8428fada86
#10 spu::device::pphlo::RegionExecutor::dispatchOp<>()+0x7f8428fadd36
#11 spu::device::pphlo::RegionExecutor::dispatchOp<>()+0x7f8428fadfe6
#12 spu::device::pphlo::RegionExecutor::dispatchOp<>()+0x7f8428fae296
#13 spu::device::pphlo::RegionExecutor::dispatchOp<>()+0x7f8428fae546
#14 spu::device::pphlo::RegionExecutor::dispatchOp<>()+0x7f8428fae796
#15 spu::device::pphlo::RegionExecutor::dispatchOp<>()+0x7f8428faea46
@anakinxc anakinxc self-assigned this Nov 2, 2022
@anakinxc
Copy link
Contributor

anakinxc commented Nov 2, 2022

Hi @BeStrongok,

jnp.median in this example requires secret sort, which is not supported by SPU right now.

But you can expect this feature in the upcoming releases.

Best

@BeStrongok
Copy link
Author

Hi @BeStrongok,

jnp.median in this example requires secret sort, which is not supported by SPU right now.

But you can expect this feature in the upcoming releases.

Best

Thanks, looking for this feature!

@BeStrongok BeStrongok reopened this Nov 2, 2022
@BeStrongok
Copy link
Author

BeStrongok commented Nov 2, 2022

Because secret sort is non-trivial. I'm also confused about it now, sort algorithm couldn't be parallelized. :(

@anakinxc
Copy link
Contributor

anakinxc commented Nov 2, 2022

Because secret sort is non-trivial. I'm also confused about it now, sort algorithm couldn't be parallelized. :(

Yes, it is non-trivial and hard to optimize, but yes, we are working on it. :D
Stay tuned

@fionser
Copy link
Contributor

fionser commented Nov 9, 2022

Because secret sort is non-trivial. I'm also confused about it now, sort algorithm couldn't be parallelized. :(

try bitonic sort which is highly parallelable

@rivertalk
Copy link

Because secret sort is non-trivial. I'm also confused about it now, sort algorithm couldn't be parallelized. :(

try bitonic sort which is highly parallelable

Nice suggestion, trying...

@fionser
Copy link
Contributor

fionser commented Nov 9, 2022

Actually, current spu should be able to implement data structure like, top-k heap.
Then median can be done using top-(N/2) heap.

@anakinxc
Copy link
Contributor

anakinxc commented Nov 9, 2022

Actually, current spu should be able to implement data structure like, top-k heap. Then median can be done using top-(N/2) heap.

This really depends on how JAX implement median.
Right now they are using quantile(..., 0.5) to compute median, and quantile is using sort with a crazy comparator when input is floating point to handle nonfinite values and negative zero, which we don't really care.

During the compilation pipeline, when we have the semi-optimized IR, it is definitely possible to discover and rewrite this pattern with some efforts.

This kind of specific pattern rewrite is obviously not a unicorn, but still hacky...We are trying to see if this can be solved in a more elegant way.

@anakinxc anakinxc added the enhancement New feature or request label Jan 30, 2023
@anakinxc
Copy link
Contributor

anakinxc commented Mar 3, 2023

Fixed with #107

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants