New PIR implementations #57

ryscheng · 2017-05-05T22:55:00Z

Added:

New Shard interface in libpir
ShardCPU and ShardCL implementations of this Shard
3 different OpenCL kernel implementations for ShardCL
Correctness tests for all of them.

I have yet to benchmark all 3 OpenCL kernels, but it's a pretty big PR, so it's be nice to get a review earlier rather than later.

NOTE: the ShardCUDA implementation is tracked in the ryscheng-cuda branch.

add example cl code in shardcl

More code in ContextCL

ryscheng · 2017-05-08T06:27:23Z

Added the CUDA implementation

willscott · 2017-05-08T16:47:59Z

pir/context_cl.go:44 - should this be a separate utility function rather than comment?
pir/context_cl.go:66 - can we flag which device to use in cases where there are multiple?
pir/context_cl.go:178 - why private method banner with no methods? get rid of it. more generally this banner should be implicit based on method names with capital / lowercase initial letters.
pir/cuda_modules/Makefile:4 - what is compute_35? why that? will it ever change?
pir/cuda_modules/pir.ptx:1 - this is generated code. shouldn't it be in gitignore?
pir/kernel_cl.go:21 - no. you can have a template file that you process, but having c code as string within go is not a reasonable way to structure this. put the kernels in c files.
pir/shard.go:12 - GetName is not part of the interface, really - implementations can over ride 'String()' to have a pretty-printing format, but this isn't a functional part of what we're asking a shard to be.
pir/shard_cl.go:186 - this seems like it's going to fully serialize kernel execution, which will result in the pipeline being fully empty at the point finish returns and meaning there are periods where the GPU doesn't have work to do. ideally the next compute call has been enqueued before the previous one finishes.
pir/shard_cl.go:198 - remove comment
pir/shard_test.go:111 - cpu tests should be in a separate file from the common helper functions

High level comments:

The primary difference between the PIR interface you're picking for shard and what's in the previous libPIR is that the previous libPIR would allocate the database memory for the writer to write into. (getDB, setDB). Are we okay relaxing that restrction?
the different systems are a bit intertwined. can we factor these to pir/cpu, pir/cuda, and pir/cl with the common interface left in pir/ ?

ryscheng · 2017-05-09T16:12:00Z

Okay I've started working through your comments. Just for prosperity's sake, I ran the benchmarks with OpenCL using the CPU instead, just to see how it works.
High-level, ShardCPU outperforms ShardCL on the CPU
BenchmarkShardCLReadv0-8 1 22108349995 ns/op
BenchmarkShardCLReadv1-8 1 49419166788 ns/op
BenchmarkShardCLReadv2-8 1 54085086233 ns/op
BenchmarkShardCLReadv3-8 1 27393025933 ns/op
BenchmarkShardCPUReadv0-8 1 61838823013 ns/op
BenchmarkShardCPUReadv1-8 1 8249138781 ns/op

ryscheng · 2017-05-10T06:17:19Z

Re: pir/cuda_modules
Specifying the architecture determines which subset of instructions you have access to. compute_35 corresponds to compute compatibility 3.5, which is when they added atomic instructions. We don't need any of the newer instructions yet, so I think we are okay here as long as your GPU has compute compatibility >3.5. See here for a more comprehensive list:
https://developer.nvidia.com/cuda-gpus

Re: committing the ptx file, I figured it was useful because

it's just intermediate representation, so architecture independent and diffs properly
it's pretty short
means we don't need to have nvcc installed after you get the repo to run it.

Re: Shard interface
Personally, I think having Shard represent an immutable static partition of the database is a better abstraction and easier to test/debug than a server that flipped between 2 working copies of memory. If the concern is that you want to recycle existing memory allocations, we can still do that performance optimization (if its necessary), but the shard interface can be agnostic to it.

Just 3 more incomplete changes to go:

pir/context_cl.go:44 - should this be a separate utility function rather than comment?
pir/context_cl.go:66 - can we flag which device to use in cases where there are multiple?
pir/kernel_cl.go:21 - no. you can have a template file that you process, but having c code as string within go is not a reasonable way to structure this. put the kernels in c files.

ryscheng · 2017-05-10T19:21:34Z

The plan was to separate kernel_cl.go into separate c files and read them in dynamically, but the problem is that the Cgo interface we are using for OpenCL requires a pointer to source code, and I spent all morning trying to generate a data structure that contained no go pointers. See here why:
golang/go#12416
Otherwise you get the following error:
"panic: runtime error: cgo argument has Go pointer to Go pointer [recovered]"
as seen in this commit: e853b1b

I'm officially giving up. I know kernel_cl.go is ugly, but from what I can tell, every consumer of this 'cl' package does the same thing.

Also with respect to specifying devices, in the future, my plan was to just let a single context enumerate all GPUs and create 1 context across all GPU devices. See this issue:
#58

In summary, could you take a look at the changes since the last review?

willscott · 2017-05-10T23:49:05Z

pir/pircl/doc_test.go

@@ -0,0 +1 @@
+package pircl


With our build flags, the OpenCL and CUDA implementations are ignored for build/test on Travis. However, tests and goveralls will still look for a test report from these folders (which appear empty). See
https://travis-ci.org/privacylab/talek/jobs/230595872

I added doc.go and doc_test.go to keep the tests happy.

willscott · 2017-05-11T02:35:43Z

i think the docs files are to make godocs show up? are tests needed as well?

I guess the only other comment is that it would be great to put in the pretty simple shim between this shard interface and the PIR interface currently used by server code. Then the end-to-end consistency test can verify correctness against these backends rather than doing that at the same time as changing server logic.

ryscheng · 2017-05-12T16:08:17Z

Resolved the conflict with master

…eng-pir

willscott · 2017-05-13T20:31:21Z

when I try to use pircpu, its dependencies don't seem to have ever gotten installed
https://travis-ci.org/privacylab/talek/jobs/231947473

(if i run go get in that directory, i can compile locally)

That seems like it would be a problem more generally.

ryscheng added 30 commits April 27, 2017 15:27

new shard abstraction and CPU-backed PIR implementation

52a80e0

fix errors

1bb057e

cleanup shard tests

8e0db40

benchmarking 2 cpu PIR variations

00e99f0

took out bitset in shard. Tests not passing

4dea849

fix bug in test

3b54ee4

adding cl dependency

9605269

adding shell ShardCL

c2807cb

add clinfo to query opencl devices

9545b8d

add example cl code in shardcl

comment out errors

59c6617

add some README for OpenCL and CUDA installation

9af198c

nit

ab4502d

fix travis.yml

dabb2c8

try non-container build

6b9bc80

hack to get running on Ubuntu

af2a8ea

adding build tags to remove OpenCL dependent tests from Travis build

69c9665

back to container-based build

3b6d756

only 1 job uploading to coveralls

2fb7050

fix Makefile

fe69b64

fix makefile

ae962bd

moving clinfo back to main

f538949

fix bug in shardcpu

9935be4

refactor shard tests to use afterEach

6aaeb5b

More code in ContextCL

more work on OpenCL PIR

74e9e04

fixed clinfo to support uint

714da3a

working on some kernels

4f6c73a

fix kernel

ed49a58

removing doc.go

a0baafe

switch from readversion to specifying global_size for different kernels

2a7b781

cuda context

6c1a5b6

debug CUDA kernel

3bce7a1

ryscheng mentioned this pull request May 8, 2017

CUDA/OpenCL/CPU-SIMD Comparison #56

Closed

a little reorg

b22f0f0

ryscheng added 8 commits May 9, 2017 17:59

remove GetName from interfaces

a9bea82

reorging to subdir

b7dc3a9

fixup pircpu

e6be308

change pircuda package name

1cc5462

address linting

e933b35

update pircuda

e08b470

address weird context issues in CUDA

497e84a

fix coverage test

3075b08

ryscheng added 4 commits May 9, 2017 23:21

remove some empty banner comments

b946de3

move cuda files

1df3663

moving kernels out to separate files

e853b1b

revert

a33e34a

willscott reviewed May 10, 2017

View reviewed changes

ryscheng added 2 commits May 11, 2017 11:28

adding a couple tests

6536801

Merge branch 'master' into ryscheng-pir

a068470

ryscheng added 2 commits May 12, 2017 10:49

update how we do logging and add more log statements to PIR

eea9f71

Merge branch 'ryscheng-pir' of github.com:privacylab/talek into rysch…

d0c44c6

…eng-pir

ryscheng merged commit 07c8e32 into master May 14, 2017

willscott mentioned this pull request May 16, 2017

can't read a write to a topic #62

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New PIR implementations #57

New PIR implementations #57

ryscheng commented May 5, 2017

ryscheng commented May 8, 2017

willscott commented May 8, 2017

ryscheng commented May 9, 2017 •

edited

Loading

ryscheng commented May 10, 2017 •

edited

Loading

ryscheng commented May 10, 2017

willscott May 10, 2017

ryscheng May 10, 2017

willscott commented May 11, 2017

ryscheng commented May 12, 2017

willscott commented May 13, 2017

New PIR implementations #57

New PIR implementations #57

Conversation

ryscheng commented May 5, 2017

ryscheng commented May 8, 2017

willscott commented May 8, 2017

ryscheng commented May 9, 2017 • edited Loading

ryscheng commented May 10, 2017 • edited Loading

ryscheng commented May 10, 2017

willscott May 10, 2017

Choose a reason for hiding this comment

ryscheng May 10, 2017

Choose a reason for hiding this comment

willscott commented May 11, 2017

ryscheng commented May 12, 2017

willscott commented May 13, 2017

ryscheng commented May 9, 2017 •

edited

Loading

ryscheng commented May 10, 2017 •

edited

Loading