Skip to content

Commit

Permalink
add postgreSQL docker image with udf (#98)
Browse files Browse the repository at this point in the history
add postgre docker image with udf
  • Loading branch information
fbalicchia committed Mar 2, 2022
1 parent 59e27c2 commit 61bc5e6
Show file tree
Hide file tree
Showing 5 changed files with 171 additions and 2 deletions.
8 changes: 6 additions & 2 deletions cc/postgres/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@
This subdirectory contains a PostgreSQL extension providing several epsilon-DP
aggregate functions. We will refer to them as the anonymous functions.

## Setup
## Setup

* Install Postgres 11 using the source code.
* Install Postgres 12 using the source code.
* Source: https://www.postgresql.org/ftp/source/
* Instructions: https://www.postgresql.org/docs/9.3/install-short.html

Expand All @@ -26,6 +26,10 @@ aggregate functions. We will refer to them as the anonymous functions.
CREATE EXTENSION anon_func;
```

## Setup Docker deployment

please refere to [README](./docker/README.md) in docker folder

### Common Issues

There are several known setup problems; we list suggested solutions for them
Expand Down
71 changes: 71 additions & 0 deletions cc/postgres/docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
FROM postgres:12 AS build

ENV PROTOC_VERSION=1.3


# Install the packages which will be required to get everything to compile
RUN apt-get update \
&& apt-get install -f -y --no-install-recommends \
software-properties-common \
build-essential \
pkg-config \
git \
curl \
libreadline-dev \
bison \
flex \
postgresql-server-dev-$PG_MAJOR \
&& add-apt-repository "deb http://ftp.debian.org/debian testing main contrib" \
&& apt-get update && apt-get install -f -y --no-install-recommends \
libprotobuf-c-dev=$PROTOC_VERSION.* \
&& rm -rf /var/lib/apt/lists/*

#Install Bazel to build code
RUN echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | tee /etc/apt/sources.list.d/bazel.list \
&& curl https://bazel.build/bazel-release.pub.gpg | apt-key add -


RUN apt-get update \
&& apt-get install -y bazel=4.1.0 \
&& rm -rf /var/lib/apt/lists/*


##Download differential privacy module


RUN git clone https://github.com/google/differential-privacy.git --single-branch /tmp/differential-privacy \
&& cd /tmp/differential-privacy \
&& git checkout fc4f2abda5052f654539fc128 \
&& cd /tmp/differential-privacy/cc \
&& bazel build postgres/anon_func.so




FROM postgres:12

RUN apt-get update \
&& apt-get install -f -y --no-install-recommends \
software-properties-common \
&& add-apt-repository "deb http://ftp.debian.org/debian testing main contrib" \
&& apt-get update && apt-get install -f -y --no-install-recommends \
libprotobuf-c1 \
&& rm -rf /var/lib/apt/lists/*

COPY --from=build /tmp/differential-privacy/cc/bazel-bin/postgres/anon_func.so /usr/lib/postgresql/$PG_MAJOR/lib/
COPY --from=build /tmp/differential-privacy/cc/postgres/anon_func.control /usr/share/postgresql/$PG_MAJOR/extension/
COPY --from=build /tmp/differential-privacy/cc/postgres/anon_func--1.0.0.sql /usr/share/postgresql/$PG_MAJOR/extension/

#Copy Dataset sample
COPY --from=build /tmp/differential-privacy/cc/postgres/fruiteaten.csv /
COPY --from=build /tmp/differential-privacy/cc/postgres/shirts.csv /

# Copy the custom configuration which will be passed down to the server (using a .sample file is the preferred way of doing it by
# the base Docker image)
COPY postgresql.conf.sample /usr/share/postgresql/postgresql.conf.sample
RUN chmod 755 /usr/share/postgresql/postgresql.conf.sample

# Copy the script which will initialize the replication permissions
COPY /docker-entrypoint-initdb.d /docker-entrypoint-initdb.d
RUN chmod 755 /docker-entrypoint-initdb.d

81 changes: 81 additions & 0 deletions cc/postgres/docker/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
This image is based upon [`postgres:12`](https://hub.docker.com/_/postgres/) and adds PostgreSQL extension with several epsilon-DP aggregate functions.

## How to build image

```
docker build --no-cache -t google/differential-privacy-postgres .
```
and run
```
docker run -e POSTGRES_PASSWORD=password -p 5432:5432 google/differential-privacy-postgres
```

## Load extension

```
psql -U postgres -h localhost -p 5432
```
connect using password `password` and load the extension by calling
```
CREATE EXTENSION anon_func;
```
## Run anon_func in Postgres

import data examples already present in image with commands

```
CREATE TABLE FruitEaten (
uid integer,
fruit character varying(20)
);
COPY fruiteaten(uid, fruit) FROM '/fruiteaten.csv' DELIMITER ',' CSV HEADER;
```

## Simple Count (Disclaimer This section is a part of C++ project)


In this table, each row represents one fruit eaten. So if person `1` eats two
`apple`s, then there will be two rows in the table with column values
`(1, apple)`. Consider a simple query counting how many of each fruit have been
eaten.

```
SELECT fruit, COUNT(fruit)
FROM FruitEaten
GROUP BY fruit;
```


Suppose that instead of getting the regular count, we want the differentially
private count with the privacy parameter ε=ln(3). The final product of the query
rewrite would be

```
SELECT result.fruit, result.number_eaten
FROM (
SELECT per_person.fruit,
ANON_SUM(per_person.fruit_count, LN(3)/2) as number_eaten,
ANON_COUNT(uid, LN(3)/2) as number_eaters
FROM(
SELECT * , ROW_NUMBER() OVER (
PARTITION BY uid
ORDER BY random()
) as row_num
FROM (
SELECT fruit, uid, COUNT(fruit) as fruit_count
FROM FruitEaten
GROUP BY fruit, uid
) as per_person_raw
) as per_person
WHERE per_person.row_num <= 5
GROUP BY per_person.fruit
) as result
WHERE result.number_eaters > 50;
```

please for more infos or more accurate description please use [README](https://github.com/google/differential-privacy/blob/main/cc/README.md)

## Troubleshooting

If during startup stumble permission problem on file `docker-entrypoint-initdb.d` and `postgresql.conf.sample` please run on both files `chmod 755` command
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
#!/bin/bash
set -e

{ echo "host replication $POSTGRES_USER 0.0.0.0/0 trust"; } >> "$PGDATA/pg_hba.conf"
9 changes: 9 additions & 0 deletions cc/postgres/docker/postgresql.conf.sample
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# LOGGING
# log_min_error_statement = fatal
# log_min_messages = DEBUG1

# CONNECTION
listen_addresses = '*'

# MODULES
shared_preload_libraries = 'anon_func'

0 comments on commit 61bc5e6

Please sign in to comment.