diff --git a/.asf.yaml b/.asf.yaml index 6043b67..84deac3 100644 --- a/.asf.yaml +++ b/.asf.yaml @@ -30,9 +30,9 @@ github: issues: true notifications: - commits: commits@arrow.apache.org + commits: commits@arrow.apache.org issues_status: issues@arrow.apache.org - issues: github@arrow.apache.org + issues: github@arrow.apache.org pullrequests: github@arrow.apache.org publish: diff --git a/.github/workflows/lint.yaml b/.github/workflows/lint.yaml index 57eede9..c25fb32 100644 --- a/.github/workflows/lint.yaml +++ b/.github/workflows/lint.yaml @@ -35,7 +35,7 @@ jobs: - uses: actions/checkout@v3 - uses: actions/setup-python@v4 with: - python-version: '3.x' + python-version: "3.x" - name: Run Release audit tool (Rat) run: dev/release/run_rat.sh . @@ -46,7 +46,7 @@ jobs: - uses: actions/checkout@v3 - uses: actions/setup-python@v4 with: - python-version: '3.x' + python-version: "3.x" - name: Run pre-commit run: | python -m pip install pre-commit diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index 3b5e536..c30f15a 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -20,3 +20,10 @@ repos: rev: "v15.0.7" hooks: - id: clang-format + types_or: + - c++ + - c + - repo: https://github.com/pre-commit/mirrors-prettier + rev: "v3.0.2" + hooks: + - id: prettier diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..29c8f6b --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,325 @@ + + +# How to contribute + +Thanks for contributing this project! + +## Report your problems or requests + +Please file issues on the GitHub issue tracker: +https://github.com/apache/arrow-flight-sql-postgresql/issues + +You can also use the following mailing lists: + +- `dev@arrow.apache.org`: for discussions about contributing to the + project development. +- `user@arrow.apache.org`: for questions on usage. + +See https://arrow.apache.org/community/#mailing-lists how to subscribe +these mailing lists. + +## Build for development + +### Install dependencies + +You need to install the following dependencies before you build Apache +Arrow Flight SQL adapter for PostgreSQL: + +- PostgreSQL +- Apache Arrow Flight SQL C++ +- Meson +- CMake +- Ninja +- C++ compiler such as `g++` and `clang++` + +#### PostgreSQL + +You can find packages and source archive of PostgreSQL at +https://www.postgresql.org/download/ . If you use packages, you need +to install packages for development. For example, +`postgresql-server-dev-XXX` is for deb packages and +`postgresqlXXX-devel` is for RPM packages. + +The latest release is recommended. + +#### Apache Arrow Flight SQL C++ + +You can find packages and source archive of Apache Arrow C++ at +https://arrow.apache.org/install/ . If you use packages, you need to +install packages for development. For example, +`libarrow-flight-sql-dev` is for deb packages and +`arrow-flight-sql-devel` is for RPM packages. + +The latest release is recommended. + +#### Meson + +Meson is a build system that is also used by PostgreSQL. + +You can install Meson by system package managers such as `apt` and +`brew`. + +For example, you can install Meson by `apt` on Debian GNU/Linux and +Ubuntu: + +```bash +sudo apt install meson +``` + +For example, you can install Meson by `brew` on macOS: + +```bash +brew install meson +``` + +Or you can use `pip3` to install Meson: + +```bash +pip3 install meson +``` + +See also: https://mesonbuild.com/Getting-meson.html + +#### CMake + +CMake is also a build system. Meson uses CMake to find CMake +packages. Apache Arrow Flight SQL adapter for PostgreSQL uses Apache +Arrow Flight SQL C++ CMake package. So both of Meson and CMake are +needed. + +If installing CMake bothers contributors, we can improving our build +system to use CMake or pkg-config to find Apache Arrow Flight SQL +C++. If you want the improvement, please report it to our issue +tracker: https://github.com/apache/arrow-flight-sql-postgresql/issues + +You can install CMake by package managers such as `apt` and `brew`. + +For example, you can install CMake by `apt` on Debian GNU/Linux and +Ubuntu: + +```bash +sudo apt install cmake +``` + +For example, you can install CMake by `brew` on macOS: + +```bash +brew install cmake +``` + +See also: https://cmake.org/install/ + +#### Ninja + +Ninja is also a build system but it differs from Meson and +CMake. Meson and CMake only generate configuration files for Ninja and +Ninja runs C++ compilers and so on based on the generated +configuration files. + +You can install Ninja by package managers such as `apt` and `brew`. + +For example, you can install Ninja by `apt` on Debian GNU/Linux and +Ubuntu: + +```bash +sudo apt install ninja-build +``` + +For example, you can install Ninja by `brew` on macOS: + +```bash +brew install ninja +``` + +See also: https://ninja-build.org/ + +### Build + +If you install PostgreSQL and Apache Arrow Flight SQL C++ to system +directory such as `/usr`, you can use the following simple command +lines: + +```bash +meson setup build +meson compile -C build +meson install -C build +``` + +If you install PostgreSQL to `/tmp/local`, you can use +`-Dpostgresql_dir=/tmp/local` option: + +```bash +meson setup -Dpostgresql_dir=/tmp/local build +meson compile -C build +meson install -C build +``` + +If you specify `postgresql_dir`, it's recommended that you also +specify `--prefix` with the same location. Apache Arrow Flight SQL +adapter for PostgreSQL installs README and so on to +`--prefix`: + +```bash +meson setup -Dpostgresql_dir=/tmp/local --prefix=/tmp/local build +meson compile -C build +meson install -C build +``` + +If you install Apache Arrow Flight SQL C++ to `/tmp/local`, you can +use `--cmake-prefix-path`: + +```bash +meson setup --cmake-prefix-path=/tmp/local build +meson compile -C build +meson install -C build +``` + +I you want to build benchmark programs, you can use +`-Dbenchmark=true`: + +```bash +meson setup -Dbenchmark=true build +meson compile -C build +meson install -C build +``` + +I you want to build example programs, you can use `-Dexmaple=true`: + +```bash +meson setup -Dexample=true build +meson compile -C build +meson install -C build +``` + +### Test + +You need Ruby and Red Arrow Flight SQL (red-arrow-flight-sql gem, +Apache Arrow Flight SQL C++ bindings for Ruby) to run tests. + +You can install Ruby by package managers such as `apt` and `brew`. + +For example, you can install Ruby by `apt` on Debian GNU/Linux and +Ubuntu: + +```bash +sudo apt install ruby +``` + +```bash +brew install ruby +``` + +You can install Red Arrow Flight SQL by Bundler that is bundled in +Ruby: + +```bash +bundle install +``` + +You can run tests in the build directory. We can change the current +directory before we run a Ruby script by `ruby`'s `-C` option: + +```bash +bundle exec ruby -C build ../test/run.rb +``` + +### Run + +You can use `dev/run-postgresql.sh` to run PostgreSQL with Apache +Arrow Flight SQL adapter for PostgreSQL. You need to specify +PostgreSQL data directory to use `dev/run-postgresql.sh`: + +```bash +dev/run-postgresql.sh /tmp/afs +``` + +You can connect to `grpc://127.0.0.1:15432`. + +If you build example programs, you can test the endpoint by the +following command line: + +```bash +PGDATABASE=postgres example/flight-sql/authenticate-password +``` + +If you specify CA name, server name and client name, it also prepare +TLS. + +For example, you can prepare `root.home`, `server.home` and +`client.home` by adding the following entry to `/etc/hosts`: + +```text +127.0.0.1 localhost root.home server.home client.home +``` + +In this case, you can prepare TLS and run PostgreSQL by the following +command line: + +```bash +dev/run-postgresql.sh /tmp/afs root.home server.home client.home +``` + +You can connect to `grpc+tls://server.home:15432`. You need to use +`/tmp/afs/root.crt` for TLS root certificates. + +If you build example programs, you can test the endpoint by the +following command line: + +```bash +PGDATABASE=postgres \ + PGHOST=server.home \ + PGSSLMODE=require \ + PGSSLROOTCERT=/tmp/afs/root.crt \ + example/flight-sql/authenticate-password +``` + +## Pull request + +Please open a new issue before you open a pull request. It's for the +[Openness](http://theapacheway.com/open/) of this project. + +We don't have rules for pull request titles and commit messages +yet. We may create rules later. Please see other merged pull requests +for now. + +You can format codes automatically by +[`pre-commit`](https://pre-commit.com/). + +You can install `pre-commit` by package managers such as `apt` and +`brew`. + +For example, you can install `pre-commit` by `apt` on Debian GNU/Linux +and Ubuntu: + +```bash +sudo apt install pre-commit +``` + +For example, you can install `pre-commit` by `brew` on macOS: + +```bash +brew install pre-commit +``` + +You can run `pre-commit` before you commit: + +```shell +pre-commit run +``` diff --git a/README.md b/README.md new file mode 100644 index 0000000..b19f803 --- /dev/null +++ b/README.md @@ -0,0 +1,35 @@ + + +# Apache Arrow Flight SQL adapter for PostgreSQL + +This is a PostgreSQL extension that adds an [Apache Arrow Flight +SQL][apache-arrow-flight-sql] endpoint to PostgreSQL. + +See https://arrow.apache.org/flight-sql-postgresql/ for details. + +## How to contribute + +See [CONTRIBUTING.md](CONTRIBUTION.md). + +## License + +The Apache License 2.0. See [LICENSE.txt](LICENSE.txt) for details. + +[apache-arrow-flight-sql]: https://arrow.apache.org/docs/format/FlightSql.html diff --git a/benchmark/integer/README.md b/benchmark/integer/README.md index 6ec23ae..86fe02d 100644 --- a/benchmark/integer/README.md +++ b/benchmark/integer/README.md @@ -40,9 +40,9 @@ It also inserts 10M records with random integers to the table. Run the following programs: -* `select.rb`: It uses Apache Arrow Flight SQL -* `select`: It uses PostgreSQL's C API -* `select.sql`: You need to use `psql` to run this +- `select.rb`: It uses Apache Arrow Flight SQL +- `select`: It uses PostgreSQL's C API +- `select.sql`: You need to use `psql` to run this All of them just run `SELECT * FROM data`. @@ -50,14 +50,14 @@ All of them just run `SELECT * FROM data`. Here is a benchmark result on the following environment: -* OS: Debian GNU/Linux sid -* CPU: AMD Ryzen 9 3900X 12-Core Processor -* Memory: 64GiB -* PostgreSQL: 16 (not released yet) +- OS: Debian GNU/Linux sid +- CPU: AMD Ryzen 9 3900X 12-Core Processor +- Memory: 64GiB +- PostgreSQL: 16 (not released yet) 019f8624664dbf1e25e2bd721c7e99822812d109 -* Apache Arrow: 12.0.0-SNAPSHOT +- Apache Arrow: 12.0.0-SNAPSHOT 237705bf17486cfc35ab7d1ddfe59dd60f042ab8 -* Apache Arrow Flight SQL PostgreSQL adapter: +- Apache Arrow Flight SQL PostgreSQL adapter: 0.1.0 (not released yet) 120e7bbd3fd580c892c988499d488c7e8b34efe2 @@ -80,4 +80,3 @@ Here is a benchmark result on the following environment: | Apache Arrow Flight SQL | C | psql | | ----------------------- | ----- | ----- | | 0.653 | 1.154 | 1.128 | - diff --git a/dev/run-postgresql.sh b/dev/run-postgresql.sh index 209bfb9..834b618 100755 --- a/dev/run-postgresql.sh +++ b/dev/run-postgresql.sh @@ -19,16 +19,29 @@ set -eu -if [ $# -ne 4 ]; then +if [ $# -ne 1 -a $# -ne 4 ]; then + echo "Usage: $0 DATA_DIRECTORY" + echo " e.g.: $0 /tmp/afs" + echo "Or:" echo "Usage: $0 DATA_DIRECTORY CA_NAME SERVER_NAME CLIENT_NAME" echo " e.g.: $0 /tmp/afs root.example.com server.example.com client.example.com" exit 1 fi data_directory=$1 -root_name=$2 -server_name=$3 -client_name=$4 +if [ $# -eq 1 ]; then + scheme=grpc + ssl=off + ssl_ca_file= + server_name=127.0.0.1 +else + scheme=grpc+tls + ssl=on + ssl_ca_file=root.crt + root_name=$2 + server_name=$3 + client_name=$4 +fi base_directory="$(cd "$(dirname "$0")" && pwd)" @@ -36,15 +49,17 @@ rm -rf "${data_directory}" initdb \ --locale=C \ - --set=arrow_flight_sql.uri=grpc+tls://${server_name}:15432 \ + --set=arrow_flight_sql.uri=${scheme}://${server_name}:15432 \ --set=shared_preload_libraries=arrow_flight_sql \ - --set=ssl=on \ - --set=ssl_ca_file=root.crt \ + --set=ssl=${ssl} \ + --set=ssl_ca_file=${ssl_ca_file} \ "${data_directory}" -pushd "${data_directory}" -"${base_directory}/prepare-tls.sh" \ - "${root_name}" \ - "${server_name}" \ - "${client_name}" -popd +if [ "${ssl}" = "on" ]; then + pushd "${data_directory}" + "${base_directory}/prepare-tls.sh" \ + "${root_name}" \ + "${server_name}" \ + "${client_name}" + popd +fi LANG=C postgres -D "${data_directory}" diff --git a/doc/index.html b/doc/index.html index b1fa92d..382d778 100644 --- a/doc/index.html +++ b/doc/index.html @@ -1,4 +1,4 @@ - +
- + - - + diff --git a/doc/source/_static/switcher.json b/doc/source/_static/switcher.json index 5d13223..f0f778b 100644 --- a/doc/source/_static/switcher.json +++ b/doc/source/_static/switcher.json @@ -1,17 +1,17 @@ [ - { - "name": "devel", - "url": "https://arrow.apache.org/flight-sql-postgresql/devel/", - "version": "devel" - }, - { - "name": "current", - "url": "https://arrow.apache.org/flight-sql-postgresql/current/", - "version": "current" - }, - { - "name": "0.1.0", - "url": "https://arrow.apache.org/flight-sql-postgresql/0.1.0/", - "version": "0.1.0" - } + { + "name": "devel", + "url": "https://arrow.apache.org/flight-sql-postgresql/devel/", + "version": "devel" + }, + { + "name": "current", + "url": "https://arrow.apache.org/flight-sql-postgresql/current/", + "version": "current" + }, + { + "name": "0.1.0", + "url": "https://arrow.apache.org/flight-sql-postgresql/0.1.0/", + "version": "0.1.0" + } ] diff --git a/doc/source/configuration.md b/doc/source/configuration.md index ae220b9..49b586e 100644 --- a/doc/source/configuration.md +++ b/doc/source/configuration.md @@ -50,7 +50,7 @@ Note that you also need to setup client side. For example, see the following documentations for the C++ implementation of Apache Arrow Flight SQL client: -* [Enable TLS][arrow-flight-tls] +- [Enable TLS][arrow-flight-tls] ```{note} mTLS (mutual-TLS) isn't implemented yet. If you're interested in mTLS, @@ -72,7 +72,7 @@ automatically. The maximum number of rows per record batch. -The default is 1 * 1024 * 1024 rows. +The default is 1 \* 1024 \* 1024 rows. If this value is small, total data exchange time will be slower. diff --git a/doc/source/index.md b/doc/source/index.md index fedc6db..ca70714 100644 --- a/doc/source/index.md +++ b/doc/source/index.md @@ -32,9 +32,8 @@ configuration.md client.md ``` - ## Indices and tables -* {ref}`genindex` -* {ref}`modindex` -* {ref}`search` +- {ref}`genindex` +- {ref}`modindex` +- {ref}`search` diff --git a/doc/source/install.md b/doc/source/install.md index 4daa30e..b716a7b 100644 --- a/doc/source/install.md +++ b/doc/source/install.md @@ -23,8 +23,8 @@ Supported versions: -* Debian GNU/Linux bookworm -* Ubuntu 22.04 LTS +- Debian GNU/Linux bookworm +- Ubuntu 22.04 LTS Enable the PostgreSQL APT repository: @@ -63,11 +63,11 @@ See {doc}`configuration` how to configure Apache Arrow Flight SQL adapter for Po You need to install the followings before you build Apache Arrow Flight SQL adapter for PostgreSQL: -* PostgreSQL: https://www.postgresql.org/download/ -* Apache Arrow C++ with Flight SQL support: https://arrow.apache.org/install/ -* Meson: https://mesonbuild.com/ -* Ninja: https://ninja-build.org/ -* C++ compiler such as `g++` and `clang++ +- PostgreSQL: https://www.postgresql.org/download/ +- Apache Arrow C++ with Flight SQL support: https://arrow.apache.org/install/ +- Meson: https://mesonbuild.com/ +- Ninja: https://ninja-build.org/ +- C++ compiler such as `g++` and `clang++ Here are command lines to build Apache Arrow Flight SQL adapter for PostgreSQL: diff --git a/doc/source/release-notes.md b/doc/source/release-notes.md index f3159e3..3f9dd1a 100644 --- a/doc/source/release-notes.md +++ b/doc/source/release-notes.md @@ -23,4 +23,4 @@ ### Improvements - * The initial release! +- The initial release!