Skip to content
Permalink
Browse files
Initial C++ cookbook (#22)
* Initial C++ cookbook

* Addressing PR feedback.  Converted contributing doc from RST to MD (since the extension was MD).

* Update cpp/CONTRIBUTING.md

Co-authored-by: David Li <li.davidm96@gmail.com>

* Creating standalone section for code of conduct

* Update cpp/CONTRIBUTING.md

Co-authored-by: David Li <li.davidm96@gmail.com>

* Addressing PR comments

Co-authored-by: David Li <li.davidm96@gmail.com>
  • Loading branch information
westonpace and lidavidm committed Aug 23, 2021
1 parent b15a357 commit d15e75c1d15b22b27d634a4254ef9ee8c41a9737
Showing 22 changed files with 1,155 additions and 3 deletions.
@@ -29,10 +29,47 @@ jobs:
name: build_book
path: build/

make_cpp:
name: build c++
runs-on: ubuntu-latest
defaults:
run:
shell: bash -l {0}
steps:
- uses: actions/checkout@v1
- name: Cache conda
uses: actions/cache@v2
env:
# Increase this value to reset cache if cpp/environment.yml has not changed
CACHE_NUMBER: 0
with:
path: ~/conda_pkgs_dir
key:
${{ runner.os }}-conda-${{ env.CACHE_NUMBER }}-${{ hashFiles('cpp/environment.yml') }}
- name: Setup environment
uses: conda-incubator/setup-miniconda@v2
with:
auto-update-conda: true
python-version: 3.9
activate-environment: cookbook-cpp
environment-file: cpp/environment.yml
auto-activate-base: false
- name: Test
run:
echo ${CONDA_PREFIX}
- name: Build cookbook
run:
make cpp
- name: Upload cpp book
uses: actions/upload-artifact@v1
with:
name: cpp_book
path: build/cpp

deploy_cookbooks:
name: deploy
runs-on: ubuntu-latest
needs: make_cookbooks
needs: [make_cookbooks, make_cpp]
steps:
- name: Checkout repo
uses: actions/checkout@v2
@@ -47,6 +84,11 @@ jobs:
with:
name: build_book
path: .
- name: Download cpp book
uses: actions/download-artifact@v1.0.0
with:
name: cpp_book
path: ./cpp
- name: Push changes to gh-pages/asf-site branch
run: |
git config --global user.name 'GitHub Actions'
@@ -1,12 +1,18 @@
r/content/_book/**
r/content/_main.Rmd
r/*.Rproj
*.Rproj
.Rproj.user

*.idea/
*.vscode/

*build/

*.parquet
*.arrow
*.arrows
*.csv
*.feather

*.pyc
@@ -2,10 +2,10 @@ all: html


html: py r
@echo "\n\n>>> Cookbooks Available in ./build <<<"
@echo "\n\n>>> Cookbooks (except C++) Available in ./build <<<"


test: pytest rtest
test: pytest rtest


help:
@@ -48,8 +48,23 @@ r: rdeps
mkdir -p build/r
cp -r r/content/_book/* build/r


rtest: rdeps
@echo ">>> Testing R Cookbook <<<\n"
cd ./r && Rscript ./scripts/test.R


cpptest:
@echo ">>> Running C++ Tests/Snippets <<<\n"
rm -rf cpp/recipe-test-build
mkdir cpp/recipe-test-build
cd cpp/recipe-test-build && cmake ../code -DCMAKE_BUILD_TYPE=Debug && cmake --build . && ctest -j 1
mkdir -p cpp/build
cp cpp/recipe-test-build/recipes_out.arrow cpp/build


cpp: cpptest
@echo ">>> Building C++ Cookbook <<<\n"
cd cpp && make html
mkdir -p build/cpp
cp -r cpp/build/html/* build/cpp
@@ -38,6 +38,7 @@ <h1>Apache Arrow Cookbook</h1>
but some are specific to the language and environment in use.
</p>
<ul>
<li><a href="cpp/index.html">C++ Cookbook</a></li>
<li><a href="py/index.html">Python Cookbook</a></li>
<li><a href="r/index.html">R Cookbook</a></li>
</ul>
@@ -0,0 +1,170 @@
# Bulding the C++ Cookbook

The C++ cookbook combines output from a set of C++ test programs with
an reStructuredText (RST) document tree rendered with Sphinx.

Running `make py` from the cookbook root directory (the one where
the `README.rst` exists) will install all necessary dependencies,
run the tests to generate the output, and will compile the cookbook
to HTML.

You will see the compiled result inside the `build/cpp` directory.

The above process requires conda to be installed and is primarily
intended for build systems. See below for more information on setting
up a development environment for developing recipes.

# Developing C++ Recipes

Every recipe is a combination of prose written in RST
format using the [Sphinx](https://www.sphinx-doc.org/) documentation
system and a snippet of a googletest test.

New recipes can be added to one of the existing `.rst` files if
they suit that section or you can create new sections by adding
additional `.rst` files in the `source` directory. You just
need to remember to add them to the `index.rst` file in the
`toctree` for them to become visible.

## Referencing a C++ Snippet

Most recipes will reference a snippet of C++ code. For simplicity
a custom `recipe` directive that can be used like so:

```
.. recipe:: ../code/creating_arrow_objects.cc CreatingArrays
:caption: Creating an array from C++ primitives
:dedent: 4
```

Each `recipe` directive has two requried arguments. The first is
a path to the file containing the source file containing the snippet
and the second is the name of the snippet and must correspond to a
set of CreateRecipe/EndRecipe calls in the source file.

The directive will generate two code blocks in the cookbook. The first
code block will contain the source code itself and will be annotated
with any (optional) caption specified on the recipe directive. The
second block will contain the test output.

The optional `dedent` argument should be used to remove leading white
space from your source code.

## Writing a C++ Snippet

Each snippet source file contains a set of
[googletest](https://github.com/google/googletest) tests. Feel free to
use any googletest features needed to help setup and verify your test.
To reference a snippet you need to surround it in `BeginRecipe` and
`EndRecipe` calls. For example:

```
StartRecipe("CreatingArrays");
arrow::Int32Builder builder;
ASSERT_OK(builder.Append(1));
ASSERT_OK(builder.Append(2));
ASSERT_OK(builder.Append(3));
ASSERT_OK_AND_ASSIGN(shared_ptr<arrow::Array> arr, builder.Finish())
rout << arr->ToString() << endl;
EndRecipe("CreatingArrays");
```

The variable `rout` is set to a `std::ostream` instance that is used to
capture test output. Anything output to `rout` will show up in the recipe
output block when the recipe is rendered into the cookbook.

## Referencing Arrow C++ Documentation

The Arrow project has its own documentation for the C++ implementation that
is hosted at https://arrow.apache.org/docs/cpp/index.html. Fortunately,
this documentation is also built with Sphinx and so we can use the extension
`intersphinx` to reference sections of this documentation. To do so simply
write a standard Sphinx reference like so:

```
Typed subclasses of :cpp:class:`arrow::ArrayBuilder` make it easy
to efficiently create Arrow arrays from existing C++ data
```

A helpful command is
`python -msphinx.ext.intersphinx https://arrow.apache.org/docs/objects.inv`
which will list all of the possible targets to link to.

# Development Workflow

Running `make` at the top level can be rather slow as it will rebuild the
entire environment each time. It is primarily intended for use in CI and
requires you to have conda installed.

For recipe development you are encouraged to create your own out-of-source
cmake build. For example:

```
mkdir cpp/code/build
cd cpp/code/build
cmake ../code -DCMAKE_BUILD_TYPE=Debug
cmake --build .
ctest
```

Then you can rerun all of the tests with `ctest` and you can rebuild and
rerun individual tests much more quickly with something like
`cmake --build . --target creating_arrow_objects && ctest creating_arrow_objects`.
Everytime the cmake build is run it will update the recipe output file
referenced by the sphinx build so after rerunning a test you can visualize the
output by running `make html` in the `cpp` directory.

## Using Conda

If you are using conda then there is file `cpp/requirements.yml` which can be
used to create an environment for recipe development with the command:

```
conda env create -n cookbook-cpp --file cpp/requirements.yml
```

# Development Philosophy

## Everything is the Cookbook

The entire document should serve as an example of how to use Arrow C++, not just the
referenced snippets. This means that the below style rules and guidelines apply to
source code that is not referenced by the cookbook itself.

## Style

This cookbook follows the same style rules as Arrow C++ which is the Google style
guide with a few exceptions described
[here](https://arrow.apache.org/docs/developers/cpp/development.html#code-style-linting-and-ci)

## Simple

The examples should be as simple as possible. If complex code (e.g. templates) can be
used to do something more efficiently then there should be a simple, inefficient version
alongside the more complex version.

Do not use `auto` in any of the templates unless you must (e.g. lambdas). Cookbook
viewers will be using a browser, not an IDE, and it is not always simple to determine
the inferred type.

# The Custom Recipe Directive

C++ is not, at the moment, a "notebook friendly" language and it does lend itself well
to being embedded inside an RST file. As such, we use a custom directive to link the
Googletest source files and the RST prose. The directive works with the helper methods
`BeginRecipe` and `EndRecipe` defined in `common.h`.

The helper method `BeginRecipe` will begin capturing output to `rout`. The helper method
`EndRecipe` will append the captured output and recipe name to string arrays. There is code
in `main.cc` which runs after the tests run to dump these arrays to a .arrow file (i.e. the
arrays will be serialized as a table using the Arrow IPC format).

When the sphinx build runs the directive `recipe` (defined in `cpp/ext`) will be loaded.
During this load the dataset of test outputs will be read. These test outputs will be used
whenever a recipe is referenced.

# Code of Conduct

All participation in the Apache Arrow project is governed by the Apache
Software Foundation’s
`code of conduct <https://www.apache.org/foundation/policies/conduct.html>`\_.
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?= -jauto
SPHINXBUILD ?= sphinx-build
SOURCEDIR = source
BUILDDIR = build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
@@ -0,0 +1,20 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
---
BasedOnStyle: Google
DerivePointerAlignment: false
ColumnLimit: 90
@@ -0,0 +1 @@
*-build*/
@@ -0,0 +1,47 @@
cmake_minimum_required(VERSION 3.19)
project(arrow-cookbook)

set(CMAKE_CXX_STANDARD 17)

# Add googletest
include(FetchContent)
FetchContent_Declare(
googletest
GIT_REPOSITORY https://github.com/google/googletest.git
GIT_TAG e2239ee6043f73722e7aa812a459f54a28552929 # release-1.11.0
)
# For Windows: Prevent overriding the parent project's compiler/linker settings
set(gtest_force_shared_crt ON CACHE BOOL "" FORCE)
FetchContent_MakeAvailable(googletest)

# Add Arrow
find_package(Arrow REQUIRED)

# Create test targets
enable_testing()
include(GoogleTest)
add_executable(
creating_arrow_objects
creating_arrow_objects.cc
common.cc
main.cc
)
target_link_libraries(
creating_arrow_objects
arrow_shared
gtest
)
gtest_discover_tests(creating_arrow_objects)

add_executable(
basic_arrow
basic_arrow.cc
common.cc
main.cc
)
target_link_libraries(
basic_arrow
arrow_shared
gtest
)
gtest_discover_tests(basic_arrow)

0 comments on commit d15e75c

Please sign in to comment.