Skip to content

Commit

Permalink
[CodeFormat] Add clang-format script (#4934)
Browse files Browse the repository at this point in the history
run build-support/check-format.sh to check cpp styles;
run build-support/clang-format.sh to fix cpp style issues;
  • Loading branch information
sduzh committed Nov 28, 2020
1 parent 6fedf58 commit d7225d6
Show file tree
Hide file tree
Showing 6 changed files with 335 additions and 40 deletions.
34 changes: 34 additions & 0 deletions build-support/check-format.sh
@@ -0,0 +1,34 @@
#!/usr/bin/env bash
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

##############################################################
# This script will run the clang-format to check but without
# updating cpp files.
##############################################################

set -eo pipefail

ROOT=`dirname "$0"`
ROOT=`cd "$ROOT"; pwd`

export DORIS_HOME=`cd "${ROOT}/.."; pwd`

CLANG_FORMAT=${CLANG_FORMAT_BINARY:=$(which clang-format)}

python3 ${DORIS_HOME}/build-support/run_clang_format.py --clang_format_binary="${CLANG_FORMAT}" --source_dirs="${DORIS_HOME}/be/src,${DORIS_HOME}/be/test" --quiet

35 changes: 35 additions & 0 deletions build-support/clang-format.sh
@@ -0,0 +1,35 @@
#!/usr/bin/env bash
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

##############################################################
# This script run the clang-format to check and fix
# cplusplus source files.
##############################################################

set -eo pipefail

ROOT=`dirname "$0"`
ROOT=`cd "$ROOT"; pwd`

export DORIS_HOME=`cd "${ROOT}/.."; pwd`

CLANG_FORMAT=${CLANG_FORMAT_BINARY:=$(which clang-format)}

python3 ${DORIS_HOME}/build-support/run_clang_format.py --clang_format_binary="${CLANG_FORMAT}" --fix --source_dirs="${DORIS_HOME}/be/src","${DORIS_HOME}/be/test"


111 changes: 111 additions & 0 deletions build-support/lintutils.py
@@ -0,0 +1,111 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#
# Modified from Apache Arrow project.

import multiprocessing as mp
import os
from fnmatch import fnmatch
from subprocess import Popen


def chunk(seq, n):
"""
divide a sequence into equal sized chunks
(the last chunk may be smaller, but won't be empty)
"""
chunks = []
some = []
for element in seq:
if len(some) == n:
chunks.append(some)
some = []
some.append(element)
if len(some) > 0:
chunks.append(some)
return chunks


def dechunk(chunks):
"flatten chunks into a single list"
seq = []
for chunk in chunks:
seq.extend(chunk)
return seq


def run_parallel(cmds, **kwargs):
"""
Run each of cmds (with shared **kwargs) using subprocess.Popen
then wait for all of them to complete.
Runs batches of multiprocessing.cpu_count() * 2 from cmds
returns a list of tuples containing each process'
returncode, stdout, stderr
"""
complete = []
for cmds_batch in chunk(cmds, mp.cpu_count() * 2):
procs_batch = [Popen(cmd, **kwargs) for cmd in cmds_batch]
for proc in procs_batch:
stdout, stderr = proc.communicate()
complete.append((proc.returncode, stdout, stderr))
return complete


_source_extensions = '''
.h
.cc
.cpp
'''.split()


def get_sources(source_dir, exclude_globs=[]):
sources = []
for directory, subdirs, basenames in os.walk(source_dir):
for path in [os.path.join(directory, basename)
for basename in basenames]:
# filter out non-source files
if os.path.splitext(path)[1] not in _source_extensions:
continue

path = os.path.abspath(path)

# filter out files that match the globs in the globs file
if any([fnmatch(path, glob) for glob in exclude_globs]):
continue

sources.append(path)
return sources


def stdout_pathcolonline(completed_process, filenames):
"""
given a completed process which may have reported some files as problematic
by printing the path name followed by ':' then a line number, examine
stdout and return the set of actually reported file names
"""
returncode, stdout, stderr = completed_process
bfilenames = set()
for filename in filenames:
bfilenames.add(filename.encode('utf-8') + b':')
problem_files = set()
for line in stdout.splitlines():
for filename in bfilenames:
if line.startswith(filename):
problem_files.add(filename.decode('utf-8'))
bfilenames.remove(filename)
break
return problem_files, stdout
144 changes: 144 additions & 0 deletions build-support/run_clang_format.py
@@ -0,0 +1,144 @@
#!/usr/bin/env python
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#
# Modified from Apache Arrow project.

from __future__ import print_function
import lintutils
from subprocess import PIPE
import argparse
import difflib
import multiprocessing as mp
import sys
from functools import partial


# examine the output of clang-format and if changes are
# present assemble a (unified)patch of the difference
def _check_one_file(filename, formatted):
with open(filename, "rb") as reader:
original = reader.read()

if formatted != original:
# Run the equivalent of diff -u
diff = list(difflib.unified_diff(
original.decode('utf8').splitlines(True),
formatted.decode('utf8').splitlines(True),
fromfile=filename,
tofile="{} (after clang format)".format(
filename)))
else:
diff = None

return filename, diff

def _check_dir(arguments, source_dir, exclude_globs):
formatted_filenames = []
for path in lintutils.get_sources(source_dir, exclude_globs):
formatted_filenames.append(str(path))

if arguments.fix:
if not arguments.quiet:
print("\n".join(map(lambda x: "Formatting {}".format(x),
formatted_filenames)))

# Break clang-format invocations into chunks: each invocation formats
# 16 files. Wait for all processes to complete
results = lintutils.run_parallel([
[arguments.clang_format_binary, "-style=file", "-i"] + some
for some in lintutils.chunk(formatted_filenames, 16)
])
for returncode, stdout, stderr in results:
# if any clang-format reported a parse error, bubble it
if returncode != 0:
sys.exit(returncode)

else:
# run an instance of clang-format for each source file in parallel,
# then wait for all processes to complete
results = lintutils.run_parallel([
[arguments.clang_format_binary, "-style=file", filename]
for filename in formatted_filenames
], stdout=PIPE, stderr=PIPE)

checker_args = []
for filename, res in zip(formatted_filenames, results):
# if any clang-format reported a parse error, bubble it
returncode, stdout, stderr = res
if returncode != 0:
print(stderr)
sys.exit(returncode)
checker_args.append((filename, stdout))

error = False
pool = mp.Pool()
try:
# check the output from each invocation of clang-format in parallel
for filename, diff in pool.starmap(_check_one_file, checker_args):
if not arguments.quiet:
print("Checking {}".format(filename))
if diff:
print("{} had clang-format style issues".format(filename))
# Print out the diff to stderr
error = True
# pad with a newline
print(file=sys.stderr)
sys.stderr.writelines(diff)
except Exception:
error = True
raise
finally:
pool.terminate()
pool.join()
sys.exit(1 if error else 0)


if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="Runs clang-format on all of the source "
"files. If --fix is specified enforce format by "
"modifying in place, otherwise compare the output "
"with the existing file and output any necessary "
"changes as a patch in unified diff format")
parser.add_argument("--clang_format_binary",
required=True,
help="Path to the clang-format binary")
parser.add_argument("--exclude_globs",
help="Filename containing globs for files "
"that should be excluded from the checks")
parser.add_argument("--source_dirs",
required=True,
help="Comma-separated root directories of the source code")
parser.add_argument("--fix", default=False,
action="store_true",
help="If specified, will re-format the source "
"code instead of comparing the re-formatted "
"output, defaults to %(default)s")
parser.add_argument("--quiet", default=False,
action="store_true",
help="If specified, only print errors")
arguments = parser.parse_args()

exclude_globs = []
if arguments.exclude_globs:
with open(arguments.exclude_globs) as f:
exclude_globs.extend(line.strip() for line in f)

for source_dir in arguments.source_dirs.split(','):
if len(source_dir) > 0:
_check_dir(arguments, source_dir, exclude_globs)
27 changes: 6 additions & 21 deletions docs/en/developer-guide/format-code.md
Expand Up @@ -25,7 +25,7 @@ under the License.
-->

# Format Code
To automatically format the code, clang-format is a good choice.
Doris use `Clang-format` to automatically check the format of your source code.

## Code Style
Doris Code Style is based on Google's, makes a few changes. The customized .clang-format
Expand Down Expand Up @@ -58,31 +58,16 @@ the version is lower than clang-format-9.0.
## Usage

### CMD
`clang-format --style=file -i $File$`
Change directory to the root directory of Doris sources and run the following command:
`build-support/clang-format.sh`

`-style=file` Clang-format will try to find the .clang-format file located in the closest parent directory of the input file. When the standard input is used, the search is started from the current directory.

`--lines = m:n` Format a range of lines. Multiple ranges can be formatted by specifying several -lines arguments.

`-i`input file

Note: filter out the files which should not be formatted, when batch clang-formatting files.

A example of how to filter \*.h/\*.cpp and exclude some dirs:

Centos

`find . -type f -not \( -wholename ./env/* \) -regextype posix-egrep -regex
".*\.(cpp|h)" | xargs clang-format -i -style=file`

Mac

`find -E . -type f -not \( -wholename ./env/* \) -regex ".*\.(cpp|h)" | xargs clang-format -i --style=file`
NOTE: Python3 is required to run the `clang-format.sh` script.

### Using clang-format in IDEs or Editors
#### Clion
If using the plugin 'ClangFormat' in Clion, choose `Reformat Code` or press the keyboard
shortcut.

#### VS Code
VS Code needs install the extension 'Clang-Format', and specify the executable path of
clang-format in settings.
Expand All @@ -93,4 +78,4 @@ Open the vs code configuration page and search `clang_format`, fill the box as f
"clang_format_path": "$clang-format path$",
"clang_format_style": "file"
```
Then, right click the file and choose `Format Document`.
Then, right click the file and choose `Format Document`.

0 comments on commit d7225d6

Please sign in to comment.