Skip to content

Commit

Permalink
Donate object_store code from object_store_rs to arrow-rs (#2081)
Browse files Browse the repository at this point in the history
* Import influxdata/object_store_rs@3c51870

* Add object_store to workspace, update notes and readme

* Remove old github items

* Remove old gitignore

* Remove kodiak config

* Remove redundant license files

* Remove influx specific security policy

* Remove redudant rust-toolchain and rustfmt

* Add Apache License (RAT)

* ignore bubble_up_io_errors test

* Fix list_store with explicit lifetime, only run `test_list_root` on linux

* Only run object_store throttle tests on a mac
  • Loading branch information
alamb committed Jul 22, 2022
1 parent add2649 commit 866f1a1
Show file tree
Hide file tree
Showing 18 changed files with 6,262 additions and 0 deletions.
1 change: 1 addition & 0 deletions Cargo.toml
Expand Up @@ -23,6 +23,7 @@ members = [
"parquet_derive_test",
"arrow-flight",
"integration-testing",
"object_store",
]
# Enable the version 2 feature resolver, which avoids unifying features for targets that are not being built
#
Expand Down
262 changes: 262 additions & 0 deletions object_store/.circleci/config.yml
@@ -0,0 +1,262 @@
---
# CI Overview
# -----------
#
# Each night:
#
# A build image is created (ci_image) from `docker/Dockerfile.ci` and is
# pushed to `quay.io/influxdb/rust:ci`. This build image is then used to run
# the CI tasks for the day.
#
# Every commit:
#
# The CI for every PR and merge to main runs tests, fmt, lints and compiles debug binaries
#
# On main if all these checks pass it will then additionally compile in "release" mode and
# publish a docker image to quay.io/influxdb/iox:$COMMIT_SHA
#
# Manual CI Image:
#
# It is possible to manually trigger a rebuild of the image used in CI. To do this, navigate to
# https://app.circleci.com/pipelines/github/influxdata/influxdb_iox?branch=main (overriding the
# branch name if desired). Then:
# - Click "Run Pipeline" in the top-right
# - Expand "Add Parameters"
# - Add a "boolean" parameter called "ci_image" with the value true
# - Click "Run Pipeline"
#
# If you refresh the page you should see a newly running ci_image workflow
#

version: 2.1

orbs:
win: circleci/windows@4.1

commands:
rust_components:
description: Verify installed components
steps:
- run:
name: Verify installed components
command: |
rustup --version
rustup show
cargo fmt --version
cargo clippy --version
cache_restore:
description: Restore Cargo Cache
steps:
- restore_cache:
name: Restoring Cargo Cache
keys:
- cargo-cache-{{ arch }}-{{ .Branch }}-{{ checksum "Cargo.lock" }}
- cargo-cache-{{ arch }}-{{ .Branch }}
- cargo-cache
cache_save:
description: Save Cargo Cache
steps:
- save_cache:
name: Save Cargo Cache
paths:
- /usr/local/cargo/registry
key: cargo-cache-{{ arch }}-{{ .Branch }}-{{ checksum "Cargo.lock" }}

jobs:
fmt:
docker:
- image: quay.io/influxdb/rust:ci
environment:
# Disable incremental compilation to avoid overhead. We are not preserving these files anyway.
CARGO_INCREMENTAL: "0"
# Disable full debug symbol generation to speed up CI build
# "1" means line tables only, which is useful for panic tracebacks.
RUSTFLAGS: "-C debuginfo=1"
# https://github.com/rust-lang/cargo/issues/10280
CARGO_NET_GIT_FETCH_WITH_CLI: "true"
steps:
- checkout
- rust_components
- cache_restore
- run:
name: Rust fmt
command: cargo fmt --all -- --check
- cache_save
lint:
docker:
- image: quay.io/influxdb/rust:ci
environment:
# Disable incremental compilation to avoid overhead. We are not preserving these files anyway.
CARGO_INCREMENTAL: "0"
# Disable full debug symbol generation to speed up CI build
# "1" means line tables only, which is useful for panic tracebacks.
RUSTFLAGS: "-C debuginfo=1"
# https://github.com/rust-lang/cargo/issues/10280
CARGO_NET_GIT_FETCH_WITH_CLI: "true"
steps:
- checkout
- rust_components
- cache_restore
- run:
name: Clippy
command: cargo clippy --all-targets --all-features --workspace -- -D warnings
- cache_save
cargo_audit:
docker:
- image: quay.io/influxdb/rust:ci
environment:
# Disable incremental compilation to avoid overhead. We are not preserving these files anyway.
CARGO_INCREMENTAL: "0"
# Disable full debug symbol generation to speed up CI build
# "1" means line tables only, which is useful for panic tracebacks.
RUSTFLAGS: "-C debuginfo=1"
# https://github.com/rust-lang/cargo/issues/10280
CARGO_NET_GIT_FETCH_WITH_CLI: "true"
steps:
- checkout
- rust_components
- cache_restore
- run:
name: Install cargo-deny
command: cargo install --force cargo-deny
- run:
name: cargo-deny Checks
command: cargo deny check -s
- cache_save
check:
docker:
- image: quay.io/influxdb/rust:ci
environment:
# Disable incremental compilation to avoid overhead. We are not preserving these files anyway.
CARGO_INCREMENTAL: "0"
# Disable full debug symbol generation to speed up CI build
# "1" means line tables only, which is useful for panic tracebacks.
RUSTFLAGS: "-C debuginfo=1"
# https://github.com/rust-lang/cargo/issues/10280
CARGO_NET_GIT_FETCH_WITH_CLI: "true"
steps:
- checkout
- rust_components
- cache_restore
- run:
name: Install cargo-hack
command: cargo install cargo-hack
- run:
name: Check all features
command: cargo hack check --feature-powerset --no-dev-deps --workspace
- cache_save
doc:
docker:
- image: quay.io/influxdb/rust:ci
environment:
# Disable incremental compilation to avoid overhead. We are not preserving these files anyway.
CARGO_INCREMENTAL: "0"
# Disable full debug symbol generation to speed up CI build
# "1" means line tables only, which is useful for panic tracebacks.
RUSTFLAGS: "-C debuginfo=1"
# https://github.com/rust-lang/cargo/issues/10280
CARGO_NET_GIT_FETCH_WITH_CLI: "true"
steps:
- checkout
- rust_components
- cache_restore
- run:
name: Cargo doc
# excluding datafusion because it's effectively a dependency masqueraded as workspace crate.
command: cargo doc --document-private-items --no-deps --workspace --exclude datafusion
- cache_save
- run:
name: Compress Docs
command: tar -cvzf rustdoc.tar.gz target/doc/
- store_artifacts:
path: rustdoc.tar.gz
test:
# setup multiple docker images (see https://circleci.com/docs/2.0/configuration-reference/#docker)
docker:
- image: quay.io/influxdb/rust:ci
- image: localstack/localstack:0.14.4
- image: mcr.microsoft.com/azure-storage/azurite
- image: fsouza/fake-gcs-server
command:
- "-scheme"
- "http"
resource_class: 2xlarge # use of a smaller executor tends crashes on link
environment:
# Disable incremental compilation to avoid overhead. We are not preserving these files anyway.
CARGO_INCREMENTAL: "0"
# Disable full debug symbol generation to speed up CI build
# "1" means line tables only, which is useful for panic tracebacks.
RUSTFLAGS: "-C debuginfo=1"
# https://github.com/rust-lang/cargo/issues/10280
CARGO_NET_GIT_FETCH_WITH_CLI: "true"
RUST_BACKTRACE: "1"
# Run integration tests
TEST_INTEGRATION: 1
AWS_DEFAULT_REGION: "us-east-1"
AWS_ACCESS_KEY_ID: test
AWS_SECRET_ACCESS_KEY: test
AWS_ENDPOINT: http://127.0.0.1:4566
AZURE_USE_EMULATOR: "1"
GOOGLE_SERVICE_ACCOUNT: "/tmp/gcs.json"
OBJECT_STORE_BUCKET: test-bucket
steps:
- run:
name: Setup localstack (AWS emulation)
command: |
cd /tmp
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install
aws --endpoint-url=http://localhost:4566 s3 mb s3://test-bucket
- run:
name: Setup Azurite (Azure emulation)
# the magical connection string is from https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azurite?tabs=visual-studio#http-connection-strings
command: |
curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash
az storage container create -n test-bucket --connection-string 'DefaultEndpointsProtocol=http;AccountName=devstoreaccount1;AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==;BlobEndpoint=http://127.0.0.1:10000/devstoreaccount1;QueueEndpoint=http://127.0.0.1:10001/devstoreaccount1;'
- run:
name: Setup fake GCS server
command: |
curl -X POST --data-binary '{"name":"test-bucket"}' -H "Content-Type: application/json" "http://localhost:4443/storage/v1/b"
echo '{"gcs_base_url": "http://localhost:4443", "disable_oauth": true, "client_email": "", "private_key": ""}' > "$GOOGLE_SERVICE_ACCOUNT"
- checkout
- rust_components
- cache_restore
- run:
name: Cargo test
command: cargo test --workspace --features=aws,azure,azure_test,gcp
- cache_save

test_windows:
executor:
name: win/default
size: medium
environment:
# https://github.com/rust-lang/cargo/issues/10280
CARGO_NET_GIT_FETCH_WITH_CLI: "true"
steps:
- checkout
- run:
name: Download rustup
command: wget https://win.rustup.rs/x86_64 -O rustup-init.exe
- run:
name: Install rustup
command: .\rustup-init.exe -y --default-host=x86_64-pc-windows-msvc
- run:
name: Cargo test
command: cargo test --workspace

workflows:
version: 2

# CI for all pull requests.
ci:
jobs:
- check
- fmt
- lint
- cargo_audit
- test
- test_windows
- doc
94 changes: 94 additions & 0 deletions object_store/CONTRIBUTING.md
@@ -0,0 +1,94 @@
<!---
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->

# Development instructions

## Running Tests

Tests can be run using `cargo`

```shell
cargo test
```

## Running Integration Tests

By default, integration tests are not run. To run them you will need to set `TEST_INTEGRATION=1` and then provide the
necessary configuration for that object store

### AWS

To test the S3 integration against [localstack](https://localstack.cloud/)

First start up a container running localstack

```
$ podman run --rm -it -p 4566:4566 -p 4510-4559:4510-4559 localstack/localstack
```

Setup environment

```
export TEST_INTEGRATION=1
export AWS_DEFAULT_REGION=us-east-1
export AWS_ACCESS_KEY_ID=test
export AWS_SECRET_ACCESS_KEY=test
export AWS_ENDPOINT=http://127.0.0.1:4566
export OBJECT_STORE_BUCKET=test-bucket
```

Create a bucket using the AWS CLI

```
podman run --net=host --env-host amazon/aws-cli --endpoint-url=http://localhost:4566 s3 mb s3://test-bucket
```

Run tests

```
$ cargo test --features aws
```

### Azure

To test the Azure integration
against [azurite](https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azurite?tabs=visual-studio)

Startup azurite

```
$ podman run -p 10000:10000 -p 10001:10001 -p 10002:10002 mcr.microsoft.com/azure-storage/azurite
```

Create a bucket

```
$ podman run --net=host mcr.microsoft.com/azure-cli az storage container create -n test-bucket --connection-string 'DefaultEndpointsProtocol=http;AccountName=devstoreaccount1;AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==;BlobEndpoint=http://127.0.0.1:10000/devstoreaccount1;QueueEndpoint=http://127.0.0.1:10001/devstoreaccount1;'
```

Run tests

```
$ cargo test --features azure
```

### GCP

We don't have a good story yet for testing the GCP integration locally. You will need to create a GCS bucket, a
service account that has access to it, and use this to run the tests.

0 comments on commit 866f1a1

Please sign in to comment.