Skip to content

Commit

Permalink
Merge branch 'gh-2630-example-real-federatedstore' into gh-2632-feder…
Browse files Browse the repository at this point in the history
…atedstore-delete-accumulo-table

# Conflicts:
#	example/federated-demo/src/main/resources/operationDeclarations.json
#	store-implementation/federated-store/src/main/java/uk/gov/gchq/gaffer/federatedstore/FederatedGraphStorage.java
#	store-implementation/federated-store/src/main/java/uk/gov/gchq/gaffer/federatedstore/util/FederatedStoreUtil.java
  • Loading branch information
GCHQDev404 committed Oct 23, 2022
2 parents dde43d7 + 062bcd2 commit f45d28b
Show file tree
Hide file tree
Showing 913 changed files with 16,565 additions and 45,718 deletions.
131 changes: 77 additions & 54 deletions .github/workflows/continuous-integration.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,77 +14,100 @@ jobs:
check-all-modules-are-tested:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Check all modules are tested
run: ./cd/check_modules.sh
- uses: actions/checkout@v3
- name: Check all modules are tested
run: ./cd/check_modules.sh

build-javadoc:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3

- name: Setup JDK
uses: actions/setup-java@v2
with:
distribution: 'zulu'
java-version: '8'
- name: Setup JDK
uses: actions/setup-java@v2
with:
distribution: 'zulu'
java-version: '8'

- name: Build Javadoc
run: |
mvn -q clean install -Pquick -Dskip.jar-with-dependencies=true -Dshaded.jar.phase=true
mvn -q javadoc:javadoc -Pquick
- name: Build Javadoc
run: |
mvn -ntp clean install -Pquick -Dskip.jar-with-dependencies=true -Dshaded.jar.phase=true
mvn -ntp javadoc:javadoc -Pquick
build:
runs-on: ubuntu-latest
runs-on: ${{ matrix.os }}
strategy:
fail-fast: true
matrix:
# The below line does the following, check if the branch name contains the words 'windows' or 'release'.
# If these words are present it runs the tests on Windows AND Ubuntu. If the words aren't present it runs the tests on Ubuntu ONLY
# The 'release' keyword is currently disabled. Change 'release-disabled-for-now' to 'release' once the tests have been fixed to reenable this.
os: ${{ fromJSON( (contains( github.head_ref, 'windows') || contains( github.head_ref, 'release-disabled-for-now')) && '["ubuntu-latest", "windows-latest"]' || '["ubuntu-latest"]' ) }}
modules:
- name: Core
values: :gaffer2,:core,:access,:cache,:common-util,:data,:exception,:graph,:operation,:serialisation,:store,:type
- name: Accumulo
values: :accumulo-store,:accumulo-rest
- name: Hbase
values: :hbase-store,:hbase-rest
- name: Parquet
values: :parquet-store,:parquet-rest
- name: Federated-And-Map
values: :integration-test,:federated-store,:map-store,:map-rest
- name: REST
values: :rest-api,:common-rest,:spring-rest,:core-rest,:store-implementation,:proxy-store
- name: Examples
values: :example,:basic,:basic-model,:basic-rest,:road-traffic,:road-traffic-model,:road-traffic-generators,:road-traffic-rest,:road-traffic-demo,:federated-demo
- name: Big-Data-Libraries
values: :flink-library,:hdfs-library,:spark,:spark-library,:spark-accumulo-library
- name: Time-Library
values: :library,:time-library
- name: Caches
values: :cache-library,:sketches-library,:bitmap-library,:hazelcast-cache-service,:jcs-cache-service
- name: Core
values: :gaffer2,:core,:access,:cache,:common-util,:data,:exception,:graph,:operation,:serialisation,:store,:type
- name: Accumulo
values: :accumulo-store,:accumulo-rest
- name: Federated-And-Map
values: :integration-test,:federated-store,:map-store,:map-rest
- name: REST
values: :rest-api,:common-rest,:spring-rest,:core-rest,:store-implementation,:proxy-store
- name: Examples
values: :example,:basic,:basic-model,:basic-rest,:road-traffic,:road-traffic-model,:road-traffic-generators,:road-traffic-rest,:road-traffic-demo,:federated-demo
- name: Big-Data-Libraries
values: :flink-library,:hdfs-library,:spark,:spark-library,:spark-accumulo-library
- name: Time-Library
values: :library,:time-library
- name: Caches
values: :cache-library,:sketches-library,:bitmap-library,:hazelcast-cache-service,:jcs-cache-service
exclude:
- os: windows-latest
modules:
name: Accumulo
- os: windows-latest
modules:
name: Federated-And-Map
- os: windows-latest
modules:
name: Examples
- os: windows-latest
modules:
name: Big-Data-Libraries
- os: windows-latest
modules:
name: Rest

env:
MAVEN_OPTS: -Dmaven.wagon.http.retryHandler.count=3 -Dmaven.wagon.httpconnectionManager.ttlSeconds=25
MAVEN_OPTS: -Dmaven.wagon.http.retryHandler.count=3 -Dmaven.wagon.httpconnectionManager.ttlSeconds=25
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
with:
fetch-depth: 0

- name: Setup JDK
uses: actions/setup-java@v2
with:
distribution: 'zulu'
java-version: '8'

- name: Setup JDK
uses: actions/setup-java@v2
with:
distribution: 'zulu'
java-version: '8'
- name: Cache dependencies
uses: actions/cache@v2
with:
path: |
~/.m2/repository
!~/.m2/repository/uk
key: ${{matrix.modules.name}}-gaffer-dependencies

- name: Cache dependencies
uses: actions/cache@v2
with:
path: |
~/.m2/repository
!~/.m2/repository/uk
key: ${{matrix.modules.name}}-gaffer-dependencies
- name: Install
run: mvn -B -ntp clean install -P quick -pl ${{matrix.modules.values}} -am

- name: Install
run: mvn -B -q clean install -P quick -pl ${{matrix.modules.values}} -am
- name: Test
run: mvn -B -ntp verify -P coverage -pl ${{matrix.modules.values}}

- name: Test
run: mvn -B -q verify -P coverage -pl ${{matrix.modules.values}}
- name: Check Copyright Headers
if: github.event_name == 'pull_request' && matrix.os == 'ubuntu-latest'
run: mvn -B -ntp spotless:check -pl ${{matrix.modules.values}}

- name: Upload Coverage
uses: codecov/codecov-action@v2
- name: Upload Coverage
if: matrix.os == 'ubuntu-latest'
uses: codecov/codecov-action@v2
37 changes: 21 additions & 16 deletions NOTICES
Original file line number Diff line number Diff line change
Expand Up @@ -21,12 +21,12 @@ and their licenses, below. For information on the dependencies of these dependen
projects below.


Koryphe (uk.gov.gchq.koryphe:koryphe:1.14.0):
Koryphe (uk.gov.gchq.koryphe:koryphe:2.4.0):

- Apache License, Version 2.0


Apache Hadoop (org.apache.hadoop:hadoop-common:2.6.3):
Apache Hadoop (org.apache.hadoop:hadoop-common:2.6.5, org.apache.hadoop:hadoop-mapreduce-client-core:2.6.5):

- Apache License, Version 2.0

Expand All @@ -36,12 +36,12 @@ Apache Accumulo (org.apache.accumulo:accumulo-core:1.8.1):
- Apache License, Version 2.0


Apache Avro (org.apache.avro:avro:1.7.7, org.apache.avro:avro-mapred:1.7.7):
Apache Avro (org.apache.avro:avro:1.8.2, org.apache.avro:avro-mapred:1.8.2):

- Apache License, Version 2.0


Apache Commons Lang (org.apache.commons:commons-lang3:3.3.2):
Apache Commons Lang (org.apache.commons:commons-lang3:3.12.0):

- Apache License, Version 2.0

Expand All @@ -51,7 +51,7 @@ Apache Commons JCS (org.apache.commons:commons-jcs-core:2.1):
- Apache Licence, Version 2.0


Apache Commons CSV (org.apache.commons:commons-csv:1.4):
Apache Commons CSV (org.apache.commons:commons-csv:1.9.0):

- Apache Licence, Version 2.0

Expand All @@ -61,7 +61,7 @@ Apache Commons Codec (commons-codec:commons-codec:1.6):
- Apache Licence, Version 2.0


Apache Spark (org.apache.spark:spark-core:2.3.2, org.apache.spark:spark-sql:2.3.2, org.apache.spark:spark-catalyst:2.3.2, org.apache.spark:spark-graphx:2.3.2)
Apache Spark (org.apache.spark:spark-core:2.4.5, org.apache.spark:spark-sql:2.4.5, org.apache.spark:spark-catalyst:2.4.5, org.apache.spark:spark-graphx:2.4.5)

- Apache License, Version 2.0

Expand All @@ -76,7 +76,7 @@ Datasketches (com.yahoo.sketches:sketches-core:0.12.0):
- Apache License, Version 2.0


Commons IO (commons-io:commons-io:2.4)
Commons IO (commons-io:commons-io:2.11.0)

- Apache License, Version 2.0

Expand All @@ -88,17 +88,22 @@ org.json4s:json4s-ast_2.10:3.2.11):
- Apache License, Version 2.0


Swagger (io.swagger:-jersey2-jaxrs:1.5.15):
Swagger (io.swagger:swagger-annotations:1.6.4,
io.swagger:swagger-jaxrs:1.6.4):

- Apache License, Version 2.0

Springfox (io.springfox:springfox-swagger2:3.0.0):

- Apache License, Version 2.0

FasterXML Jackson (com.fasterxml.jackson.core:jackson-annotations:2.6.5,
com.fasterxml.jackson.core:jackson-core:2.6.5, com.fasterxml.jackson.core:jackson-databind:2.6.5,
com.fasterxml.jackson.jaxrs:jackson-jaxrs-base:2.6.5,
com.fasterxml.jackson.jaxrs:jackson-jaxrs-json-provider:2.6.5,
com.fasterxml.jackson.datatype:jackson-datatype-jsr310:2.6.5,
com.fasterxml.jackson.datatype:jackson-datatype-json-org:2.6.5):
com.fasterxml.jackson.datatype:jackson-datatype-json-org:2.6.5,
com.fasterxml.jackson.module:jackson-module-scala_2.11:2.6.5):

- Apache License, Version 2.0

Expand All @@ -119,7 +124,7 @@ Javax Web Api (javax:javaee-web-api:7.0)
- Common Development and Distribution License 1.0


Jersey Server (org.glassfish.jersey.core:jersey-server:2.25)
Jersey Server (org.glassfish.jersey.core:jersey-server:2.25.1)

- Common Development and Distribution License 1.0
- GNU General Public License 2.0
Expand All @@ -140,7 +145,7 @@ Mockito (org.mockito:mockito-all:1.9.5):
- MIT License


Netty (io.netty:netty:jar:3.6.2.Final)
Netty (io.netty:netty:3.6.2.Final)

- Apache License, Version 2.0

Expand All @@ -150,7 +155,7 @@ MockServer (org.mock-server:mockserver-netty:3.9.16)
-Apache License, Version 2.0


Reflections (org.reflections:reflections:RC1:0.9.9)
Reflections (org.reflections:reflections:0.9.12)

- WTFPL

Expand All @@ -160,7 +165,7 @@ Scala (org.scala-lang:scala-library:2.11.12):
- BSD 3 Clause Licence ("New" or "Revised")


SLF4J (org.slf4j:slf4j-api:1.7.25):
SLF4J (org.slf4j:slf4j-api:1.7.36):

- MIT License

Expand All @@ -180,7 +185,7 @@ FindBugs Annotations (com.google.code.findbugs:annotations:3.0.2):
- GNU Lesser Public License


Hazelcast (com.hazelcast:hazelcast:3.8)
Hazelcast (com.hazelcast:hazelcast:5.1)

- Apache Licence, Version 2.0

Expand All @@ -205,12 +210,12 @@ Apache Kafka (org.apache.kafka:kafka_2.11:0.10.0.0,org.apache.kafka:kafka-client
- Apache License, Version 2.0


Apache Flink (org.apache.flink:flink-java:1.4.1,org.apache.flink:flink-clients_2.11:1.4.1,org.apache.flink:flink-connector-kafka_2.11:1.4.1)
Apache Flink (org.apache.flink:flink-java:1.7.2,org.apache.flink:flink-clients_2.11:1.7.2,org.apache.flink:flink-connector-kafka-0.10_2.11:1.7.2, org.apache.flink:flink-streaming-java_2.11:1.7.2)

- Apache License, Version 2.0


Graphframes (graphframes:graphframes:0.4.0-spark2.1-s_2.11)
Graphframes (graphframes:graphframes:0.8.1-spark2.4-s_2.11)

- Apache License, Version 2.0

Expand Down
47 changes: 24 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Copyright 2016-2020 Crown Copyright
Copyright 2016-2022 Crown Copyright

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
Expand All @@ -17,27 +17,31 @@ limitations under the License.
Gaffer
======

Gaffer is a graph database framework. It allows the storage of very large graphs containing rich properties on the nodes and edges. Several storage options are available, including Accumulo, Hbase and Parquet.
Gaffer is a graph database framework. It allows the storage of very large graphs containing rich properties on the nodes and edges. Several storage options are available, including Accumulo and an in-memory Java Map Store.

It is designed to be as flexible, scalable and extensible as possible, allowing for rapid prototyping and transition to production systems.

Gaffer offers:

- Rapid query across very large numbers of nodes and edges;
- Continual ingest of data at very high data rates, and batch bulk ingest of data via MapReduce or Spark;
- Storage of arbitrary Java objects on the nodes and edges;
- Automatic, user-configurable in-database aggregation of rich statistical properties (e.g. counts, histograms, sketches) on the nodes and edges;
- Versatile query-time summarisation, filtering and transformation of data;
- Fine grained data access controls;
- Hooks to apply policy and compliance rules to queries;
- Automated, rule-based removal of data (typically used to age-off old data);
- Retrieval of graph data into Apache Spark for fast and flexible analysis;
- A fully-featured REST API.
- Rapid query across very large numbers of nodes and edges
- Continual ingest of data at very high data rates, and batch bulk ingest of data via MapReduce or Spark
- Storage of arbitrary Java objects on the nodes and edges
- Automatic, user-configurable in-database aggregation of rich statistical properties (e.g. counts, histograms, sketches) on the nodes and edges
- Versatile query-time summarisation, filtering and transformation of data
- Fine grained data access controls
- Hooks to apply policy and compliance rules to queries
- Automated, rule-based removal of data (typically used to age-off old data)
- Retrieval of graph data into Apache Spark for fast and flexible analysis
- A fully-featured REST API

To get going with Gaffer, visit our [getting started pages](https://gchq.github.io/gaffer-doc/v1docs/summaries/getting-started.html).

Gaffer is under active development. Version 1.0 of Gaffer was released in October 2017.

Gaffer 2.0 Alpha
---------------
Gaffer 2.0 is currently in the alpha phase, see the [v2docs](https://gchq.github.io/gaffer-doc/latest/) for a list of changes under development.

License
-------

Expand Down Expand Up @@ -88,20 +92,17 @@ Related repositories

The [gaffer-tools](https://github.com/gchq/gaffer-tools) repository contains useful tools to help work with Gaffer. These include:

- `jar-shader` - Used to shade the version of Jackson to avoid incompatibility problems on CDH clusters;
- `mini-accumulo-cluster` - Allows a mini Accumulo cluster to be spun up for testing purposes;
- `performance-testing` - Methods of testing the performance of ingest and query operations against a graph;
- `python-shell` - Allows operations against a graph to be executed from a Python shell;
- `random-element-generation` - Code to generate large volumes of random graph data;
- `schema-builder` - A (beta) visual tool for writing schemas for a graph;
- `slider` - Code to deploy a Gaffer cluster to a YARN cluster using [Apache Slider](https://slider.incubator.apache.org/), including the ability to easily run Slider on an [AWS EMR cluster](https://aws.amazon.com/emr/);
- `ui` - A basic graph visualisation tool.
- `mini-accumulo-cluster` - Allows a mini Accumulo cluster to be spun up for testing purposes
- `performance-testing` - Methods of testing the performance of ingest and query operations against a graph
- `python-shell` - Allows operations against a graph to be executed from a Python shell
- `random-element-generation` - Code to generate large volumes of random graph data
- `ui` - A basic graph visualisation tool

Contributing
------------

We welcome contributions to the project. Detailed information on our ways of working can be found [here](https://gchq.github.io/gaffer-doc/v1docs/other/ways-of-working.html). In brief:

- Sign the [GCHQ Contributor Licence Agreement](https://cla-assistant.io/gchq/Gaffer);
- Push your changes to a fork;
- Submit a pull request.
- Sign the [GCHQ Contributor Licence Agreement](https://cla-assistant.io/gchq/Gaffer)
- Push your changes to a fork
- Submit a pull request
Loading

0 comments on commit f45d28b

Please sign in to comment.