An example of a hyppo integration that generates it's own data internally. Particularly useful for testing.
Here are the classes involved in retrieving, processing, persisting data. This can serve as a starting point for a "real" integration of an ingestion source.
- DemoTaskCreator -- split up a job into multiple tasks.
- DemoDataFetcher -- should collect data from an API or some other source. The demo integration simply creates a list of numbers in the
value
field. - DemoDataProcessor -- should turn data from the fetcher into processed Avro recordds
- DemoDataPersister -- should upload the data (but the demo integration just logs the values).
Also needed:
- Avro schema
- Create the artifact (in
target/scala-2.<version>/<integration name>-<version>.jar
)
sbt clean compile assembly
- Upload the jar file to S3
- Add an ingestion source along with this integration in the integration manager 3.1. For the source, use a config like this one:
{
"firstValue" : 1,
"lastValue" : 20,
"chunkSize" : 10,
"jdbcUrl" : "DSN"
}
3.2. For the integration, enter com.harrys.hyppo.demo.DemoIntegration
as the class name and add the S3 bucket and object key information based on the upload step.
4. Now submit a job to test it.
5. Test overriding the value sequence by specifying a single value as the job parameter
{ "value": 1 }
Hint: Killing the executor process will dump the logging to STDOUT.