Infinispan Hadoop

Integrations with Apache Hadoop and related frameworks.

Compatibility

Version	Infinispan	Hadoop	Java
0.1	8.0.x	2.x	8
0.2	8.2.x	2.x	8
0.3	9.4.x	2.x 3.x	8
0.4	9.4.x	2.x 3.x	8

InfinispanInputFormat and InfinispanOutputFormat

Implementation of Hadoop InputFormat and OutputFormat that allows reading and writing data to Infinispan Server with best data locality. Partitions are generated based on segment ownership and allows processing of data in a cache using multiple splits in parallel.

Maven Coordinates

 <dependency>  
    <groupId>org.infinispan.hadoop</groupId>  
    <artifactId>infinispan-hadoop-core</artifactId>  
    <version>0.4</version>
 </dependency>

Sample usage with Hadoop YARN mapreduce application:

import org.infinispan.hadoop.*;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.mapreduce.Job;

Configuration configuration = new Configuration();
String hosts = "172.17.0.2:11222;172.17.0.3:11222";

// Configures input/output caches
configuration.set(InfinispanConfiguration.INPUT_REMOTE_CACHE_SERVER_LIST, hosts);
configuration.set(InfinispanConfiguration.OUTPUT_REMOTE_CACHE_SERVER_LIST, hosts);

configuration.set(InfinispanConfiguration.INPUT_REMOTE_CACHE_NAME, "map-reduce-in");
configuration.set(InfinispanConfiguration.OUTPUT_REMOTE_CACHE_NAME, "map-reduce-out");

Job job = Job.getInstance(configuration, "Infinispan job");

// Map and Reduce implementation
job.setMapperClass(MapClass.class);
job.setReducerClass(ReduceClass.class);

job.setInputFormatClass(InfinispanInputFormat.class);
job.setOutputFormatClass(InfinispanOutputFormat.class);

Supported Configurations:

Name	Description	Default
hadoop.ispn.input.filter.factory	The name of the filter factory deployed on the server to pre-filter data before reading	null (no filtering)
hadoop.ispn.input.cache.name	The name of cache where data will be read from	"default"
hadoop.ispn.input.read.batch	Batch size when reading from the cache	5000
hadoop.ispn.output.write.batch	Batch size when writing to the cache	500
hadoop.ispn.input.remote.cache.servers	List of servers of the input cache, in the format `host1:port1;host2:port2`	localhost:11222
hadoop.ispn.output.cache.name	The name of cache where job results will be written to	"default"
hadoop.ispn.output.remote.cache.servers	List of servers of the output cache, in the format `host1:port1;host2:port2`
hadoop.ispn.input.converter	Class name with an implementation of `org.infinispan.hadoop.KeyValueConverter`, applied after reading from the cache	null (no converting)
hadoop.ispn.output.converter	Class name with an implementation of `org.infinispan.hadoop.KeyValueConverter`, applied before writing	null (no converting)

Demos

Refer to https://github.com/infinispan/infinispan-hadoop/tree/master/samples/

Releasing

The $MAVEN_HOME/conf/settings.xml must contain credentials for the release repository. Add the following section in <servers>:

<server>
   <id>jboss-snapshots-repository</id>
   <username>RELEASE_USER</username>
   <password>RELEASE_PASS</password>
</server>
<server>
   <id>jboss-releases-repository</id>
   <username>RELEASE_USER</username>
   <password>RELEASE_PASS</password>
</server>

To release:

mvn release:prepare release:perform -B

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
core		core
parent		parent
samples		samples
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core

core

parent

parent

samples

samples

.gitignore

.gitignore

.travis.yml

.travis.yml

LICENSE

LICENSE

README.md

README.md

pom.xml

pom.xml

Repository files navigation

Infinispan Hadoop

Compatibility

InfinispanInputFormat and InfinispanOutputFormat

Maven Coordinates

Sample usage with Hadoop YARN mapreduce application:

Supported Configurations:

Demos

Releasing

About

Releases

Packages

Contributors 3

Languages

License

infinispan/infinispan-hadoop

Folders and files

Latest commit

History

Repository files navigation

Infinispan Hadoop

Compatibility

InfinispanInputFormat and InfinispanOutputFormat

Maven Coordinates

Sample usage with Hadoop YARN mapreduce application:

Supported Configurations:

Demos

Releasing

About

Resources

License

Stars

Watchers

Forks

Languages