Skip to content

BenjaminSchiller/GiraphWrapper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GiraphWrapper

Resources

directory structure of results/

The measured runtimes for analyzing a given setup with Giraph are written to the following directory structure:

results/
	<dataset>/
		<batches>/
			<partitioning>/
 				<metric>/
					<workers>/
						<run>.dat
						aggr

The results from multiple runs (= repetitions) of the same setup are aggregated in aggr.

metrics from Giraph

Here, metrics used from giraph examples are listed. Each metric is specified as:

  • KEY: name/link

sources

we use:

tested but not working

others:

  • random walk
  • page rank

metrics from Okapi

Here, metrics used from ocapi are listed. W identifies algorithms working for weighted graphs, D are directed graphs, and U are undirected graphs. U & D means that the algorithm can be applied to directed or undirected graphs, if only one is used the other is assumed to not be supported. W menas that weighted graphs are supported but no statement about directed or undirected is mde.

Each algorithm is specified as:

  • KEY: name/link (graph type) (optional description)

sources

we use:

tested but not working

  • OKAPI_MSSP_FRACTION_${fraction of vertices to use as source} (always fails)

others:

Arguments for the use of GiraphRunner

complete help output (GiraphRunner -h)

usage: org.apache.giraph.utils.ConfigurationUtils [-aw <arg>] [-c <arg>]
       [-ca <arg>] [-cf <arg>] [-eif <arg>] [-eip <arg>] [-eof <arg>]
       [-esd <arg>] [-h] [-jyc <arg>] [-la] [-mc <arg>] [-op <arg>] [-pc
       <arg>] [-q] [-th <arg>] [-ve <arg>] [-vif <arg>] [-vip <arg>] [-vof
       <arg>] [-vsd <arg>] [-vvf <arg>] [-w <arg>] [-wc <arg>] [-yh <arg>]
       [-yj <arg>]
 -aw,--aggregatorWriter <arg>           AggregatorWriter class
 -c,--combiner <arg>                    MessageCombiner class
 -ca,--customArguments <arg>            provide custom arguments for the
                                        job configuration in the form: -ca
                                        <param1>=<value1>,<param2>=<value2
                                        > -ca <param3>=<value3> etc. It
                                        can appear multiple times, and the
                                        last one has effect for the same
                                        param.
 -cf,--cacheFile <arg>                  Files for distributed cache
 -eif,--edgeInputFormat <arg>           Edge input format
 -eip,--edgeInputPath <arg>             Edge input path
 -eof,--edgeOutputFormat <arg>          Edge output format
 -esd,--edgeSubDir <arg>                subdirectory to be used for the
                                        edge output
 -h,--help                              Help
 -jyc,--jythonClass <arg>               Jython class name, used if
                                        computation passed in is a python
                                        script
 -la,--listAlgorithms                   List supported algorithms
 -mc,--masterCompute <arg>              MasterCompute class
 -op,--outputPath <arg>                 Output path
 -pc,--partitionClass <arg>             Partition class
 -q,--quiet                             Quiet output
 -th,--typesHolder <arg>                Class that holds types. Needed
                                        only if Computation is not set
 -ve,--outEdges <arg>                   Vertex edges class
 -vif,--vertexInputFormat <arg>         Vertex input format
 -vip,--vertexInputPath <arg>           Vertex input path
 -vof,--vertexOutputFormat <arg>        Vertex output format
 -vsd,--vertexSubDir <arg>              subdirectory to be used for the
                                        vertex output
 -vvf,--vertexValueFactoryClass <arg>   Vertex value factory class
 -w,--workers <arg>                     Number of workers
 -wc,--workerContext <arg>              WorkerContext class
 -yh,--yarnheap <arg>                   Heap size, in MB, for each Giraph
                                        task (YARN only.) Defaults to
                                        giraph.yarn.task.heap.mb => 1024
                                        (integer)
                                        MB.
 -yj,--yarnjars <arg>                   comma-separated list of JAR
                                        filenames to distribute to Giraph
                                        tasks and ApplicationMaster. YARN
                                        only. Search order: CLASSPATH,
                                        HADOOP_HOME, user current dir.

components

edges

 -eif,--edgeInputFormat <arg>           Edge input format
 -eip,--edgeInputPath <arg>             Edge input path
 -eof,--edgeOutputFormat <arg>          Edge output format
 -esd,--edgeSubDir <arg>                subdirectory to be used for the
                                        edge output

vertices

 -ve,--outEdges <arg>                   Vertex edges class
 -vif,--vertexInputFormat <arg>         Vertex input format
 -vip,--vertexInputPath <arg>           Vertex input path
 -vof,--vertexOutputFormat <arg>        Vertex output format
 -vsd,--vertexSubDir <arg>              subdirectory to be used for the
                                        vertex output
 -vvf,--vertexValueFactoryClass <arg>   Vertex value factory class

log

 -h,--help                              Help
 -q,--quiet                             Quiet output

results

 -op,--outputPath <arg>                 Output path

computation

 -mc,--masterCompute <arg>              MasterCompute class
 -pc,--partitionClass <arg>             Partition class

workers

 -w,--workers <arg>                     Number of workers
 -wc,--workerContext <arg>              WorkerContext class

misc

 -jyc,--jythonClass <arg>               Jython class name, used if
                                        computation passed in is a python
                                        script
 -yj,--yarnjars <arg>                   comma-separated list of JAR
                                        filenames to distribute to Giraph
                                        tasks and ApplicationMaster. YARN
                                        only. Search order: CLASSPATH,
                                        HADOOP_HOME, user current dir.
 -yh,--yarnheap <arg>                   Heap size, in MB, for each Giraph
                                        task (YARN only.) Defaults to
                                        giraph.yarn.task.heap.mb => 1024
                                        (integer)
                                        MB.
 -ca,--customArguments <arg>            provide custom arguments for the
                                        job configuration in the form: -ca
                                        <param1>=<value1>,<param2>=<value2
                                        > -ca <param3>=<value3> etc. It
                                        can appear multiple times, and the
                                        last one has effect for the same
                                        param.
 -cf,--cacheFile <arg>                  Files for distributed cache
 -la,--listAlgorithms                   List supported algorithms
 -th,--typesHolder <arg>                Class that holds types. Needed
                                        only if Computation is not set
 -aw,--aggregatorWriter <arg>           AggregatorWriter class
 -c,--combiner <arg>                    MessageCombiner class

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published