Skip to content

Latest commit

 

History

History
121 lines (97 loc) · 5.64 KB

rec.rst

File metadata and controls

121 lines (97 loc) · 5.64 KB

Link recommendation experiments

The RELISON framework provides a program for recommending people in social network environments. After the binary for the library is generated, we can use the program with the following terminal command:

java -jar RELISON.jar recommendation train-network test-network multigraph directed weighted selfloops readtypes config output rec-length (-users test/all -print true/false -reciprocal true/false -distance max -feat-data file index -comms commfile)

where

  • train-network: a file containing the training social network, which is taken as input to recommenders.
  • test-network : a file containing the test social network, which is used to evaluate the effectiveness of the recommendation algorithms.
  • multigraph: true if the network allows multiple edges between each pair of users, false otherwise.
  • directed: true if the network is directed, false otherwise.
  • weighted: true if we want to use the weights of the links, false otherwise (weights will be binary).
  • selfloops: true if we allow links between a node and itself, false otherwise.
  • readtypes: true if we want to read the types of the edges, false otherwise.
  • config: a Yaml configuration file for reading the people recommendation algorithms and the evaluation metrics we want to apply (see Configuration file below).
  • output: a directory in which to store the structural properties.
  • rec-length: the maximum number of links to recommend to each user.
  • Optional arguments:
    • -users test/all: indicates whether to generate recommendations for all the users (all) or just for those who have links in the test set (test). By default: all.
    • -print true/false: indicates whether we want to print the recommendations or not. By default: true.
    • -reciprocal true/false: in directed networks, this parameter indicates whether we want to recommend reciprocal edges or not. By default: false.
    • -distance max directed: max indicates the maximum distance between the target and the candidate users. directed indicates whether we want to consider link orientation when computing such distance. By default, it does not limit the distance.
    • -feat-data file index: in case we want to compute feature-based metrics, file specifies the location of a feature file (user-feature-value tab-separated triplets), or an index containing the features. If index is equal to true, we consider that file points to an index.
    • -comms commfile: route to a file containing a community partition of the network.

Configuration file

In order to select a suitable set of metrics, the program receives, as input, a configuration file, specifying the different people recommendation methods we want to apply and evaluate. This is a Yaml file with the following format:

algorithms:
  algorithm_name1:
    param_name1:
      type: int/double/boolean/string/long/orientation/object
      values: [value1,value2,...,valueN] / value
      range:
      - start: startingValue
        end: endingValue
        step: stepValue
      - start: <...>
      objects:
        name_of_the_object:
          param_name1:
            type: int/double/boolean/string/long/orientation/object
            <...>
          param_name2:
            type: int/double/boolean/string/long/orientation/object
            <...>
  algorithm_name2:
    ...
metrics:
  metric_name1:
    param_name1:
      type: int/double/boolean/string/long/orientation/object
      <...>
    param_name2:
      <...>
  metric_name2:
  <...>

where algorithms shows the part of the configuration file dedicated to the parameter grid of the people recommendation algorithms, whereas the metric tag shows the start of the evaluation metric section of the Yaml file.

Output files

This program produces two outcomes: the evaluation file and the recommendations.

Evaluation file

This file contains the evaluation metrics for each algorithm. The first line contains the header, whereas the rest show the metric values for a single algorithm. Each line has the following (tab-separated) format:

Variant Fraction metric1 metric2 <...> metricN

where fraction represents the number of the algorithm (divided by the total number of algorithms in the comparison).

For example:

Variant Fraction P@10  R@10  nDCG@10
Random  0.5 0.001 0.001 0.0013
Popularity  1.0 0.4 0.23  0.3482

Recommendation file

This file contains the recommendations produced for the different users. It does not have a header, and each line has the following (tab-separated) format:

TargetUserId  CandidateUserId value

where the target-candidate user pairs are sorted by a) the target user and b) the score (in descending order). Order between users might be arbitrary.

Example:

883345842 10671602  0.7839427836033016
883345842 242101122 0.7510278151340579
883345842 230377004 0.6487410202793975
883345842 19604744  0.6219403238554378
883345842 398306220 0.6129622813222247
883345842 181561712 0.525116653773563
883345842 176566242 0.525116653773563
883345842 105119490 0.525116653773563
883345842 11254812  0.5196496019742988
883345842 11348282  0.5094869396470944
883609597 430916286 3.08258431711799
883609597 756033804 2.7629745300415265
883609597 11254812  2.629591896712651
<...>