Skip to content

Sample subgraphs from RDF Graphs stored as HDT Documents.

License

Notifications You must be signed in to change notification settings

Lars-H/hdt_sampler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HDT Sampler

Sample subgraphs from RDF Graphs stored as HDT Documents.

Requirements

Installation

  • Install Python
  • Follow instruction of pyHDT and RDFLib to install them
  • Installation in a virtualenv is advised

Usage

Example:

python hdt_sampler.py -f myHDTFile.hdt -s 0.1 -m unweigthed

CLI Arguments:

  -h, --help            show this help message and exit
  -f FILE, --file FILE  HDT File to be sampled from (required)
  -s SIZE, --size SIZE  Percentage of subjects to be sampled, range: [0,1]
                        (required)
  -n NUMBER, --number NUMBER
                        Number of samples to be created (default=1)
  -m {unweighted,weighted,hybrid}, --method {unweighted,weighted,hybrid}
                        Sampling method to be used (required: unweighted,
                        weigthed, hybrid)
  -r RATIO, --ratio RATIO
                        Ratio for hybrid sampling, range: [0,1] (default=0.5)
  -l {INFO,DEBUG,ERROR}, --logging {INFO,DEBUG,ERROR}
                        Set logging level (optional)

Scripts

In the scripts directory we provide additional scripts:

  • compute_CSPF_proto.py: Prototypical implementation to compute the CSPF for an n-triples file. The script takes the filepath of an N-Triples file as a single argument. It shuffles the triples, sorts them, and computes the CSPF. It prints the stats of computing the CSPF

Related Publication

Heling, Lars, Acosta, Maribel. 
"Estimating Characteristic Sets for RDF Dataset Profiles based on Sampling." 
European Semantic Web Conference 2020.

About

Sample subgraphs from RDF Graphs stored as HDT Documents.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages