Skip to content

TaoranJ/oag_in_elasticsearch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OAG in Elasticsearch

Want to explore the Microsoft Academic Graph (MAG) and Aminer in a search engine? Use these scripts to put MAG and Aminer dataset in the Elasticsearch!

Requirement

pip install elasticsearch tqdm 

Datasets

OAG can be downloaded from here. Open Academic Graph (OAG) unifies two billion-scale academic graphs: Microsoft Academic Graph (MAG) and AMiner.

MAG V1

In total, 167 files named in pattern mag_papers_[0-166].txt are included in MAG V1 dataset. Running the script below to upload the dataset to Elasticsearch. The index name is set up by --index option and is mag_v1 by default. The script was tested on Elasitcsearch >= 7.4 using English only publications in MAG V1.

python index_mag_v1.py --inputs [path/mag_papers*.txt]

Aminer V1

In total, 155 files named in pattern aminer_papers_[0-154].txt are included in Aminer V1 dataset. Running the script below to upload the dataset to Elasticsearch. The index name is set up by --index option and is aminer_v1 by default. The script was tested on Elasitcsearch >= 7.4 and publications which have both title and abstract.

python index_aminer_v1.py --inputs [path/aminer_papers*.txt]

Releases

No releases published

Packages

No packages published

Languages