Skip to content

OSL Internship ‐ 2024 ‐ 2nd Cycle

Ivan Ogasawara edited this page Apr 17, 2024 · 1 revision

ES-Journals

An ElasticSearch instance for serving scientific journals metadata. Currently, it has support for biorXiv and medrXiv.


Project Idea 1: Add support for more Journals

Abstract

ES-Journals relays on scripts that download articles' metadata from original sources and load it into a ElasticSearch instance.

In some sense it is very similar to https://github.com/CenterForOpenScience/SHARE, but the purpose of ES-Journals is to keep it as simple as possible and serve just the ElasticSearch instance .

Current State

Currently, it supports biorXiv and medrXiv.

Tasks

Expected Outcomes

  • Scripts for download metadata from arXiv and load its data to ES
  • Scripts for download metadata from PubMed and load its data to ES
  • Scripts for download metadata from PubMedCentral and load its data to ES

Details

  • Prerequisites:
    • Python
    • Django
  • Expected Time: 350 hours
  • Potential Mentor(s): Ivan Ogasawara

References