-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add abstracts entities into the KG #39
Comments
Hi, @ceteri I think now I understand how everything works in the template code, especially how it iterates through partitions in the Bucket_Stage. But I got a little confused about the input and output. Where is the metadata I am going to pull the abstract from? Is it BUCKET_STAGE, if so, where should I output the result(is it BUCKET_FINAL?), in other words, how could I add anything into the KG, could you explain a little more about it? |
Great. For overall structure, probably copy code similar to this from
... then with the added definition for Using the
Another function could serve as an accessor, something with code like:
That moves the abstract into the top level metadata for that publication, along with
A larger question is whether Semantic Scholar is the only source of abstracts? It may be that we need to expand the RCApi list of APIs and what they returns -- for example PubMed, OpenAIRE, CORE, etc., and @lobodemonte can help there. |
Hi @ceteri Thanks so much! Your instruction is so detailed and it's very helpful. So I have created a PR in #58 with the code. Hope I get what you mean and it is what you want. Now I have a much clearer idea of how to add something in the KG and also get more familiar with |
Add the publication abstracts into the KG, based on
stage3
results from discovery APIs.Best to use
run_stage3.py
as a template, although the main part to reuse is how it iterates through partitions in theBUCKET_STAGE
@ernestogimeno has also worked with this code and can help guide/advise.
Will need to pull the
abstract
field from metadata, where it exists. The responses from Semantic Scholar tend to have these -- and we may be able to extend other API calls to get abstracts. @lobodemonte can assist on those extensions.The end goal will be to include abstracts as metadata in the graph -- where available -- and then also run these through a later stage that runs the TextRank algorithm to extract key phrases.
The text was updated successfully, but these errors were encountered: