Skip to content

openAlex new author and work data accuracy.  #146

@yhan818

Description

@yhan818

Hi, All,

The openAlex new author data was out on July 25, and was finally announced on Aug 11, 2023. I did some tests after July 25 and also after Aug 11. I noticed that there are some intermedia changes in the Author data.

It seems to me that it uses cosine_similarity to measure the similarity and XGBoost of matching an author with his/her name. see code https://github.com/ourresearch/openalex-name-disambiguation/tree/main/V3/002_Data_Processing_Modeling_Clustering

In general, the new author data is much better in terms of accuracy, compared to the previous version. However, there are still issues for both author and work. I have tested some cases. see yhan818/openalexR-test#7

So what are your views on the latest updates?

Metadata

Metadata

Assignees

No one assigned

    Labels

    OpenAlexquestionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions