# Analysis of Author Metrics

This notebook shows, how our graphs enables the analysis of author metrics.

In [23]:

from graph.oa_graph import OpenAlexGraph
from graph.analyse_graph import (
    res_to_dataframe,
)

g = OpenAlexGraph()
g.parse("out/graph.ttl")  # for performance, keep graph loaded

<Graph identifier=N0fd463374c34407b8dda3e9e3fb64d2a (<class 'graph.oa_graph.OpenAlexGraph'>)>

# Number of Works

We can simply query the number of works for an author.

In [16]:
authors = ["Stefan Wermter", "Frank Steinicke"]

author_list = '" "'.join(authors)
q = f"""
    SELECT
        (?name as ?AUTHOR)
        (COUNT(?work) as ?PUBLICATIONS)
    WHERE {{
        ?id a schema:Person ;
            schema:name ?name ;
            schema:author ?work ;

        VALUES ?name {{ "{author_list}" }}
    }}
    GROUP BY ?name
    ORDER BY DESC(?WORK_NUMBER)
"""
df = res_to_dataframe(g.query(q))
df.head()

Unnamed: 0,AUTHOR,PUBLICATIONS
0,Stefan Wermter,295
1,Frank Steinicke,131


We can also query the number of citations for a specific author. Since only works of Hamburgs universities are included, only those will count to the citation count.

In [15]:
q = f"""
    SELECT
        (?name as ?AUTHOR)
        (SUM(?citations) as ?CITATIONS)
    WHERE {{
        ?id a schema:Person ;
            schema:name ?name ;
            schema:author [
                dbp:citation [
                    dbp:amount ?citations
                ]
            ] ;

        VALUES ?name {{ "{author_list}" }}
    }}
    GROUP BY ?name
    ORDER BY DESC(?WORK_NUMBER)
"""
df = res_to_dataframe(g.query(q))
df.head()

Unnamed: 0,AUTHOR,CITATIONS
0,Stefan Wermter,6067
1,Frank Steinicke,1162


We can also consider the number of citations per year.

In [22]:
q = f"""
    SELECT
        (?name as ?AUTHOR)
        (?year as ?YEAR)
        (SUM(?citations) as ?CITATIONS)
    WHERE {{
        ?id a schema:Person ;
            schema:name ?name ;
            schema:author [
                dbp:citation [
                    dbp:amount ?citations ;
                    dbp:year ?year
                ]
            ] ;

        VALUES ?name {{ "{author_list}" }}
    }}
    GROUP BY ?name ?year
    ORDER BY DESC(?year)
"""
df = res_to_dataframe(g.query(q))
df = df.set_index(["AUTHOR", "YEAR"])
df = df.sort_index()
df

Unnamed: 0_level_0,Unnamed: 1_level_0,CITATIONS
AUTHOR,YEAR,Unnamed: 2_level_1
Frank Steinicke,2012,1
Frank Steinicke,2014,1
Frank Steinicke,2015,10
Frank Steinicke,2016,22
Frank Steinicke,2017,59
Frank Steinicke,2018,99
Frank Steinicke,2019,242
Frank Steinicke,2020,267
Frank Steinicke,2021,402
Frank Steinicke,2022,59
