# BASIC ALTMETRIC ANALYSIS

This notebook uses the [altmetric source file]() to explore the top Earthcube-funded papers from the Altmetric lens.

It produces three files in `../outputs/altmetric/`:

* [altmetric_doi_project_detail.csv](../outputs/altmetric/altmetric_doi_project_detail.csv)
* [altmetric_doi_project_detail_table.md](../outputs/altmetric/altmetric_doi_project_detail_table.md)
* [altmetric_doi_project_top10_table.md](../outputs/altmetric/altmetric_doi_project_top10_table.md)

In [1]:
import pandas as pd

In [2]:
df = pd.read_csv("../outputs/altmetric_scores.tsv", sep='\t')
df = df.set_index('doi') 

In [3]:
df.describe()

Unnamed: 0,altmetric_score
count,114.0
mean,42.173193
std,139.314984
min,0.25
25%,1.625
50%,7.1
75%,17.8035
max,980.774


## Altmetric Summary

* 114 papers with scores (as of 9/30/2022)
* mean Altmetric score: 42.17 ($\sigma^2$: 139.31)
* median Almetric score: 7.10
* highest Altemetric score: 980.77

## Top 10 Analysis

In [4]:
df.sort_values(by='altmetric_score', ascending=False)[:10]

Unnamed: 0_level_0,altmetric_score
doi,Unnamed: 1_level_1
10.1038/sdata.2017.88,980.774
10.1002/2017gl074954,918.634
10.1029/2021ef002277,580.648
10.1038/s41550-020-1147-7,274.68
10.1038/nbt.4306,170.44
10.1371/journal.pone.0113523,166.646
10.1016/j.epsl.2016.12.012,154.722
10.1038/s41561-018-0272-8,145.226
10.1126/science.aad7048,142.358
10.5194/tc-16-1431-2022,107.298


In [5]:
df_citations = pd.read_csv("../outputs/nsf/nsf_doi_project_summary.tsv", sep='\t', header=None)

df_citations = df_citations[[0,1,2]]
df_citations = df_citations.drop_duplicates()
df_citations.columns = ['nsfid', 'doi', 'title']

In [6]:
df_citations

Unnamed: 0,nsfid,doi,title
0,nsfid,doi,project_title
1,1324760,10.1016/j.geomorph.2015.03.039,"Data management, sharing, and reuse in experim..."
3,1324760,10.2110/sedred.2013.4.9,Building a Sediment Experimentalist Network (S...
4,1324760,10.2110/sedred.2013.4,
5,1340233,10.1126/science.342.6162.1041-b,Open Data: Crediting a Culture of Cooperation
...,...,...,...
405,2126474,10.1109/icdm51629.2021.00037,Physics-Guided Machine Learning from Simulatio...
406,2126474,10.1145/3474842,Significant DBSCAN+: Statistically Robust Dens...
407,2126474,10.1109/icdm51629.2021.00088,A Statistically-Guided Deep Network Transforma...
408,2126474,10.1145/3474717.3483970,Spatial-Net


In [7]:
df_projects = pd.read_csv("../outputs/nsf/nsfid_project_title_normed.csv")
df_projects.columns = ['nsfid', 'project_title']

The final dataframe is created and stored.                     

In [8]:
df_altmetric = \
    df.merge(
        df_citations\
            .merge(df_projects).set_index('doi')\
            .merge(df, left_index=True, right_index=True),\
        left_index=True, right_index=True
        )\
      .reset_index()\
      .drop_duplicates('doi', keep='first')\
      .set_index('doi').drop('altmetric_score_y', axis=1)\
      .rename(columns={'altmetric_score_x': 'altmetric_score'})\
      .sort_values(by='altmetric_score', ascending=False)

ValueError: You are trying to merge on object and int64 columns. If you wish to proceed you should use pd.concat

In [None]:
df_altmetric.altmetric_score = df_altmetric.altmetric_score.round(2)

Produce the csv file:

In [11]:
df_altmetric.to_csv("../outputs/altmetric/altmetric_doi_project_detail.csv")

Produce the markdown files:

In [12]:
with open("../outputs/altmetric/altmetric_doi_project_detail_table.md", "w", encoding='utf-8') as fo, \
     open("../outputs/altmetric/altmetric_doi_project_top10_table.md", "w", encoding='utf-8') as fo_top10:
    header = "|Altmetric Score|Publication Title|Project Title|\n|--------------:|-----------------|------------------------|\n"
    fo.write(header)
    fo_top10.write(header)
    
    for i, row in enumerate(df_altmetric.itertuples()):
        if i < 10:
            nsf_url = f"https://nsf.gov/awardsearch/showAward?AWD_ID={row.nsfid}&HistoricalAwards=false"
            doi_url = f"https://doi.org/{row.Index}"

            fo_top10.write(
                f"| {row.altmetric_score:.2f} | {row.title} (doi: [{row.Index}]({doi_url})) | {row.project_title} ([NSF #{row.nsfid}]({nsf_url}))|\n"
            )
            
        fo.write(
                f"| {row.altmetric_score:.2f} | {row.title} (doi: [{row.Index}]({doi_url})) | {row.project_title} ([NSF #{row.nsfid}]({nsf_url}))|\n"
            )