### Implementation of [FAIRsoft evaluation tool](https://openebench.bsc.es/observatory/Evaluation/) on CSDMS model repository.

This notebook is intended to provide an example adaptation of the FAIRsoft evaluation workflow using the Observatory API and a custom GitHub metadta extraction routine to evaluate the FAIRsoft scores of research software in the CSDMS model repository.

In [1]:
# import Python libraries
import pandas as pd
# import libary to handle making API requests
from src import observatory_api_client
# import library for custom metadata extraction and CSDMS repository querying
from src import custom_scripts

The CSDMS research software objects and their associated metadata can be imported via the CSDMS Wiki API:

In [2]:
# retrieve dataframe of csdsms model repo metadata
df_csdms = custom_scripts.query_csdms_repository()
print(df_csdms.head())
# this can be stored in tmp folder
df_csdms.to_csv("tmp/csdms_models.csv",index=False)

                                     ModelName  ModelDomain  \
0                            1D Hillslope MCMC  Terrestrial   
1  1D Particle-Based Hillslope Evolution Model      Coastal   
2                  1DBreachingTurbidityCurrent      Coastal   
3                                    2DFLOWVEL      Coastal   
4                                       ACADIA       Marine   

                                   SourceWebAddress    SourceCodeAvailability  \
0                                                    Through CSDMS repository   
1                                                    Through CSDMS repository   
2                                                    Through CSDMS repository   
3                                                    Through CSDMS repository   
4  http://www-nml.dartmouth.edu/Software/acadia5.0/    Through web repository   

  CodeReviewed ProgrammingLanguage                DOIModel  \
0            1                 C++                           
1         

We can now filter the CSDMS research software objects from GitHub through a regex search:

In [3]:
df_csdms_github = df_csdms[df_csdms.SourceWebAddress.str.contains('github')]
print(df_csdms_github.head())
# this can be stored in tmp folder
df_csdms_github.to_csv("tmp/csdms_models_github.csv",index=False)

                                   ModelName  ModelDomain  \
7                                   ALFRESCO   Cryosphere   
8                              AR2-sinuosity    Hydrology   
9   ATS (The Advanced Terrestrial Simulator)   Cryosphere   
13                                    AeoLiS  Terrestrial   
21                                AlluvStrat      Coastal   

                              SourceWebAddress    SourceCodeAvailability  \
7          https://github.com/ua-snap/alfresco    Through web repository   
8     https://github.com/alimaye/AR2-sinuosity    Through web repository   
9                https://github.com/amanzi/ats    Through web repository   
13  https://github.com/openearth/aeolis-python    Through web repository   
21      https://github.com/awickert/alluvstrat  Through CSDMS repository   

   CodeReviewed ProgrammingLanguage             DOIModel  \
7             1                 C++                        
8             1              Matlab                     

The GitHub repository URLs can be input into the FAIRsoft evaluation workflow and the results can be stored into a dataframe:

In [4]:
# initalize dataframe for FAIRsoft scores
indicators = ['F','A','I','R']
# create copy of df_csdms_github to store results
df_csdms_github_results = df_csdms_github.copy()

for index, row in df_csdms_github.iterrows():
    try:
        # The GitHub repository URL can be input into the routine for metadata extraction and mapping to the observatory metadata schema
        metadata = custom_scripts.get_repository_metadata(row.SourceWebAddress)
        # Run evaluator and retrieve scores
        fairsoft_scores,_ = observatory_api_client.get_fairsoft_scores_and_evaluation(metadata)
        print('Evaluated %s' % row.SourceWebAddress)
        # Store scores in dataframe
        for indicator in indicators:
            df_csdms_github_results.at[index,indicator] = fairsoft_scores[indicator]
    except Exception as e:
        print('Failed to evaluate %s' % row.SourceWebAddress)

# save evaluation results to file
df_csdms_github_results.to_csv('out/fairsoft_results_csdms_models_github.csv',index=False)

HTTP error occurred: 400 Client Error: Bad Request for url: https://observatory.openebench.bsc.es/api/fair/evaluate
Failed to evaluate https://github.com/ua-snap/alfresco
Evaluated https://github.com/alimaye/AR2-sinuosity
Evaluated https://github.com/amanzi/ats
Evaluated https://github.com/openearth/aeolis-python
HTTP error occurred: 400 Client Error: Bad Request for url: https://observatory.openebench.bsc.es/api/fair/evaluate
Failed to evaluate https://github.com/awickert/alluvstrat
Evaluated https://github.com/mperignon/anugaSed
Evaluated https://github.com/APSIMInitiative/ApsimX
Evaluated https://github.com/csdms-contrib/Auto_marsh
Evaluated https://github.com/mcflugen/sedflux
Evaluated https://github.com/cmshobe/brake-model
Failed to evaluate http://github.com/badlands-model
Evaluated https://github.com/csdms-contrib/Barrier_Inlet_Environment_BRIE_Model
Evaluated https://github.com/UNC-CECL/Barrier3D
Evaluated https://github.com/UNC-CECL/BarrierBMFT
Evaluated https://github.com/mcf

The FAIRsoft indicators can be seen below for further interpretation (documentation for these indicators can be found [here](https://inab.github.io/FAIRsoft_indicators/)):

![Image of FAIRsoft indicators](images/fairsoft_indicators.png)