<h1 style="text-align:center; text-decoration:underline">Sebastiano's Notebook</h1>
<br>
<br>
<p style="font-size: 14.5px">This is a Jupyter Notebook to keep track of the thoughts and progresses about the <b>Open Science</b> project (DHDK - a.y. 2022-2023). The research question to be tackled is the following one:</p>
<div style="padding-left: 25px; padding-right: 43px; padding-top: 8px; display:block"><p style="text-align:justify; font-size:14px"><em>"What is the coverage of publications in Social Science and Humanities (SSH) journals (according to ERIH-PLUS) included in OpenCitations Meta? What are the disciplines that have more publications? What are countries providing the largest number of publications and journals? How many of the SSH journals are available in Open Access according to the data in DOAJ?"</em></p></div>
<br>
<div style="display:block; background-color: #e9fce9; border-radius: 3px; border: 1pt solid #a6f1a6; padding: 8px 10px; width: 30%; margin-left:2%; margin-top: 12px">
    <div style="display:inline-block; width: 50%">
        <p style="font-size:15px; padding-left: 15px, padding-top: 10px"><b>Go to date:</b></p>
        <ul style="font-size:14px">
            <li><a href="#27/03/2023" style="color:#333; font-style: italic;">27/03/2023</a></li>
            <li><a href="#28/03/2023" style="color:#333; font-style: italic;">28/03/2023</a></li>
            <li><a href="#31/03/2023" style="color:#333; font-style: italic;">31/03/2023</a></li>
            <li><a href="#04/04/2023" style="color:#333; font-style: italic;">04/04/2023</a></li>
            <li><a href="#11/04/2023" style="color:#333; font-style: italic;">11/04/2023</a></li>
            <li><a href="#15/04/2023" style="color:#333; font-style: italic;">15/04/2023</a></li>
        </ul>
    </div>
    <img src="https://www.transparentpng.com/thumb/calendar/green-calendar-vector-icon-png-20.png" alt="green calendar vector icon png @transparentpng.com" style="display:inline-block; width:45%; margin-top: -80px">
</div>
<hr style="border: 1pt solid #89CFF0">

<h3 id="27/03/2023" style="text-decoration:underline">27/03/2023</h3>
<p style="font-size: 14.5px">First of all, let's try to subdivide our research questions into smaller units:</p>
<table style="display:block; float:left; font-size: 14px; margin-bottom: 15px">
    <tr>
        <th style="border: 1px solid #dddddd; text-align: left; padding: 8px;">Question</th>
        <th style="border: 1px solid #dddddd; text-align: left; padding: 8px;">Resources to be used</th>
        <th style="border: 1px solid #dddddd; text-align: left; padding: 8px;">Possible solution</th>
      </tr>
      <tr>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;">Coverage of publications in SSH journals included in OpenCitations Met</td>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><a href="https://kanalregister.hkdir.no/publiseringskanaler/erihplus/">ERIH-PLUS</a>, <a href="http://opencitations.net/meta">OpenCitations Meta</a></td>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;">Extract publications in SSH journals; use the dois to find the intersection of the two datasets.</td>
      </tr>
      <tr>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;">Disciplines and countries providing more publications/journals</td>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><a href="https://kanalregister.hkdir.no/publiseringskanaler/erihplus/">ERIH-PLUS</a></td>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;">Pandas DataFrame to sql; Sql query</td>
      </tr>
      <tr>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;">How many SSH journals are available in Open Access</td>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><a href="https://kanalregister.hkdir.no/publiseringskanaler/erihplus/">ERIH-PLUS</a>, <a href="https://doaj.org/">DOAJ</a></td>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;">Intersection of journals (maybe using ISSN)</td>
      </tr>
</table>
<p style="font-size:14.5px; text-align: justify">The analysis of the research question allows understanding what is the <b>purpose</b> of the project. Our goal is to investigate the <b>openness</b> of <b>SSH publications</b> and to analyse how different disciplines and countries approach Open Science. At the same time, a primary is played by <b>citations</b>, as they represent a measure of the diffusion and sharing of a publication.</p>
<div style="display:block; background-color: #D4F1F4; border-radius: 3px; border: 1pt solid #89CFF0; padding: 5px 10px; width: 90%; margin-left:5%; margin-top: 10px"><p><b>ON THIS DATE</b>: analysis of the reaserch question; possible draft of our project's abstract published in <a href="https://github.com/open-sci/2022-2023/commit/bcfc173e2d3e615ec3c32985fb68346a1eb201fd">Github</a></p></div>
<hr style="border: 1pt solid #89CFF0">

<h3 id="28/03/2023" style="text-decoration:underline">28/03/2023</h3>
<p style="text-align:justify; font-size: 14.5px">Let's start working on our datasets. The first one I would like to analyze is ERIH-PLUS. This is an academic journal index: the data is available in the form of a <code>.csv</code> file at the following <a href="https://kanalregister.hkdir.no/publiseringskanaler/erihplus/periodical/listApproved">link</a>. Let's try to upload this file using Pandas.

In [1]:
import pandas as pd

erih_plus = pd.read_csv("2023-03-27 ERIH PLUS.csv", sep=";")
erih_plus.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 11072 entries, 0 to 11071
Data columns (total 9 columns):
 #   Column                  Non-Null Count  Dtype 
---  ------                  --------------  ----- 
 0   Journal ID              11072 non-null  int64 
 1   Print ISSN              8747 non-null   object
 2   Online ISSN             9515 non-null   object
 3   Original Title          11072 non-null  object
 4   International Title     11072 non-null  object
 5   Country of Publication  10952 non-null  object
 6   ERIH PLUS Disciplines   11072 non-null  object
 7   OECD Classifications    11072 non-null  object
 8   [Last Updated]          11072 non-null  object
dtypes: int64(1), object(8)
memory usage: 778.6+ KB


In [2]:
erih_plus.head()

Unnamed: 0,Journal ID,Print ISSN,Online ISSN,Original Title,International Title,Country of Publication,ERIH PLUS Disciplines,OECD Classifications,[Last Updated]
0,486254,1989-3477,,@tic.revista d'innovació educativa,@tic.revista d'innovació educativa,Spain,Interdisciplinary research in the Social Scien...,Educational Sciences; Other Social Sciences,2015-06-25 13:48:26
1,488561,,2341-0515,[i2] Investigación e Innovación en Arquitectur...,[i2] Investigación e Innovación en Arquitectur...,Spain,"Art and Art History, Cultural Studies, Human G...","Arts (Arts, History of Arts, Performing Arts, ...",2016-04-18 17:34:55
2,504135,,2068-3472,[Inter]sections,[Inter]sections,Romania,"Gender Studies, Cultural Studies, Literature, ...",Languages and Literature; Other Humanities; So...,2022-10-18 08:40:28
3,495209,2250-4591,2346-9986,+E: Revista de Extensión Universitaria,+E: Revista de Extensión Universitaria,Argentina,"Interdisciplinary research in the Humanities, ...",Other Humanities; Other Social Sciences,2023-02-02 17:14:12
4,488332,,0719-5737,100-Cs,100-Cs,Chile,Interdisciplinary research in the Social Sciences,Other Social Sciences,2018-04-26 16:33:26


<p style="font-size: 14.5px">Let's try to check the content of the a cell in the <code>ERIH PLUS Disciplines</code> column</p>

In [3]:
erih_plus.at[1,"ERIH PLUS Disciplines"]

'Art and Art History, Cultural Studies, Human Geography and Urban Studies, Interdisciplinary research in the Humanities, Interdisciplinary research in the Social Sciences, Pedagogical & Educational Research, Science and Technology Studies'

<p style="font-size: 14.5; text-align: justify">As we can see, <b>multiple disciplines</b> are included in the same journal: thus, we need to check for single publications in order to get some more detailed information. To get this information we could try to exploit this <a href="https://erih.dimensions.ai/discover/publication">link</a>. Here we can find all the publications included in the ERIH-PLUS database's journals. However, the system seems to allows users to export a maximum of 500 publications per time, meaning that it's almost impossible to retrieve all the information about the 10,769,086 currently included in the system. Nevertheless, let's try to analyze one of these exported files:</p>

In [8]:
erih_plus_500_publications = pd.read_csv("ERIH-PLUS 500.csv")
erih_plus_500_publications.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 500 entries, 0 to 499
Data columns (total 33 columns):
 #   Column                            Non-Null Count  Dtype  
---  ------                            --------------  -----  
 0   Rank                              500 non-null    int64  
 1   Publication ID                    500 non-null    object 
 2   DOI                               500 non-null    object 
 3   PMID                              462 non-null    float64
 4   PMCID                             343 non-null    object 
 5   Title                             500 non-null    object 
 6   Abstract                          500 non-null    object 
 7   Acknowledgements                  309 non-null    object 
 8   Funding                           267 non-null    object 
 9   Source title                      500 non-null    object 
 10  Anthology title                   0 non-null      float64
 11  MeSH terms                        233 non-null    object 
 12  Publicat

In [9]:
erih_plus_500_publications.head()

Unnamed: 0,Rank,Publication ID,DOI,PMID,PMCID,Title,Abstract,Acknowledgements,Funding,Source title,...,Corresponding Authors,Authors Affiliations,Times cited,Recent citations,RCR,FCR,Source Linkout,ERIH PLUS by Dimensions URL,Fields of Research (ANZSRC 2020),Sustainable Development Goals
0,500,pub.1155098456,10.1162/jocn_a_01972,36735619.0,PMC10024573,Time Courses of Attended and Ignored Object Re...,Selective attention prioritizes information th...,,"Sean Noah, National Eye Institute (https://dx....",Journal of Cognitive Neuroscience,...,,"Noah, Sean (University of California, Davis; U...",0,0,,,https://psyarxiv.com/2aj3n/download,https://erih.dimensions.ai/details/publication...,52 Psychology; 5202 Biological Psychology; 520...,
1,500,pub.1154380038,10.1162/jocn_a_01959,36626349.0,,Cochlear Theta Activity Oscillates in Phase Op...,It is widely established that sensory percepti...,,DOC Fellowship Programme of the Austrian Acade...,Journal of Cognitive Neuroscience,...,,"Köhler, Moritz Herbert Albrecht (University of...",0,0,,,https://doi.org/10.1101/2022.02.21.481289,https://erih.dimensions.ai/details/publication...,32 Biomedical and Clinical Sciences; 3202 Clin...,
2,500,pub.1147283528,10.1037/pspp0000420,35446080.0,,Trait-Specificity Versus Global Positivity: A ...,"For decades, a recurring question in person pe...",,,Journal of Personality and Social Psychology,...,"Thielmann, Isabel (; University of Münster; Un...","Thielmann, Isabel (University of Münster; Univ...",2,2,,,https://psyarxiv.com/z78na/download,https://erih.dimensions.ai/details/publication...,52 Psychology; 5201 Applied and Developmental ...,
3,500,pub.1153677667,10.1037/bne0000544,36521141.0,,Pair housing does not alter incubation of crav...,Evidence suggests that single housing in rats ...,,,Behavioral Neuroscience,...,,"Nett, Kelle E (); LaLumiere, Ryan T ()",0,0,,,https://doi.org/10.1101/2022.07.28.501777,https://erih.dimensions.ai/details/publication...,32 Biomedical and Clinical Sciences; 3214 Phar...,3 Good Health and Well Being
4,500,pub.1156401252,10.1371/journal.pone.0283259,36947531.0,PMC10032514,Human scent signature on cartridge case surviv...,This paper focuses on a chemical analysis of h...,,,PLOS ONE,...,,"Ladislavová, Nikola (University of Chemistry a...",0,0,,,https://journals.plos.org/plosone/article/file...,https://erih.dimensions.ai/details/publication...,34 Chemical Sciences; 3401 Analytical Chemistry,


<p style="font-size:14.5px">As we can see, the third colum is named <code>DOI</code>, so this DataFrame contains some specifical information we might be interested in.</p>

In [11]:
erih_plus_500_publications.at[0, "Fields of Research (ANZSRC 2020)"]

'52 Psychology; 5202 Biological Psychology; 5204 Cognitive and Computational Psychology'

<div style="display:block; background-color: #D4F1F4; border-radius: 3px; border: 1pt solid #89CFF0; padding: 5px 10px; width: 90%; margin-left:5%; margin-top: 10px"><p><b>ON THIS DATE</b>: analysis of the ERIH-PLUS dump file and search by DOI; abstract updated in <a href="https://github.com/open-sci/2022-2023/commit/f08398b6ad90fef436f10be3c6940bfc55d86fa8">Github</a></p></div>
<hr style="border: 1pt solid #89CFF0">

<h3 id="31/03/2023" style="text-decoration:underline">31/03/2023</h3>
<p style="text-align:justify; font-size: 14.5px">First task of the day: create a personal ORCID (available at this <a href="https://orcid.org/0009-0007-7813-0939">link</a>). The next step will be about creating the team's <b>Data Management Plan</b> for the project. Let's try to collect some ideas before the next meeting: the first source is the following article, <em>Ten Simple Rules for Creating a Good Data Management Plan</em> by William K. Michener (<a href="https://doi.org/10.1371/journal.pcbi.1004525">doi.org/10.1371/journal.pcbi.1004525</a>).</p>
<div style="padding-left: 25px; padding-right: 43px; padding-top: 8px; display:block"><p style="text-align:justify; font-size:14px"><em>"A data management plan (DMP) is a document that describes how you will treat your data during a project and what happens with the data after the project ends."</em></p></div>
<p style="text-align:justify; font-size: 14.5px">The author identifies 10 main rules for producing a good DMP: although they are all crucial, which of them could be the most useful ones for our project?</p>
<ul>
    <li><b>Rule 2: Identify the Data to Be Collected</b> - There are some essential characteristics one should understand while working on some data: <em>types</em>, <em>sources</em> (in our case, the sources can be defined by analyzing the research question), <em>volume</em> (i.e., volume of data and number of files), <em>data and format files</em> (<code>.csv</code> files are expected to represent the major format of our data).</li>
    <li><b>Rules 3, 4: How the Data will be Organized and Documented</b> - At this stage, one should design a possible way to organize data. At the same time data should be documented (through metadata) to avoid meaninglessness.</li>
    <li><b>Rule 6: Present a Sound Data Storage and Preservation Strategy</b> - This rule deals with the preservation of our data. In our case, <a href="https://zenodo.org/">Zenodo</a> will probably our main ally. However, there are three main questions we should answer:
        <em>
            <ol>
                <li>How long will the data be accessible?</li>
                <li>How will data be stored and protected over the duration of the project?</li>
                <li> How will data be preserved and made available for future use?</li>
            </ol>
        </em>
    </li>
    <li><b>Rule 9: Assign Roles and Responsibilities</b> - It is important to define who will tackle specific parts of the project. However, also contributions by nonproject staff should be recorded in the DMP.</li>
</ul>
<p style="text-align:justify; font-size: 14.5px">Note that the DMP will be updated several times:</p>
<div style="padding-left: 25px; padding-right: 43px; padding-top: 8px; display:block"><p style="text-align:justify; font-size:14px"><em>"Treat your DMP as a living document and revisit it frequently."</em></p></div>
<div style="display:block; background-color: #D4F1F4; border-radius: 3px; border: 1pt solid #89CFF0; padding: 5px 10px; width: 90%; margin-left:5%; margin-top: 10px"><p><b>ON THIS DATE</b>: creation of a personal ORCID; introductory <a href="https://doi.org/10.1371/journal.pcbi.1004525">reading</a> on how to prepare a DMP</p></div>
<hr style="border: 1pt solid #89CFF0">

<h3 id="04/04/2023" style="text-decoration:underline">04/04/2023</h3>
<p style="text-align:justify; font-size: 14.5px">During the last meeting (02 April) me and my team started to create a Data Management Plan on <a href="https://argos.openaire.eu/splash/">Argos OpenAIRE</a>. We wrote together the metadata of the DMP and planned the creation of its datasets:</p>
<ol>
    <li>A dataset for the data collected</li>
    <li>A dataset for the software generated to create/analyse the data</li>
</ol>
<p style="text-align:justify; font-size: 14.5px">We dediced to adopt the <a href="https://ec.europa.eu/research/participants/docs/h2020-funding-guide/cross-cutting-issues/open-access-data-management/data-management_en.htm">Horizon 2020 FAIR DMP template</a> and split the work on the two datasets: each of us has to work on a set of questions. Thus, today I answered the questions about</p>
<ul>
    <li>Allocation of resources (1st dataset)</li>
    <li>Data security (1st dataset)</li>
    <li>Ethical aspects (1st dataset)</li>    
    <li>Data summary (2nd dataset)</li>
</ul>
        <p style="text-align:justify; font-size: 14.5px">After completing the first draft of these answers, I still have some doubts about the features of the second dataset. In particular, it seems rather complex to properly identify the "<em>types of the described generated/collected data</em>". I will probably revise the answer to this question (<b>1.1.2</b>) after the next meeting with my team, as I am not enterily sure whether this dataset fits the classification provided by the <b>University of Virginia Library</b>: <a href="https://data.library.virginia.edu/data-management/plan/format-types/">here</a>.</p>
<div style="display:block; background-color: #D4F1F4; border-radius: 3px; border: 1pt solid #89CFF0; padding: 5px 10px; width: 90%; margin-left:5%; margin-top: 10px"><p><b>ON THIS DATE</b>: compilation of the DMP datasets' templates (<a href="https://argos.openaire.eu/splash/">Argos OpenAIRE</a>)</p></div>
<hr style="border: 1pt solid #89CFF0">

<h3 id="11/04/2023" style="text-decoration:underline">11/04/2023</h3>
<p style="text-align:justify; font-size: 14.5px">During the last meeting, we decided how to design our project's <b>workflow</b>. The first step is about retrieving the set of Publications from ERIH-PLUS journals which are also included in the OpenCitation Meta dataset. We expect our results to be a DataFrame (Pandas) made up of four columns:</p>
<ul>
    <li>OpenCitation Publication Venue Internal ID</li>
    <li>Issn (OpenCitation format, e.g.: <code>issn:1865-3804</code>*)</li>
    <li>ERIH-PLUS Publication Venue Internal ID</li>    
    <li>Issn (ERIH-PLUS format, e.g.: <code>1865-3804</code>*)</li>
</ul>
<p style="text-align:justify; font-size: 12px">* Actually, columns 2 and 4 could be merged into a single one.</p>
<p style="text-align:justify; font-size: 14.5px">How can we get these results given the first analysis on data organization (see <a href="#28/03/2023" style="font-style: italic;">28/03/2023</a> for further details)? We found out that OpenCitation Meta dataset can be queried from a <b>SPARQL</b> endpoint, available at this <a href="http://opencitations.net/meta/sparql">link</a>, thus I performed on it the following query:</p>
<br>
<div><code>PREFIX datacite: &lt;http://purl.org/spar/datacite/&gt;</code><br>
    <code>PREFIX dcterms: &lt;http://purl.org/dc/terms/&gt;</code><br>
<code>PREFIX literal: &lt;http://www.essepuntato.it/2010/06/literalreification/&gt;</code><br>
<code>PREFIX prism: &lt;http://prismstandard.org/namespaces/basic/2.0/&gt;</code><br>
<code>SELECT * {</code><br>
 <code>?venue datacite:hasIdentifier ?internal_identifier .</code><br>
 <code>?internal_identifier datacite:usesIdentifierScheme datacite:issn .</code><br>
 <code>?internal_identifier literal:hasLiteralValue ?issn .</code><br>
<code>}</code><br>
</div><br>
<p style="text-align:justify; font-size: 14.5px">The results have been dowloaded and stored in a file named <code>queryResults.csv</code>. At the same time, I downloaded the list of ISSNs included in the ERIH-PLUS index (<a href="https://kanalregister.hkdir.no/publiseringskanaler/erihplus/periodical/listApprovedISSN">here</a>) and stored them in a plain text file.</p>

In [1]:
import pandas as pd
df = pd.read_csv("C:/Users/sebas/Downloads/queryResults.csv")
df

Unnamed: 0,"""venue""","""internal_identifier""","""issn""",Unnamed: 3
0,"""https://w3id.org/oc/meta/br/062501038""","""https://w3id.org/oc/meta/id/062501""","""0009-2797""",
1,"""https://w3id.org/oc/meta/br/062501129""","""https://w3id.org/oc/meta/id/062503""","""1521-3781""",
2,"""https://w3id.org/oc/meta/br/062501129""","""https://w3id.org/oc/meta/id/062504""","""0009-2851""",
3,"""https://w3id.org/oc/meta/br/062501155""","""https://w3id.org/oc/meta/id/062506""","""1522-2640""",
4,"""https://w3id.org/oc/meta/br/062501155""","""https://w3id.org/oc/meta/id/062507""","""0009-286X""",
...,...,...,...,...
136023,"""https://w3id.org/oc/meta/br/061803464677""","""https://w3id.org/oc/meta/id/061803195739""","""2582-6271""",
136024,"""https://w3id.org/oc/meta/br/061803464695""","""https://w3id.org/oc/meta/id/061803195740""","""2582-7421""",
136025,"""https://w3id.org/oc/meta/br/061803464699""","""https://w3id.org/oc/meta/id/061803195741""","""2582-7804""",
136026,"""https://w3id.org/oc/meta/br/061803464707""","""https://w3id.org/oc/meta/id/061803195742""","""2582-7960""",


In [2]:
with open('C:/Users/sebas/Downloads/issnLists.txt') as f:
    lines = f.readlines()
issn_str = lines[0]
issn_list = issn_str.split(",")
issn_list

['0001-2343',
 '0001-415X',
 '0001-4273',
 '0001-4788',
 '0001-4966',
 '0001-5210',
 '0001-5229',
 '0001-5830',
 '0001-6241',
 '0001-6438',
 '0001-6446',
 '0001-6829',
 '0001-690X',
 '0001-6918',
 '0001-6993',
 '0001-8244',
 '0001-8392',
 '0001-8791',
 '0001-9720',
 '0001-9909',
 '0001-9933',
 '0002-0184',
 '0002-0206',
 '0002-0591',
 '0002-1482',
 '0002-1490',
 '0002-4805',
 '0002-6980',
 '0002-7189',
 '0002-7294',
 '0002-7316',
 '0002-8312',
 '0002-8762',
 '0002-9114',
 '0002-9300',
 '0002-9319',
 '0002-9432',
 '0002-9475',
 '0002-9483',
 '0002-953X',
 '0002-9556',
 '0002-9831',
 '0003-0139',
 '0003-0279',
 '0003-0481',
 '0003-0554',
 '0003-0651',
 '0003-066X',
 '0003-0678',
 '0003-1186',
 '0003-1224',
 '0003-1283',
 '0003-1615',
 '0003-2468',
 '0003-2565',
 '0003-2573',
 '0003-2638',
 '0003-3472',
 '0003-3790',
 '0003-4436',
 '0003-4487',
 '0003-4800',
 '0003-5033',
 '0003-5459',
 '0003-5491',
 '0003-5521',
 '0003-5548',
 '0003-5815',
 '0003-598X',
 '0003-6390',
 '0003-6870',
 '0003

In [3]:
new_df = pd.DataFrame(columns=df.columns)
for idx, row in df.iterrows():
    if row[2].strip().split('"')[1] in issn_list:
        new_df = new_df.append(df.loc[idx], ignore_index=True)
new_df

Unnamed: 0,"""venue""","""internal_identifier""","""issn""",Unnamed: 3
0,"""https://w3id.org/oc/meta/br/06501147""","""https://w3id.org/oc/meta/id/06506""","""1557-7805""",
1,"""https://w3id.org/oc/meta/br/06501147""","""https://w3id.org/oc/meta/id/06507""","""0091-3367""",
2,"""https://w3id.org/oc/meta/br/06501162""","""https://w3id.org/oc/meta/id/06508""","""1541-3535""",
3,"""https://w3id.org/oc/meta/br/06501162""","""https://w3id.org/oc/meta/id/06509""","""0091-4150""",
4,"""https://w3id.org/oc/meta/br/061201047""","""https://w3id.org/oc/meta/id/061202""","""1471-6402""",
...,...,...,...,...
13877,"""https://w3id.org/oc/meta/br/06103892867""","""https://w3id.org/oc/meta/id/06103492509""","""2079-3634""",
13878,"""https://w3id.org/oc/meta/br/06803871825""","""https://w3id.org/oc/meta/id/06803494574""","""2544-2031""",
13879,"""https://w3id.org/oc/meta/br/06903488133""","""https://w3id.org/oc/meta/id/06903218004""","""2545-8329""",
13880,"""https://w3id.org/oc/meta/br/061103475568""","""https://w3id.org/oc/meta/id/061103216283""","""2551-895X""",


<div style="display:block; background-color: #D4F1F4; border-radius: 3px; border: 1pt solid #89CFF0; padding: 5px 10px; width: 90%; margin-left:5%; margin-top: 10px"><p><b>ON THIS DATE</b>: alternative ways to process META dataset, SPARQL query on META SPARQL endpoint</p></div>
<hr style="border: 1pt solid #89CFF0">

<h3 id="15/04/2023" style="text-decoration:underline">15/04/2023</h3>
<p style="text-align:justify; font-size: 14.5px">The next task for the Open Science course is the production a <b>peer-review</b> on the <em>protocols.io</em> workflow created by the other Team. The first version this document is currently available at the following link: DOI <a href="dx.doi.org/10.17504/protocols.io.n92ldpeenl5b/v1">dx.doi.org/10.17504/protocols.io.n92ldpeenl5b/v1</a>. In order to fulfill this task, I've read some bibliography about peer-reviews. The major source I've followed during the production of my review is this article:</p>
<ul>
    <li>Stiller-Reeve, M. (2018). <em>How to write a thorough peer review</em>. Nature. <a href="https://doi.org/10.1038/d41586-018-06991-0">https://doi.org/10.1038/d41586-018-06991-0</a></li>
</ul>
<p style="text-align:justify; font-size: 14.5px">One of the main benefits of this article is that it suggests a <a href="https://www.scisnack.com/wp-content/uploads/2018/10/A-Peer-Review-Process-Guide.pdf">worksheet</a> that presents the main steps to follow when writing a review. The author suggests to read the article/document three times, focusing on different aspects with each reading. Here is a table containing some notes following the schema provided by this article.</p>
<table style="display:block; float:left; font-size: 14px; margin-bottom: 15px">
    <tr style="background-color: #D6EEEE;">
        <th style="border: 1px solid #dddddd; text-align: left; padding: 8px;">#</th>
        <th style="border: 1px solid #dddddd; text-align: left; padding: 8px;">Question</th>
        <th style="border: 1px solid #dddddd; text-align: left; padding: 8px;">Notes</th>
      </tr>
      <tr style="background-color: #f7fbfd">
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><b>1</b></td>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><em>Is the article in line with the journal’s scope?</em></td>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;">Yes: it is perfectly in line with the course scope.</td>
      </tr>
      <tr style="background-color: #f7fbfd">
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><b>2</b></td>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><em>Does your expertise cover all aspects of the article? If not, describe which sections you can respond to and why?</em></td>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;">I know Pandas and how to deal with .csv datasets. I'm familiar with META and ERIH-PLUS, while I don't know COCI.</td>
      </tr>
      <tr style="background-color: #f7fbfd">
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><b>3</b></td>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><em>“Mirror” the article. Make a first draft describing the main aim of the article and why it’s innovative.</em></td>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;">The article aims to identify and answer 3 questions included in the main research question. It seeks to facilitate access to and reuse of complex datasets by providing an additional source of information for new resources.</td>
      </tr>
      <tr style="background-color: #f7fbfd">
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><b>4</b></td>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><em>Is the article publishable in principle?</em></td>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;">Yes: though some steps should be further explained.</td>
      </tr>
</table>

<table style="display:block; float:left; font-size: 14px; margin-bottom: 15px">
    <tr style="background-color: #eef5ee;">
        <th style="border: 1px solid #dddddd; text-align: left; padding: 8px;">#</th>
        <th style="border: 1px solid #dddddd; text-align: left; padding: 8px;">Question</th>
        <th style="border: 1px solid #dddddd; text-align: left; padding: 8px;">Notes</th>
      </tr>
      <tr style="background-color: #fafcfa">
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><b>5</b></td>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><em>Do the Introduction and Abstract clearly identify the need and relevance for this research? </em></td>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><b>Major Issue</b>: the abstract doesn't make very clear why we should need to work on these specific kinds of data. It might be unclear to readers not involved in this domain.</td>
      </tr>
      <tr style="background-color: #fafcfa">
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><b>6</b></td>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><em>Does the Methodology target the main question(s) appropriately?</em></td>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><b>Minor issue</b>: though it is implicitly derivable, readers may benefit from a thorough presentations of the data required to addess specific sub-questions.</td>
      </tr>
      <tr style="background-color: #fafcfa">
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><b>7</b></td>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><em>Are the Results clearly and logically presented, and are they justified by the data presented? Are the figures clear and fully described? </em></td>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><b>Minor issues</b>: What's the final format of the results? Will the results be exported somewhere?</td>
      </tr>
      <tr style="background-color: #fafcfa">
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><b>8</b></td>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><em>Do the Conclusions justifiably respond to main questions the author(s) posed? Do the Conclusions go too far or not far enough based on the results?</em></td>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;">No issues</td>
      </tr>
</table>

<table style="display:block; float:left; font-size: 14px; margin-bottom: 15px">
    <tr style="background-color: #ffe1e7;">
        <th style="border: 1px solid #dddddd; text-align: left; padding: 8px;">#</th>
        <th style="border: 1px solid #dddddd; text-align: left; padding: 8px;">Question</th>
        <th style="border: 1px solid #dddddd; text-align: left; padding: 8px;">Notes</th>
      </tr>
      <tr style="background-color: #fffafb">
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><b>9</b></td>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><em>Is the manuscript’s story cohesive and tightly reasoned throughout? If not, where does it deviate from the central argument? </em></td>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><b>Major Issue</b>: the last section is still empty. It should be either removed or populated with some new content.</td>
      </tr>
      <tr style="background-color: #fffafb">
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><b>10</b></td>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;"><em>How are the grammar and spelling in the manuscript? </em></td>
        <td style="border: 1px solid #dddddd; text-align: left; padding: 8px;">No issue</td>
      </tr>
</table>