# Step 1 
This step suggest how to fetch the reference list with the reference number, doi, title year of review online by using the crossref API. 

You need to enter the doi of the paper on #Example DOI and enter the path to save the .csv file

This step only creates the reference list with details. In the next step you will have to input the data from the table inside papers manually or using a tabula.org


In [20]:
import requests
import pandas as pd

# Function to fetch references using the CrossRef API
def get_references_from_doi(doi):
    url = f"https://api.crossref.org/works/{doi}"
    response = requests.get(url)
    
    if response.status_code == 200:
        data = response.json()
        references = data.get('message', {}).get('reference', [])
        return references
    else:
        print(f"Error fetching data for DOI {doi}: {response.status_code}")
        return None

# Helper function to extract necessary information for the table
def extract_reference_info(ref, order):
    # Extract the title
    title = ref.get('article-title', 'No Title')
    
    # Reference order number
    reference_number = order + 1  # Start from 1
    
    # Extract DOI if available
    ref_doi = ref.get('DOI', 'No DOI')
    
    # Extract the year of publication
    year = ref.get('year', 'No Year')
    
    # Return a dictionary with only the desired columns
    return {
        "Title": title,
        "Reference Number": reference_number,
        "DOI": ref_doi,
        "Year": year
    }

# Example DOI
doi = "doi:10.3390/ma13092078"  # Replace with your DOI
references = get_references_from_doi(doi)

# Create a DataFrame from the extracted references information
if references:
    reference_data = [extract_reference_info(ref, i) for i, ref in enumerate(references)]
    df = pd.DataFrame(reference_data, columns=["Title", "Reference Number", "DOI", "Year"])
    
    # Display the table
    print(df)
    
    # Optionally, save to CSV
    output_csv_path = 'C:\\Users\\pedro\\Desktop\\Materials World\\Reference list Yaqoob 2020.csv'
    df.to_csv(output_csv_path, index=False)
    print(f"References table saved to {output_csv_path}")
else:
    print("No references found.")

                                                 Title  Reference Number  \
0    Performance improvement of microbial fuel cell...                 1   
1                                             No Title                 2   
2    Conversion of wastes into bioelectricity and c...                 3   
3    High performance platinum group metal-free cat...                 4   
4    Denitrification of water in a microbial fuel c...                 5   
..                                                 ...               ...   
181                                           No Title               182   
182  Electrochemistry and microbiology of microbial...               183   
183  Microbial fuel cell is emerging as a versatile...               184   
184  An overview of electrode materials in microbia...               185   
185  Challenges in the application of microbial fue...               186   

                                 DOI     Year  
0         10.1016/j.rser.2017.05.098   

# Step 2 - Merge two data frames based on common values in Pandas
Now you will merge the 2 two data frames *a)reference list* and *b)the table with info* based on the common values of the **reference number** present in both data frames.

This step must be repeated for every new paper studied.

Below there is an example of merging the reference list and the table with info from a Review paper from Yaqoob et. al. 2020 (doi:10.3390/ma13092078)

In [23]:
reference_list_yaqoob2020 = pd.read_csv("Reference list Yaqoob 2020.csv")
reference_list_yaqoob2020.head()


Unnamed: 0,Title,Reference Number,DOI,Year
0,Performance improvement of microbial fuel cell...,1,10.1016/j.rser.2017.05.098,2017
1,No Title,2,10.4018/978-1-5225-5766-1.ch014,No Year
2,Conversion of wastes into bioelectricity and c...,3,10.1126/science.1217412,2012
3,High performance platinum group metal-free cat...,4,10.1149/2.0061703jes,2017
4,Denitrification of water in a microbial fuel c...,5,10.1016/j.jclepro.2017.12.221,2018


In [25]:
table_info_yaqoob2020 = pd.read_csv("Table info_Yaqoob 2020.csv")
table_info_yaqoob2020.head()

Unnamed: 0,Type of Material,Anode,Size of Anode,Surface Area of Anode cm2,Inoculum Source/,Power Density mw/m2,Reference Number
0,Carbon-based,Carbon cloth,2 cm × 2 cm,4.0,S. putrefaciens CN32,679.7,104
1,Composites,rGO/SnO2/Carbon cloth composite,3 cm × 2 cm,6.0,E. coli,1624.0,105
2,Carbon-based,Graphene,-,4.0,E. coli,2850.0,106
3,Composites,r GO/PPy,1 cm × 1.5 cm,,E. coli,1068.0,73
4,Carbon-based,Graphene coating on Carbon cloth,1 cm × 2 cm,4.0,P. aeruginosa,52.5,107


In [29]:
yaqoob2020_merged = pd.merge(table_info_yaqoob2020, reference_list_yaqoob2020, on="Reference Number")
yaqoob2020_merged.to_csv('yaqoob 2020 merged.csv')

# Step 3: Unite all the merged data frames created from the multiple papers

# Step 4: Removing duplicates (DOI)