In [1]:
import os
import pandas as pd
from systematic_review import *

### Pond Screening

In [33]:
fname = "../extraction/data/pond/pond_screening3.csv"
df = pd.read_csv(fname, index_col=0)

In [34]:
defs = df.loc[df.definition_bool == True]
print(defs.doi.value_counts())
print()
for i in defs.index:
    print(f"{defs.loc[i, "doi"]}: {defs.loc[i, 'definition']}")
    print()
    print(f"{defs.loc[i, "doi"]} text: {defs.loc[i, 'text']}")
    print()
    print()
    print()
    print()

doi
definitions2    3
ponds2          2
definitions3    1
definitions1    1
Name: count, dtype: int64

definitions3: According to the provided text, ponds are generally defined as **small (1 m<sup>2</sup> to ∼50,000 m<sup>2</sup>), shallow, standing water bodies that can either permanently or temporarily collect freshwater.** 

This definition distinguishes ponds by their size (small, 1-50,000 m<sup>2</sup>) and depth (shallow) compared to other water bodies like lakes and rivers. It also specifies that they contain standing freshwater, and can be either permanent or temporary.

definitions3 text: ## INTRODUCTION
Growing urbanization and climate variability have placed immense pressure on the finite supply of groundwater available for agricultural irrigation. As a result, the exploration of alternative irrigation water sources, including recycled water and pond water, has become a global priority (Scheierling et al., 2006;Parsons et al., 2010;Ortega-Reig et al., 2014). While there is n

In [35]:
tabs = df.loc[df.table_bool == True]
print(tabs.doi.value_counts())
print()
for i in tabs.index:
    print(f"{tabs.loc[i, "doi"]} text: {tabs.loc[i, 'text']}")
    print()
    print()
    print()
    print()

doi
lakes3          18
ponds3          13
lakes1          12
definitions1    10
lakes2           9
ponds2           9
ponds1           8
definitions3     1
Name: count, dtype: int64

ponds1 text: ## INTRODUCTION
The functioning of small water bodies may differ in a variety of aspects from that of lakes (Biggs et al., 2005;Likens, 2009). Even though small aquatic ecosystems create favourable conditions for a number of organisms of various ecological requirements, they have never received much attention from researchers compared with larger reservoirs (Oertli et al., 2002). This especially applies to a specific and very unique kind of pond -meteorite craters, which were the object of this research. The origin of ponds is very diverse and they can be created by a wide range of natural processes such as glaciation, volcanic activity, land subsidence, wind or river action, and tree falls, as well as by human activities such as mineral extraction or water storage (Oertli et al., 2005). Meteo

### Coastal Screening

In [3]:
fname = "../extraction/data/coastal/screening5.csv"
df = pd.read_csv(fname, index_col=0)

In [4]:
# At least one of the relevant columns is True
relevant = df.loc[df['definition_bool'] | df['table_bool'] | df['measurement_bool']]
print(f"Number of relevant papers: {len(relevant.doi.value_counts())}")

# Papers with a definition 
definitions = df.loc[df['definition_bool'] == True]
print(f"Number of papers with a definition: {len(definitions.doi.value_counts())}")

# Papers with a table
tables = df.loc[df['table_bool'] == True]
print(f"Number of papers with a table: {len(tables.doi.value_counts())}")

# Papers with a measurement
measurements = df.loc[df['measurement_bool'] == True]
print(f"Number of papers with a measurement: {len(measurements.doi.value_counts())}")

# Papers with a measurement or a definition 
definitions_or_measurements = df.loc[df['definition_bool'] | df['measurement_bool']]
print(f"Number of papers with a definition or a measurement: {len(definitions_or_measurements.doi.value_counts())}")

Number of relevant papers: 32
Number of papers with a definition: 9
Number of papers with a table: 31
Number of papers with a measurement: 10
Number of papers with a definition or a measurement: 15


In [5]:
definitions_or_measurements

Unnamed: 0,doi,chunk,abstract_bool,definition_bool,table_bool,measurement_bool,definition
124,10.1002/lno.12322,0,True,False,False,True,
125,10.1002/lno.12322,1,True,True,False,True,The coastal ecosystems being studied are **tid...
126,10.1002/lno.12322,2,True,True,False,False,The coastal ecosystems being studied are:\n\n*...
127,10.1002/lno.12322,3,True,True,False,False,The coastal ecosystems being studied are **sal...
131,10.1002/lno.12322,7,True,False,False,True,
...,...,...,...,...,...,...,...
1184,10.1002/lno.12241,32,True,False,False,True,
1189,10.1002/lno.12471,1,True,True,False,False,The context focuses on **estuaries** and **del...
1190,10.1002/lno.12471,2,True,True,False,False,The coastal ecosystems being studied are **est...
1204,10.1002/lno.12471,16,True,True,True,False,"Based on the provided excerpt, the following c..."


In [22]:
sample = definitions_or_measurements.sample(10)

In [6]:
# Sample of papers with definition and a measurement
definitions_and_measurements = df.loc[df['definition_bool'] & df['measurement_bool']]

for i in range(len(definitions_and_measurements)):
    sample = definitions_and_measurements.iloc[i,:]
    print(f"DOI: {sample.doi}")
    print(f"Definition: {sample.definition}")
    print()

DOI: 10.1002/lno.12322
Definition: The coastal ecosystems being studied are **tidal marshes, mangroves, and seagrass meadows**. 

The specific quantitative attributes or descriptive characteristics used to define them are:

*   **Small areal coverage:** They comprise only 1% of the global ocean surface.
*   **High carbon sink effectiveness:** They are the most effective long-term carbon sinks of the biosphere.
*   **Significant carbon budget contribution:** They account for 50% of the total marine soil (sediment) organic carbon budget.
*   **Carbon sequestration drivers:** High rates of plant primary production, vertical soil development with rising sea levels, and slow rates of decomposition in reducing soils.

DOI: 10.1002/lno.12717
Definition: The context is studying **coastal wetlands**. 

The specific quantitative or descriptive characteristics used to define them are:

*   They are influenced by both **terrestrial and marine hydrology**.
*   They serve as **vital biogeochemical i