<img src="docs/logo.jpg" width="400" height="400" align="center"/> 

# <center> `seesus` for Text Analysis on Sustainability<center> 

`seesus` is an open-source Python software that evaluates whether a textual expression aligns with the concept of sustainability as defined by the United Nations Sustainable Development Goals (SDGs). It currently has four main functions: 
1. [Evaluate whether a statement aligns with sustainability](#1)
2. [Identify SDGs and associated targets in a statement](#2)
3. [Classify a statement into social, environmental, and economic sustainability](#3)
4. [Examine and customize match syntax](#4)

`seesus` is based on regular expressions instead of language models. It attains an accuracy rate of 75.5%, as determined by alignment with manual coding.

For analysis in R, please check [`SDGdector`](https://github.com/Yingjie4Science/SDGdetector).

## Example usage

To achieve the best results, it is recommended to analyze one sentence rather than a lengthy paragraph. Paragraphs could be split into sentences by tools such as `nltk.tokenize` and `re.split`. 

A statement could be directly matched with the concept of sustainability (e.g., "progress toward the Sustainable Development Goal 1"), or indirectly (e.g., "mitigate climate change"). We provide two examples here.

In [1]:
# install the package
# !pip install seesus

In [2]:
from seesus import SeeSus

In [3]:
# an example with indirect match
text = "We aim to contribute to the mitigation of climate change by reducing carbon emissions in the city."

In [7]:
# an example with direct match
# text = "Our ambition is to achieve the Sustainable Development Goal 1"

In [11]:
result = SeeSus(text)

In [12]:
print(result)

String matched the following SDGs



<a name="1"></a>
### To evaluate whether a statement aligns with sustainability

In [None]:
print(result.sus)

<a name="2"></a>
### To identify SDGs and associated targets in a statement

In [None]:
print(result.sdg)
print(result.sdg_desc)

In [None]:
print(result.target)
print(result.target_desc)

In [None]:
# check match type
print(result.match)

<a name="3"></a>
### To classify a statement into social, environmental, and economic sustainability

In [None]:
print(result.see)

<a name="4"></a>
### To examine and customize match syntax

In [None]:
SeeSus.show_syntax("SDG1_general")

In [None]:
SeeSus.edit_syntax(sdg_id="SDG1_general", new_syntax="my match terms", match_type='indirect')

It should be noted that if a match type (i.e., "direct" or "indirect") of the specified SDG id does not exist in the original database (i.e., `SDG_keys`), the new syntax will be *added* to the database. If a match type already exists, the new syntax will *replace* the original syntax.

In [None]:
SeeSus.show_syntax("SDG1_general")

In [None]:
new_result = SeeSus("my match terms are in text")

In [None]:
print(new_result.sus)
print(new_result.target)
print(new_result.match)