<img src="docs/logo.jpg" width="400" height="400" align="center"/> 

# <center> `seesus` for Text Analysis on Sustainability<center> 

`seesus` is an open-source Python software that evaluates whether a textual expression aligns with the concept of sustainability as defined by the United Nations Sustainable Development Goals (SDGs). It currently has four main functions: 
1. [Evaluate whether a statement aligns with sustainability](#1)
2. [Identify SDGs and associated targets in a statement](#2)
3. [Classify a statement into social, environmental, and economic sustainability](#3)
4. [Examine and customize match syntax](#4)

`seesus` is based on regular expressions instead of language models. It attains an accuracy rate of 75.5%, as determined by alignment with manual coding.

For analysis in R, please check [`SDGdector`](https://github.com/Yingjie4Science/SDGdetector).

## Example usage

To achieve the best results, it is recommended to analyze one sentence rather than a lengthy paragraph. Paragraphs could be split into sentences by tools such as `nltk.tokenize` and `re.split`. 

A statement could be directly matched with the concept of sustainability (e.g., "progress toward the Sustainable Development Goal 1"), or indirectly (e.g., "mitigate climate change"). We provide two examples here.

In [1]:
from seesus import SeeSus

In [2]:
# an example with indirect match
text = "We aim to contribute to the mitigation of climate change by reducing carbon emissions in the city."

In [3]:
# an example with direct match
# text = "Our ambition is to achieve the Sustainable Development Goal 1"

In [4]:
result = SeeSus(text)

<a name="1"></a>
### To evaluate whether a statement aligns with sustainability

In [5]:
print(result.sus)

True


<a name="2"></a>
### To identify SDGs and associated targets in a statement

In [6]:
print(result.sdg)
print(result.sdg_desc)

['SDG13']
['Climate Action']


In [7]:
print(result.target)
print(result.target_desc)

['SDG13_general', 'SDG13_2']
['Take urgent action to combat climate change and its impacts', 'Integrate climate change measures into national policies, strategies and planning']


In [8]:
# check match type
print(result.match)

['indirect']


<a name="3"></a>
### To classify a statement into social, environmental, and economic sustainability

In [9]:
print(result.see)

{'social_sustainability': False, 'environmental_sustainability': True, 'economic_sustainability': False}


<a name="4"></a>
### To examine and customize match syntax

In [10]:
SeeSus.show_syntax("SDG1_general")

[{'SDG_id': 'SDG1_general', 'SDG_keywords': '(sdg|goal)[^0-9]{0,2}(?=1\\b)|No Poverty', 'match_type': 'direct'}]


In [11]:
SeeSus.edit_syntax(sdg_id="SDG1_general", new_syntax="my match terms", match_type='indirect')

The indirect match syntax of SDG1_general has been updated.


It should be noted that if a match type (i.e., "direct" or "indirect") of the specified SDG id does not exist in the original database (i.e., `SDG_keys`), the new syntax will be *added* to the database. If a match type already exists, the new syntax will *replace* the original syntax.

In [12]:
SeeSus.show_syntax("SDG1_general")

[{'SDG_id': 'SDG1_general', 'SDG_keywords': '(sdg|goal)[^0-9]{0,2}(?=1\\b)|No Poverty', 'match_type': 'direct'}, {'SDG_id': 'SDG1_general', 'SDG_keywords': 'my match terms', 'match_type': 'indirect'}]


In [13]:
new_result = SeeSus("my match terms are in text")

In [14]:
print(new_result.sus)
print(new_result.target)
print(new_result.match)

True
['SDG1_general']
['indirect']
