# Topics

The topics were created by the authors and iterated multiple times as part of the investigation process. It is difficult to compare the scope of the topics directly, but the size relations allow us to identify neglected, vibrant and emerging areas. 

In [6]:
import numpy as np
import pandas as pd
import plotly.io as pio
import plotly.graph_objects as go
import plotly.express as px
from opensustain_template import *

In [7]:
df_active = pd.read_csv("../csv/project_analysis.csv")

In [8]:

topic_his = (
    df_active["topic"]
    .value_counts()
    .to_frame()
    .rename_axis("topic_names")
    .reset_index()
)

fig = px.bar(
    topic_his,
    x="topic",
    y="topic_names",
    orientation="h",
)

fig.update_layout(
    height=1000,  # Added parameter
    yaxis_title= None,
    xaxis_title="Projects",
    title="Projects within Topics",
    coloraxis_colorbar=dict(
    title="DDS",
    ),
    hoverlabel=dict(
    bgcolor="white"
    )
)
fig['layout'].update(margin=dict(l=300,r=0,b=0,t=40))
fig.update_traces(marker_color=marker_color)
fig.update(layout_showlegend=False)
fig.add_layout_image(
    dict(
        source=logo_img,
        xref="paper", yref="paper",
        x=1, y=1,
        sizex=0.06, sizey=0.06,
        xanchor="right", yanchor="top"
    )
)
fig.show()

```{figure} data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7
:figclass: caption-hack
:name: projects-within-topics

Number of individual projects within topics
```

**Projects on climate, biosphere, hydrosphere and water supply as well as energy system modelling, mobility & transportation and buildings make up ~45% of all identified projects.** This is likely due to the research maturity of these fields, the multitude of scientific organisations behind them, and the relatively good availability of open data from natural and engineered systems in these categories. Particularly in the field of energy modelling and in renewable energy such as photovoltaics or wind energy, we can see a stronger open source ecosystem. However, despite the central importance of batteries for energy storage, only a small number of OSS projects are under development.

Furthermore, areas where software plays a central role but only a small number of projects can be identified are of particular interest. 

**For example, within [Sustainable Investment](https://opensustain.tech/#sustainable-investment) representing only 1.15 % (a total of 11 projects), open source is still something of a marginal factor.** Despite ongoing discussions about ESG (Environmental, Social and Governance) ratings in terms of their quality and transparency, the field is dominated by proprietary closed-source frameworks and datasets. The lack of open source and open science in sustainable investment reflects the lack of impact measurement and evaluation which are key in financing a sustainable transformation. **Also, the field of energy and resource consumption in industrial production shows a very low level of OSS developments, at only 0.28 %.**

**In Emissions Observation and Modeling, there are only 21 developments, representing 2.1% of all projects.** Despite the significant impact of man-made emissions on the climate, there are not enough open source tools, platforms, and communities that truly reflect the magnitude of the challenge. A significant business opportunity would exist if an open source community brought together various emissions monitoring and modelling datasets from around the world on a single platform. Such a platform would be critical for transparency in pressing issues such as carbon trading, carbon taxes, and company sustainability assessments. Electricity Maps has demonstrated with great success how this approach works for the share of renewables in local energy grids. Hundreds of scientists and developers have joined forces and integrated existing data into one platform. There are new promising developments such as [The Global Registry of Fossil Fuels](https://fossilfuelregistry.org/).

**Topics with low OSS representation also include bioenergy, hydrogen, and carbon capture.** This is likely to be due to the more nascent nature of these areas and the smaller academic communities working in them. These technologies have a higher degree of uncertainty and data are held by for-profit companies within this sector. The small number of open source projects makes it difficult to quantify, transparently and independently, the sustainable developments in this area.

**Lastly, domains like carbon offsets or calculations on climate neutrality could not be investigated due to a general lack of OSS projects.** Despite intensive research, no OSS project or organisation (with the exception of [CarbonPlan](https://carbonplan.org/)) could be found that provides comprehensive and scientifically sound calculations and methodologies of climate neutrality and the carbon offsets of individual companies. All statements about the environmental impact of companies are primarily based on black box algorithms and analyses performed by companies and consultancies, making sustainability statements of Carbon Offsets rather opaque. 


In [233]:
fig = px.sunburst(
    df_active,
    path=['sector', 'topic', 'project_name'],
    color='sector',
    maxdepth=2,
    color_discrete_sequence=color_discrete_sequence
)

fig.update_layout(title="Projects within categories and topics", height=800, title_font_size=22, font_size=11)
# animated transitions are currently not implemented when uniformtext is used
fig.update_traces(insidetextorientation='radial', textinfo='label')

fig.show()

```{figure} data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7
:figclass: caption-hack
:name: projects-within-sectors

Number of individual projects within topics and categories
```