# Programming Languages

**The number and kinds of programming languages provides insight into the skills required of code contributors, as well as the nature of the projects themselves.** This metric can help newcomers to an open source project as well as open source project managers gain insight into the project's profile within the context of their own experience and organisations. 

**Python dominates the OSS movement for sustainability, and is being used in 37.9% of all projects, followed by R (15%), Jupyter notebooks (9.76%) and other languages like Fortran, C++ and Java.** Statistics from [GitHut 2.0](https://madnight.github.io/githut/#/pull_requests/2022/1) or official number of GitHub give insights in the programming language usage of Open Source projects on GitHub in general. Comparing the data, it becomes clear that Python is significantly more represented within the repositories analysed here than JavaScript. This indicates strong focus on the analysis of large datasets, where Python and Jupyter Notebooks are becoming a dominant choice and less focus on the application side. Python, in particular, is the language of choice in projects in energy modelling, biosphere, hydrosphere, wind energy, buildings and heating. Python is considered to be an energy inefficient programming language. However, this does not show in practice because for computationally intensive operations, the Python community uses libraries such as NumPy, which make energy efficient algorithms in C available within Python.

**The use of R deviates significantly from other statistics and has a high prevalence within the software world in general.** A concentration of R developments can be found, in particular, under the topics of biosphere, hydrosphere, water supply, soil, land use, climate, and agriculture. This can be attributed to the high number of data science projects in this field and the low number of web development projects for sustainable development. 
Despite its advanced age of over 65 years, Fortran is still widely used in the Earth system models in hydrosphere, climate and atmosphere. This can be explained by the long development time of these projects and the necessary numerical efficiency of such models for high performance computing.

**Julia, a relatively new language, also has a wide range of applications.**  For some special use cases, such as building simulation, programming languages like Modelica are frequently used. 

In [1]:
import numpy as np
import pandas as pd
import plotly.io as pio
import plotly.graph_objects as go
import plotly.express as px
from opensustain_template import *

In [2]:
df_active = pd.read_csv("../csv/project_analysis.csv")

In [3]:
license_dominating_language = (
    df_active["dominating_language"]
    .value_counts()
    .to_frame()
    .rename_axis("dominating_language_names")
    .reset_index()
)
license_dominating_language
license_dominating_language = license_dominating_language[(license_dominating_language["dominating_language"] > 4)]
fig = px.pie(license_dominating_language, values="dominating_language", names="dominating_language_names", color_discrete_sequence=color_discrete_sequence, hole=0.2)


fig.update_layout(title="Distribution of Programming Languages", showlegend=False, font_size=16)
fig.update_traces(textposition='inside', textinfo='percent+label', marker=dict(line=dict(color='#000000', width=1)))
fig.show()

```{figure} data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7
:figclass: caption-hack
:name: languages-distribution

Distribution of programming languages
```

In [5]:
df_language_distribution = (
    df_active.value_counts(["topic", "dominating_language"]).to_frame().reset_index()
)

df_language_distribution.rename(columns={0: "counts"}, inplace=True)
fig = px.scatter(
    df_language_distribution, x="dominating_language", y="topic", size="counts", 
)


fig.update_layout(
    height=1000,  # Added parameter
    width=1200,
    xaxis_title= None,
    yaxis_title= None,
    title="Distribution of Programming Languages within Topics"
)
fig.update_traces(marker_color=marker_color)

fig.add_layout_image(
    dict(
        source=logo_img,
        xref="paper", yref="paper",
        x=1, y=1,
        sizex=0.05, sizey=0.05,
        xanchor="right", yanchor="top"
    )
)

fig.show()

```{figure} data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7
:figclass: caption-hack
:name: languages-within-topics

Distribution of programming languages within topics
```