# Ranking

Quantifying the state and health of a repository is challenging, but still feasible by combining different indicators. The state was divided into 3 indicator classes: **Size, Community and Activity.** 
Every indicator consists of the rank in relation to the other. This can be illustrated by the activity score. Each project is ranked according to the variables "Total Commits Last Year", "Issues closed last Year", "Day Until last Issue closed", and "Last Release Data" and then normalised by 1. The total score is the normalised sum of all scores. The following code cell shows in detail the how the ranking is been calculated:

```python
# Calculate the scores on activity, community and size

# Every project is ranked in based in different activity indicators. 
# A values of 1 shows the highest rank and 0 the lowest
# The individual values are added up.
df_active["activity"] = (
    df_active["total_commits_last_year"].rank(pct=True)
    + df_active["issues_closed_last_year"].rank(pct=True)
    + df_active["days_until_last_issue_closed"].rank(pct=True)
    + df_active["last_released_date"].rank(pct=True, na_option="top")
)

df_active["community"] = (
    df_active["contributors"].rank(pct=True)
    + df_active["development_distribution_score"].rank(pct=True)
    + df_active["reviews_per_pr"].rank(pct=True)
)

df_active["size"] = (
    df_active["total_number_of_commits"].rank(pct=True)
    + df_active["contributors"].rank(pct=True)
    + df_active["closed_issues"].rank(pct=True)
    + df_active["closed_pullrequests"].rank(pct=True)
)

# All scores are weighted equal and normalized to one
df_active["total_score"] = (
    df_active["activity"] / df_active["activity"].max()
    + df_active["community"] / df_active["community"].max()
    + df_active["size"] / df_active["size"].max()
) / 3
```

Ranking all projects by the total score gives a much deeper understanding of the ecosystem. Larger developments like [EnergyPlus](https://github.com/NREL/EnergyPlus) are suddenly making up for lost ground at the top. Instead of using Stars this ranking unveils the strong but rather unpopular developments. However, more monolithic software developments have a higher probability of achieving a high score. The ranking between the individual developments creates the danger that small projects that rely more on modular development will be significantly underrepresented. 

The real value of such health analytics comes into play when the data is compared with usage data. Unfortunately, this data is currently only available to a limited extent via Python dependents.


In [13]:
import numpy as np
import pandas as pd
import plotly.io as pio
import plotly.graph_objects as go
import plotly.express as px
from opensustain_template import *

In [14]:
df_active = pd.read_csv("../csv/project_analysis.csv")

In [15]:
df_total_score = df_active.nlargest(40, "total_score")

fig = px.bar(
    df_total_score,
    x=df_total_score["total_score"],
    y=df_total_score["project_name"],
    orientation="h",
    range_x=(0.85, 1),
    custom_data=["oneliner","topic","git_url"],
    color = df_total_score["development_distribution_score"],
    color_continuous_scale=color_continuous_scale
)

fig.update_layout(
    height=1000,  # Added parameter
    xaxis_title="Total Score",
    yaxis_title=None,
    title="Top 40 Total Score",
    coloraxis_colorbar=dict(
    title="DDS",
    ),   
    hoverlabel=dict(
    bgcolor="white"
)
)
fig.update(layout_showlegend=False)
fig['layout'].update(margin=dict(l=200,r=0,b=0,t=40))

fig.add_layout_image(
    dict(
        source=logo_img,
        xref="paper", yref="paper",
        x=1, y=0,
        sizex=0.05, sizey=0.05,
        xanchor="right", yanchor="bottom"
    )
)

fig.update_traces(
    hovertemplate="<br>".join([
        "Project Info: <b>%{customdata[0]}</b>",
        "Topic: <b>%{customdata[1]}</b>",
        "Git URL: <b>%{customdata[2]}</b>"
    ])
)
fig['layout']['yaxis']['autorange'] = "reversed"
fig.show()


```{figure} data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7
:figclass: caption-hack
:name: total-score

40 Projects with the highest total score
```

In [28]:
df_activity_score = df_active.nlargest(40, "activity")

fig = px.bar(
    df_activity_score,
    x=df_activity_score["activity"],
    y=df_activity_score["project_name"],
    orientation="h",
    custom_data=["oneliner","topic","git_url"],
    color = df_activity_score["development_distribution_score"],
    color_continuous_scale=color_continuous_scale,
    range_x=(2.8, 3.7)
)

fig.update_layout(
    height=1000,  # Added parameter
    width=1000,
    xaxis_title="Activity Score",
    yaxis_title=None,
    title="Top 40 Activity Score",
    coloraxis_colorbar=dict(
    title="DDS",
    ),   
    hoverlabel=dict(
    bgcolor="white"
)
)
fig.update(layout_showlegend=False)
fig['layout'].update(margin=dict(l=200,r=0,b=0,t=40))

fig.add_layout_image(
    dict(
        source=logo_img,
        xref="paper", yref="paper",
        x=1, y=0,
        sizex=0.05, sizey=0.05,
        xanchor="right", yanchor="bottom"
    )
)

fig.update_traces(
    hovertemplate="<br>".join([
        "Project Info: <b>%{customdata[0]}</b>",
        "Topic: <b>%{customdata[1]}</b>",
        "Git URL: <b>%{customdata[2]}</b>"
    ])
)
fig['layout']['yaxis']['autorange'] = "reversed"
fig.show()

```{figure} data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7
:figclass: caption-hack
:name: activity-score

The 40 Projects with the highest activity score
```

In [24]:
df_community_score = df_active.nlargest(40, "community")

fig = px.bar(
    df_community_score,
    x=df_community_score["community"],
    y=df_community_score["project_name"],
    orientation="h",
    range_x=(2.5, 3),
    custom_data=["oneliner","topic","git_url"],
    color = df_community_score["development_distribution_score"],
    color_continuous_scale=color_continuous_scale
)

fig.update_layout(
    height=1000,  # Added parameter
    width=1000,
    xaxis_title="Community Score",
    yaxis_title=None,
    title="Top 40 Community Score",
    coloraxis_colorbar=dict(
    title="DDS",
    ),   
    hoverlabel=dict(
    bgcolor="white"
)
)
fig.update(layout_showlegend=False)
fig['layout'].update(margin=dict(l=200,r=0,b=0,t=40))

fig.add_layout_image(
    dict(
        source=logo_img,
        xref="paper", yref="paper",
        x=1, y=0,
        sizex=0.05, sizey=0.05,
        xanchor="right", yanchor="bottom"
    )
)

fig.update_traces(
    hovertemplate="<br>".join([
        "Project Info: <b>%{customdata[0]}</b>",
        "Topic: <b>%{customdata[1]}</b>",
        "Git URL: <b>%{customdata[2]}</b>"
    ])
)
fig['layout']['yaxis']['autorange'] = "reversed"
fig.show()

```{figure} data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7
:figclass: caption-hack
:name: community-score

The 40 Projects with the highest community score
```