<h1 id="basics" style="font-family:verdana;"> 
    <center> <b>Banana Index</b>
    </center>
</h1>

#### The Economist’s Banana index - first unveiled in the article "A different way to measure the climate impact of food". Updated yearly.

#### Methodology

#### The banana index gives the greenhouse-gas emissions of foods - by weight, calorie, or protein - by their equivalent in bananas. Greenhouse-gas emissions are in CO2-equivalents, with non-CO2 gases converted according to the amount of warming they cause over a 100-year timescale.

#### Mathematically, this means that for a given metric, the banana score is the ratio of emissions efficiency. For instance, strawberries bring about 5.18 kilograms of CO2-equivalents per 1000 kilocalories. Bananas bring about 0.88 kilograms of CO2-equivalents per 1000 kilocalories. Strawberries banana score, by calorie, is therefore equal to 5.18/0.88, which rounds to 6.

![image.png](attachment:58081303-af92-4e54-942d-13a48a50e17d.png)

## Overview
If you're a runner, you might track your fitness by comparing your 5K running time to benchmarks. This helps you understand how well you're doing and where you can improve. In customer service, businesses use the Net Promoter Score (NPS) to measure customer satisfaction and compare it with industry standards and competitors. In manufacturing, quality benchmarks like ISO certifications set clear standards for product quality and safety.
Similarly, when it comes to measuring greenhouse gas emissions from food, we have the Banana Index. This benchmark compares the emissions of various foods to those of bananas. By using bananas as a standard, the Banana Index makes it easier to understand and compare the environmental impact of different foods. Just like fitness benchmarks help you improve your performance, the Banana Index helps you make more sustainable food choices.

## Load the libraries

In [1]:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.express as px
import plotly.subplots as sp
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import warnings
warnings.filterwarnings('ignore')

## Load the data

In [50]:
df = pd.read_csv('bananaindex.csv')

In [8]:
df.head(11).T

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10
entity,Ale,Almond butter,Almond milk,Almonds,Apple juice,Apple pie,Apples,Asparagus,Avocados,Bacon,Bagels
emissions_kg,0.48869,0.387011,0.655888,0.602368,0.458378,1.244974,0.507354,0.925692,0.921227,19.314209,0.802813
emissions_1000kcal,0.317338,0.067265,2.22223,0.105029,0.955184,0.418182,0.898188,2.791229,1.853708,6.818607,0.329389
emissions_100g_protein,0.878525,0.207599,13.595512,0.328335,29.152212,4.704171,13.442716,3.853152,4.984962,9.157925,0.959624
emissions_100g_fat,2.424209,0.079103,4.05747,0.119361,19.75498,2.585492,25.59441,16.707521,11.067428,11.273881,3.741236
land_use_kg,0.811485,7.683045,1.370106,8.230927,0.660629,1.765165,0.668999,1.401769,1.248931,42.332229,2.262947
land_use_1000kcal,0.601152,1.29687,2.675063,1.423376,1.382839,0.597374,1.205081,3.909719,2.513363,14.792841,0.924206
Land use per 100 grams of protein,1.577687,3.608433,12.687839,4.26104,43.232158,6.471804,18.66799,5.424797,6.758265,18.106127,2.663544
Land use per 100 grams of fat,3.065766,1.495297,4.60053,1.610136,26.246743,3.8198,36.386277,20.964724,15.006285,23.911078,10.173081
Bananas index (kg),0.559558,0.443134,0.751002,0.689721,0.524851,1.425516,0.580929,1.059933,1.054821,22.115095,0.919234


In [48]:
df.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
emissions_kg,158.0,6.96875,15.275774,0.207276,0.952077,1.972573,5.300145,129.747715
emissions_1000kcal,158.0,4.23465,8.283058,0.067265,0.559785,1.21891,4.688395,68.095953
emissions_100g_protein,158.0,12.344095,26.924422,0.207599,2.411551,5.285952,10.815254,257.293857
emissions_100g_fat,158.0,23.437459,47.489672,0.079103,3.263673,7.661961,20.426275,348.489652
land_use_kg,158.0,13.81414,45.541512,0.395357,1.77489,3.437468,9.089234,427.331126
land_use_1000kcal,158.0,6.984234,22.869427,0.427187,1.217093,2.120968,4.607993,223.690135
Land use per 100 grams of protein,158.0,15.143724,28.17182,0.686572,4.348589,7.825508,13.350834,211.381205
Land use per 100 grams of fat,158.0,28.812621,58.270064,1.032875,4.635697,9.970706,27.436151,478.803717
Bananas index (kg),158.0,7.979336,17.491019,0.237335,1.090144,2.258629,6.068756,148.563324
Bananas index (1000 kcalories),158.0,4.835165,9.457677,0.076804,0.639168,1.391763,5.353256,77.752629


In [51]:
df.isnull().sum()

entity                                 0
year                                   0
emissions_kg                           0
emissions_1000kcal                     0
emissions_100g_protein                 2
emissions_100g_fat                     0
land_use_kg                            0
land_use_1000kcal                      0
Land use per 100 grams of protein      2
Land use per 100 grams of fat          0
Bananas index (kg)                     0
Bananas index (1000 kcalories)         0
Bananas index (100g protein)           0
Chart?                                 0
type                                   0
Banana values                        157
Unnamed: 16                          157
dtype: int64

We will drop a few columns like the Banana values and unnamed:16 as they have null values. Also the year and Chart? can be dropped as it doesn't add any value to the analysis.

In [None]:
df.drop(['year','Banana values', 'Unnamed: 16', 'Chart?'], axis = 1, inplace = True)
df = df.dropna()

## Emission analysis

In [43]:
def plot_emissions(entities, measure):

    # Create scatter plot
    fig = go.Figure()

    # Add traces with text labels
    fig.add_trace(go.Scatter(
        x=measure,
        y=entities,
        mode='markers+text',  # Include text labels
        marker=dict(
            color='goldenrod',
            size=10
        ),
        text=entities,  # Data labels
        textposition='top right',  # Position of the text
        name='Other Items'
    ))

    # Highlight the data point where measure == 1
    highlight_index = [i for i, value in enumerate(measure) if value == 1]

    fig.add_trace(go.Scatter(
        x=[measure[i] for i in highlight_index],
        y=[entities[i] for i in highlight_index],
        mode='markers+text',
        marker=dict(
            color='red',
            size=12
        ),
        text=[entities[i] for i in highlight_index],  # Data labels
        textposition='top right',
        name='Bananas'
    ))

    # Update layout with dark theme and no y-axis
    fig.update_layout(
        title='Bananas Index (kg)',
        xaxis_title='Bananas Index (kg)',
        # yaxis_title='Entity',
        xaxis=dict(
            type='log',  # Log scale for x-axis
            color='white',  # X-axis label color
            showgrid=False,  # Hide grid lines on x-axis
            showline=True,  # Show x-axis line
            showticklabels=True,  # Show x-axis tick labels
            zeroline=False  # Hide x-axis zero line
        ),
        yaxis=dict(
            showgrid=False,  # Hide grid lines on y-axis
            showline=False,  # Hide y-axis line
            showticklabels=False,  # Hide y-axis tick labels
            zeroline=False  # Hide y-axis zero line
        ),
        title_font=dict(size=16, color='white'),
        xaxis_title_font=dict(size=14, color='white'),
        yaxis_title_font=dict(size=14, color='white'),
        width=1500,  # Increase width of the plot
        height=1800,   # Increase height of the plot
        paper_bgcolor='black',  # Background color of the entire figure
        plot_bgcolor='black',   # Background color of the plotting area
        font=dict(color='silver'),  # Font color
        shapes=[
            # Vertical red line at x = 1
            dict(
                type='line',
                x0=1,
                x1=1,
                y0=0,
                y1=1,
                xref='x',
                yref='paper',
                line=dict(
                    color='red',
                    width=2
                )
            )
        ]
    )

    fig.show()

## Emissions based on quantity

In [44]:
# Create scatter plot
entities = df['entity']
measure = df['Bananas index (kg)']
plot_emissions(entities, measure)

## Emissions based on calories

In [45]:
entities = df['entity']
measure = df['Bananas index (1000 kcalories)']
plot_emissions(entities, measure)

## Emissions based on protein

In [46]:
entities = df['entity']
measure = df['Bananas index (100g protein)']
plot_emissions(entities, measure)

## Observations
* Regardless of the analysis method, beef-based foods consistently perform poorly across all environmental parameters. In contrast, almonds and almond butter exhibit excellent performance in terms of emissions, with potatoes also demonstrating a low emission score per kilogram, even surpassing almonds in this regard.
* Food items positioned to the left of the RED line, which represents the banana index, generally show favorable emission profiles. Conversely, items to the right of this line tend to have higher emissions. Notably, those farthest to the left are exceptionally low in emissions, while those farthest to the right exhibit the highest levels of emissions.

## Conclusion
The Banana Index is a fruit-based metric that helps measure the climate impact of everyday foods. It simplifies understanding the environmental footprint of our diets by comparing various foods to bananas. Bananas serve as a benchmark for greenhouse gas emissions, making it easier to grasp the impact of our food choices. Future discussions could explore other fruits or everyday items in the index and the implications of using such metrics in sustainable food practices. By continuously seeking innovative ways to measure and understand food's impact, we can contribute to a greener planet.