# Food Environmental Impact
## ILV Datenvisualisierung und Visual Analytics
## Christina Köck
## Februar 2023
### Link to the Gitlab-Repo: https://gitlab.web.fh-kufstein.ac.at/christina.koeck/datenvisualisierung_und_visualanalytics

Die Visualisierungen werden in diesem Notebook erstellt. Designentscheidungen werden hier dokumentiert. Nach der Entwicklung wird der Code in eine streamlit-Anwendung für das Dashboarding übertragen.

Die Streamlit-Anwendung ist für interessierte Verbaucher mit leicht wissenschaftlichem Hintergrund gedacht. Die Informationen sollten für VerbraucherInnen verständlich sein, allerdings sind gewisse Kenntnisse zu den Nachhaltigkeitsparamtern vorausgesetzt. Besonders die Darstellung der Korrelation setzt Kenntnisse der Pearson-Korrelation voraus. Die Streamlit-Anwendung könnte auch im Unterricht verwendet werden, um Lernenden verschiedene Ernährungsformen näherzubringen. Die Anwendung ist so gestaltet, dass sie mit verschiedenen Fragen durch die Daten führt. Somit soll auf verschiedene Aspekte und Zusammenhänge hingewiesen werden. Die NutzerInnen können dabei selbst wählen, welche Lebenmittel dargestellt werden sollen.

### Libraries and data

In [1]:
# ! pip install cmcrameri

In [2]:
from cmcrameri import cm
import math as math
import pandas as pd
import numpy as np
import seaborn as sns

import matplotlib.pyplot as plt
import plotly.express as px

import sparql_dataframe

In [3]:
# # Data are from:
# # Hannah Ritchie and Max Roser (2022)
# # "Environmental Impacts of Food Production".
# # Published online at OurWorldInData.org. 'https://ourworldindata.org/environmental-impacts-of-food' [Online Resource]
# # Data was retrieved in a shorter form from https://www.kaggle.com/datasets/selfvivek/environment-impact-of-food-production
# df_food = pd.read_excel("Mockdata012.xlsx")
# df_food.drop('Unnamed: 0', axis = 1, inplace=True)


In [4]:
# df_food.head(12)

## Read in Dapro Data

In [59]:
df_dapro = pd.read_excel('df_dapro_en.xlsx')
df_dapro.rename(columns= {'Unnamed: 0': ' '}, inplace=True)
df_dapro.set_index(' ', drop = True, inplace=True)
df_dapro['Fat_g/100g'] = [
    0.4,
    23,
    60,
    50,
    52,
    42,
    32,
    61.6,
    2,
    49.7, 
    30,
    49,
    0.8,
    18.1,
    3.5,
    45,
    14
]

Unnamed: 0,Alanin_mg/100g,Alkohol (Ethanol)_mg/100g,Arginin_mg/100g,Asparagic acid_mg/100g,Butteric acid/butyric acid_mg/100g,Calcium_mg/100g,Cholesterin_mg/100g,Cystein_mg/100g,Decanic acid/capric acid_mg/100g,Decosanic acid/beetic acid_mg/100g,...,Water -insoluble fiber_mg/100g,Xylit_mg/100g,Sugar (total) _mg/100g,http://purl.obolibrary.org/obo/CHEBI_16646_mg/100g,http://purl.obolibrary.org/obo/FOODON_03316427_mg/100g,http://purl.obolibrary.org/obo/FOODON_03420190_mg/100g,"ecological scarcity 2013, total, UBP/100g",Eutrophying emissions per 100g (gPO₄eq per 100g},Freshwater withdrawals per 100g (liters per 100g),Land use per 100g (m² per 100g)
,,,,,,,,,,,,,,,,,,,,,
Apple,0.0008,1.82757,0.052,0.0,3.25,0.303,0.0,0.005,3.25,0.859,...,0.0,2.01,0.49,0.0,0.0,0.0,134.47,0.145,18.01,0.063
Feta,0.078,17.15552,0.0,0.0,0.0,0.137,0.251,0.017,0.0,0.0,...,0.0004,0.0,0.795,0.0,0.0,0.0,,,,
Pine nut,0.002,10.16515,1.03,0.0,0.146,0.204,0.0,0.235,0.146,0.144,...,0.0,7.2,12.47,0.0,0.0,0.0,,,,
Sesame,0.01,10.16515,1.46,0.0,0.204,0.202,0.0,0.347,0.204,0.559,...,0.0,11.18,0.25,0.0,0.0,0.0,592.59,,,
Pistachio,0.005,14.99285,1.18,0.0,2.31,0.206,0.0,0.158,2.31,0.212,...,0.0,10.61,5.2,0.0,0.0,0.0,,,,
Cashew nut,0.005,14.99285,2.06,0.0,22.18,0.213,0.005,0.258,22.18,0.0,...,0.0,3.1,0.323,0.0,0.0,0.0,3322.2,,,
Hard cheese,0.052,15.3088,0.05,0.0,0.0,0.179,0.206,0.03,0.0,0.0,...,0.0014,0.0,0.55,0.0,0.0,0.0,3554.7,9.837,560.52,8.779
Hazel nut,0.001,14.99285,1.6,0.0,5.97,0.219,0.006,0.163,5.97,0.0,...,0.0,7.7,22.2,0.0,0.0,0.0,,,,
Chicken,0.0094,9.46711,0.065,0.0,0.0,0.264,0.0,0.011,0.0,0.0,...,0.0019,0.0,1.96,0.0,0.0,0.0,668.1,2.176,57.77,0.627


## Visualisierungen

In [60]:
def annontations_hor(plots):
    # Place a label for each bar
    for bar in plots.patches:
        # Get X and Y placement of label from rect
        x_value = bar.get_width()
        y_value = bar.get_y() + bar.get_height() / 2

        # Number of points between bar and label; change to your liking
        space = -30
        # Vertical alignment for positive values
        ha = 'left'

        # If value of bar is negative: place label to the left of the bar
        if x_value < 0:
            # Invert space to place label to the left
            space *= -1
            # Horizontally align label to the right
            ha = 'right'

        # Use X value as label and format number
        label = '{:,.0f}'.format(x_value)

        # Create annotation
        plt.annotate(
            label,                      # Use `label` as label
            (x_value, y_value),         # Place label at bar end
            xytext=(space, 0),          # Horizontally shift label by `space`
            textcoords='offset points', # Interpret `xytext` as offset in points
            va='center',                # Vertically center label
            ha=ha,                      # Horizontally align label differently for positive and negative values
            color = 'white',
        fontsize = 25)            # Change label color to white


## scientific colormaps (see http://www.fabiocrameri.ch/visualisation.php)

In [61]:
from colors_cameri import bilbao, tofino, davos, lisbon, oslo

In [62]:
tofino_rgb = [el[1] for el in tofino]

In [63]:
davos_rgb = [el[1] for el in davos]
oslo_rgb = [el[1] for el in oslo]

In [64]:
# colors =  dict(zip(df_food["Category"].unique(), tofino_rgb))

In [65]:
zhaw_color = (0.00000 , 0.39216 , 0.65098)

In [74]:
oslo_rgb

['rgb(0, 1, 0)',
 'rgb(11, 25, 39)',
 'rgb(17, 48, 77)',
 'rgb(27, 73, 117)',
 'rgb(46, 98, 160)',
 'rgb(78, 125, 199)',
 'rgb(111, 146, 202)',
 'rgb(144, 166, 201)',
 'rgb(176, 185, 200)',
 'rgb(215, 215, 216)',
 'rgb(255, 255, 255)']

### Verteilungen bestimmter Spalten
 - alle beliebigen Spalten können gewählt werden werden
 - Zusammenhänge der Spalten werden dargestellt
 - ob Parameter positiv oder negative zu bewerten sind, muss User feststellen

## TODO:
 [ x ] Auswahl aller Spalten

In [66]:
df = df_dapro

In [67]:
df

Unnamed: 0,Alanin_mg/100g,Alkohol (Ethanol)_mg/100g,Arginin_mg/100g,Asparagic acid_mg/100g,Butteric acid/butyric acid_mg/100g,Calcium_mg/100g,Cholesterin_mg/100g,Cystein_mg/100g,Decanic acid/capric acid_mg/100g,Decosanic acid/beetic acid_mg/100g,...,Water -insoluble fiber_mg/100g,Xylit_mg/100g,Sugar (total) _mg/100g,http://purl.obolibrary.org/obo/CHEBI_16646_mg/100g,http://purl.obolibrary.org/obo/FOODON_03316427_mg/100g,http://purl.obolibrary.org/obo/FOODON_03420190_mg/100g,"ecological scarcity 2013, total, UBP/100g",Eutrophying emissions per 100g (gPO₄eq per 100g},Freshwater withdrawals per 100g (liters per 100g),Land use per 100g (m² per 100g)
,,,,,,,,,,,,,,,,,,,,,
Apple,0.0008,1.82757,0.052,0.0,3.25,0.303,0.0,0.005,3.25,0.859,...,0.0,2.01,0.49,0.0,0.0,0.0,134.47,0.145,18.01,0.063
Feta,0.078,17.15552,0.0,0.0,0.0,0.137,0.251,0.017,0.0,0.0,...,0.0004,0.0,0.795,0.0,0.0,0.0,,,,
Pine nut,0.002,10.16515,1.03,0.0,0.146,0.204,0.0,0.235,0.146,0.144,...,0.0,7.2,12.47,0.0,0.0,0.0,,,,
Sesame,0.01,10.16515,1.46,0.0,0.204,0.202,0.0,0.347,0.204,0.559,...,0.0,11.18,0.25,0.0,0.0,0.0,592.59,,,
Pistachio,0.005,14.99285,1.18,0.0,2.31,0.206,0.0,0.158,2.31,0.212,...,0.0,10.61,5.2,0.0,0.0,0.0,,,,
Cashew nut,0.005,14.99285,2.06,0.0,22.18,0.213,0.005,0.258,22.18,0.0,...,0.0,3.1,0.323,0.0,0.0,0.0,3322.2,,,
Hard cheese,0.052,15.3088,0.05,0.0,0.0,0.179,0.206,0.03,0.0,0.0,...,0.0014,0.0,0.55,0.0,0.0,0.0,3554.7,9.837,560.52,8.779
Hazel nut,0.001,14.99285,1.6,0.0,5.97,0.219,0.006,0.163,5.97,0.0,...,0.0,7.7,22.2,0.0,0.0,0.0,,,,
Chicken,0.0094,9.46711,0.065,0.0,0.0,0.264,0.0,0.011,0.0,0.0,...,0.0019,0.0,1.96,0.0,0.0,0.0,668.1,2.176,57.77,0.627


In [68]:
# # chose the dimensions to display

# # x = 'Vitamin B12-Cobalamin_μg/100g'
# x = 'Einfach ungesättigte Fettsäuren_mg/100g'
# #  'Wasserunlösliche Ballaststoffe_mg/100g'
# size = 'Vitamin B12-Cobalamin_μg/100g'
# y = 'Eiweiß (Protein)_mg/100g'
# # y = 'Eiweiß (Protein)_mg/100g'


# fig = px.scatter(df, 
#                 y=y,
#                    size= size,
#                  x = x,
#                  color = df.index, 
#            hover_name=df.index, 
#                  size_max=60,
#          color_discrete_sequence = oslo_rgb[:9],
#                  height = 750, 
#                  title = 'Distribution of the food products in the database in regard to the chosen parameters.<br>Two parameters are shown on the x- and y- axis respectively, the size of the bubbles show the parameter<br>"{}".<br>By hovering over the bubbles the numbers are shown.'.format(size)
#                 )
# fig.update_layout(
#                           margin={'t': 200})
# fig.show()

In [73]:
df.columns[40:]

Index(['Hexadecan acid/palmitic acid_mg/100g', 'Hexadecatetraensäure_mg/100g',
       'Hexadecensic acid/palmitoleic acid_mg/100g',
       'Hexanic acid/capronic acid_mg/100g', 'Histidin_mg/100g',
       'Iodid_μg/100g', 'Isoleucin_mg/100g', 'Kalium_mg/100g',
       'KUPFER_MG/100G', 'Short -chain fatty acids_mg/100g',
       'Lactose (milk sugar) _mg/100g', 'Long -chain fatty acids_mg/100g',
       'Leucin_mg/100g', 'Lignin_mg/100g', 'Lysin_mg/100g',
       'Maltose (Malzzucker)_mg/100g', 'Mangan_μg/100g', 'Mannit_mg/100g',
       'Polyunsaturated fatty acids MG/100G', 'Methionin_mg/100g',
       'Minerals (raw ash) _mg/100g', 'Medium -chain fatty acids_mg/100g',
       'Monosaccharide (1M)_mg/100g', 'Sodium_mg/100g',
       'Non -essential amino acids_mg/100g', 'Nonadecatrio acid_mg/100g',
       'Octadiic acid/linoleic acid_mg/100g',
       'Octadecanic acid/stearic acid_mg/100g',
       'Octadecriosic acid/linolenic acid_mg/100g',
       'Octadecenic acid/oleic acid_mg/100g',
     

In [69]:
# chose the dimensions to display

size =  'Protein_g/g'
x = 'ecological scarcity 2013, total, UBP/100g'
y = 'Lysin_mg/100g'



                                    


fig = px.scatter(df, 
                y=y,
                   size= size,
                 x = x,
                 color = df.index, 
           hover_name=df.index, 
                 size_max=60,
         color_discrete_sequence = oslo_rgb,
                 title = 'Distribution of the food products in the database in regard to the chosen parameters.'
                )
fig.show()

In [70]:
# plt.figure(figsize=(8, 4))
# sns.scatterplot(df_food, x = x, y = y, size = size, hue = 'Category', palette=cm.oslo.colors, legend='brief')
# plt.legend(loc=(1.04, 0))

# sns.despine(left=True, bottom=True)

### Anzahl pro Kategorie
 - zur Übersicht ob etwas überrepräsentiert ist
 - Aggregation, keine negative oder positive Aussage, keine Referenzwerte nötig
  - Annotation/Farbe ja/nein

## TODO:
### kann ich das besser zur Übersicht verwenden? 
### Clickable machen?
### Treemap

In [71]:
fig = px.sunburst(df, path=[df.index, df.Category], color= 'Eiweiß (Protein)_mg/100g'
                  , color_discrete_sequence = (oslo_rgb + oslo_rgb*2), 
                  title = 'Categories in the database (inner circle) and corresponding food products' 
                  '(outer circle). <br>Click on one category to zoom in. To go back, click on the category again.'
                 )
                
fig.show()

AttributeError: 'DataFrame' object has no attribute 'Category'

In [None]:
# # chose a parameter to display

# list = ['Category', 'Allergens', 'FurProc', 'NutritionalForm']

# choice = 'Allergens'

# plt.figure(figsize=(8, 4))
# sns.countplot(df_food, x= choice , color= zhaw_color, order=df[choice].value_counts().index)
# sns.despine(left=True, bottom=True)
# plt.title('Count of food products in the database in regard to the chosen parameter.')


### Anzahl missing values
 - Nur Anzahl pro Parameter oder Datensatz oder beides
 - keine Referenzwerte
 - fehlende Werte negativ notiert (dunkles rot/schwarz)
     - Farben für Konsistenz beibehalten oder rot/schwarz?

## TODO:
[ X ] paar Produkte auswählen 

In [None]:
# df_food.set_index('Food product', inplace=True)

In [None]:
# choice = ['Rice', 'Potatoes', 'Milk']
choice = df.columns[:]

df_plot = df[choice].isna()

plt.figure(figsize = (33, 10))

sns.heatmap(df_plot, cbar = False, cmap = sns.blend_palette(cm.oslo.colors, n_colors=6))
plt.tick_params(axis='both', which='major', 
                labelsize=10, labelbottom = False, bottom=False, top = False, labeltop=True)
plt.xticks(rotation = 90)
plt.title('Count of unknown values in the database. Dark color signifies known values, bright color signifies unknown value.')
plt.show()

In [None]:
df_plot

In [None]:
import plotly

choice = df.columns[:]

df_plot = df[choice].isna()

title_text = 'Count of unknown values in the database. Dark color signifies <br> known values, bright color signifies unknown value.'

plt.figure(figsize=(20, 20))
fig = px.imshow(df_plot, text_auto=False, aspect="auto", width=2000,height=800, 
    color_continuous_scale=oslo_rgb
               )
fig.update_xaxes(side = "top")
fig.update_layout(title_text=title_text,title_y = 0.95)
fig.show()

In [None]:
import plotly

choice = df.columns[50:65]

size = len(choice)

df_plot = df[choice]>0

title_text = 'Count of zero values in the database. Dark color signifies zero values,<br>bright color signifies values bigger than zero.'

plt.figure(figsize=(20, 20))
fig = px.imshow(df_plot, text_auto=False, aspect="auto", width=size*50,height=500, 
    color_continuous_scale=oslo_rgb
               )
fig.update_xaxes(side = "top")
fig.update_layout(title_text=title_text,title_y = 0.95,
         margin={'t': 200})

fig.update_coloraxes(showscale=False)
fig.show()

In [None]:
df[choice]>0

### distribution of a certain parameter
 - Verteilung der Daten, was sind typische hohe/niedrige Werte?
 - Skala kann eine Rolle spielen, wenn kleine und große Wertebereiche bei den Parametern vorhanden sind (Kalorien vs Asche)
 - trotzdem keine Referenzwerte verwendet, tatsächlicher Wertebereich soll abgebildet werden
 - keine negative /Positive Bewertung der Verteilung

## TODO:
### nicht verständlich, so darstellen, dass Werte über 0, vielleicht Balken
### Sinnvolle Beschreibung

In [None]:
df_dapro[df_dapro.columns[10:14]]

In [None]:
# # column = ['Eiweiß (Protein)']
# column = df_dapro.columns[10:14]
# # column = df_food.columns[23:27]

# plt.figure(figsize=(17, 10))
# sns.histplot(df_dapro[column], palette = sns.blend_palette(cm.oslo.colors, n_colors=6), multiple='dodge' )
# # plt.yticks(fontsize=20)
# # plt.xticks(fontsize=20)
# sns.despine(left=True, bottom=True)

In [None]:
# column = ['calories [kcal]', 'EuEmkg']

# plt.figure(figsize=(17, 10))
# sns.kdeplot(df_food[column], palette = cm.davos.colors)
# # plt.yticks(fontsize=20)
# # plt.xticks(fontsize=20)
# sns.despine(left=True, bottom=True)

### Bubbles with data
Nur Anzahl der Datenquellen, Datensätze, Parameter


## TODO:
### mit Übersicht der Datenquellen kombinieren

In [None]:
import numpy as np
import matplotlib.pyplot as plt

db_characterise = {
    'dimensions': ['food products: {}'.format(len(df_dapro.index)), 
                   'parameter : {}'.format(len(df_dapro.columns)), 'data sources: {}'.format(5)],
    'count': [len(df_dapro.index), len(df_dapro.columns), 2],
    'color': (sns.blend_palette(cm.oslo.colors, n_colors=5)[-4:])
}


class BubbleChart:
    def __init__(self, area, bubble_spacing=0):
        """
        Setup for bubble collapse.

        Parameters
        ----------
        area : array-like
            Area of the bubbles.
        bubble_spacing : float, default: 0
            Minimal spacing between bubbles after collapsing.

        Notes
        -----
        If "area" is sorted, the results might look weird.
        """
        area = np.asarray(area)
        r = np.sqrt(area / np.pi)

        self.bubble_spacing = bubble_spacing
        self.bubbles = np.ones((len(area), 4))
        self.bubbles[:, 2] = r
        self.bubbles[:, 3] = area
        self.maxstep = 2 * self.bubbles[:, 2].max() + self.bubble_spacing
        self.step_dist = self.maxstep / 2

        # calculate initial grid layout for bubbles
        length = np.ceil(np.sqrt(len(self.bubbles)))
        grid = np.arange(length) * self.maxstep
        gx, gy = np.meshgrid(grid, grid)
        self.bubbles[:, 0] = gx.flatten()[:len(self.bubbles)]
        self.bubbles[:, 1] = gy.flatten()[:len(self.bubbles)]

        self.com = self.center_of_mass()

    def center_of_mass(self):
        return np.average(
            self.bubbles[:, :2], axis=0, weights=self.bubbles[:, 3]
        )

    def center_distance(self, bubble, bubbles):
        return np.hypot(bubble[0] - bubbles[:, 0],
                        bubble[1] - bubbles[:, 1])

    def outline_distance(self, bubble, bubbles):
        center_distance = self.center_distance(bubble, bubbles)
        return center_distance - bubble[2] - \
            bubbles[:, 2] - self.bubble_spacing

    def check_collisions(self, bubble, bubbles):
        distance = self.outline_distance(bubble, bubbles)
        return len(distance[distance < 0])

    def collides_with(self, bubble, bubbles):
        distance = self.outline_distance(bubble, bubbles)
        idx_min = np.argmin(distance)
        return idx_min if type(idx_min) == np.ndarray else [idx_min]

    def collapse(self, n_iterations=50):
        """
        Move bubbles to the center of mass.

        Parameters
        ----------
        n_iterations : int, default: 50
            Number of moves to perform.
        """
        for _i in range(n_iterations):
            moves = 0
            for i in range(len(self.bubbles)):
                rest_bub = np.delete(self.bubbles, i, 0)
                # try to move directly towards the center of mass
                # direction vector from bubble to the center of mass
                dir_vec = self.com - self.bubbles[i, :2]

                # shorten direction vector to have length of 1
                dir_vec = dir_vec / np.sqrt(dir_vec.dot(dir_vec))

                # calculate new bubble position
                new_point = self.bubbles[i, :2] + dir_vec * self.step_dist
                new_bubble = np.append(new_point, self.bubbles[i, 2:4])

                # check whether new bubble collides with other bubbles
                if not self.check_collisions(new_bubble, rest_bub):
                    self.bubbles[i, :] = new_bubble
                    self.com = self.center_of_mass()
                    moves += 1
                else:
                    # try to move around a bubble that you collide with
                    # find colliding bubble
                    for colliding in self.collides_with(new_bubble, rest_bub):
                        # calculate direction vector
                        dir_vec = rest_bub[colliding, :2] - self.bubbles[i, :2]
                        dir_vec = dir_vec / np.sqrt(dir_vec.dot(dir_vec))
                        # calculate orthogonal vector
                        orth = np.array([dir_vec[1], -dir_vec[0]])
                        # test which direction to go
                        new_point1 = (self.bubbles[i, :2] + orth *
                                      self.step_dist)
                        new_point2 = (self.bubbles[i, :2] - orth *
                                      self.step_dist)
                        dist1 = self.center_distance(
                            self.com, np.array([new_point1]))
                        dist2 = self.center_distance(
                            self.com, np.array([new_point2]))
                        new_point = new_point1 if dist1 < dist2 else new_point2
                        new_bubble = np.append(new_point, self.bubbles[i, 2:4])
                        if not self.check_collisions(new_bubble, rest_bub):
                            self.bubbles[i, :] = new_bubble
                            self.com = self.center_of_mass()

            if moves / len(self.bubbles) < 0.1:
                self.step_dist = self.step_dist / 2

    def plot(self, ax, labels, colors):
        """
        Draw the bubble plot.

        Parameters
        ----------
        ax : matplotlib.axes.Axes
        labels : list
            Labels of the bubbles.
        colors : list
            Colors of the bubbles.
        """
        for i in range(len(self.bubbles)):
            circ = plt.Circle(
                self.bubbles[i, :2], self.bubbles[i, 2], color=colors[i])
            ax.add_patch(circ)
            ax.text(*self.bubbles[i, :2], labels[i],
                    horizontalalignment='center', verticalalignment='center', 
                   color = 'black', fontsize = 18)


bubble_chart = BubbleChart(area=db_characterise['count'],
                           bubble_spacing=0.1)

bubble_chart.collapse()

fig, ax = plt.subplots(subplot_kw=dict(aspect="equal"))
bubble_chart.plot(
    ax, db_characterise['dimensions'], db_characterise['color'])
ax.axis("off")
ax.relim()
ax.autoscale_view()
ax.set_title('Characteristics of the DaPro database: Count of food products, parameters and data sources')
ax.set_xticklabels(db_characterise['count'])

plt.show()

In [None]:
# !pip install pyvis

from pyvis.network import Network

net = Network(notebook=True, directed = True,
              heading = 'Characteristics of the DaPro database: Count of food products, parameters and data sources')

count_sources = 8

db_characterise = {
    'dimensions': ['food products: {}'.format(len(df_dapro.index)), 
                   'parameter : {}'.format(len(df_dapro.columns)), 'data sources: {}'.format(count_sources)],
    'count': [len(df_dapro.index), len(df_dapro.columns), 2],
    'color': oslo_rgb[1: 1+len(db_characterise['dimensions'])]
}


net.add_nodes(range(len(db_characterise['dimensions'])), 
              label= db_characterise['dimensions'],
              
              size=[ len(df_dapro.index), len(df_dapro.columns), count_sources],
              
              color=db_characterise['color'])



net.toggle_physics(True)
net.show('bubbles.html')

#### Add databsse structure and sources

In [None]:
# !pip install pyvis

from pyvis.network import Network

net = Network(notebook=True, directed = True,
                heading = 'Structure of the database sources: international food databases are collected in the Swiss Food Data Mediator and directed to DaPro.')



net.add_nodes(range(count_sources+2), 
              label=['DaPro', 'Swiss Food Data Mediator', 'USDA FoodData Central', 'BLSDB', 'Schweizer Nährwertdatenbank', 'FOODON', 
                     'ecoinvent', 'IUNR-DB', 'Recipes', 'Scientific Studies'],
      
              
              title=[ 'https://dapro.ulozezoz.myhostpoint.ch/webvowl/#', 'Link Swiss Food Data Mediator', 
                    'https://fdc.nal.usda.gov/','https://www.blsdb.de/', 
                     'https://naehrwertdaten.ch/de/', 'https://foodon.org/', 
                     'https://ecoinvent.org/', 'https://www.zhaw.ch/en/lsfm/institutes-centres/iunr/', 
                    '?', '?'],
              
              color=[oslo_rgb[5], 
                      oslo_rgb[3], 
                     oslo_rgb[1], oslo_rgb[1], oslo_rgb[1], oslo_rgb[1], oslo_rgb[1], oslo_rgb[1], 
                     oslo_rgb[9], oslo_rgb[9]],
             )

net.add_edges([(1,0,4), (2, 1,1),  (3, 1,1), (4, 1,1), (5, 1,1), (6, 1,1), (7, 1,1), (8, 1,1), (9, 1,1)])

net.toggle_physics(True)
net.show('mygraph.html')