# Network Graph

Usando dataset ../../Data/02ParaLimpiar/02desastres_paralimpiar.csv

## Objetivo

### Mapear relaciones entre países
Agrupar países que comparten Seq para identificar patrones y frecuencias en tipos de desastres

## Importar librerías

In [1]:
import numpy as np
import pandas as pd
pd.set_option('display.max_rows', 20)
import matplotlib.pyplot as plt
import seaborn as sns
import networkx as nx

## Cargar datos

In [2]:
df = pd.read_csv('../../Data/02ParaLimpiar/02desastres_paralimpiar.csv', encoding='utf-8', delimiter=';', engine='python')

# Verificar carga de dataset

##  Resumen básico shape, info

In [3]:
df.shape
# Resultado: 16636 filas y 20 columnas

(16636, 20)

In [4]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 16636 entries, 0 to 16635
Data columns (total 20 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   Dis No             16636 non-null  object 
 1   Seq                16636 non-null  int64  
 2   Disaster Subgroup  16636 non-null  object 
 3   Disaster Type      16636 non-null  object 
 4   Disaster Subtype   13313 non-null  object 
 5   Country            16636 non-null  object 
 6   ISO                16636 non-null  object 
 7   Region             16636 non-null  object 
 8   Continent          16636 non-null  object 
 9   Location           14825 non-null  object 
 10  Origin             4085 non-null   object 
 11  Dis Mag Value      5064 non-null   float64
 12  Dis Mag Scale      15416 non-null  object 
 13  Latitude           2775 non-null   object 
 14  Longitude          2775 non-null   object 
 15  Start Year         16636 non-null  int64  
 16  Start Month        162

## Columns

In [5]:
df.columns

Index(['Dis No', 'Seq', 'Disaster Subgroup', 'Disaster Type',
       'Disaster Subtype', 'Country', 'ISO', 'Region', 'Continent', 'Location',
       'Origin', 'Dis Mag Value', 'Dis Mag Scale', 'Latitude', 'Longitude',
       'Start Year', 'Start Month', 'End Year', 'End Month', 'CPI'],
      dtype='object')

# Pivot table: Seq, Country, Disaster Type, Disaster Subtype, Dis Mag Scale y Dis Mag Value

Relacionar número de secuencia con país(es) y tipo de desastre así como su subtipo y magnitud

In [6]:
# # Filtrar el dataset original
# filtered_data = df[['Seq', 'Country', 'Disaster Type', 'Disaster Subtype', 'Dis Mag Scale', 'Dis Mag Value']]

# # Crear pivot table considerada pivot principal
# pivot_seq_country_distype_mag = pd.pivot_table(filtered_data, values='Country', index=['Seq', 'Country', 'Disaster Subtype', 'Dis Mag Scale', 'Dis Mag Value'], aggfunc=lambda x: ', '.join(x))

# # Mostrar pivot table principal
# print(pivot_seq_country_distype_mag)

                                                                                           Disaster Type
Seq  Country           Disaster Subtype                 Dis Mag Scale Dis Mag Value                     
1    Australia         Land fire (Brush, Bush, Pasture) Km2            1000.0                   Wildfire
                                                                       8000.0                   Wildfire
     Azores Islands    Ground movement                  Richter        7.0                    Earthquake
     Bangladesh        Cold wave                        ?C            -4.0           Extreme temperature
     Chile             Ground movement                  Richter        8.0                    Earthquake
...                                                                                                  ...
9557 China             Drought                          Km2            60000.0                   Drought
9564 China             Drought                         

### Network graph

In [15]:
# Crear graph nueva
G = nx.Graph()

# Añadir edges a graph para todas las combinaciones de países, tipos y subtipos de desastres
for index, row in df.iterrows():
    countries = row['Country'].split(", ")
    disaster_type = row['Disaster Type']
    disaster_subtype = row['Disaster Subtype']
    for country in countries:
        G.add_edge(country, (disaster_type, disaster_subtype))

# Definir posiciones de nodos usando algoritmo spring layout
pos = nx.spring_layout(G)

# Crear trazado de nodo: node trace
node_trace = go.Scatter(
    x=[pos[node][0] for node in G.nodes()],
    y=[pos[node][1] for node in G.nodes()],
    mode='markers',
    marker=dict(
        size=10,
        color='skyblue'
    ),
    hoverinfo='text',
    text=list(G.nodes())
)

# Crear trazado de orilla: edge trace
edge_trace = go.Scatter(
    x=[pos[edge[0]][0] for edge in G.edges()],
    y=[pos[edge[0]][1] for edge in G.edges()],
    line=dict(width=0.5, color='gray'),
    hoverinfo='none',
    mode='lines'
)

# Crear figura
fig = go.Figure(data=[edge_trace, node_trace],
                layout=go.Layout(
                    title='Network Graph de Relaciones entre Países',
                    titlefont=dict(size=16),
                    showlegend=False,
                    hovermode='closest',
                    margin=dict(b=20, l=5, r=5, t=40),
                    xaxis=dict(showgrid=False, zeroline=False, showticklabels=False),
                    yaxis=dict(showgrid=False, zeroline=False, showticklabels=False)
                ))

# Show the figure
fig.show()


# Conclusiones

# Recomendaciones

# Guardar dataset en csv

In [10]:
# df.to_csv('../../Data/02ParaLimpiar/02desastres_networkgraph.csv', index=False, sep=';', encoding='utf-8')