# Social Media Data Explorer

We can create a few apps that allow us to easily explore social media data.

<div class="alert alert-warning" role="alert">
  This notebook uses a slightly newer and improved version of `social_media` that has improvements in the `num_persons` extraction.
</div>

## Prerequisites

### Imports

In [1]:
from apps.table_apps import table_app
from apps.geovis_apps import geo_vis_cluster_app, geo_vis_shapes_app
from IPython.display import display, HTML
import os
import pandas as pd
import numpy as np
import geopandas as gpd

### Styling

In [2]:
display(HTML("<style>.container { width:80% !important; }</style>"))
pd.set_option('display.max_colwidth', -1)

## Data

In [3]:
data_base_path = 'data/'

### Tweets

In [4]:
tweet_df = pd.read_csv(os.path.join(data_base_path, 'HUMAN_public_social_media.csv'))
tweet_df = tweet_df.set_index('uniqueid', drop=True).replace('[]', np.nan)
tweet_df.loc[~tweet_df.media.isna(), 'media'] = tweet_df.media.dropna().apply(eval)

### UNHCR

In [5]:
unhcr_df = pd.read_csv('http://popstats.unhcr.org/en/asylum_seekers_monthly.csv', skiprows=3, low_memory=False)
unhcr_df['num_persons'] = unhcr_df.Value.replace('*', 0).astype(int)
unhcr_df=unhcr_df[unhcr_df.num_persons > 0]

unhcr_df['_timestamp'] = pd.to_datetime(unhcr_df['Year'].astype(str) + '-' + unhcr_df['Month'])

unhcr_df.rename(columns = {'Country / territory of asylum/residence' : 'destination', 'Origin' : 'origin'}, inplace=True)

In [6]:
country_shapes = gpd.read_file(os.path.join(data_base_path, 'country_shapes.geojson'))

## Table App

App for exploring raw data.

In [7]:
table_app(tweet_df)

VBox(children=(Tab(children=(HBox(children=(Dropdown(description='order by', options=('Unnamed: 0', '_timestam…

## GeoVis App Shapes

Origin and Destination in two layers.

### Tweets

In [8]:
geo_vis_shapes_app(tweet_df, shp_folder=data_base_path)

Output()

### UNHCR

In [9]:
geo_vis_shapes_app(unhcr_df, nuts_shapes=country_shapes)

Output()

## GeoVis App Clusters

Clustered single tweets, tooltips show information about tweet (customizable below).

In [None]:
geo_vis_cluster_app(tweet_df)