# osiris

![img](https://dm2301files.storage.live.com/y4mmRC1xelS6Y6MEqUnZ-k2vjpADHpo6UMZAaZWROunr9-Ml5FYDlZ6WMxCGedy7NDhwDpusZdF5E1oLR5Qn6momydHe7tYUOMwNeFeGW7pUWkBjGPSnZp2sacYWs9IKkose6xjhSySL_v2tbfItRI7T_Pw_Tayhaa2F_vrwW6ucyr6WPa6s9DWH_if9Y5Y3yAU?width=375&height=250&cropmode=none)


osiris is a Python data processing and analysis environment for data-based computational conflict forecasting using very large datasets and graph-based methods and models and visualization, powered by scalable graph databases.

You can use osiris to analyze causal chains and networks of confict and violence around the world from realtime-updated, [automatically-encoded political event data](https://parusanalytics.com/eventdata/papers.dir/Schrodt_Yonamine_NewDirectionsInText.pdf) from projects like GDELT. This notebook gives an overview of the osiris project, the [GDELT project](https://www.gdeltproject.org/) data that osiris uses, how to import political event data using osiris either from the GDELT file server or from Google BigQuery, how to visualize and analyze it using Python, and how to load it into a TigerGraph graph server instance to efficiently run graph-centric queries on it to retrieve vertex-edge event data that can then be further analyzed.

## Notebook Environment Setup

In [1]:
import os, sys
# Check if running inside Colab or Kaggle
IN_COLAB = 'COLAB_GPU' in os.environ
IN_KAGGLE = 'KAGGLE_KERNEL_RUN_TYPE' in os.environ
IN_HOSTED_NB = IN_COLAB or IN_KAGGLE
os.environ['IN_HOSTED_NB'] = str(IN_HOSTED_NB)

OS_NAME = sys.platform.upper()
if OS_NAME in ['LINUX', 'DARWIN'] and IN_HOSTED_NB:
    import subprocess
    print('Installing osiris from GitHub...')
    print(subprocess.run('if [ -d "osiris" ]; then rm -Rf osiris; fi', text=True, shell=True, check=True, capture_output=True).stdout)
    print(subprocess.run('git clone https://github.com/allisterb/osiris --recurse-submodule', text=True, shell=True, check=True, capture_output=True).stdout)
    print(subprocess.run('cd osiris && ./install', text=True, shell=True, check=True, capture_output=True).stdout)
    if IN_COLAB:
        print('Installing colab-env which can pull env variable values from a file called vars.env on your GDrive.')
        print(subprocess.run('pip install colab-env --upgrade', text=True, shell=True, check=True, capture_output=True).stdout)

# If we're not in a hosted nb env assume we're running Jupyter from the osiris project directory root
OSIRIS_PATH = '..' if not IN_HOSTED_NB else 'osiris'

# Import the osiris code and set the runtime env. 
sys.path.append(os.path.join(OSIRIS_PATH, 'osiris'))
sys.path.append(os.path.join(OSIRIS_PATH, 'ext'))
from osiris_global import set_runtime_env
set_runtime_env(interactive_nb=True)

## GDELT Event Data

*From the  [GDELT project](https://www.gdeltproject.org/) website*:
>The GDELT Project is a realtime network diagram and database of global human society for open research.
![gf](https://www.gdeltproject.org/images/spinningglobe.gif)

>The GDELT Project is an initiative to construct a catalog of human societal-scale behavior and beliefs across all countries of the world, connecting every person, organization, location, count, theme, news source, and event across the planet into a single massive network that captures what's happening around the world, what its context is and who's involved, and how the world is feeling about it, every single day.

The GDELT [event data](http://data.gdeltproject.org/documentation/GDELT-Event_Codebook-V2.0.pdf) contains hundreds of millions of automatically coded events extracted from news stories daily using NLU methods and models. Each event data row contains the following fields:
1. *Actors*: Humans or organizations or states which initiate and are the target of event actions. Actors may have geographic information but not temporal. An event references exactly 2 actors: Actor1 and Actor2.
2. *Actions*: Codes and other information which describe each event. Actions have both temporal and spatial attributes: an event time plus some geo information like latitude / longitude. Actors and actions naturally form graphs with directed edges connecting Actor1-->Action-->Actor2. An Actor-Action edge may contain attributes like the event time and a complementary reverse edge to make querying easier e.g. Actor1----event1_date---->Action1----event1_date--->Actor2----->event2_date----->Action2----event2_date---->Actor3 
3. *SourceURL*: a URL that locates the *story* from which the event data was extracted.

osiris can extract data directly from the GDELT file server. The advantage of this method is that you don't need to have any special credentials or server access (remember we're interested *open-source* indicators.). All the data is downloaded directly to your client machine or notebook environment.

### Importing GDELT data from file server

osiris uses *DataSource* classes to manage importing tabular data. 

In [2]:
# Import data directly from GDELT file server
from data.gdelt import DataSource
import pandas as pd
gdelt = DataSource()

In [3]:
# Get event data for a 1 week period
events = gdelt.import_data('events', 'Apr-14-2022', 'Apr-20-2022')

Importing GDELT events data for 7 day(s) from 04-14-2022 to 04-20-2022...


Import GDELT events data:   0%|          | 0/7 [00:00<?, ?day/s]

Importing GDELT events data for 7 day(s) from 04-14-2022 to 04-20-2022 completed in 71.03 s.


About a week's worth of event data in 2022 consists of about 700K events takes up about 340MB RAM.

In [4]:
events.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 707186 entries, 0 to 125669
Data columns (total 62 columns):
 #   Column                 Non-Null Count   Dtype  
---  ------                 --------------   -----  
 0   GLOBALEVENTID          707186 non-null  int64  
 1   SQLDATE                707186 non-null  int64  
 2   MonthYear              707186 non-null  int64  
 3   Year                   707186 non-null  int64  
 4   FractionDate           707186 non-null  float64
 5   Actor1Code             640700 non-null  object 
 6   Actor1Name             640700 non-null  object 
 7   Actor1CountryCode      408112 non-null  object 
 8   Actor1KnownGroupCode   9610 non-null    object 
 9   Actor1EthnicCode       3423 non-null    object 
 10  Actor1Religion1Code    10452 non-null   object 
 11  Actor1Religion2Code    2561 non-null    object 
 12  Actor1Type1Code        296023 non-null  object 
 13  Actor1Type2Code        19713 non-null   object 
 14  Actor1Type3Code        495 non-null 

In [5]:
events

Unnamed: 0,GLOBALEVENTID,SQLDATE,MonthYear,Year,FractionDate,Actor1Code,Actor1Name,Actor1CountryCode,Actor1KnownGroupCode,Actor1EthnicCode,...,ActionGeo_Type,ActionGeo_FullName,ActionGeo_CountryCode,ActionGeo_ADM1Code,ActionGeo_ADM2Code,ActionGeo_Lat,ActionGeo_Long,ActionGeo_FeatureID,DATEADDED,SOURCEURL
0,1039299549,20210414,202104,2021,2021.2849,IGOUNODEVWBK,WORLD BANK,,UNO,,...,1,Afghanistan,AF,AF,,33.0000,66.00000,AF,20220414011500,https://english.alaraby.co.uk/news/world-bank-...
1,1039299550,20210414,202104,2021,2021.2849,USA,UNITED STATES,USA,,,...,2,"Kansas, United States",US,USKS,,38.5111,-96.80050,KS,20220414011500,https://www.msn.com/en-us/news/us/wichita-wend...
2,1039299551,20210414,202104,2021,2021.2849,USA,UNITED STATES,USA,,,...,1,Afghanistan,AF,AF,,33.0000,66.00000,AF,20220414011500,https://english.alaraby.co.uk/news/world-bank-...
3,1039299552,20220407,202204,2022,2022.2658,FRAJUD,FRANCE,FRA,,,...,4,"Paris, France (general), France",FR,FR00,16282,48.8667,2.33333,-1456928,20220414011500,https://www.timesofisrael.com/two-men-in-polic...
4,1039299553,20220407,202204,2022,2022.2658,GOVMIL,DEFENSE SECRETARY,,,,...,4,"Kyiv, Kyyiv, Misto, Ukraine",UP,UP12,28554,50.4333,30.51670,-1044367,20220414011500,https://www.msn.com/en-us/news/world/why-the-b...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
125665,1040383216,20220420,202204,2022,2022.3014,cre,CREE,,,cre,...,0,,,,,,,,20220420234500,https://www.cjvr.com/2022/04/20/first-nations-...
125666,1040383217,20220420,202204,2022,2022.3014,cre,CREE,,,cre,...,0,,,,,,,,20220420234500,https://www.cjvr.com/2022/04/20/first-nations-...
125667,1040383218,20220420,202204,2022,2022.3014,cre,CREE,,,cre,...,0,,,,,,,,20220420234500,https://www.cjvr.com/2022/04/20/first-nations-...
125668,1040383219,20220420,202204,2022,2022.3014,telOPP,TELUGU,,,tel,...,0,,,,,,,,20220420234500,https://www.deccanchronicle.com/nation/politic...


Event data is highly denormalized with many redundancies for ease of querying and coded using a hierachical coding system called [CAMEO](http://data.gdeltproject.org/documentation/CAMEO.Manual.1.1b3.pdf) - Conflict and Mediation Event Observations

In [6]:
events[['EventCode', 'CAMEOCodeDescription']]

Unnamed: 0,EventCode,CAMEOCodeDescription
0,130,"Threaten, not specified below"
1,030,"Express intent to cooperate, not specified below"
2,130,"Threaten, not specified below"
3,090,"Investigate, not specified below"
4,040,"Consult, not specified below"
...,...,...
125665,060,"Engage in material cooperation, not spec below"
125666,073,Provide humanitarian aid
125667,090,"Investigate, not specified below"
125668,043,Host a visit


We can query and filter event data directly using the Pandas dataframe

In [7]:
# Find all events that were geolocated in Ukraine
uka_events = events[(events.ActionGeo_CountryCode == 'UP')]
uka_events

Unnamed: 0,GLOBALEVENTID,SQLDATE,MonthYear,Year,FractionDate,Actor1Code,Actor1Name,Actor1CountryCode,Actor1KnownGroupCode,Actor1EthnicCode,...,ActionGeo_Type,ActionGeo_FullName,ActionGeo_CountryCode,ActionGeo_ADM1Code,ActionGeo_ADM2Code,ActionGeo_Lat,ActionGeo_Long,ActionGeo_FeatureID,DATEADDED,SOURCEURL
4,1039299553,20220407,202204,2022,2022.2658,GOVMIL,DEFENSE SECRETARY,,,,...,4,"Kyiv, Kyyiv, Misto, Ukraine",UP,UP12,28554,50.4333,30.5167,-1044367,20220414011500,https://www.msn.com/en-us/news/world/why-the-b...
5,1039299554,20220407,202204,2022,2022.2658,GOVMIL,DEFENSE SECRETARY,,,,...,4,"Donbas, Ukraine (general), Ukraine",UP,UP00,25090,48.5000,38.5000,-1038077,20220414011500,https://ktvz.com/news/2022/04/13/why-the-biden...
6,1039299555,20220407,202204,2022,2022.2658,GOVMIL,DEFENSE SECRETARY,,,,...,4,"Kyiv, Kyyiv, Misto, Ukraine",UP,UP12,28554,50.4333,30.5167,-1044367,20220414011500,https://www.msn.com/en-us/news/world/why-the-b...
12,1039299561,20220407,202204,2022,2022.2658,UKR,UKRAINIAN,UKR,,,...,4,"Kyiv, Kyyiv, Misto, Ukraine",UP,UP12,28554,50.4333,30.5167,-1044367,20220414011500,https://www.msn.com/en-us/news/world/why-the-b...
13,1039299562,20220407,202204,2022,2022.2658,UKRGOVMIL,UKRAINIAN,UKR,,,...,4,"Kyiv, Kyyiv, Misto, Ukraine",UP,UP12,28554,50.4333,30.5167,-1044367,20220414011500,https://www.msn.com/en-us/news/world/why-the-b...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
125624,1040383175,20220420,202204,2022,2022.3014,USAGOV,THE WHITE HOUSE,USA,,,...,1,Ukraine,UP,UP,,49.0000,32.0000,UP,20220420234500,https://www.hellenicshippingnews.com/u-s-crude...
125625,1040383176,20220420,202204,2022,2022.3014,USAGOV,JOE BIDEN,USA,,,...,1,Ukraine,UP,UP,,49.0000,32.0000,UP,20220420234500,https://www.agassizharrisonobserver.com/news/t...
125648,1040383199,20220420,202204,2022,2022.3014,VAT,VATICAN,VAT,,,...,1,Ukraine,UP,UP,,49.0000,32.0000,UP,20220420234500,http://www.icatholic.org/article/christs-resur...
125651,1040383202,20220420,202204,2022,2022.3014,VAT,VATICAN,VAT,,,...,1,Ukraine,UP,UP,,49.0000,32.0000,UP,20220420234500,http://www.icatholic.org/article/christs-resur...


So about 50K of 700K events last week were coded as happening in Ukraine, not surprising given recent events. Many of those related to use of military force.

In [8]:
# CAMEO code 190 denotes 'use of military force'
uka_events[uka_events.EventCode.str.startswith('190')]

Unnamed: 0,GLOBALEVENTID,SQLDATE,MonthYear,Year,FractionDate,Actor1Code,Actor1Name,Actor1CountryCode,Actor1KnownGroupCode,Actor1EthnicCode,...,ActionGeo_Type,ActionGeo_FullName,ActionGeo_CountryCode,ActionGeo_ADM1Code,ActionGeo_ADM2Code,ActionGeo_Lat,ActionGeo_Long,ActionGeo_FeatureID,DATEADDED,SOURCEURL
99,1039299648,20220414,202204,2022,2022.2849,,,,,,...,4,"Donbas, Ukraine (general), Ukraine",UP,UP00,25090,48.5000,38.5000,-1038077,20220414011500,https://www.msn.com/en-us/news/world/why-the-b...
101,1039299650,20220414,202204,2022,2022.2849,,,,,,...,4,"Kharkiv, Kharkivs'ka Oblast', Ukraine",UP,UP07,25036,49.9808,36.2527,-1041320,20220414011500,https://www.winonadailynews.com/opinion/column...
117,1039299666,20220414,202204,2022,2022.2849,,,,,,...,4,"Odesa, Odes'ka Oblast, Ukraine",UP,UP17,28558,46.4639,30.7386,-1049092,20220414011500,https://www.thescottishsun.co.uk/news/8708802/...
844,1039300393,20220414,202204,2022,2022.2849,RUS,MOSCOW,RUS,,,...,4,"Kharkiv, Kharkivs'ka Oblast', Ukraine",UP,UP07,25036,49.9808,36.2527,-1041320,20220414011500,https://www.winonadailynews.com/opinion/column...
874,1039300423,20220414,202204,2022,2022.2849,RUS,RUSSIAN,RUS,,,...,4,"Kyiv, Kyyiv, Misto, Ukraine",UP,UP12,28554,50.4333,30.5167,-1044367,20220414011500,https://jamestown.org/program/why-the-russian-...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
120903,1040373135,20220420,202204,2022,2022.3014,RUS,RUSSIAN,RUS,,,...,4,"Vadym, Khersons'ka Oblast', Ukraine",UP,UP08,28553,46.1827,33.5971,-1057325,20220420221500,https://www.news8000.com/i/elderly-in-ukraine-...
120904,1040373136,20220420,202204,2022,2022.3014,RUS,RUSSIAN,RUS,,,...,4,"Vadym, Khersons'ka Oblast', Ukraine",UP,UP08,28553,46.1827,33.5971,-1057325,20220420221500,https://www.news8000.com/i/elderly-in-ukraine-...
120985,1040373217,20220420,202204,2022,2022.3014,UKR,UKRAINE,UKR,,,...,4,"Chernihiv, Chernihivs'ka Oblast', Ukraine",UP,UP02,28554,51.5055,31.2849,-1037057,20220420221500,http://www.msn.com/en-us/news/world/a-bomb-sni...
121060,1040373292,20220420,202204,2022,2022.3014,USA,UNITED STATES,USA,,,...,4,"Kyiv, Kyyiv, Misto, Ukraine",UP,UP12,28554,50.4333,30.5167,-1044367,20220420221500,http://www.msn.com/en-us/news/world/as-a-new-u...


In [9]:
# Import Folium to plot these military force events on a map
import folium
folium.Map(
    location=[48., 31.], 
    tiles="Stamen Toner",
    zoom_start=6
)

In [31]:
uka_map = folium.Map(
    location=[48., 31.], 
    #tiles="Stamen Toner",
    zoom_start=6
)
uka_map
uka_events_sample = uka_events[uka_events.EventCode.str.startswith('190')].sample(n=1000)
for r in uka_events_sample.itertuples():
    m = folium.Marker(location=[r.ActionGeo_Lat, r.ActionGeo_Long],
                      icon=folium.Icon(color="red", icon="fire", prefix="glyphicon"),
                      tooltip=str(r.Actor1CountryCode) + '->' + str(r.EventCode) + ' ' +  str(r.CAMEOCodeDescription) + '->' + str(r.Actor2CountryCode) +' on ' + str(r.SQLDATE)
                     )
    m.add_to(uka_map)
uka_map

### Shaping tabular data into graph vertices and edges

The GDELT data schema is 'flat' and designed for easy of tabular querying and grouping. To be able to do graph and network queries it needs to be shaped.

In [32]:
from data.etl import shape_event_actor_vertices
events_vertices, actor1_vertices, actor2_vertices = shape_event_actor_vertices(uka_events_sample)

Hashing Actor1 ID:   0%|          | 0/1000 [00:00<?, ?row/s]

Hashing Actor2 ID:   0%|          | 0/1000 [00:00<?, ?row/s]

We create unique IDs for actors that can be linked to actions. We hash individual actor fields together to create a unique ID for each actor and then drop all the other actor fields from event data.

In [34]:
events_vertices

Unnamed: 0,ID,Actor1ID,Actor2ID,Date,IsRoot,MonthYear,Year,FractionDate,Actor1Religion1Code,Actor1Religion2Code,...,Actor2Geo_ADM1Code,Actor2Geo_ADM2Code,ActionGeo_Type,ActionGeo_FullName,ActionGeo_CountryCode,ActionGeo_Lat,ActionGeo_Long,ActionGeo_FeatureID,DATEADDED,SOURCEURL
15027,1039498719,8aKeckFX+kx+eDBeMh7i1NPAbwQ=,gDzOiI2s5+6AplUmSWWgsEjpQxA=,2022-04-15,True,202204,2022,2022.2877,,,...,UP12,28554,4,Kyiv; Kyyiv; Misto; Ukraine,UP,50.4333,30.5167,-1044367,20220415023000,https://www.stuff.co.nz/world/europe/300567107...
33105,1039905497,PdW2+wiyjAcN2y1x3ynxRHMp9QY=,tlifxqsNyCzxIJnRwtQKuZToQQw=,2022-04-18,False,202204,2022,2022.2959,,,...,,,4,Kherson; Khersons'ka Oblast'; Ukraine,UP,46.6558,32.6178,-1041356,20220418094500,https://www.thesun.co.uk/news/18294062/putin-l...
101133,1039631457,v16yoVRmoBvMvdCgoAvpcbLYG4k=,9zwz1m/hDQzpR2+pj4H5jILHHWo=,2022-04-15,False,202204,2022,2022.2877,,,...,RS48,25106,4,Kyiv; Kyyiv; Misto; Ukraine,UP,50.4333,30.5167,-1044367,20220415224500,https://www.swoknews.com/ap/international/poli...
34679,1040057622,TnOPld40XzwMsoOjVnMiJDikDRU=,pBuSWNEqPrpVycgVw3ALEfROE0I=,2022-04-19,True,202204,2022,2022.2986,,,...,CE,,1,Ukraine,UP,49.0000,32.0000,UP,20220419074500,https://www.msn.com/en-in/news/other/senas-rau...
31416,1039681775,gfQvMG0isaLqMb5oNqmSBjX0o6A=,GqhZ+kSheXH6rhGs0JPu69lg+cQ=,2022-04-16,True,202204,2022,2022.2904,,,...,RS48,25106,4,Donbas; Ukraine (general); Ukraine,UP,48.5000,38.5000,-1038077,20220416090000,https://www.jagonews24.com/en/international/ne...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
41010,1039815562,jjyTV/XkrwQ2Pgjl5nuy4JjnmhM=,cjLIHJ+7YlIT77u7A230gC6Z8Qs=,2022-04-17,False,202204,2022,2022.2932,,,...,UP07,25036,4,Kharkiv; Kharkivs'ka Oblast'; Ukraine,UP,49.9808,36.2527,-1041320,20220417143000,https://www.dailystar.co.uk/news/world-news/vl...
66469,1039385218,KwnR1KenGeUshKNw7RA2LYjTV80=,CvfHXZuN0rK0MU7Zbdp5eFvsmxA=,2022-04-14,False,202204,2022,2022.2849,,,...,UP12,28554,4,Kyiv; Kyyiv; Misto; Ukraine,UP,50.4333,30.5167,-1044367,20220414124500,https://www.spectator.co.uk/article/i-can-feel...
7052,1040208322,Rpzr0WcuTApnHvKAcmOowWFuPAs=,tlifxqsNyCzxIJnRwtQKuZToQQw=,2022-04-20,False,202204,2022,2022.3014,,,...,,,4,Azovstal; Ukraine (general); Ukraine,UP,47.0833,37.5833,-1034725,20220420014500,https://www.dailymail.co.uk/news/article-10733...
62959,1039955595,SCa95LBP/3qlPNMCgasW27F//LQ=,R+r2Y5pDoIsFbuYi+jG0JTeG+Rw=,2022-04-18,False,202204,2022,2022.2959,,,...,UP,,1,Ukraine,UP,49.0000,32.0000,UP,20220418163000,https://ussanews.com/2022/04/18/the-deferentials/


The actor information is now stored as separate entities that can be linked to actions.

In [35]:
actor1_vertices

Unnamed: 0,ActorID,ActorCode,ActorName,ActorCountryCode,ActorKnownGroupCode,ActorEthnicCode,ActorGeo_Type,ActorGeo_FullName,ActorGeo_CountryCode,ActorGeo_Lat,ActorGeo_Long,ActorGeo_ADMCode,ActorGeo_FeatureID
15027,8aKeckFX+kx+eDBeMh7i1NPAbwQ=,MED,LOCAL MEDIA,,,,4,"Kyiv, Kyyiv, Misto, Ukraine",UP,50.4333,30.5167,UP1228554,-1044367
33105,PdW2+wiyjAcN2y1x3ynxRHMp9QY=,MIL,LIEUTENANT GENERAL,,,,4,"Moscow, Moskva, Russia",RS,55.7522,37.6156,RS4825106,-2960561
101133,v16yoVRmoBvMvdCgoAvpcbLYG4k=,MIL,WARSHIP,,,,4,"Kyiv, Kyyiv, Misto, Ukraine",UP,50.4333,30.5167,UP1228554,-1044367
34679,TnOPld40XzwMsoOjVnMiJDikDRU=,UKR,UKRAINE,UKR,,,1,Sri Lanka,CE,7.0000,81.0000,,CE
31416,gfQvMG0isaLqMb5oNqmSBjX0o6A=,RUS,RUSSIA,RUS,,,4,"Kyiv, Kyyiv, Misto, Ukraine",UP,50.4333,30.5167,UP1228554,-1044367
...,...,...,...,...,...,...,...,...,...,...,...,...,...
41010,jjyTV/XkrwQ2Pgjl5nuy4JjnmhM=,UKRGOV,UKRAINE,UKR,,,4,"Kharkiv, Kharkivs'ka Oblast', Ukraine",UP,49.9808,36.2527,UP0725036,-1041320
66469,KwnR1KenGeUshKNw7RA2LYjTV80=,REL,RELIGION,,,,4,"Kyiv, Kyyiv, Misto, Ukraine",UP,50.4333,30.5167,UP1228554,-1044367
7052,Rpzr0WcuTApnHvKAcmOowWFuPAs=,RUS,RUSSIAN,RUS,,,4,"Kremlin, Moskva, Russia",RS,55.7522,37.6156,RS4825106,-2960561
62959,SCa95LBP/3qlPNMCgasW27F//LQ=,CUB,CUBA,CUB,,,1,Cuba,CU,22.0000,-79.5000,,CU


We can visualize this data using Graphistry. First let's 'flatten' the graph schema so we only have one type of node and edge

In [44]:
from data.etl import flatten_event_actor_vertices
nodes, edges = flatten_event_actor_vertices(events_vertices, actor1_vertices, actor2_vertices)
edges.to_csv('test_edges.txt')

In [18]:
# Start using Graphistry, you'll need GRAPHISTRY_USER and GRAPHISTRY_PASS env variables.
# Uncomment this to begin the authorization process for GDrive to use vars.env in Colab
# if IN_COLAB:
#    import colab_env

# If running from a local machine you can set these vars with the other osiris env variables
# If in a hosted nb env and not using vars.env or not in Colab you'll have to set it manually
# os.envriron[''GRAPHISTRY_USER'] = mygruser
# os.envriron[''GRAPHISTRY_PASS'] = mygrpass

assert 'GRAPHISTRY_USER' in os.environ and 'GRAPHISTRY_PASS' in os.environ
from graphistry import graphistry
graphistry.register(api=3, username=os.environ['GRAPHISTRY_USER'], password=os.environ['GRAPHISTRY_PASS'], protocol='https', server='hub.graphistry.com')

In [30]:
# Plot UP events using Graphistry
g = graphistry.bind(source="src", destination="dest", edge_title="date", node="node_id")
g.edges(edges).nodes(nodes).plot()

In [37]:
events_vertices, actor1_vertices, actor2_vertices = shape_event_actor_vertices(events.sample(10000))

Hashing Actor1 ID:   0%|          | 0/10000 [00:00<?, ?row/s]

Hashing Actor2 ID:   0%|          | 0/10000 [00:00<?, ?row/s]

In [39]:
nodes, edges = flatten_event_actor_vertices(events_vertices, actor1_vertices, actor2_vertices)
g = graphistry.bind(source="src", destination="dest", edge_title="date", node="node_id")
g.edges(edges).nodes(nodes).plot()

ValueError: Failed to refresh token: HTTPSConnectionPool(host='hub.graphistry.com', port=443): Max retries exceeded with url: /api-token-auth/ (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x0000020E86F66460>: Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond'))