## Examples using HIF translators with HNX

Here we illustrate examples of using HNX with HIF formatted json objects. 

In [None]:
import json
import warnings

import fastjsonschema
import matplotlib.pyplot as plt
import pandas as pd

In [None]:
warnings.simplefilter("ignore")

## Load schema and validator
The schema provides a complete description of the format and typing of an HIF json object.  
The validator is a function, which throws an error only if the object it is checking does not comply with the schema.

In [None]:
schema = json.load(open("../schemas/hif_schema_v0.1.0.json", "r"))
validator = fastjsonschema.compile(schema)

In [None]:
schema

## Example from HyperNetX Toys

The LesMis data was developed out of the [Stanford GraphBase]( https://www-cs-faculty.stanford.edu/~knuth/sgb.html).

The hypergraph relates characters to the scenes they participate in. As scenes are indexed relative to a hierarchy, we index the hyperedges by a string of numeric indices referencing the Volume, Book, Chapter, and Scene.  

Character's are indexed by a two letter Symbol.  
Additional metadata is associated with each character including the characters fullname and description. We will incorporate this data into the hypergraph.   
Since no metadata is associated to the hyperedges, the HIF format will only include the incidences and nodes.    

In [None]:
import hypernetx as hnx
from hypernetx.utils import toys

lesmis = toys.LesMis()
lm = lesmis.hypergraph_example()
lm.nodes.dataframe.head()

In [None]:
lm.incidences.dataframe.head()

In [None]:
hnx.info_dict(lm)

In [None]:
edges = lm.restrict_to_nodes(["FN"]).edges.items
lm_small = lm.restrict_to_edges(edges).collapse_nodes_and_edges(
    use_node_uids=["FN", "JV"], use_counts=True, return_counts=True
)
plt.title("Subhypergraph of LesMis")
plt.gcf().set_figheight(5)
hnx.draw(lm_small)

In [None]:
### View what is saved in HIF and save json to data
lesmis_hif = hnx.to_hif(lm, filename="data/lesmis.hif.json")

## The validator confirms the json read conforms to the HIF standard
output = validator(lesmis_hif)

print("metadata: ", output["metadata"], "\n")
print("network-type: ", output["network-type"])

In [None]:
## Retrieve the hypergraph from HIF
h = hnx.from_hif(filename="data/lesmis.hif.json")
hnx.info_dict(h)

In [None]:
h.nodes.dataframe.head()

# Example Publications Dataset

This dataset consists of open source publications with the keyword "Hypergraph" and was collected from ArXiv, Biorxiv, DBLP and Osti. The Hypergraph has hyperedges as publications and nodes as authors.

In [None]:
H = hnx.from_hif(filename="data/publications.hif.json")

In [None]:
H.nodes.dataframe

In [None]:
print("number of (nodes, edges):", H.shape)

In [None]:
H.nodes.dataframe

In [None]:
hnx.draw(
    H,
    with_edge_labels=False,
    with_node_labels=False,
    node_radius=0.2,
    edges_kwargs={"lw": 0.5},
)
plt.show()

In [None]:
# Getting the main connected component of hypergraph
import numpy as np

Hs = list(H.s_component_subgraphs(s=1, return_singletons=False))
I = np.argsort([len(H_CC.incidences.dataframe) for H_CC in Hs])
Hs = [Hs[i] for i in I]
H_MC = Hs[-1]

In [None]:
hnx.algorithms.homology_mod2.betti_numbers(H, k=1)

In [None]:
hnx.draw(
    H_MC,
    with_edge_labels=False,
    with_node_labels=True,
    node_radius=0.2,
    edges_kwargs={"lw": 0.5},
)
plt.show()

## Examples contributed from XGI data


### Contact High School
Contact-High-School originally sourced from:
https://www.cs.cornell.edu/~arb/data/contact-high-school/

This example is already in json form, but not in the HIF standard. We construct an HNX hypergraph from the json, incorporating all data, then store it in HIF.

In [None]:
chs = json.load(open("data/contacts-high-school-not-hif.json", "r"))
chs.keys()

In [None]:
### Create a nodes dataframe with all of the properties
chsnodes = pd.DataFrame(chs["nodes"])
chsnodes = chsnodes.set_index("id").reset_index()
chsnodes

In [None]:
## Create an incidences datafame with timestamps included
chsinc = (
    pd.DataFrame(chs["hyperedges"])
    .reset_index()
    .rename(columns={"index": "edge", "interaction": "node"})
)
df = (
    chsinc["node"]
    .explode()
    .reset_index()
    .rename(columns={"index": "edge", "interaction": "node"})
)
df["time"] = [chsinc.loc[row.edge].time for row in df.itertuples()]

In [None]:
chsinc.head(n=10)

In [None]:
chshyp = hnx.Hypergraph(
    df, node_properties=chsnodes, name="contacts-high-school from XGI"
)

In [None]:
# %%time
# CPU times: user 27.7 s, sys: 259 ms, total: 27.9 s
# Wall time: 28 s

hif = hnx.to_hif(chshyp, filename="../tutorials/data/contacts_high_school.hif.json")
hif["metadata"]

In [None]:
hnx.info_dict(chshyp)

In [None]:
# %time
# CPU times: user 1e+03 ns, sys: 1e+03 ns, total: 2 μs
# Wall time: 2.86 μs

H = hnx.from_hif(filename="../tutorials/data/contacts_high_school.hif.json")
H.nodes.dataframe.head()

### e-coli

In [None]:
H = hnx.from_hif(filename="../tutorials/data/e-coli.json")
H.edges.dataframe.head()

In [None]:
H.name

### email-enron

In [None]:
H = hnx.from_hif(filename="../tutorials/data/email-enron.json")
H.name

In [None]:
H.nodes.dataframe.head()

In [None]:
H.edges.dataframe.head()

In [None]:
plt.hist(hnx.edge_size_dist(H), log=True)

In [None]:
plt.close("all")