-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to replicate the HDO distribution plot? #2
Comments
Thanks for your interest! The following is our code for the HDO figure, for your information. from os.path import expanduser
import matplotlib.font_manager as font_manager
fontpath = expanduser('~/.local/share/fonts/LinLibertine_R.ttf')
prop = font_manager.FontProperties(fname=fontpath)
from matplotlib.pyplot import MultipleLocator
def one(f):
return '{:.1f}'.format(f)
def plot_HDO_distribution():
import numpy as np
import matplotlib.pyplot as plt
def get_colors(length, i):
if i == 0:
return plt.cm.plasma(np.linspace(-0.5, 1, length))
else:
return plt.cm.cool(np.linspace(-0.5, 1, length))
filepath = './results/distance_curv/icml23/dist_data_final/'
for dim in [16, 64, 256]:
for dataset in ['cora', 'citeseer', 'disease_nc', 'airport']:
data0 = np.loadtxt(filepath + '{}/{}_{}/{}_HDO0.txt'.format(dataset, dataset, dim, dataset))
data1 = np.loadtxt(filepath + '{}/{}_{}/{}_HDO1.txt'.format(dataset, dataset, dim, dataset))
key_nodes0 = np.loadtxt(filepath + '{}/{}_{}/{}_d20.txt'.format(dataset, dataset, dim, dataset))
key_nodes1 = np.loadtxt(filepath + '{}/{}_{}/{}_d21.txt'.format(dataset, dataset, dim, dataset))
dist0 = np.expand_dims(data0, 1)
dist1 = np.expand_dims(data1, 1)
minvalue0, center0, meanvalue0, maxvalue0 = key_nodes0
minvalue1, center1, meanvalue1, maxvalue1 = key_nodes1
plt.xlim([0, 8.5])
plt.ylim([0, 0.5])
margin = 0.1
freqs0 = []
freqs1 = []
for r in np.arange(0, 7, margin):
freqs0.append(np.where((dist0 > r) & (dist0 < r + margin))[0].shape[0])
freqs1.append(np.where((dist1 > r) & (dist1 < r + margin))[0].shape[0])
# colors0 = get_colors(len(freqs0), 1)
# colors1 = get_colors(len(freqs0), 0)
plt.bar(np.arange(0, 7, margin), np.array(freqs0) / np.sum(freqs0), color="#1F77B4", edgecolor="white",
width=margin, alpha=0.8, label='HGCN')
plt.bar(np.arange(0, 7, margin), np.array(freqs1) / np.sum(freqs1), color="#FE7E0D", edgecolor="white",
width=margin, alpha=0.9, label='Ours')
plt.legend(loc='upper right', prop={'size': 15})
row_labels = ['STATS', 'ROOT', 'MIN', 'MEAN', 'MAX'] # ROOT/ HC
table_vals = [['HGCN', 'Ours'],
[one(center0), one(center1)],
[one(minvalue0), one(minvalue1)],
[one(meanvalue0), one(meanvalue1)],
[one(maxvalue0), one(maxvalue1)]
]
the_table = plt.table(cellText=table_vals, colWidths=[0.12] * 2,
rowLabels=row_labels,
colLoc='center', rowLoc='left', cellLoc='center',
edges='closed',
bbox=(0.20, 0.55, 0.27, 0.4))
the_table.auto_set_font_size(False)
the_table.set_fontsize(14)
plt.title('{}'.format(dataset.capitalize()) + '($\mathcal{H}^{%d}$)' % dim, fontproperties=prop,
fontsize=20)
plt.yticks(fontproperties=prop, size=20)
plt.xticks(fontproperties=prop, size=20)
ax = plt.gca()
x_major_locator = MultipleLocator(1)
ax.xaxis.set_major_locator(x_major_locator)
plt.savefig('./results/icml2023/hdo/pdf/{}_{}.pdf'.format(dataset, dim), bbox_inches='tight', pad_inches=0)
plt.clf() The HDO is computed by if self.manifold.name == 'PoincareBall':
d2 = self.manifold.dist0(embeddings, c=c).mean() |
It is very clear. I greatly appreciate your help! |
I hope I'm not being too bothersome with another question, but could you please provide more details on how to compute
I would greatly appreciate any additional details you can provide. Thank you for your patience and assistance. |
Thanks for your question. The three variables are list:
|
Hello,
I'm attempting to create the HDO distribution graph as depicted in Figure 4, specifically for the Cora dataset. I have acquired the embedding results
best_emb
from the raw HGCN model and am in the process of calculating the hyperbolic distances from the origin. However, the graph I'm generating does not align well with the one presented in the article. Below is the code I'm using; could you please assist me in pinpointing any possible issues? I would greatly appreciate your help.`manifold = geoopt.PoincareBall()
origin = torch.zeros(2708, 8)
hyperbolic_distance = manifold.dist(best_emb, origin)
print("Hyperbolic distance to the origin:", hyperbolic_distance)
import networkx as nx
import matplotlib.pyplot as plt
hdo_values = hyperbolic_distance.cpu().detach().numpy()
hdo_values.shape #(2708,)
hdo_values.mean() #5.1924605
hdo_values.min() #2.5806413
hdo_values.max() #6.1599727, especially the max value is very big!
plt.figure(figsize=(10, 6))
plt.hist(hdo_values, color='blue', alpha=0.5)
plt.title('HDO Distribution')
plt.xlabel('HDO Value')
plt.ylabel('Ratio')
plt.show()
`
Best regards.
The text was updated successfully, but these errors were encountered: