Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EvaluationSheet : fully understand the universe terme #31

Open
TaousDev opened this issue Sep 12, 2020 · 1 comment
Open

EvaluationSheet : fully understand the universe terme #31

TaousDev opened this issue Sep 12, 2020 · 1 comment

Comments

@TaousDev
Copy link

I don't think I understand the "universe" term that is used as params, or how do I choose it in linkpred/evaluation/static/StaticEvaluation() also in EvaluationSheet() , you stated that this param is important to return the accuracy

Also, how do i get the confusion matrix, recall, precision and accuracy?

Concerning the accuracy do I pick the max value, like this : evaluation.accuracy().max() or is this wrong
or should i do this : acc = (sum(evaluation.tp + evaluation.tn))/(sum(evaluation.tp + evaluation.tn + evaluation.fp + evaluation.fn)) (also i imported 'division from future')

I want to use sklearn but what's confusiing me is how do I retrieve the y_true and y_pred from a graph sklearn.metrics.confusion_matrix(y_true, y_pred, *, labels=None, sample_weight=None, normalize=None)
how do I get these data from the graph to use them in other Machine learning algorithms such as SVM

this is my full code :


`import linkpred
import random
from matplotlib import pyplot as plt

random.seed(100)

# Read network
G = linkpred.read_network('BUP_full.net')

# Create test network
test = G.subgraph(random.sample(G.nodes(), 33))

# Exclude test network from learning phase
training = G.copy()
training.remove_edges_from(test.edges())

simrank = linkpred.predictors.SimRank(training, excluded=training.edges())
simrank_results = simrank.predict(c=0.5)

test_set = set(linkpred.evaluation.Pair(u, v) for u, v in test.edges())
evaluation = linkpred.evaluation.EvaluationSheet(simrank_results, test_set, simrank_results)

plt.plot(evaluation.recall(), evaluation.precision())`

Thank you

@rafguns
Copy link
Owner

rafguns commented Sep 24, 2020

The universe parameter is an iterable (typically a list or set) of all possible links (i.e. all node pairs) in the graph. Because the number of node pairs increases exponentially with the number of nodes, it can also simply be the number of node pairs (an int). So in your example, I think you could use

n = len(training)
universe = n * (n - 1) // 2

With the benefit of hindsight, this was a premature optimization that would probably require some fairly substantial work to get rid of. I'll get back to your other questions soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants