# Introduction:

This notebook shows how to use a `GraphCreator` instance in a recommendation pipeline to easily produce the top recommendations and display their predicted order (before/after)

In [1]:
%load_ext autoreload
%autoreload 1

import sys
sys.path.append('../utils/')

import pickle
import numpy as np
import pandas as pd

from GraphAPI import GraphCreator
from RecommenderPipeline import Recommender
from save_to_mlab import save_dict_to_mlab

from sklearn.preprocessing import normalize, StandardScaler, Normalizer, RobustScaler, MinMaxScaler, MaxAbsScaler


%aimport GraphAPI
%aimport RecommenderPipeline
%aimport save_to_mlab

# Load in Models

When we run our pipeline, we will need to pass a trained classifier model to it when making the recommendations for before/after.

The models below have all been trained on human labeled data, with slightly different parameters.  

In [2]:
with open("../models/rf_classifier_v2_normalized.pkl", "rb") as model:
    rf_v2_classifier = pickle.load(model)
    
with open("../models/rf_classifier_v3_normalized_714.pkl", "rb") as model:
    rf_v3_classifier = pickle.load(model)    
    
with open("../models/rf_classifier_v4_732.pkl", "rb") as model:
    rf_v4_classifier = pickle.load(model)    
    
with open("../models/xg_model_semisupervised_v2.pkl", "rb") as model:
    xg_classifier = pickle.load(model)

# Initialize `GraphCreator` Instance

After initialization, pass as an argument to a new recommender instance

In [42]:
gc = GraphCreator("https://en.wikipedia.org/wiki/Cognition", include_see_also=False, max_recursive_requests=50)
print("Layer 1 nodes:", len(gc.next_links))
rec = Recommender(gc)

Layer 1 nodes: 1427


# Fit the Recommender 

In [43]:
rec.fit(scaler=Normalizer)

# Make Predictions
Pass in your model to make predictions on the data

In [44]:
rec.predict(rf_v2_classifier)
# rec.predict(xg_classifier)

# Format the Results
Will return as a dictionary containing the entry node and the predictions of the top articles.

In [45]:
formatted_results = rec.format_results()
formatted_results

{'entry': 'Cognition',
 'decision_threshold': 0.5400000000000003,
 'predictions': [{'node': 'Thought',
   'similarity_rank': 2.5971306135366183,
   'degree': 0.8004274094807888,
   'category_matches_with_source': 0.004104755946055328,
   'in_edges': 0.5397754069062756,
   'out_edges': 0.2606520025745133,
   'shared_neighbors_with_entry_score': 0.0001306584287054231,
   'centrality': 0.000135299619146267,
   'page_rank': 4.995549902323121e-07,
   'adjusted_reciprocity': 1.647582201879857e-05,
   'shortest_path_length_from_entry': 0.001026188986513832,
   'shortest_path_length_to_entry': 0.001026188986513832,
   'jaccard_similarity': 0.00012250131026508867,
   'primary_link': 0.0,
   'label_proba': [0.5053775021491849, 0.494622497850815],
   'position': 'before'},
  {'node': 'Intuition',
   'similarity_rank': 2.588665189108568,
   'degree': 0.8164849884612254,
   'category_matches_with_source': 0.004176393802870718,
   'in_edges': 0.40615429732917735,
   'out_edges': 0.41033069113204806,

In [46]:
save_dict_to_mlab(formatted_results)

# Optional: Format as DataFrame for Easy Viewing

In [31]:
formatted_results = rec.format_results(0.47)

recommendations = pd.DataFrame(formatted_results['predictions'])
print(recommendations.position.value_counts())
print("Decision Threshold:", round(formatted_results['decision_threshold'], 2))
recommendations[['node', 'position', "label_proba"]]

after     82
before    17
Name: position, dtype: int64
Decision Threshold: 0.47


Unnamed: 0,node,position,label_proba
0,Neo-Fauvism,after,"[0.5895203211831936, 0.41047967881680625]"
1,Neo-impressionism,after,"[0.5068976130277447, 0.4931023869722553]"
2,Proto-Cubism,after,"[0.5958830935118675, 0.40411690648813253]"
3,Impressionism (literature),before,"[0.45606516577944894, 0.5439348342205511]"
4,Art movement,after,"[0.536951304140065, 0.4630486958599351]"
5,Canadian Impressionism,after,"[0.54293961501367, 0.4570603849863298]"
6,Tonalism,after,"[0.5140880931219012, 0.48591190687809876]"
7,Paul Durand-Ruel,after,"[0.5014055929329783, 0.49859440706702157]"
8,Japonism,after,"[0.5521058160922017, 0.447894183907798]"
9,Les Nabis,after,"[0.5048887746137583, 0.49511122538624164]"
