---

In [1]:
%load_ext autoreload
%autoreload 2

In [5]:
#Adding path to util 
import sys
sys.path[-1] = f'{sys.path[0]}' + '/src'
#/src/preprocessing/preprocessing_util.py
import preprocessing.preprocessing_util as prep
#/src/recommendations/recommend_util.py
import recommendations.recommend_util as rec
#/src/visualizations/viz_util.py
import visualizations.viz_util as viz

---

In [6]:
df, lda_bert_model, lda_bert_vectors, \
lda_d2v_model, lda_d2v_vectors, \
bert, bert_vectors, \
lda, lda_vectors, \
d2v, doc_vectors = rec.load()

Below, 5 different recommendation methods are tested. Markdown is also saved from the original models for the example in ```text```. Since the ideal way to evaluate these recommenders would be with something like an A/B test and measuring usage, I had to qualitatively evaluate them. I ended up choosing the LDA-BERT method for a couple of reasons. 

1. This method seemed to provide the most germane recommendations for the examples I ran.
2. LDA-D2V was close in relevant recommendations, but the advantage with including BERT is that since the doc2vec model was trained on only these posts, BERT would be able identify and embed vocabulary not present in the training set. 

In [4]:
text = "I caught my partner cheating, and I'm not sure what to do."
recommend = rec.Recommender(text, df).process_text(prep.NlpPipe([text]).lemmatize())

In [6]:
lda_recs = recommend.lda_preds(lda, lda_vectors)

**LDA alone**
```
Post: Does this make any sense?
URL: https://www.reddit.com/r/relationship_advice/comments/a3g12z/does_this_make_any_sense/


Post: Once a cheater always a cheater?
URL: https://www.reddit.com/r/relationship_advice/comments/f7sk47/once_a_cheater_always_a_cheater/


Post: I [26m] am having a very difficult time trusting my gf [24f] after a trust-breaking incident, even though I feel enough time has passed and she has proven herself trustworthy.
URL: https://www.reddit.com/r/relationship_advice/comments/3301r9/i_26m_am_having_a_very_difficult_time_trusting_my/


Post: Would you be angry if your partner still keeps a physical photo album of pictures taken with the ex?
URL: https://www.reddit.com/r/relationship_advice/comments/frmb0i/would_you_be_angry_if_your_partner_still_keeps_a/


Post: BF of 4 years has very flirtatious thing with co-worker, should I be worried?
URL: https://www.reddit.com/r/relationship_advice/comments/aat7o6/bf_of_4_years_has_very_flirtatious_thing_with/
```

In [8]:
d2v_recs = recommend.d2v_preds(d2v, doc_vectors)

**Doc2Vec alone**
```
Post: Cheating
URL: https://www.reddit.com/r/relationship_advice/comments/9wsic7/cheating/


Post: Cheating
URL: https://www.reddit.com/r/relationship_advice/comments/e1leum/cheating/


Post: Question for men
URL: https://www.reddit.com/r/relationship_advice/comments/cs1t25/question_for_men/


Post: Relationship question
URL: https://www.reddit.com/r/relationship_advice/comments/5vp37z/relationship_question/


Post: What is a reason to cheat on a partner that has never cheated on you?
URL: https://www.reddit.com/r/relationship_advice/comments/hunsov/what_is_a_reason_to_cheat_on_a_partner_that_has/
```

In [10]:
bert_recs = recommend.bert_preds(bert, bert_vectors)

**BERT alone**
```
Post: What is a reason to cheat on a partner that has never cheated on you?
URL: https://www.reddit.com/r/relationship_advice/comments/hunsov/what_is_a_reason_to_cheat_on_a_partner_that_has/


Post: Telling on Cheaters
URL: https://www.reddit.com/r/relationship_advice/comments/gclcuc/telling_on_cheaters/


Post: Things guys say when they get caught cheating!
URL: https://www.reddit.com/r/relationship_advice/comments/7p6e68/things_guys_say_when_they_get_caught_cheating/


Post: Cheating
URL: https://www.reddit.com/r/relationship_advice/comments/9wsic7/cheating/


Post: Will they cheat?
URL: https://www.reddit.com/r/relationship_advice/comments/4bw6l8/will_they_cheat/
```

In [12]:
lda_d2v_recs = recommend.lda_d2v_preds(lda, d2v, lda_d2v_model, lda_d2v_vectors)

**LDA-Doc2Vec**
```
Post: Telling on Cheaters
URL: https://www.reddit.com/r/relationship_advice/comments/gclcuc/telling_on_cheaters/


Post: Would you forgive a cheater?
URL: https://www.reddit.com/r/relationship_advice/comments/e95qrx/would_you_forgive_a_cheater/


Post: Men- please enlighten me
URL: https://www.reddit.com/r/relationship_advice/comments/i3qvi4/men_please_enlighten_me/


Post: Does this make any sense?
URL: https://www.reddit.com/r/relationship_advice/comments/a3g12z/does_this_make_any_sense/


Post: have you ever gotten back with an ex who cheated?
URL: https://www.reddit.com/r/relationship_advice/comments/9n55du/have_you_ever_gotten_back_with_an_ex_who_cheated/
```

In [14]:
lda_bert_recs = recommend.lda_bert_preds(lda, bert, lda_bert_model, lda_bert_vectors)

**LDA-BERT**
```
Post: What is a reason to cheat on a partner that has never cheated on you?
URL: https://www.reddit.com/r/relationship_advice/comments/hunsov/what_is_a_reason_to_cheat_on_a_partner_that_has/


Post: Will they cheat?
URL: https://www.reddit.com/r/relationship_advice/comments/4bw6l8/will_they_cheat/


Post: Should I get back with a cheater?
URL: https://www.reddit.com/r/relationship_advice/comments/dpas0b/should_i_get_back_with_a_cheater/


Post: What constitutes cheating??
URL: https://www.reddit.com/r/relationship_advice/comments/da2ir0/what_constitutes_cheating/


Post: On cheaters:
URL: https://www.reddit.com/r/relationship_advice/comments/945fry/on_cheaters/
```

# Visualizing

To help visualize the results, I used UMAP projections. Below, a mock post is vectorized and then plotted with the rest of existing data. Then, the top 5 recommendations are plotted.

In [15]:
text = "The communication in my relationship is terrible. We can never reach a resolution in a disagreement, and we're starting to argue more and more."
viz_recommend = rec.Recommender(text, df).process_text(prep.NlpPipe([text]).lemmatize())

In [16]:
viz_recs = viz_recommend.lda_bert_preds(lda, 
                                        bert, 
                                        lda_bert_model, 
                                        lda_bert_vectors, 
                                        num_recs = 'all', 
                                        save_vec = True, 
                                        save_idxes = True, 
                                        print_recs = False)

In [17]:
umap_data = viz.umap_transform(lda_bert_vectors, viz_recs.predicted_vec)

In [19]:
viz.umap_viz(vectors=umap_data, 
             pred_text=text, 
             pred_color='#f20253'
            )

![umap alone](../reports/figures/post_alone.png)

In [435]:
viz.umap_viz(vectors=umap_data, 
             pred_text=text, 
             pred_color='#f20253', 
             plot_recs = True, 
             df=df, 
             dists=dists, 
             recs=5, 
             recs_colors=['#ff9715', '#7ecefd', '#2185c5', 'purple', 'green']
            )


![umap with recs](../reports/figures/post_with_recs.png)