# Recommendations: Part 1

In this notebook you will learn how to make recommendations using Neo4j. 

Execute the code to import the libraries (remember to unset Reset all runtimes before running):

In [11]:
from py2neo import Graph
import pandas as pd

import matplotlib 
import matplotlib.pyplot as plt

plt.style.use('fivethirtyeight')
pd.set_option('display.float_format', lambda x: '%.3f' % x)
pd.set_option('display.max_colwidth', 100)

Next, create a connection to your Neo4j Sandbox, just as you did previously when you set up your environment. 

<div align="left">
    <img src="images/sandbox-citations.png" alt="Citation Sandbox"/>
</div>

Update the cell below to use the IP Address, Bolt Port, and Password, as you did previously.

In [12]:
# Change the line of code below to use the IP Address, Bolt Port, and Password of your Sandbox.
# graph = Graph("bolt://<IP Address>:<Bolt Port>", auth=("neo4j", "<Password>")) 
 
graph = Graph("bolt://52.3.242.176:33698", auth=("neo4j", "equivalent-listing-parts"))

##  Finding popular authors

Since we're going to make collaborator suggestions find authors who have written the most articles so that we have some data to work with.

In [None]:
popular_authors_query = """
MATCH (author:Author)
RETURN author.name, size((author)<-[:AUTHOR]-()) AS articlesPublished
ORDER BY articlesPublished DESC
LIMIT 10
"""

graph.run(popular_authors_query).to_data_frame()

Pick one of these authors...

In [14]:
author_name = "Peter G. Neumann"

Retrieve the articles they've published and how many citations they've received:

In [None]:
author_articles_query = """
MATCH (:Author {name: $authorName})<-[:AUTHOR]-(article)
RETURN article.title AS article, article.year AS year, size((article)<-[:CITED]-()) AS citations
ORDER BY citations DESC
LIMIT 20
"""

graph.run(author_articles_query,  {"authorName": author_name}).to_data_frame()

Find the author's collaborators:

In [None]:
collaborations_query = """
MATCH (:Author {name: $authorName})<-[:AUTHOR]-(article)-[:AUTHOR]->(coauthor)
RETURN coauthor.name AS coauthor, count(*) AS collaborations
ORDER BY collaborations DESC
LIMIT 10
"""

graph.run(collaborations_query,  {"authorName": author_name}).to_data_frame()

How would you suggest some future collaborators for this author? One way is by looking at the collaborators of their collaborators!

In [None]:
collaborations_query = """
MATCH (author:Author {name: $authorName})<-[:AUTHOR]-(article)-[:AUTHOR]->(coauthor),
      (coauthor)<-[:AUTHOR]-()-[:AUTHOR]->(coc)
WHERE not((coc)<-[:AUTHOR]-()-[:AUTHOR]->(author)) AND coc <> author      
RETURN coc.name AS coauthor, count(*) AS collaborations
ORDER BY collaborations DESC
LIMIT 10
"""

graph.run(collaborations_query,  {"authorName": author_name}).to_data_frame()

Each of these people have collaborated with someone that Peter has worked with before, so they might be able to do an introduction.

## Exercise

1. Can you find the top 20 suggested collaborators for 'Brian Fitzgerald' instead of 'Peter G. Neumann'?
2. How many of these potential collaborators have collaborated with Brian's collaborators more than 3 times?

Keep the results of this exercise handy as they may be useful for the quiz at the end of this module.