In this Jupyter notebook, you'll explore the PageRank algorithm and how changing the topology of a network of webpages can affect the PageRank of individual pages. 

First, let's create a simple data abstraction to represent a webpage.

In [25]:
class Webpage:    
    # fixed_pagerank is used to model an external page whose PR should not be updated
    # The default, -1, means that the page's PR should be updated.
    def __init__(self, name, fixed_pagerank=-1):
        # create a new webpage, with no links
        self.name = name
        self.links = []
        self.backlinks = []
        self.fixed_pagerank = fixed_pagerank

    def add_link(self, target_page):
        if not target_page in self.links:
            self.links.append(target_page)
        # when adding a link, add a backlink on the target
        target_page.add_back_link(self) 
        
    def add_back_link(self, source_page):
        if not source_page in self.backlinks:
            self.backlinks.append(source_page)

    # debugging methods        
    def print_links(self):
        for page in self.links:
            print (page.name + ", ")
    def print_back_links(self):
        for page in self.backlinks:
            print (page.name + ", ")
    def __str__(self):
        return self.name        

We provide a simple iterative implementation of PageRank:

In [26]:
class PageRank:
    def __init__(self, pages, damping_factor=0.85, debug=False, supernode=True):
        self.page_rank_table = {}
        self.damping_factor = damping_factor
        self.debug=debug
        self.supernode=supernode

        # create a "supernode" that has a link and backlink to every page
        self.pages = pages.copy() # don't update the actual pages
        supernode = Webpage("supernode")
        
        if self.supernode:
            for page in self.pages:
                if page.fixed_pagerank == -1:
                    page.add_link(supernode)
                    supernode.add_link(page)

            self.pages.append(supernode)
        for page in self.pages:
            # initialize each page's PR to be 1/n, where n is the total number of pages
            self.page_rank_table[page] = 1/len(self.pages)
          
    def run_page_rank(self, iterations):
        for ii in range(iterations):
            if self.debug:
                print("\nIteration #" + str(ii))
                self.print_table(show_supernode=True)
                
            new_page_rank_table = {}
            for page in self.page_rank_table:
                if page.fixed_pagerank == -1:
                    new_page_rank = 0
                    for backlink in page.backlinks:
                        new_page_rank += self.page_rank_table[backlink] / len(backlink.links)
                    new_page_rank_table[page] = (1-self.damping_factor) + self.damping_factor * new_page_rank
                else:
                    new_page_rank_table[page] = page.fixed_pagerank
            self.page_rank_table = new_page_rank_table

    # debugging & validation methods
    def calc_average_pagerank(self):
        sum_pagerank = 0
        for page in self.page_rank_table:
            sum_pagerank += self.page_rank_table[page]
        return sum_pagerank / len(self.page_rank_table)

    def print_table(self, show_supernode=False):
        for page in self.page_rank_table:
            if str(page) != "supernode" or show_supernode:
                print(str(page) + ": " + str(self.page_rank_table[page]))        

Create a simple network, consisting of two pages, each linking to each other, and calculate the PageRank of each one.

In [27]:
pages = []
pageHome = Webpage("A")
pageTeam = Webpage("B")
pageAbout = Webpage("C")
pageBlog = Webpage("D")
pageReviews = Webpage("D")
pageR1 = Webpage("D")
pageR2 = Webpage("D")
pageR3 = Webpage("D")
pageR4 = Webpage("D")
pageR5 = Webpage("D")

pageHome.add_link(pageTeam)
pageHome.add_link(pageAbout)
pageHome.add_link(pageBlog)
#pageHome.add_link(pageReviews)

pageTeam.add_link(pageHome)
pageAbout.add_link(pageHome)
pageBlog.add_link(pageHome)
pageReviews.add_link(pageHome)

pageBlog.add_link(pageReviews)

pageReviews.add_link(pageR1)
pageReviews.add_link(pageR2)
pageReviews.add_link(pageR3)
pageReviews.add_link(pageR4)
pageReviews.add_link(pageR5)


pages.append(pageHome)
pages.append(pageTeam)
pages.append(pageAbout)
pages.append(pageBlog)
pages.append(pageReviews)
pages.append(pageR1)
pages.append(pageR2)
pages.append(pageR3)
pages.append(pageR4)
pages.append(pageR5)

pageRank = PageRank(pages, debug=False, damping_factor=0.75, supernode=False)
pageRank.run_page_rank(100)
pageRank.print_table(show_supernode=True)

A: 1.4661654135338342
B: 0.6165413533834585
C: 0.6165413533834585
D: 0.6165413533834585
D: 0.48120300751879697
D: 0.3101503759398496
D: 0.3101503759398496
D: 0.3101503759398496
D: 0.3101503759398496
D: 0.3101503759398496


# Assignment Exercises
Complete all three of the following exercises.

### Exercise 1
While the inputs to Google's overall ranking algorithm are secret, the output of PageRank is not. There are tools that  allow you to enter a URL and see the Google PageRank of the corresponding page. For example, if you enter nuevaschool.org into [this one](https://dnschecker.org/pagerank.php), it will tell you that Nueva's homepage has a PR of 4.

Two notes:
1. Recall that this is not the raw output -- the PageRank algorithm can produce an arbitrarily high output, so to make it easier to understand, Google maps the output to a scale of 0-10, where 0 is low and 10 is high.
2. This tool is kind of a pain to use since it requires that you enter a captcha every time you submit. If you find a better one, let me know! Just be careful, as many of these tools have a bunch of spammy ads on them.

#### Questions
1. Play around with this tool and report back the values of some webpages. Can you find webpages of each rank between 0 and 10? 

2. Describe what kinds of webpages are represented in each rank. For example, you might say that pages of rank 9 are usually the homepages of the services of the biggest Internet companies in the world.

**(1)**

0. 
1. soleds.com
2. danaflowerbasket.com, petsdelightlosaltos.com
3. andrusia.com, henryhneff.com
4. nuevaschool.org, verily.com, surviv.io, marcuswestberg.photo
5. coolmathgames.com, sweetgreen.com, peets.com, dnschecker.org, statefarm.com, att.com, firefox.com, berkeley.edu, xfinity.com, t-mobile.com, huawei.com
6. wix.com, squarespace.com, dashlane.com, wikipedia.org, netflix.com, target.com, starbucks.com, espn.com, spotify.com
7. apple.com, yahoo.com, mail.google.com, blogger.com, adobe.com, cnn.com, twitch.tv
8. drive.google.com, calendar.google.com, medium.com, github.com, amazon.com, nytimes.com
9. 
10. google.com, youtube.com, instagram.com, facebook.com, twitter.com

**(2)**

I couldn't unfortunately run the pagecheck on my website, I guess it doesn't like github pages or something (I tried Jen Selby's ML notes as well). For some reason I couldn't find a page with rank 9, all the popular ones seemed to be either 10 or 8. I found rank 1 by typing in random letters into the site until it gave me a 1, soleds.com is actually for sale so the page is super super basic, probably nothing linking to it.

*Rank 10:* Pages which have billions of users (exception of twitter), these pages just have so much traffic and links to them.

*Rank 7-8:* Google Drive and Calendar, plus super popular sites like Amazon and NYTimes. This sections seems like pages with less traffic than rank 10, but with lots of links to those pages. Rank 8 seems to be more like tools and productive sites, rank 7 is super large companies like Apple or Adobe.

*Rank 5-6:* I was very surprised that Wikipedia ranked a 6, it seems like it is so popular and gets so much usage but I guess there aren't a lot of pages that link to it or something? But also it has the extensive inter-links between pages, so I don't really know. And it also shows up first in google search every time, so maybe that has to do with some of the hidden secrets of google search algorithms. 5 and 6 seem to be smaller (relatively) companies and stores, 6 is more larger online companies like netflix or espn or squarespace and some larger stores like target, whereas 5 is more smaller stores or online pages.

*Rank 3-4:* Very similar to rank 5 but just smaller, like Berkeley is 5 and Nueva is 4. A less well-known company or popular individual page would be here, like my favorite photographer. Rank 3 has Andrew Alexander's personal website along with the author of one of my favorite book series but isn't well known.

*Rank 2-1:* For rank 2 I found small local stores that aren't chains and have a website, for rank 1 I typed in random letters until I got sometihng of rank 1. I tried google searching obscure queries and going to the last pages on the 14th section or whatever, but I usually got rank 2 for all of those. I think that google doesn't actually include these rank 1's because at some point it would switch back to the popular sites, just with less relevance to the search query.

*Rank 0:* I could not find anything for rank 0, but I would assume a blank page with no links to it would fall under here.

### Exercise 2
In the given implementation, we set the damping factor to 0.85 and we initialize each page's PageRank to be 1/n, where n is the total number of pages. 

1. What happens if you change the damping factor? How does the algorithm behave with a small damping factor vs a large one? 
2. What happens if you change the initial PageRank?

You may want to use the `debug=True` flag and the `calc_average_pagerank` method to get a better look at what's going on.

In [28]:
pageRank01 = PageRank(pages, debug=False, damping_factor=0.1, supernode=False)
pageRank01.run_page_rank(100)
pageRank01.print_table(show_supernode=True)
pageRank01.calc_average_pagerank()

A: 1.1503683576570773
B: 0.9383456119219026
C: 0.9383456119219026
D: 0.9383456119219026
D: 0.9469172805960951
D: 0.9157819546766016
D: 0.9157819546766016
D: 0.9157819546766016
D: 0.9157819546766016
D: 0.9157819546766016


0.9491232247401887

In [29]:
pageRank25 = PageRank(pages, debug=False, damping_factor=0.25, supernode=False)
pageRank25.run_page_rank(100)
pageRank25.print_table(show_supernode=True)
pageRank25.calc_average_pagerank()

A: 1.3234081539166285
B: 0.8602840128263857
C: 0.8602840128263857
D: 0.8602840128263857
D: 0.8575355016032982
D: 0.7857306459001374
D: 0.7857306459001374
D: 0.7857306459001374
D: 0.7857306459001374
D: 0.7857306459001374


0.8690448923499773

In [30]:
pageRank50 = PageRank(pages, debug=False, damping_factor=0.5, supernode=False)
pageRank50.run_page_rank(100)
pageRank50.print_table(show_supernode=True)
pageRank50.calc_average_pagerank()

A: 1.4933920704845813
B: 0.7488986784140969
C: 0.7488986784140969
D: 0.7488986784140969
D: 0.6872246696035242
D: 0.5572687224669604
D: 0.5572687224669604
D: 0.5572687224669604
D: 0.5572687224669604
D: 0.5572687224669604


0.7213656387665199

In [31]:
pageRank75 = PageRank(pages, debug=False, damping_factor=0.75, supernode=False)
pageRank75.run_page_rank(100)
pageRank75.print_table(show_supernode=True)
pageRank75.calc_average_pagerank()

A: 1.4661654135338342
B: 0.6165413533834585
C: 0.6165413533834585
D: 0.6165413533834585
D: 0.48120300751879697
D: 0.3101503759398496
D: 0.3101503759398496
D: 0.3101503759398496
D: 0.3101503759398496
D: 0.3101503759398496


0.5347744360902256

In [32]:
pageRank90 = PageRank(pages, debug=False, damping_factor=0.9, supernode=False)
pageRank90.run_page_rank(100)
pageRank90.print_table(show_supernode=True)
pageRank90.calc_average_pagerank()

A: 1.137817866568147
B: 0.4413453589883733
C: 0.4413453589883733
D: 0.4413453589883733
D: 0.29860541100249716
D: 0.14479081155566084
D: 0.14479081155566084
D: 0.14479081155566084
D: 0.14479081155566084
D: 0.14479081155566084


0.34844134123140685

**(1)**

Changing the dampening factor changes the distribution of the PageRanks. The lower the dampening factor (see first example of 0.1), the lower the probability that a user continues surfing. The way the equation is set up is: $$(1-d) + d(\frac{PR}{c}\textrm{ terms}),$$ so when the dampening factor is low the equation will depend less on the terms and more on $1-d$. (In the extreme case of $d=0$, each rank would be 1 because the terms are multiplied by zero). That is why there is less distribution in the scores when using a lower dampening factor. A byproduct of this is that the average of the scores is actually higher (in this set of pages), because the 5 review pages which have the lowest score are much closer to 1 than when the dampening factor is higher. When the dampening factor is higher (closer to 1), the distribution goes up (larger range between top and bottom ranked pages), and the average goes down because of the 5 review pages.

**(2)**

It's not shown above but I played around with the initial value of each page and no matter what I did the scores all converged to the same numbers each time. 

### Exercise 3
In class, we discussed the major advantage of PageRank compared to the search engine algorithms that came before it: since PageRank is determined by backlinks, which the owner of a page has less control over, it's highly resistant to *keyword stuffing*, which is the practice of putting a ton of keywords on your webpage so that the indexer thinks your page is relevant to all those keywords. In this exercise, we'll explore what kinds of practices can manipulate PageRank; this is known as search engine optimization, or SEO, and is a $80 billion industry in the US alone.


1. You've just created a startup, and you need to make a website for your startup. You have four pages to start with:
    1. Homepage
    2. About (the story of the company)
    3. Team (the bios of you and your cofounders)
    4. Blog
    
  How should you structure your website so that your homepage has the highest possible PageRank? List the PageRank values for each of the four webpages in this structure.
  
  
2. Your product is a hit! Glowing reviews are rolling in, and you want to add a page for Reviews that links to each of the 5 reviews so far. How does this affect your homepage's PageRank? Should you restructure your website, and if so, how? List the PageRank values for each of the five webpages you control (Home, About, Team, Blog, and Reviews).

3. It's been a month, and you've produced a blog post every week (total of 4 posts). Your homepage still has a link to the most recent post, and each post links to the post before it and after it, like so:

  `Home ==> Blog post 4 <==> Blog post 3 <==> Blog post 2 <== Blog post 1`
  
 How does this affect your homepage's PageRank? Should you restructure your website, and if so, how? (You may want to refer to your answer to question 1.) Do websites you visit structure their sites in this way? If not, what are some reasons why?
 
4. Copycats of your product are popping up all over the web, and your homepage is getting pushed off the first page of search results. You decide you need to take drastic action. What are some unscrupulous ways you might boost the PageRank of your homepage quickly? Model each of these and give their impact on the PageRank of your homepage.

5. Google figures out what you're doing, and they send you a nice cease and desist letter telling you to stop, or they'll remove your website from the search results entirely. What are some legitimate ways you might boost the PageRank of your homepage?

**(1)**

In [34]:
pages = []
pageHome = Webpage("Home")
pageAbout = Webpage("About")
pageTeam = Webpage("Team")
pageBlog = Webpage("Blog")

pageHome.add_link(pageAbout)
pageHome.add_link(pageTeam)
pageHome.add_link(pageBlog)

pageAbout.add_link(pageHome)
pageTeam.add_link(pageHome)
pageBlog.add_link(pageHome)

pages.append(pageHome)
pages.append(pageAbout)
pages.append(pageTeam)
pages.append(pageBlog)

pageRank = PageRank(pages, debug=False, damping_factor=0.75, supernode=False)
pageRank.run_page_rank(100)
pageRank.print_table(show_supernode=True)

Home: 1.8571428571423418
About: 0.7142857142855654
Team: 0.7142857142855654
Blog: 0.7142857142855654


**(2)**

You don't need to restructure.

In [35]:
pages = []
pageHome = Webpage("Home")
pageAbout = Webpage("About")
pageTeam = Webpage("Team")
pageBlog = Webpage("Blog")
pageReviews = Webpage("Reviews")
pageR1 = Webpage("R1")
pageR2 = Webpage("R2")
pageR3 = Webpage("R3")
pageR4 = Webpage("R4")
pageR5 = Webpage("R5")

pageHome.add_link(pageAbout)
pageHome.add_link(pageTeam)
pageHome.add_link(pageBlog)
pageHome.add_link(pageReviews)

pageAbout.add_link(pageHome)
pageTeam.add_link(pageHome)
pageBlog.add_link(pageHome)
pageReviews.add_link(pageHome)

pageBlog.add_link(pageReviews)

pageReviews.add_link(pageR1)
pageReviews.add_link(pageR2)
pageReviews.add_link(pageR3)
pageReviews.add_link(pageR4)
pageReviews.add_link(pageR5)


pages.append(pageHome)
pages.append(pageAbout)
pages.append(pageTeam)
pages.append(pageBlog)
pages.append(pageReviews)
pages.append(pageR1)
pages.append(pageR2)
pages.append(pageR3)
pages.append(pageR4)
pages.append(pageR5)

pageRank = PageRank(pages, debug=False, damping_factor=0.75, supernode=False)
pageRank.run_page_rank(100)
pageRank.print_table(show_supernode=True)

Home: 1.236133122028526
About: 0.4817749603803486
Team: 0.4817749603803486
Blog: 0.4817749603803486
Reviews: 0.6624405705229793
R1: 0.3328050713153724
R2: 0.3328050713153724
R3: 0.3328050713153724
R4: 0.3328050713153724
R5: 0.3328050713153724


**(3)**

Structured as given:

In [38]:
pages = []
pageHome = Webpage("Home")
pageAbout = Webpage("About")
pageTeam = Webpage("Team")
pageB1 = Webpage("B1")
pageB2 = Webpage("B2")
pageB3 = Webpage("B3")
pageB4 = Webpage("B4")
pageReviews = Webpage("Reviews")
pageR1 = Webpage("R1")
pageR2 = Webpage("R2")
pageR3 = Webpage("R3")
pageR4 = Webpage("R4")
pageR5 = Webpage("R5")

pageHome.add_link(pageAbout)
pageHome.add_link(pageTeam)
pageHome.add_link(pageB4)
pageHome.add_link(pageReviews)

pageAbout.add_link(pageHome)
pageTeam.add_link(pageHome)
pageReviews.add_link(pageHome)

pageB4.add_link(pageB3)
pageB3.add_link(pageB4)
pageB3.add_link(pageB2)
pageB2.add_link(pageB3)
pageB2.add_link(pageB1)

pageReviews.add_link(pageR1)
pageReviews.add_link(pageR2)
pageReviews.add_link(pageR3)
pageReviews.add_link(pageR4)
pageReviews.add_link(pageR5)


pages.append(pageHome)
pages.append(pageAbout)
pages.append(pageTeam)
pages.append(pageB1)
pages.append(pageB2)
pages.append(pageB3)
pages.append(pageB4)
pages.append(pageReviews)
pages.append(pageR1)
pages.append(pageR2)
pages.append(pageR3)
pages.append(pageR4)
pages.append(pageR5)

pageRank = PageRank(pages, debug=False, damping_factor=0.75, supernode=False)
pageRank.run_page_rank(100)
pageRank.print_table(show_supernode=True)

Home: 0.9438202247191011
About: 0.42696629213483145
Team: 0.42696629213483145
B1: 0.5052573641056787
B2: 0.6806863042818099
B3: 1.1484968114181597
B4: 0.8576525964166414
Reviews: 0.42696629213483145
R1: 0.3033707865168539
R2: 0.3033707865168539
R3: 0.3033707865168539
R4: 0.3033707865168539
R5: 0.3033707865168539


As you can see blog post 3 (B3) now has the highest PageRank, with Home behind that and B4 close behind Home. However, if we add a link from each blog post to the homepage, it will fix this problem.

In [42]:
pages = []
pageHome = Webpage("Home")
pageAbout = Webpage("About")
pageTeam = Webpage("Team")
pageB1 = Webpage("B1")
pageB2 = Webpage("B2")
pageB3 = Webpage("B3")
pageB4 = Webpage("B4")
pageReviews = Webpage("Reviews")
pageR1 = Webpage("R1")
pageR2 = Webpage("R2")
pageR3 = Webpage("R3")
pageR4 = Webpage("R4")
pageR5 = Webpage("R5")

pageHome.add_link(pageAbout)
pageHome.add_link(pageTeam)
pageHome.add_link(pageB4)
pageHome.add_link(pageReviews)

pageAbout.add_link(pageHome)
pageTeam.add_link(pageHome)
pageReviews.add_link(pageHome)

pageB4.add_link(pageB3)
pageB3.add_link(pageB4)
pageB3.add_link(pageB2)
pageB2.add_link(pageB3)
pageB2.add_link(pageB1)

pageB1.add_link(pageHome)
pageB2.add_link(pageHome)
pageB3.add_link(pageHome)
pageB4.add_link(pageHome)

pageReviews.add_link(pageR1)
pageReviews.add_link(pageR2)
pageReviews.add_link(pageR3)
pageReviews.add_link(pageR4)
pageReviews.add_link(pageR5)


pages.append(pageHome)
pages.append(pageAbout)
pages.append(pageTeam)
pages.append(pageB1)
pages.append(pageB2)
pages.append(pageB3)
pages.append(pageB4)
pages.append(pageReviews)
pages.append(pageR1)
pages.append(pageR2)
pages.append(pageR3)
pages.append(pageR4)
pages.append(pageR5)

pageRank = PageRank(pages, debug=False, damping_factor=0.75, supernode=False)
pageRank.run_page_rank(100)
pageRank.print_table(show_supernode=True)

Home: 2.1542251786154227
About: 0.6539172209903917
Team: 0.6539172209903917
B1: 0.3538125153978813
B2: 0.415250061591525
B3: 0.6610002463661
B4: 0.8191672825819167
Reviews: 0.6539172209903917
R1: 0.33173965262379895
R2: 0.33173965262379895
R3: 0.33173965262379895
R4: 0.33173965262379895
R5: 0.33173965262379895


We could also structure the blog posts similar to the reviews, where we have a master blog page and that links to each review, but I liked having it structured so users could navigate between blog posts easier.

I would say yes, the websites that I visit are structured similarly, where each blog post (or whatever similar thing) links to the previous and next one, and each post has a link back to the homepage. There is often a master blog page as well that links to all of the blog posts but each blog post doesn't link back to it. I would say that just makes it easier for users to find the post they are looking for if there are a lot of them. See a version for that below.

In [44]:
pages = []
pageHome = Webpage("Home")
pageAbout = Webpage("About")
pageTeam = Webpage("Team")
pageBlog = Webpage("Blog")
pageB1 = Webpage("B1")
pageB2 = Webpage("B2")
pageB3 = Webpage("B3")
pageB4 = Webpage("B4")
pageReviews = Webpage("Reviews")
pageR1 = Webpage("R1")
pageR2 = Webpage("R2")
pageR3 = Webpage("R3")
pageR4 = Webpage("R4")
pageR5 = Webpage("R5")

pageHome.add_link(pageAbout)
pageHome.add_link(pageTeam)
pageHome.add_link(pageBlog)
pageHome.add_link(pageReviews)

pageAbout.add_link(pageHome)
pageTeam.add_link(pageHome)
pageBlog.add_link(pageHome)
pageReviews.add_link(pageHome)

pageBlog.add_link(pageB1)
pageBlog.add_link(pageB2)
pageBlog.add_link(pageB3)
pageBlog.add_link(pageB4)

pageB4.add_link(pageB3)
pageB3.add_link(pageB4)
pageB3.add_link(pageB2)
pageB2.add_link(pageB3)
pageB2.add_link(pageB1)

pageB1.add_link(pageHome)
pageB2.add_link(pageHome)
pageB3.add_link(pageHome)
pageB4.add_link(pageHome)

pageReviews.add_link(pageR1)
pageReviews.add_link(pageR2)
pageReviews.add_link(pageR3)
pageReviews.add_link(pageR4)
pageReviews.add_link(pageR5)


pages.append(pageHome)
pages.append(pageAbout)
pages.append(pageTeam)
pages.append(pageBlog)
pages.append(pageB1)
pages.append(pageB2)
pages.append(pageB3)
pages.append(pageB4)
pages.append(pageReviews)
pages.append(pageR1)
pages.append(pageR2)
pages.append(pageR3)
pages.append(pageR4)
pages.append(pageR5)

pageRank = PageRank(pages, debug=False, damping_factor=0.75, supernode=False)
pageRank.run_page_rank(100)
pageRank.print_table(show_supernode=True)

Home: 2.3281620028031504
About: 0.6865303755255907
Team: 0.6865303755255907
Blog: 0.6865303755255907
B1: 0.4837127253395196
B2: 0.5229326760427239
B3: 0.6798124788555411
B4: 0.5229326760427239
Reviews: 0.6865303755255907
R1: 0.3358162969406988
R2: 0.3358162969406988
R3: 0.3358162969406988
R4: 0.3358162969406988
R5: 0.3358162969406988


**(4)**

Ideas:
1. You could create a bunch of hidden pages on your site that are linked to secretly on each blog post that all link to the homepage and nothing else.
2. You could create more websites that all link to each other and your site. (Not modelable).
3. Pay other people who have high PageRank sites to link to your page. (Also not modelable).

In [45]:
pages = []
pageHome = Webpage("Home")
pageAbout = Webpage("About")
pageTeam = Webpage("Team")
pageBlog = Webpage("Blog")
pageB1 = Webpage("B1")
pageB2 = Webpage("B2")
pageB3 = Webpage("B3")
pageB4 = Webpage("B4")
pageReviews = Webpage("Reviews")
pageR1 = Webpage("R1")
pageR2 = Webpage("R2")
pageR3 = Webpage("R3")
pageR4 = Webpage("R4")
pageR5 = Webpage("R5")
pageH1 = Webpage("H1")
pageH2 = Webpage("H2")
pageH3 = Webpage("H3")
pageH4 = Webpage("H4")
pageH5 = Webpage("H5")
pageH6 = Webpage("H6")
pageH7 = Webpage("H7")
pageH8 = Webpage("H8")
pageH9 = Webpage("H9")
pageH10 = Webpage("H10")

pageHome.add_link(pageAbout)
pageHome.add_link(pageTeam)
pageHome.add_link(pageBlog)
pageHome.add_link(pageReviews)

pageAbout.add_link(pageHome)
pageTeam.add_link(pageHome)
pageBlog.add_link(pageHome)
pageReviews.add_link(pageHome)

pageBlog.add_link(pageB1)
pageBlog.add_link(pageB2)
pageBlog.add_link(pageB3)
pageBlog.add_link(pageB4)

pageB4.add_link(pageB3)
pageB3.add_link(pageB4)
pageB3.add_link(pageB2)
pageB2.add_link(pageB3)
pageB2.add_link(pageB1)

pageB1.add_link(pageHome)
pageB2.add_link(pageHome)
pageB3.add_link(pageHome)
pageB4.add_link(pageHome)

pageReviews.add_link(pageR1)
pageReviews.add_link(pageR2)
pageReviews.add_link(pageR3)
pageReviews.add_link(pageR4)
pageReviews.add_link(pageR5)

pageB1.add_link(pageH1)
pageB1.add_link(pageH2)
pageB1.add_link(pageH3)
pageB1.add_link(pageH4)
pageB1.add_link(pageH5)
pageB1.add_link(pageH6)
pageB1.add_link(pageH7)
pageB1.add_link(pageH8)
pageB1.add_link(pageH8)
pageB1.add_link(pageH9)
pageB1.add_link(pageH10)

pageB2.add_link(pageH1)
pageB2.add_link(pageH2)
pageB2.add_link(pageH3)
pageB2.add_link(pageH4)
pageB2.add_link(pageH5)
pageB2.add_link(pageH6)
pageB2.add_link(pageH7)
pageB2.add_link(pageH8)
pageB2.add_link(pageH8)
pageB2.add_link(pageH9)
pageB2.add_link(pageH10)

pageB3.add_link(pageH1)
pageB3.add_link(pageH2)
pageB3.add_link(pageH3)
pageB3.add_link(pageH4)
pageB3.add_link(pageH5)
pageB3.add_link(pageH6)
pageB3.add_link(pageH7)
pageB3.add_link(pageH8)
pageB3.add_link(pageH8)
pageB3.add_link(pageH9)
pageB3.add_link(pageH10)

pageB4.add_link(pageH1)
pageB4.add_link(pageH2)
pageB4.add_link(pageH3)
pageB4.add_link(pageH4)
pageB4.add_link(pageH5)
pageB4.add_link(pageH6)
pageB4.add_link(pageH7)
pageB4.add_link(pageH8)
pageB4.add_link(pageH8)
pageB4.add_link(pageH9)
pageB4.add_link(pageH10)

pageH1.add_link(pageHome)
pageH2.add_link(pageHome)
pageH3.add_link(pageHome)
pageH4.add_link(pageHome)
pageH5.add_link(pageHome)
pageH6.add_link(pageHome)
pageH7.add_link(pageHome)
pageH8.add_link(pageHome)
pageH9.add_link(pageHome)
pageH10.add_link(pageHome)

pages.append(pageHome)
pages.append(pageAbout)
pages.append(pageTeam)
pages.append(pageBlog)
pages.append(pageB1)
pages.append(pageB2)
pages.append(pageB3)
pages.append(pageB4)
pages.append(pageReviews)
pages.append(pageR1)
pages.append(pageR2)
pages.append(pageR3)
pages.append(pageR4)
pages.append(pageR5)
pages.append(pageH1)
pages.append(pageH2)
pages.append(pageH3)
pages.append(pageH4)
pages.append(pageH5)
pages.append(pageH6)
pages.append(pageH7)
pages.append(pageH8)
pages.append(pageH9)
pages.append(pageH10)

pageRank = PageRank(pages, debug=False, damping_factor=0.75, supernode=False)
pageRank.run_page_rank(100)
pageRank.print_table(show_supernode=True)

Home: 5.329371557878538
About: 1.249257167102226
Team: 1.249257167102226
Blog: 1.249257167102226
B1: 0.464264700150812
B2: 0.4658528348149539
B3: 0.49338050232674663
B4: 0.4658528348149539
Reviews: 1.249257167102226
R1: 0.40615714588777824
R2: 0.40615714588777824
R3: 0.40615714588777824
R4: 0.40615714588777824
R5: 0.40615714588777824
H1: 0.36611059838495175
H2: 0.36611059838495175
H3: 0.36611059838495175
H4: 0.36611059838495175
H5: 0.36611059838495175
H6: 0.36611059838495175
H7: 0.36611059838495175
H8: 0.36611059838495175
H9: 0.36611059838495175
H10: 0.36611059838495175


As you can see, adding 10 hidden pages that all are linked from the blog posts and all link to the homepage boosts the homepage PR from 2.32 to 5.32.

**(5)**

Ideas:
1. The more blog posts you create, the higher your homepage will rank.
2. You could nicely email other people with similar blogs and see if they are willing to link to your site.
3. You could restructure your website.
4. You could work on getting your blog more attention through advertising and people would naturally start linking to your posts.
5. Create an Instagram and Facebook and Twitter and everything for your blog and link to it a lot.