<a href="https://colab.research.google.com/github/AchalaVP/SMA/blob/main/SMA_CIE2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Program 1

Write a python program to implement page rank algorithm.

In [1]:
import numpy as np

def calculate_pagerank(adjacency_matrix, damping_factor=0.85, max_iterations=100, convergence_threshold=0.0001):
    num_pages = adjacency_matrix.shape[0]
    initial_rank = np.ones(num_pages) / num_pages
    rank = initial_rank.copy()

    for _ in range(max_iterations):
        prev_rank = rank.copy()
        rank = (damping_factor * np.dot(adjacency_matrix, rank)) + ((1 - damping_factor) / num_pages)

        if np.linalg.norm(rank - prev_rank) < convergence_threshold:
            break

    return rank

# Example usage
adjacency_matrix = np.array([[0, 1, 1],
                             [1, 0, 0],
                             [0, 1, 0]])

pagerank = calculate_pagerank(adjacency_matrix)
print(pagerank)


[131974.54619607  99624.63908054  75204.42870338]


Program 2

Link Prediction is used to predict future possible links in a network. Link Prediction is the
algorithm based on which Facebook recommends People you May Know, Amazon predicts items
you’re likely going to be interested in and Zomato recommends food you’re likely going to order.
Implement link prediction using jaccard co efficient.

In [5]:
def jaccard_coefficient(set1, set2):
    intersection = len(set1.intersection(set2))
    union = len(set1) + len(set2) - intersection
    return intersection / union

def link_prediction_jaccard(graph, node):
    neighbors = set(graph[node])
    scores = []

    for other_node in graph:
        if other_node != node:
            other_neighbors = set(graph[other_node])
            score = jaccard_coefficient(neighbors, other_neighbors)
            scores.append((other_node, score))

    scores.sort(key=lambda x: x[1], reverse=True)
    return scores

graph = {
    'A': ['B', 'C', 'D'],
    'B': ['A', 'C'],
    'C': ['A', 'B', 'D'],
    'D': ['A', 'C'],
    'E': ['F'],
    'F': ['E']
}

node = 'A'
predictions = link_prediction_jaccard(graph, node)

print(f"Link predictions for node {node}:")
for prediction, score in predictions:
    #The nodes with higher Jaccard coefficients are more likely to have a link with the specified node.
    print(f"Node: {prediction}, Jaccard coefficient: {score}") 


Link predictions for node A:
Node: C, Jaccard coefficient: 0.5
Node: B, Jaccard coefficient: 0.25
Node: D, Jaccard coefficient: 0.25
Node: E, Jaccard coefficient: 0.0
Node: F, Jaccard coefficient: 0.0


Program 3

Implement link Prediction Recommendation Engines with Node2Vec.


In [7]:
pip install node2vec


Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [11]:
import networkx as nx
from node2vec import Node2Vec

# Create a graph using NetworkX (you can use your own graph as well)
graph = nx.Graph()
graph.add_edges_from([('1', '2'), ('1', '3'), ('2', '3'), ('2', '4'), ('3', '4')])

# Generate node embeddings using Node2Vec
node2vec = Node2Vec(graph, dimensions=64, walk_length=30, num_walks=200, workers=4)
model = node2vec.fit(window=10, min_count=1, batch_words=4)

# Get the node embeddings
embeddings = {node: model.wv[str(node)] for node in graph.nodes()}

def link_prediction(embeddings, node1, node2):
    emb1 = embeddings[str(node1)]
    emb2 = embeddings[str(node2)]
    similarity = cosine_similarity([emb1], [emb2])
    return similarity[0][0]

def recommendation_engine(embeddings, node, top_k=5):
    scores = []
    for other_node in embeddings:
        if other_node != str(node):
            score = link_prediction(embeddings, node, other_node)
            scores.append((other_node, score))
    scores.sort(key=lambda x: x[1], reverse=True)
    return scores[:top_k]

# Example usage
node = '1'
recommendations = recommendation_engine(embeddings, node, top_k=3)
print(f"Recommendations for node {node}:")
for recommendation, score in recommendations:
    print(f"Node: {recommendation}, Similarity Score: {score}")


Computing transition probabilities:   0%|          | 0/4 [00:00<?, ?it/s]

Recommendations for node 1:
Node: 4, Similarity Score: 0.9594818353652954
Node: 3, Similarity Score: 0.947777509689331
Node: 2, Similarity Score: 0.9415410757064819


Program 4

a. Given a webpage extract all hyperlinks from the web page

In [12]:
import requests
from bs4 import BeautifulSoup

def extract_hyperlinks(url):
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'html.parser')
    hyperlinks = []

    for link in soup.find_all('a'):
        href = link.get('href')
        if href:
            hyperlinks.append(href)

    return hyperlinks

# Example usage
webpage_url = 'https://en.wikipedia.org/wiki/E-commerce'
hyperlinks = extract_hyperlinks(webpage_url)
for hyperlink in hyperlinks:
    print(hyperlink)


#bodyContent
/wiki/Main_Page
/wiki/Wikipedia:Contents
/wiki/Portal:Current_events
/wiki/Special:Random
/wiki/Wikipedia:About
//en.wikipedia.org/wiki/Wikipedia:Contact_us
https://donate.wikimedia.org/wiki/Special:FundraiserRedirector?utm_source=donate&utm_medium=sidebar&utm_campaign=C13_en.wikipedia.org&uselang=en
/wiki/Help:Contents
/wiki/Help:Introduction
/wiki/Wikipedia:Community_portal
/wiki/Special:RecentChanges
/wiki/Wikipedia:File_upload_wizard
/wiki/Main_Page
/wiki/Special:Search
/w/index.php?title=Special:CreateAccount&returnto=E-commerce
/w/index.php?title=Special:UserLogin&returnto=E-commerce
/w/index.php?title=Special:CreateAccount&returnto=E-commerce
/w/index.php?title=Special:UserLogin&returnto=E-commerce
/wiki/Help:Introduction
/wiki/Special:MyContributions
/wiki/Special:MyTalk
#
#Defining_e-commerce
#Forms
#Governmental_regulation
#Global_trends
#Logistics
#Impacts
#Impact_on_markets_and_retailers
#Impact_on_supply_chain_management
#Impact_on_employment
#Impact_on_custom

Program 4

b. Retrieving All Titles from a Wikipedia Webpage:

In [16]:
import requests
from bs4 import BeautifulSoup

def retrieve_titles_from_wikipedia(url):
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'html.parser')
    titles = []

    for link in soup.find_all('a'):
        title = link.text
        if title:
            titles.append(title)

    return titles

# Example usage
wikipedia_url = 'https://en.wikipedia.org/wiki/Social_media_analytics'
titles = retrieve_titles_from_wikipedia(wikipedia_url)
for title in titles:
    print(title)


Jump to content
Main page
Contents
Current events
Random article
About Wikipedia
Contact us
Donate
Help
Learn to edit
Community portal
Recent changes
Upload file








Search

Create account
Log in
 Create account
 Log in
learn more
Contributions
Talk

(Top)



1Process



1.1Data identification



1.2Data analysis



1.3Information interpretation



2Techniques



3Impacts on business intelligence



4Role in international politics



4.12016 United States Presidential Election



4.22020 United States Presidential Election Controversies



4.3Brexit



5See also



6References

Afrikaans
العربية
Euskara
فارسی
한국어
עברית
Magyar
Norsk bokmål
Polski
Português
Русский
کوردی
Српски / srpski
Suomi
Edit links
Article
Talk
Read
Edit
View history
Read
Edit
View history
What links here
Related changes
Upload file
Special pages
Permanent link
Page information
Cite this page
Wikidata item
Download as PDF
Printable version
social network analysis
improve it
talk page
Learn how and when to remove

Program 5

Auto Search StackOverflow for Errors in Code using Python

In [40]:
# Import dependencies
from subprocess import Popen, PIPE
import requests
import webbrowser

# We are going to write code to read and run python
# file, and store its output or error.
def execute_return(cmd):
    args = cmd.split()
    proc = Popen(args, stdout=PIPE, stderr=PIPE)
    out, err = proc.communicate()
    return out, err

# This function will make an HTTP request using StackOverflow
# API and the error we get from the 1st function and finally
# returns the JSON file.
def mak_req(error):
    resp = requests.get("https://api.stackexchange.com/" +
                        "/2.2/search?order=desc&tagged=python&sort=activity&intitle={}&site=stackoverflow".format(error))
    return resp.json()

# This function takes the JSON from the 2nd function, and
# fetches and stores the URLs of those solutions which are
# marked as "answered" by StackOverflow. And then finally
# open up the tabs containing answers from StackOverflow on
# the browser.
def get_urls(json_dict):
    url_list = []
    count = 0

    for i in json_dict['items']:
        if i['is_answered']:
            url_list.append(i["link"])
        count += 1
        if count == 3 or count == len(i):
            break

    for i in url_list:
        webbrowser.open(i)


# Below line will go through the provided python file
# And stores the output and error.
out, err = execute_return("python /content/brr.py")


# This line is used to store that part of error we are interested in.
error = err.decode("utf-8").strip().split("\r\n")[-1]
print(error)


# A simple if condition, if error is found then execute 2nd and
# 3rd function, otherwise print "No error".
if error:
    filter_error = error.split(":")
    json1 = mak_req(filter_error[0])
    json2 = mak_req(filter_error[1])
    json = mak_req(error)
    get_urls(json1)
    get_urls(json2)
    get_urls(json)

else:
    print("No error")


Traceback (most recent call last):
  File "/content/brr.py", line 2, in <module>
    print (a)
NameError: name 'a' is not defined


Program 6

Install Pytrends and do the following tasks:

 Connect to Google

 Build Payload

 Interest Over Time

 Historical Hourly Interest

 Interest by Region

 Top Charts

 Related Queries

 Keyword Suggestion

In [2]:
!pip install pytrends

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting pytrends
  Downloading pytrends-4.9.2-py3-none-any.whl (15 kB)
Installing collected packages: pytrends
Successfully installed pytrends-4.9.2


In [4]:
from pytrends.request import TrendReq

# Connect to Google
pytrends = TrendReq(hl='en-US', tz=360)

# Build Payload
keyword = 'Python programming'
pytrends.build_payload(kw_list=[keyword])

# Interest Over Time
interest_over_time = pytrends.interest_over_time()

# Interest by Region
interest_by_region = pytrends.interest_by_region()

# Top Charts
top_charts = pytrends.trending_searches(pn='united_states')

# Related Queries
related_queries = pytrends.related_queries()

# Keyword Suggestion
keyword_suggestions = pytrends.suggestions(keyword)

# Print the results
print("Interest Over Time:")
print(interest_over_time)

print("\nInterest by Region:")
print(interest_by_region)

print("\nTop Charts:")
print(top_charts)

print("\nRelated Queries:")
print(related_queries)

print("\nKeyword Suggestions:")
print(keyword_suggestions)


Interest Over Time:
            Python programming  isPartial
date                                     
2018-06-17                  52      False
2018-06-24                  54      False
2018-07-01                  47      False
2018-07-08                  52      False
2018-07-15                  50      False
...                        ...        ...
2023-05-07                  72      False
2023-05-14                  71      False
2023-05-21                  71      False
2023-05-28                  67      False
2023-06-04                  71       True

[260 rows x 2 columns]

Interest by Region:
                Python programming
geoName                           
Afghanistan                      0
Albania                          0
Algeria                          7
American Samoa                   0
Andorra                          0
...                            ...
Western Sahara                   0
Yemen                            0
Zambia                           0
Zimb