# Naina Chabra
# 210968234
# week-2


**QUESTION 1** 


**The Game :**

According to the “Six Degrees of Kevin Bacon” game, anyone in the
Hollywood film industry can be connected to Kevin Bacon within six steps, where each
step consists of finding a film that two actors both starred in. To solve the problem,
find the shortest path between any two actors by choosing a sequence of movies that
connects them. For example, the shortest path between Jennifer Lawrence and Tom
Hanks is 2: Jennifer Lawrence is connected to Kevin Bacon by both starring in “X-Men:
First Class,” and Kevin Bacon is connected to Tom Hanks by both starring in “Apollo
13.”


**Problem Solving Agent:**

Given two actors nodes in the graph we need to find the distance (shortest
path) between the nodes. Write a python program to determine how many “degrees
of separation” apart two actors are. Find the distance or the degree of separation.,
using
a. Breadth first search
b. Depth first search

In [2]:
import pandas as pd

# Reading all the datasets
df1 = pd.read_csv('movies.csv')
df2 = pd.read_csv('people.csv')
df3 = pd.read_csv('stars.csv')

# Renaming the columns
df1.rename(columns = {'id':'movie_id'}, inplace = True)
df2.rename(columns = {'id':'person_id'}, inplace = True)

# Merging the datasets
df = df3.merge(df1, on='movie_id', how='left').merge(df2, on='person_id', how='left')

#Dropping unnecessary columns
df = df.drop(['birth', 'person_id', 'movie_id'], axis = 1)

#Sorting the columns
df = df.sort_index(axis=1)

#Saving the dataset
df.to_csv('small.csv', index=False)

# Implementing Bfs

For breadth first we need to

1:Maintain a queue that is able to pop out the current lowest level node

2:Keep track of the trajectory


path : the current nodes has been visited on that particular expansion

queue : maintains (current_node, current_path) and is able to pop out the node of the lowest level

Note that to avoid visit a node a second time, we use a closed list called as visited.




In [3]:
import csv
import sys

from queue import Queue

# Function to load data from the CSV file
def load_data(filename):
    data = {}
    with open(filename, "r") as file:
        reader = csv.reader(file)
        next(reader)
        for row in reader:
            if row[0] not in data:
                data[row[0]] = set()
            data[row[0]].add(row[1])
            if row[1] not in data:
                data[row[1]] = set()
            data[row[1]].add(row[0])
    return data

# Function to find the shortest path between two actors using BFS
def bfs(start, end, data):
    visited = set()
    queue = Queue()
    queue.put((start, 0))
    while not queue.empty():
        actor, degree = queue.get()
        if actor == end:
            return degree
        visited.add(actor)
        for neighbor in data[actor]:
            if neighbor not in visited:
                queue.put((neighbor, degree + 1))
    return None

# Main function
def main():

    data = load_data('small.csv')
    start = "Bill Paxton"
    end = "Robin Wright"
    degree = bfs(start, end, data)
    if degree is None:
        print("No connection found")
    else:
        print(f"{start} and {end}: Degree of Separation = {degree} ")

if __name__ == "__main__":
    main()


Bill Paxton and Robin Wright: Degree of Separation = 4 


# Implementing DFS



**The working logic of the DFS algorithm consists of the following steps:**

Select the start node.
Visit the start node.
Explore the neighbors of the start node.
If there are undiscovered neighbor nodes, switch to that node and repeat the steps (recursively or using a chunk data structure).
When there are no undiscovered neighbor nodes, go back and return to the previous node.
Repeat the steps until all nodes have been visited.

In [4]:
import csv

# Load the data from the CSV file into a dictionary
def load_data(filename):
    data = {}
    with open(filename, 'r') as file:
        reader = csv.reader(file)
        for row in reader:
            actor = row[0]
            movie = row[1]
            if actor not in data:
                data[actor] = set()
            data[actor].add(movie)
            if movie not in data:
                data[movie] = set()
            data[movie].add(actor)
    return data

# Find the shortest path between two actors using depth first search
def dfs(data, start, end, path=None):
    if path is None:
        path = [start]
    if start == end:
        return path
    for actor in data[start]:
        if actor not in path:
            new_path = dfs(data, actor, end, path + [actor])
            if new_path is not None:
                return new_path
    return None

# Load the data from the CSV file
data = load_data('small.csv')

# Find the shortest path between Kevin Bacon and Tom Hanks
path = dfs(data, 'Kevin Bacon', 'Tom Cruise')

# Print the shortest path
if path is None:
    print("No path found")
else:
    print(" -> ".join(path))


Kevin Bacon -> A Few Good Men -> Tom Cruise
