<b>The Problem Statement</b>:

Form triples based on the following paragraph:

"Alice is enrolled in Computer Science 101. Bob is enrolled in Physics 201. Charlie is enrolled in Mathematics 301. Computer Science 101 is taught by Professor Smith. Physics 201 is taught by Professor Johnson. Mathematics 301 is taught by Professor Brown."

Use the above to paragraph extract triples and build a complete graph representing the relationships between students, courses, and instructors in a university setting.







## The Code

### Imports

In [None]:
!pip install pyvis

In [2]:
### NLTK libraries for triples extraction
from nltk.tokenize import word_tokenize, sent_tokenize
from nltk import pos_tag

### To plot a networkx graph in pyvis
import networkx as nx
from pyvis.network import Network
from IPython.display import HTML
from IPython.display import display,IFrame

from pprint import pprint  

In [3]:
import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')

[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\saran\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     C:\Users\saran\AppData\Roaming\nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!


True

### Definition of the Triples

In [4]:
# Manually define the triples from the paragraph in the subject predicate and object format as a list of tuples
triples = [
    ('Alice', 'enrolled_in', 'CS 101'),
    ('Bob', 'enrolled_in', 'Physics 201'),
    ('Charlie', 'enrolled_in', 'Mathematics 301'),
    ('CS 101', 'taught_by', 'Professor Smith'),
    ('Physics 201', 'taught_by', 'Professor Johnson'),
    ('Mathematics 301', 'taught_by', 'Professor Brown')
]

### Graph Building using Networkx

In [5]:
# Function to build a NetworkX graph from extracted triples
def build_networkx_graph(triples):
    """
    Builds a NetworkX graph from a list of subject-predicate-object triples.

    Args:
        triples (list): A list of extracted triples, each represented as a tuple (subject, predicate, object).

    Returns:
        networkx.Graph: A NetworkX graph representing relationships between students, courses, and instructors.
    """
    # TO-DO: Implement the code to build a NetworkX graph from the triples
    # Initialize an empty NetworkX graph

    # Iterate through the triples

    # Add nodes with node types (student, course, instructor)

    # Add edges with relationship types (enrolled in, taught by)

    # Return the NetworkX graph
    G = nx.Graph()

    # Iterate through triples and add nodes and edges to the graph
    for triple in triples:
        print(triple)
        subject, action, objects = triple
        G.add_node(subject, type="student")
        G.add_node(objects, type="course" if "101" in objects or "201" in objects or "301" in objects else "instructor")
        # G.add_edge(subject, objects, relation=action)

        # Set edge color based on the type of relationship
        if action == "enrolled in":
            edge_color = "blue"
        elif action == "taught by":
            edge_color = "red"
        else:
            edge_color = "green"

        G.add_edge(subject, objects, label=action, relation=action, color=edge_color)

    return G

### Graph Visualize using Matplotlib

In [6]:
# Function to save the graph as "university_relationship_graph.html" using PyVis
def save_graph_pyvis(triples):
    """
    Visualizes a NetworkX graph using PyVis and saves it as an HTML file.

    Args:
        graph (networkx.Graph): The NetworkX graph to be visualized.

    Returns:
        None
    """
    # TO-DO: Implement the code to visualize the graph using PyVis
    # Create an empty PyVis Network object

    # Add nodes and edges to the PyVis graph

    # Save the graph as an HTML file
    # Create a NetworkX graph

    G = build_networkx_graph(triples)
    # Visualize the graph using PyVis
    nt = Network(height="750px", width="100%", bgcolor="#222222", font_color="white", notebook=True)
    nt.from_nx(G)

    # Set node and edge attributes for visualization
    nt.set_edge_smooth('dynamic')
    nt.show_buttons(filter_=['physics'])
    nt.force_atlas_2based()
    nt.show("university_relationship_graph.html")

### Main function to solve the problem

In [7]:
def give_triples(paragraph):
    # Tokenize the paragraph into sentences
    sentences = sent_tokenize(paragraph)
    sentences = [sentence.replace("is","").strip() for sentence in sentences if "is" in sentence]

    triples = []

    # Iterate through sentences
    for sentence in sentences:
        # Tokenize words in the sentence
        if "enrolled in" in sentence:
            words = sentence[:-1].partition("enrolled in")
        else:
            words = sentence[:-1].partition("taught by")
        words = [word.strip() for word in words]
        
        # Extract triples based on the structure of the sentence        
        subject = words[0]
        action = words[1]
        objects = words[2]
        triples.append((subject, action, objects))
        
    return triples

In [8]:
# Given paragraph
paragraph = "Alice is enrolled in Computer Science 101. Bob is enrolled in Physics 201. Charlie is enrolled in Mathematics 301. Computer Science 101 is taught by Professor Smith. Physics 201 is taught by Professor Johnson. Mathematics 301 is taught by Professor Brown."

triples = give_triples(paragraph)

# Build a graph using the manually defined triples
graph = save_graph_pyvis(triples)

('Alice', 'enrolled in', 'Computer Science 101')
('Bob', 'enrolled in', 'Physics 201')
('Charlie', 'enrolled in', 'Mathematics 301')
('Computer Science 101', 'taught by', 'Professor Smith')
('Physics 201', 'taught by', 'Professor Johnson')
('Mathematics 301', 'taught by', 'Professor Brown')
university_relationship_graph.html
