<b>The Problem Statement</b>:

Form triples based on the following paragraph:

"Alice is enrolled in Computer Science 101. Bob is enrolled in Physics 201. Charlie is enrolled in Mathematics 301. Computer Science 101 is taught by Professor Smith. Physics 201 is taught by Professor Johnson. Mathematics 301 is taught by Professor Brown."

Use the above to paragraph extract triples and build a complete graph representing the relationships between students, courses, and instructors in a university setting.

## The Code

### Imports

In [1]:
### NLTK libraries for triples extraction
from nltk.tokenize import word_tokenize
from nltk import pos_tag

### To plot a networkx graph in pyvis
import networkx as nx
from pyvis.network import Network
from IPython.display import HTML
from IPython.display import display,IFrame

In [2]:
import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')

[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\HP\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     C:\Users\HP\AppData\Roaming\nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!


True

### Definition of the Triples

To define triples automatically from the paragraph, you can use a natural language processing librao' like spaCy to extract the entities and relationships from the text.

In [3]:
# Manually define the triples from the paragraph in the subject predicate and object format as a list of tuples
triples = [
    ('Alice', 'enrolled_in', 'CS 101'),
    ('Bob', 'enrolled_in', 'Physics 201'),
    ('Charlie', 'enrolled_in', 'Mathematics 301'),
    ('CS 101', 'taught_by', 'Professor Smith'),
    ('Physics 201', 'taught_by', 'Professor Johnson'),
    ('Mathematics 301', 'taught_by', 'Professor Brown')
]

In [4]:
# def triples_extraction(paragraph):
#     sentences = nltk.sent_tokenize(paragraph)
#     triples = []
#     for sentence in sentences:
#         words = nltk.word_tokenize(sentence)
#         if 'enrolled' in words:
#             subject = words[0]
#             course = ' '.join(words[4:6])
#             triples.append((subject, 'enrolled_in', course))
#         elif 'taught' in words:
#             course = ' '.join(words[:2])
#             professor = ' '.join(words[5:7])
#             triples.append((course, 'taught_by', professor))

#     return triples

### Graph Building using Networkx

In [5]:
# Function to build a NetworkX graph from extracted triples
def build_networkx_graph(triples):
    """
    Builds a NetworkX graph from a list of subject-predicate-object triples.

    Args:
        triples (list): A list of extracted triples, each represented as a tuple (subject, predicate, object).

    Returns:
        networkx.Graph: A NetworkX graph representing relationships between students, courses, and instructors.
    """
    # TO-DO: Implement the code to build a NetworkX graph from the triples
    # Initialize an empty NetworkX graph

    # Iterate through the triples

    # Add nodes with node types (student, course, instructor)

    # Add edges with relationship types (enrolled in, taught by)

    # Return the NetworkX graph
    G = nx.Graph()

    # Iterate through the triples
    for triple in triples:
        subject, predicate, obj = triple

        # Add nodes with node types (student, course, instructor)
        if subject not in G:
            G.add_node(subject, node_type='student')
        if obj not in G:
            if predicate == 'enrolled_in':
                G.add_node(obj, node_type='course')
            elif predicate == 'taught_by':
                G.add_node(obj, node_type='instructor')

        # Add edges with relationship types (enrolled in, taught by)
        if predicate == 'enrolled_in':
            G.add_edge(subject, obj, relationship_type='enrolled in')
        elif predicate == 'taught_by':
            G.add_edge(obj, subject, relationship_type='taught by')

    # Return the NetworkX graph
    return G

### Graph Visualize using Matplotlib

In [6]:
# Function to save the graph as "university_relationship_graph.html" using PyVis
def save_graph_pyvis(graph):
    """
    Visualizes a NetworkX graph using PyVis and saves it as an HTML file.

    Args:
        graph (networkx.Graph): The NetworkX graph to be visualized.

    Returns:
        None
    """
    # TO-DO: Implement the code to visualize the graph using PyVis
    # Create an empty PyVis Network object

    # Add nodes and edges to the PyVis graph

    # Save the graph as an HTML file
    pyvis_graph = Network()

    # Add nodes and edges to the PyVis graph
    for node in graph.nodes():
        node_type = graph.nodes[node]['node_type']
        pyvis_graph.add_node(node, label=node, title=node_type, color='green' if node_type == 'student' else 'blue' if node_type == 'course' else 'red')

    for edge in graph.edges():
        relationship_type = graph.edges[edge]['relationship_type']
        pyvis_graph.add_edge(edge[0], edge[1], label=relationship_type)

    # Save the graph as an HTML file

    pyvis_graph.write_html('university_relationship_graph.html')

### Main function to solve the problem

In [7]:
# Given paragraph
paragraph = "Alice is enrolled in CS 101. Bob is enrolled in Physics 201. Charlie is enrolled in Mathematics 301. CS 101 is taught by Professor Smith. Physics 201 is taught by Professor Johnson. Mathematics 301 is taught by Professor Brown."

# Extract triples using NLTK
# triples = triples_extraction(paragraph)

# Build a graph using the manually defined triples
graph = build_networkx_graph(triples)

# Visualize the graph
save_graph_pyvis(graph)

---
This code is an example of how to extract information from a paragraph and represent it as a graph using NetworkX and PyVis libraries in Python.

The paragraph contains information about students, courses, and instructors. The code manually defines the triples (subject, predicate, object) from the paragraph and builds a NetworkX graph from them. The `build_networkx_graph` function takes a list of triples and creates a NetworkX graph object. The `save_graph_pyvis` function takes a NetworkX graph object and visualizes it using PyVis, then saves it as an HTML file.

The `paragraph` variable contains the text that needs to be extracted and represented as a graph. The `triples` variable contains the manually defined triples that represent the relationships between the entities in the paragraph.

The `graph` variable is created by calling the `build_networkx_graph` function with the `triples` variable as an argument. This creates a NetworkX graph object that represents the relationships between the entities in the paragraph.

Finally, the `save_graph_pyvis` function is called with the `graph` variable as an argument. This function visualizes the graph using PyVis and saves it as an HTML file named "university_relationship_graph.html".