<a href="https://colab.research.google.com/github/05satyam/AI-ML/blob/main/generating_interactive_human_readable_graphs_from_pdf_dcouments_using_llm.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Overview

This implementation demonstrates how to extract textual content from PDF files, analyze it using GPT-4 to generate graph nodes and edges, and visualize the relationships as an interactive graph using Plotly. The process adds metadata like page numbers and content summaries to enhance the graph's usability.


## Notebook Structure
- Setup: Install and import required libraries.
- Functions:
  - extract_text_from_pdf
  - analyze_with_gpt
  - parse_gpt_output
  - render_human_readable_graph

- Execution:
  - Call main(pdf_path, api_key) with appropriate arguments.



## One of my usecase:
 - I have used it for my stem-opt application and keeping track of my apartment lease. As these documents are big to go through and usually cant be shared publically, so i generated graphs relation to understand the flow and related components.

 - And also for Research Papers
  - For Example: Visualize the structure of a research paper, highlighting:

   - Key sections (e.g., Introduction, Methodology, Results, Discussion).
   - Relationships between hypotheses, experiments, and results.
   - References to external works with page-level summaries.
   - Output: A graph showing how different sections are interconnected, with summaries and page numbers for quick navigation.


In [None]:
!pip install fitz openai networkx pyvis matplotlib


In [None]:
!pip install pymupdf

Collecting pymupdf
  Downloading PyMuPDF-1.24.14-cp39-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (3.4 kB)
Downloading PyMuPDF-1.24.14-cp39-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (19.8 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m19.8/19.8 MB[0m [31m33.4 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pymupdf
Successfully installed pymupdf-1.24.14


In [None]:
!pip install pyvis jinja2




## Step-by-Step Explanation


### 1. Import Necessary Libraries
Before starting, import the essential libraries for handling PDF content, API interactions, and graph visualization:

In [None]:
import fitz  # For PDF text extraction
from openai import OpenAI  # For GPT-4 API
import json
import networkx as nx  # For graph processing
import plotly.graph_objects as go  # For graph visualization

### 2. Extract Text from the PDF
Define a function to extract all text from the given PDF file using PyMuPDF (fitz):

In [None]:

# Step 1: Extract text from the PDF
def extract_text_from_pdf(pdf_path):
    doc = fitz.open(pdf_path)
    text = ""
    for page in doc:
        text += page.get_text()
    doc.close()
    return text


### 3. Analyze Extracted Content with GPT-4

Use GPT-4 to analyze the content and generate a JSON object containing nodes and edges for the graph:

In [None]:
from openai import OpenAI
def analyze_with_gpt(content, api_key, prompt_template):
  client = OpenAI(api_key=api_key)
  completion = client.chat.completions.create(
      model="gpt-4o",
      messages=[
              {"role": "system", "content": "You are a helpful assistant that processes PDF content for graph generation."},
              {"role": "user", "content": prompt_template.format(content=content)}
          ]
  )
  gpt_output = completion.choices[0].message.content.strip()
  if not gpt_output:
        raise ValueError("GPT-4 returned an empty response.")
  return gpt_output


### 4. Parse the GPT-4 Output

Define a function to parse GPT-4's JSON output and extract nodes and edges. Include a fallback mechanism for invalid responses:

In [None]:
import json
import networkx as nx
from pyvis.network import Network

def parse_gpt_output(gpt_output):
    """
    Parses GPT-4 output JSON into nodes and edges, including metadata like page numbers.
    """
    fallback_data = {
        "nodes": [
            {"id": "1", "label": "Fallback Node 1", "group": "Fallback", "page": None, "summary": None},
            {"id": "2", "label": "Fallback Node 2", "group": "Fallback", "page": None, "summary": None}
        ],
        "edges": [
            {"source": "1", "target": "2", "label": "Fallback Edge"}
        ]
    }
    try:
        gpt_output = gpt_output.strip("`").strip()
        data = json.loads(gpt_output)
        nodes = data.get("nodes", [])
        edges = data.get("edges", [])
        print(f"Nodes: {nodes}")
        print(f"Edges: {edges}")

        if not nodes or not edges:
            print("Using fallback data due to missing nodes or edges.")
            return fallback_data["nodes"], fallback_data["edges"]
        return nodes, edges
    except json.JSONDecodeError as e:
        print(f"Error decoding JSON: {e}. Using fallback data.")
        return fallback_data["nodes"], fallback_data["edges"]


### 5. Generate the Human-Readable Graph

Visualize the graph with Plotly, highlighting starting points and displaying metadata (e.g., page numbers and summaries):

#### Features:
- Nodes include metadata (e.g., page numbers, summaries).
- Edges display relationships as labels.
- Starting nodes are highlighted in red.


In [None]:
def render_human_readable_graph(nodes, edges, starting_nodes):
    """
    Renders a human-readable graph with Plotly, including metadata like page numbers and summaries.
    """
    graph = nx.DiGraph()  # Use directed graph

    # Add nodes
    for node in nodes:
        graph.add_node(
            node["id"],
            label=node["label"],
            group=node.get("group", "Default"),
            page=node.get("page", "Unknown"),
            summary=node.get("summary", "No details available")
        )

    # Add edges
    for edge in edges:
        graph.add_edge(edge["source"], edge["target"], label=edge.get("label", ""))

    # Compute positions for nodes
    pos = nx.spring_layout(graph)

    # Highlight starting nodes
    starting_node_ids = set(starting_nodes)

    # Prepare edge traces
    edge_x = []
    edge_y = []
    annotations = []
    for edge in graph.edges(data=True):
        x0, y0 = pos[edge[0]]
        x1, y1 = pos[edge[1]]
        edge_x.extend([x0, x1, None])
        edge_y.extend([y0, y1, None])

        # Add arrow annotation for each edge
        annotations.append(
            dict(
                ax=x0, ay=y0, x=x1, y=y1,
                xref="x", yref="y", axref="x", ayref="y",
                showarrow=True,
                arrowhead=3,
                arrowsize=2,
                arrowwidth=1.5,
                arrowcolor="#888",
                text=edge[2].get("label", "")  # Human-readable label
            )
        )

    edge_trace = go.Scatter(
        x=edge_x, y=edge_y,
        line=dict(width=0.8, color='#888'),
        hoverinfo='none',
        mode='lines')

    # Prepare node traces with hover text for metadata
    node_x = []
    node_y = []
    hover_texts = []
    sizes = []
    colors = []
    for node_id, node_data in graph.nodes(data=True):
        x, y = pos[node_id]
        node_x.append(x)
        node_y.append(y)
        hover_texts.append(
            f"<b>{node_data['label']}</b><br>Page: {node_data['page']}<br>Summary: {node_data['summary']}"
        )
        size = 40 if node_id in starting_node_ids else 20
        sizes.append(size)
        color = 'red' if node_id in starting_node_ids else 'blue'
        colors.append(color)

    node_trace = go.Scatter(
      x=node_x, y=node_y,
      mode='markers+text',
      hoverinfo='text',
      hovertext=hover_texts,  # Display metadata on hover
      text=[node_data["label"] for _, node_data in graph.nodes(data=True)],  # Fixed
      textposition="top center",
      marker=dict(
          size=sizes,
          color=colors,
          line_width=2
      ))

    # Create the figure
    fig = go.Figure(data=[edge_trace, node_trace],
                    layout=go.Layout(
                        title="Human-Readable Graph with Metadata",
                        titlefont_size=16,
                        showlegend=False,
                        hovermode='closest',
                        margin=dict(b=0, l=0, r=0, t=40),
                        annotations=annotations,
                        xaxis=dict(showgrid=False, zeroline=False),
                        yaxis=dict(showgrid=False, zeroline=False))
                    )

    fig.show()


### 6. Combine Everything in the Main Function

The main function orchestrates the entire process:

In [None]:
def main(pdf_path, api_key):
    print("Extracting text from PDF...")
    content = extract_text_from_pdf(pdf_path)

    print("Sending content to GPT-4 for analysis...")
    prompt_template = """
Analyze the following PDF content and create a **valid JSON** object with "nodes" and "edges".Just give the json object which can be parsed directly. do not include backtikcs and json keyword.
Nodes should include:
- A 'start' category for starting points.
- Clear relationships and human-readable descriptions for traversal.
- Additional metadata, such as the page number where the content is found, or a brief summary.

Each node must have:
- id: Unique identifier
- label: A descriptive name
- group (e.g., "Starting Point", "Requirement", "Action", "Outcome")
- page (optional): Page number of the document where the content is located
- summary (optional): Brief summary or additional information about the content

Each edge must have:
- source: ID of the source node
- target: ID of the target node
- label: A brief, human-readable explanation of the relationship

Example:
{{
    "nodes": [
        {{"id": "1", "label": "Start Process", "group": "Starting Point", "page": 1, "summary": "Overview of the process"}},
        {{"id": "2", "label": "Submit Application", "group": "Action", "page": 2, "summary": "Details about submission"}},
        {{"id": "3", "label": "Approval Granted", "group": "Outcome", "page": 5, "summary": "Approval steps"}}
    ],
    "edges": [
        {{"source": "1", "target": "2", "label": "Proceed to"}},
        {{"source": "2", "target": "3", "label": "Leads to"}}
    ]
}}

Content: {content}
"""

    gpt_output = analyze_with_gpt(content, api_key, prompt_template)
    print("Parsing GPT-4 output...")
    nodes, edges = parse_gpt_output(gpt_output)

    # Identify starting nodes
    starting_nodes = [node['id'] for node in nodes if node.get('group') == "Starting Point"]

    print("Creating and visualizing the graph...")
    render_human_readable_graph(nodes, edges, starting_nodes)
    print("Graph visualization complete.")


In [None]:

# Run the application
if __name__ == "__main__":
    PDF_PATH = "/content/test4.pdf"  # Replace with your PDF file
    OPENAI_API_KEY = ""  # Replace with your OpenAI API Key
    main(PDF_PATH, OPENAI_API_KEY)

Extracting text from PDF...
Sending content to GPT-4 for analysis...
Parsing GPT-4 output...
Nodes: [{'id': 'N1', 'label': 'Start Optimization Process', 'group': 'Starting Point', 'page': 1, 'summary': 'Introduction to optimization and challenges with discontinuities'}, {'id': 'N2', 'label': 'Identify Optimization Problem', 'group': 'Action', 'page': 3, 'summary': 'Discusses issues with unconstrained optimization of objective functions'}, {'id': 'N3', 'label': 'Implement Gradient-Only Approaches', 'group': 'Action', 'page': 4, 'summary': 'Proposed method to avoid local minima using gradient information'}, {'id': 'N4', 'label': 'Address Discontinuities', 'group': 'Action', 'page': 6, 'summary': 'Strategies for dealing with non-differentiable functions'}, {'id': 'N5', 'label': 'Evaluate Univariate Example', 'group': 'Example', 'page': 7, 'summary': 'Example using Newton’s cooling law'}, {'id': 'N6', 'label': 'Evaluate Multivariate Example', 'group': 'Example', 'page': 8, 'summary': 'Shap

Graph visualization complete.
