<a href="https://colab.research.google.com/github/dpowale/LexiLogic/blob/main/LexiLogic.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction

LexiLogic is a Streamlit web application that blends symbolic reasoning with large language models (LLMs) to analyze and interpret multilingual PDF documents — especially in domains such as theology, law, history, and ideology where structured logic and contextual awareness are critical.

Unlike purely statistical models, LexiLogic applies a symbolic AI chain-of-thought framework that mimics how a human expert might break down a complex document:

🈶 Detect the document's language (e.g., Arabic, English, etc.)

🌐 Translate non-English text to English (if needed)

🧠 Extract named entities and their relationships

🧾 Identify the document’s main topics

📝 Generate a contextual summary with ideological emphasis

This app is especially useful for:

Researchers studying religious, ideological, or political texts

Analysts performing cross-language content audits

Students and educators looking to summarize complex materials

Intelligence teams doing content triage on foreign-language media

LexiLogic is built using:

📄 PyMuPDF for PDF text extraction

🧠 LangChain for prompt management

🧬 SymPy for symbolic logic (expandable)

🤖 OpenAI LLM for reasoning

🌐 deep-translator for multilingual support

It offers a unique, explainable alternative to black-box summarization — enabling deeper insights and better alignment with human reasoning.

# Install Required Libraries

In [None]:
!pip install streamlit langchain_community openai sympy langdetect deep-translator PyMuPDF

# Prompt Chaining

In [None]:
# LexiLogic Streamlit App for Multilingual Document Analysis

import os
import streamlit as st
from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate
from langdetect import detect
from deep_translator import GoogleTranslator
from sympy import symbols, Eq, solve
import fitz  # PyMuPDF

# Configure LLM
os.environ["OPENAI_API_KEY"] = "your-api-key-here"  # Replace with your API key
llm = OpenAI(temperature=0)

# Prompt Templates
ner_prompt = PromptTemplate.from_template(
    """
    Extract named entities (e.g., people, ideologies, organizations, locations) and their relationships from the text:
    {text}
    """
)
topic_prompt = PromptTemplate.from_template(
    """
    Identify the main topics discussed in the following document:
    {text}
    """
)
summary_prompt = PromptTemplate.from_template(
    """
    Summarize the document with emphasis on its key arguments and ideological stance:
    {text}
    """
)

# LangChain chaining components like prompts, LLM, tools, and retrievers in a clean, composable way using "|"
ner_chain = ner_prompt | llm
topic_chain = topic_prompt | llm
summary_chain = summary_prompt | llm

# Core Functions
def extract_text_from_pdf(file):
    doc = fitz.open(stream=file.read(), filetype="pdf")
    full_text = "".join(page.get_text() for page in doc)
    return full_text

def detect_language(text):
    return detect(text)

def translate_to_english(text):
    lang = detect_language(text)
    if lang != 'en':
        return GoogleTranslator(source='auto', target='en').translate(text)
    return text



# PDF Documents Analysis

In [None]:
# 📌 Streamlit App UI
st.set_page_config(page_title="LexiLogic Document Analyzer", layout="wide")
st.title("📚 LexiLogic Multilingual Document Analyzer")

uploaded_file = st.file_uploader("Upload a PDF document", type="pdf")

if uploaded_file:
    with st.spinner("🔍 Processing PDF..."):
        raw_text = extract_text_from_pdf(uploaded_file)
        translated_text = translate_to_english(raw_text)
        language = detect_language(raw_text)

        st.subheader("🈶 Detected Language")
        st.write(language)

        st.subheader("🔁 Translated Text Sample")
        st.code(translated_text[:500])

        st.subheader("📌 Named Entities & Relationships")
        ner_result = ner_chain.invoke({"text": translated_text})
        st.write(ner_result)

        st.subheader("📚 Topics")
        topics_result = topic_chain.invoke({"text": translated_text})
        st.write(topics_result)

        st.subheader("📝 Summary")
        summary_result = summary_chain.invoke({"text": translated_text})
        st.write(summary_result)
