Skip to content

A comprehensive desktop application designed for intelligent text analysis and linguistic processing. Built with Python and tkinter, it provides advanced natural language processing capabilities including word frequency analysis, named entity recognition, pattern extraction, and multi-language support with automatic language detection.

License

Notifications You must be signed in to change notification settings

TALOS-AI4SSH/Text-File-Analyser

Repository files navigation

TALOS Text File Analyser

Python Tkinter spaCy Stanza

Available languages: English | Ελληνικά


This repository provides a lightweight desktop GUI application for analyzing, extracting, and exporting linguistic information from plain-text files.


📂 Contents

This repository contains:

  • Talos_Text_Analyser.py — the complete source code (all-in-one Tkinter application)
  • Talos_Text_Analyser_Documentation.pdf — installation & usage guide (Version 2.0)

Overview

The TALOS Text File Analyser (TFA) is a cross-platform desktop tool (Tkinter) for interactive text analysis. It can:

  • Perform Word frequency analysis
  • Extract Nouns (POS-based)
  • Detect Person and Location names (NER)
  • Run Lemmatization
  • Extract Lexico-Syntactic Patterns (predefined & custom up to 5 tokens)
  • Export results to Excel (.xlsx) and CSV

Designed for researchers, educators, and developers, with special support for Ancient Greek via Stanza and modern languages via spaCy.


Features

  • Language detection (automatic) with tuned English/Greek heuristics; supports Ancient & Modern Greek distinctions.
  • Six analysis modes: Words, Nouns, Person names, Location names, Lemmas, Pattern extraction.
  • Custom pattern builder (POS templates, wildcard, up to 5 positions).
  • Export to Excel (multi-sheet) and CSV.
  • Multilingual NLP: spaCy models for modern languages; Stanza for Ancient Greek (grc).
  • Modern dark UI with progress bars and responsive feedback.

Run

# 1) Ensure Python 3.8+ is installed
python --version

# 2) Install core libraries
pip install pandas openpyxl spacy langdetect stanza

# 3) Install spaCy language models (essential)
python -m spacy download en_core_web_sm
python -m spacy download el_core_news_sm   # Greek

# 4) Install Stanza Ancient Greek model (one-time)
python - << 'PY'
import stanza
stanza.download("grc")
PY

# 5) Launch the GUI application
python Talos_Text_Analyser.py

Author

Prof. Christophe Roche — TALOS ERA Chair Holder — University of Crete

📧 roche.university@gmail.com

🌐 https://talos-ai4ssh.uoc.gr/


Citation

For general reference to the project:

Roche, C. (2025). TALOS Text File Analyser (Version 2.0).
TALOS AI4SSH Project, University of Crete.
https://talos-ai4ssh.uoc.gr/


License

All the code is distributed under the Creative Commons Attribution–NonCommercial (CC BY-NC 4.0) license.
You are free to share and redistribute the material under the following conditions:

  • BY: Credit must be given to the creator(s).
  • NC: Only non-commercial uses are permitted.

More info: https://creativecommons.org/licenses/by-nc/4.0/


More Information


About

A comprehensive desktop application designed for intelligent text analysis and linguistic processing. Built with Python and tkinter, it provides advanced natural language processing capabilities including word frequency analysis, named entity recognition, pattern extraction, and multi-language support with automatic language detection.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages