#### Instructions for Running this Notebook

- **Initial Setup:** If you are running this notebook for the first time, install the necessary Python packages by executing the following command in your terminal:
```bash
        pip install -r ../demos/requirements.txt
```

- **Troubleshooting:** If you encounter any issues or failed tests while running the notebook, restart the kernel and try running the cells again. This can often resolve state-related problems.

In [1]:
# Import Standard Libraries
import os
import sys
import multiprocessing

# Import Third Party Libraries
from flask import Flask, request, jsonify
from IPython.display import IFrame, display
import pytest

# Get the current working directory (CWD)
cwd = os.getcwd()
# Move up two levels to reach the stixd directory
stixd_path = os.path.abspath(os.path.join(cwd, '..', '..'))
# Append the stixd directory to the Python path
sys.path.append(stixd_path)

# Import Local Libraries
from ling508.api import app

In [2]:
# Define Global Variables

# Define the path to /tests directory
TEST_DIR = os.path.join(os.getcwd(), '../tests')

# Define pytest verbosity level
VERBOSITY = '-q' # Quiet
# VERBOSITY = '-v' # Verbose
# VERBOSITY = '-vv' # More verbose
# VERBOSITY = '-vvv' # Even more verbose

# Define pytest traceback level
TRACEBACK = '--tb=auto' # Default
# TRACEBACK = '--tb=short' # Short
# TRACEBACK = '--tb=long' # Long
# TRACEBACK = '--tb=line' # One line
# TRACEBACK = '--tb=native' # Python standard
# TRACEBACK = '--tb=no' # No traceback

# Demonstration of STIX-D's Clex Importer Tool

## Use Case

The STIX-D Use Case L1 involves seeding the `stixd_corpus.lexicon` database table with lexical entries from the ACE Common Lexicon (Clex) or similar files. An administrator provides a URI to the lexicon file, and the system connects to the local database via the `mysql_repository.py` module. For each line in the lexicon file, the system extracts relevant character strings to create a word tag and form, generates a SHA256 hash of these components, and checks for the hash in the `lexicon` table. If the hash exists, it links the existing entry with a source ID; if not, it creates a new entry. The system also imports additional arguments into appropriate fields and outputs summary information or error messages as necessary.

## Code Execution

## Project Design

### Project Overview
The Clex Importer tool imports data from the Attempto Controlled English (ACE) lexicon file, stored as Prolog facts, into the `lexicon` table of the STIX-D MySQL database. This tool is accessible via a web form served by a Flask API, where users input a URL pointing to an ACE lexicon file. The system then parses the each Prolog fact and maps it to the appropriate attributes in the `lexicon` table. Additionally, the tool can be executed from the command line or integrated into other applications.

### OOP Principles in the Project
This project is designed using object-oriented programming (OOP) principles to create a modular, extensible, and maintainable system. The key OOP principles in the project are as follows:

- **Abstraction**: The project uses abstract classes and methods to define interfaces and enforce a common structure. For example, the Repository class defines abstract methods for interacting with the database, which are implemented by MySQLRepository.
- **Encapsulation**: Each class is responsible for a specific aspect of the project, encapsulating related data and behavior. For example, ClexImporter encapsulates the logic for importing Clex entries, while MySQLRepository encapsulates database interactions.
- **Inheritance**: The project uses inheritance to create a hierarchy of classes with shared behavior. For example, MySQLRepository inherits from Repository to reuse common database interaction methods.
- **Polymorphism**: The project uses polymorphism to allow different classes to be used interchangeably. For example, the Repository interface allows different types of repositories to be used with the ClexImporter.

### Key Modules and Their OOP Design
The project consists of the following key modules, each designed using OOP principles:

- **`ClexImporter` Class in `clex_importer.py`**: 
    - **Responsibility**: Manages the importation of Clex entries into the database.
    - **Attributes**:
        - `db_repo`: Represents the database repository where Clex entries will be stored.
        - `uri`: The location of the Clex file to be imported.
    - **Methods**:
        - `import_clex_entries()`: Imports Clex entries from the specified file into the database.
        - `parse_clex_entry()`: Parses a single Clex entry from the file.
        - `map_clex_entry_to_lexicon()`: Maps the parsed Clex entry to the `lexicon` table schema.

- **`MySQLRepository` Class in `mysql_repository.py`**: 
    - **Responsibility**: Abstracts the database interactions for MySQL databases.
    - **Attributes**:
        - `connection`: Represents the connection to the MySQL database.
        - `table_name`: The name of the table in the database.
    - **Methods**:
        - `create_table()`: Creates the table in the database.

- **MySQLRepository**: Contains the MySQLRepository class, which implements the Repository interface for interacting with a MySQL database. The class uses the mysql-connector-python library to connect to the database and execute queries.

- **DocumentManager**: Contains the DocumentManager class, which provides methods for reading and writing files. The class is used by the ClexImporter to read the lexicon file.

- **NLPProcessor**: Contains the NLPProcessor class, which provides methods for processing natural language text. The class is used by the ClexImporter to extract information from the lexicon file.

## Code Interaction with Database

In [3]:
# Run Flask in a separate process
def run_flask():
    app.run(port=5000, debug=True, use_reloader=False)

flask_process = multiprocessing.Process(target=run_flask)
flask_process.start()

In [4]:
# Display the HTML form served by Flask
display(IFrame("http://localhost:5000/ling508/web/stixd.html", width=640, height=480))

## Test Cases

### All Test Cases

In [5]:
# Run all tests in the test directory (~30-60 seconds)
pytest.main([TEST_DIR, VERBOSITY, TRACEBACK])


[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m                                         [100%][0m
[32m[32m[1m32 passed[0m[32m in 37.85s[0m[0m


<ExitCode.OK: 0>

### Test Case 10: doc_scrapper

In [6]:
# Run a specific test file in the test directory
# test_file = "test_10_doc_scrapper.py"
# pytest.main([os.path.join(TEST_DIR, test_file), "-v", "--tb=line"])


### Test Case 20: gen_clex_uuid

In [7]:
# Run a specific test file in the test directory
# test_file = "test_20_gen_clex_uuid.py"
# pytest.main([os.path.join(TEST_DIR, test_file), "-v", "--tb=line"])


### Test Case 30: mysql_repo

In [8]:
# Run a specific test file in the test directory
# test_file = "test_30_mysql_repo.py"
# pytest.main([os.path.join(TEST_DIR, test_file), "-v", "--tb=line"])


### Test Case 40: nlp_manager

In [9]:
# Run a specific test file in the test directory
# test_file = "test_40_nlp_manager.py"
# pytest.main([os.path.join(TEST_DIR, test_file), "-v", "--tb=line"])


### Test Case 50: doc_manager

In [10]:
# Run a specific test file in the test directory
# test_file = "test_50_doc_manager.py"
# pytest.main([os.path.join(TEST_DIR, test_file), "-v", "--tb=line"])


### Test Case 53: sent_manager

In [11]:
# Run a specific test file in the test directory
# test_file = "test_53_sent_manager.py"
# pytest.main([os.path.join(TEST_DIR, test_file), "-v", "--tb=line"])


### Test Case 57: lexicon_manager

In [12]:
# Run a specific test file in the test directory
# test_file = "test_57_lexicon_manager.py"
# pytest.main([os.path.join(TEST_DIR, test_file), "-v", "--tb=line"])


### Test Case 70: clex_importer_local

In [13]:
# Run a specific test file in the test directory (~ 10 seconds)
# test_file = "test_70_clex_importer_local.py"
# pytest.main([os.path.join(TEST_DIR, test_file), "-v", "--tb=line"])


### Test Case 75: clex_importer_ci

In [14]:
# Run a specific test file in the test directory
test_file = "test_75_clex_importer_ci.py"
pytest.main([os.path.join(TEST_DIR, test_file), "-v", "--tb=auto"])


platform win32 -- Python 3.12.4, pytest-8.3.2, pluggy-1.5.0 -- d:\OneDrive\Code\hltms\stixd\.venv\Scripts\python.exe
cachedir: .pytest_cache
rootdir: d:\OneDrive\Code\hltms\stixd
configfile: pytest.ini
plugins: anyio-4.4.0, mock-3.14.0
[1mcollecting ... [0mcollected 1 item

..\tests\test_75_clex_importer_ci.py::test_import_clex_entries [32mPASSED[0m[32m    [100%][0m



<ExitCode.OK: 0>

### Test Case 80: api

In [None]:
# Run a specific test file in the test directory
# test_file = "test_80_api.py"
# pytest.main([os.path.join(TEST_DIR, test_file), "-v", "--tb=line"])


### Test Case 90: e2e_local

In [None]:
# Run a specific test file in the test directory (~15 seconds)
# test_file = "test_90_e2e_local.py"
# pytest.main([os.path.join(TEST_DIR, test_file), "-v", "--tb=auto"])


### Test Case 95: e2e_ci

In [None]:
# Run a specific test file in the test directory (~15 seconds)
# test_file = "test_95_e2e_ci.py"
# pytest.main([os.path.join(TEST_DIR, test_file), "-v", "--tb=line"])
