# TRAligner Development Testing

This notebook demonstrates how to use TRAligner in development mode after the package restructure.

## Setup

**First time only:** Install TRAligner in editable mode

In [None]:
# Run this cell ONCE, then restart the kernel
# After restart, skip this cell and go to the next one

%pip install -e "/Users/hadarmiller/Dropbox (University of Haifa)/HaifaU/10_Text_Reuse/Data Bases And Systems/Framwork/Modules/TRAligner"

print("✓ Installation complete! Please restart the kernel now.")
print("  In Jupyter: Kernel → Restart Kernel")
print("  Then skip this cell and continue with the next one.")

## Alternative: Quick Import Without Installation

If you don't want to install, use this instead:

In [None]:
# Alternative method: Add src to path (no installation needed)
import sys
traligner_path = "/Users/hadarmiller/Dropbox (University of Haifa)/HaifaU/10_Text_Reuse/Data Bases And Systems/Framwork/Modules/TRAligner/src"
if traligner_path not in sys.path:
    sys.path.insert(0, traligner_path)

print("✓ Path added. You can now import traligner.")

## Import TRAligner

**Important:** Use `import traligner` (lowercase), not `import TRAligner`

In [None]:
# Import the package
import traligner as ta
import pickle

# Verify it works
print(f"✓ TRAligner version: {ta.__version__}")
print(f"\n✓ Available functions:")
for func in [x for x in dir(ta) if not x.startswith('_')][:10]:
    print(f"   - {func}")
print("   ...and more")

## Test: Basic Alignment

Test the alignment function with sample sequences

In [None]:
# Sample sequences for testing
suspect_sequence = [
    'Tanakh_Torah_Numbers_3_15',
    'Tanakh_Writings_Psalms_68_7',
    'Tanakh_Writings_Psalms_68_7',
    'Tanakh_Torah_Genesis_44_16',
    'Tanakh_Writings_Psalms_68_7',
    'Tanakh_Torah_Numbers_3_15',
    'Tanakh_Torah_Numbers_1_49',
    'Tanakh_Torah_Exodus_32_26',
    'Tanakh_Torah_Exodus_32_26',
    'Tanakh_Torah_Numbers_3_15',
    'Tanakh_Torah_Genesis_46_27',
    'Tanakh_Torah_Genesis_46_27',
    'Tanakh_Torah_Numbers_3_15',
    'Tanakh_Torah_Numbers_3_15',
    'Tanakh_Torah_Numbers_3_15',
    'Tanakh_Torah_Numbers_3_16'
]

potential_sequence = [
    'Tanakh_Torah_Numbers_3_15',
    'Tanakh_Torah_Numbers_1_49',
    'Tanakh_Torah_Exodus_32_26',
    'Tanakh_Torah_Exodus_32_26',
    'Tanakh_Torah_Numbers_3_15',
    'Tanakh_Torah_Genesis_46_27',
    'Tanakh_Torah_Numbers_3_15',
    'Tanakh_Torah_Numbers_3_15',
    'Tanakh_Torah_Numbers_3_16'
]

print("✓ Test sequences loaded")
print(f"  Suspect sequence length: {len(suspect_sequence)}")
print(f"  Potential sequence length: {len(potential_sequence)}")

In [None]:
# Perform alignment
als, df_sus_a, sus_m, src_m = ta.alignment(
    suspect_sequence,
    potential_sequence, 
    match_score=30,
    mismatch_score=1,
    methods={"ignore_tokens": ["*"]}
)

print("✓ Alignment completed")
print(f"  Number of alignments found: {len(als)}")

In [None]:
# Calculate simple score
alignment_score = ta.alignmentScore(als, verbose=False)
print(f"✓ Simple Alignment Score: {alignment_score[0]}")

## Test: Advanced Scoring with TF-IDF

Test the incremental scoring with TF-IDF weights

In [None]:
# Load TF-IDF data (adjust path as needed)
tfidf_path = "/Users/hadarmiller/Downloads/tfidf_unigram_Hebrew.pickle"

try:
    with open(tfidf_path, "rb") as f:
        df_tfidf = pickle.load(f)
    print("✓ TF-IDF data loaded")
    
    # IMPORTANT: Correct parameter mapping (sus_t = suspect, src_t = potential)
    increment2one = {
        "sus_t": suspect_sequence,      # ✓ Correct: suspect → sus_t
        "src_t": potential_sequence,    # ✓ Correct: potential → src_t
        "tfidf": df_tfidf,
        'default': 0.1, 
        "i21": 0.3
    }
    
    # Calculate incremental score
    sseq = ta.alignmentScore(
        als, 
        increment2one=increment2one, 
        decrement_gap=0.2, 
        verbose=False, 
        prune=0.0
    )
    
    print(f"✓ Incremental Alignment Score: {sseq}")
    
except FileNotFoundError:
    print("⚠ TF-IDF file not found. Skipping advanced scoring test.")
    print(f"  Expected path: {tfidf_path}")
except Exception as e:
    print(f"⚠ Error in advanced scoring: {e}")

## Development Tips

### When You Modify Source Code:

1. **Edit the file**: Make changes to files in `src/traligner/`
2. **Restart kernel**: In Jupyter, go to Kernel → Restart Kernel
3. **Re-run imports**: Run the import cells again

### Use Auto-reload (Optional):

For automatic reloading of changed modules:

In [None]:
# Enable auto-reload
%load_ext autoreload
%autoreload 2

print("✓ Auto-reload enabled")
print("  Changes to source files will be automatically reloaded")

## Summary

✅ **Import**: `import traligner as ta` (lowercase, not `TRAligner`)

✅ **Installation**: `pip install -e /path/to/TRAligner` (recommended)

✅ **Alternative**: Add `src/` to `sys.path` for quick testing

✅ **Reload changes**: Restart kernel or use `%autoreload 2`

✅ **Version**: Check with `ta.__version__`

---

**For more details, see:** `JUPYTER_DEVELOPMENT.md`