# Demo Code for Simplifier Component

Run the notebook on `https://colab.research.google.com/`.<br>
Remember to set Runtime > Change runtime type > Hardware Accelerator > GPU.

## Notebook Initialization

In [1]:
# Python version can't be set on Google Colab
!python --version

Python 3.9.16


In [None]:
# Clone repository
CODE_TEMP = './_temp'
CODE_BRANCH = 'main'

!git clone -b $CODE_BRANCH --recurse-submodules --single-branch https://github.com/Genisis2/nus_cs5246_project.git $CODE_TEMP

# Explode it in the workspace and remove temp
!mv -f $CODE_TEMP/* . && mv -f $CODE_TEMP/.* .
!rm -rf $CODE_TEMP

# Setup git redirect
!git config --global url.https://github.com/.insteadOf git://github.com/

# Install requirements
!pip install -U pip setuptools
!pip install -r requirements-cuda.txt

# Restart kernel so imported modules are available
print("Restarting kernel. Run the next cell manually.")
import time
time.sleep(2)
import os
os.kill(os.getpid(), 9)

Cloning into './_temp'...
remote: Enumerating objects: 147, done.[K
remote: Counting objects: 100% (147/147), done.[K
remote: Compressing objects: 100% (97/97), done.[K
remote: Total 147 (delta 56), reused 116 (delta 30), pack-reused 0[K
Receiving objects: 100% (147/147), 73.51 KiB | 350.00 KiB/s, done.
Resolving deltas: 100% (56/56), done.
Submodule 'facebookresearch/access' (https://github.com/Genisis2/facebookresearch-access) registered for path 'facebookresearch/access'
Cloning into '/content/_temp/facebookresearch/access'...
remote: Enumerating objects: 166, done.        
remote: Counting objects: 100% (80/80), done.        
remote: Compressing objects: 100% (48/48), done.        
remote: Total 166 (delta 45), reused 54 (delta 32), pack-reused 86        
Receiving objects: 100% (166/166), 738.29 KiB | 23.82 MiB/s, done.
Resolving deltas: 100% (68/68), done.
Submodule path 'facebookresearch/access': checked out 'fb724d2b5388adf0c29e487534013daca50d9313'
mv: cannot move './_temp

## Testing Simplifier

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
from simplertimes import simplify

# Create the ACCESS simplifier model
access_simplifier = simplify.create_simplifier()

access_simplifier.print_details()

[nltk_data] Downloading package perluniprops to /root/nltk_data...
[nltk_data]   Unzipping misc/perluniprops.zip.
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.


Downloading...
... 100% - 622 MB - 22.14 MB/s - 28s
Extracting...
Downloading...
... 100% - 623 MB - 15.71 MB/s - 39s
Extracting...
<function get_fairseq_simplifier.<locals>.fairseq_simplifier at 0x7f9363b2aee0>            


In [3]:
# to_simplify = [
#     "The Palestinian Authority formally becomes the 123rd member of the International Criminal Court .<n>The move gives the court jurisdiction over alleged crimes in Palestinian territories .<n>Palestinians signed the ICC's founding Rome Statute in January .",
#     'Theia, a white-and-black bully breed mix, was apparently hit by a car and buried in a field .<n>Four days later, the dog staggers to a farm and is taken in by a worker .<n>She needs surgery to fix a dislocated jaw and a caved-in sinus cavity .',
#     "Mohammad Javad Zarif is the Iranian foreign minister .<n>He is the opposite number in talks with the U.S. over Iran's nuclear program .<n>He received a hero's welcome as he arrived in Iran on a sunny Friday morning ."
# ]

to_simplify = [
    "The Palestinian Authority becomes the 123rd member of the International Criminal Court. The move gives the court jurisdiction over alleged crimes in Palestinian territories. Israel and the United States opposed the Palestinians' efforts to join the body. But Palestinian Foreign Minister Riad al-Malki said it was a move toward greater justice.",
    'Theia, a one-year-old bully breed mix, was hit by a car and buried in a field. She managed to stagger to a nearby farm, dirt-covered and emaciated. She suffered a dislocated jaw, leg injuries and a caved-in sinus cavity.',
    "Mohammad Javad Zarif is the Iranian foreign minister. He has been John Kerry's opposite number in securing a breakthrough in nuclear discussions. He received a hero's welcome as he arrived in Iran on a sunny Friday morning. But there are some facts about Zarif that are less well-known."
]

In [4]:
simplified, comp_simp_pairs = access_simplifier.simplify_document(to_simplify)

# Nicely formatted output
for doc_idx in range(len(to_simplify)):
    print(f"============================================================")
    print()

    doc_simp = simplified[doc_idx]
    print(f"Summary:\n{doc_simp}\n")
    
    doc_sent_pairs = comp_simp_pairs[doc_idx]
    print(f"Simplifications:")
    for comp, simp in doc_sent_pairs:
        print(f"{comp}")
        print(f"--> {simp}")
    print()
    print(f"============================================================")


Summary:
The Palestinian Authority is a member of the International Criminal Court of the United States . The move gives the court control over alleged crimes in Palestinian territories . Israel and the United States did not like the Palestinians to join the body . But Palestinian Foreign Minister Riad al-Malki said it was a move toward more justice .

Simplifications:
The Palestinian Authority becomes the 123rd member of the International Criminal Court.
--> The Palestinian Authority is a member of the International Criminal Court of the United States .
The move gives the court jurisdiction over alleged crimes in Palestinian territories.
--> The move gives the court control over alleged crimes in Palestinian territories .
Israel and the United States opposed the Palestinians' efforts to join the body.
--> Israel and the United States did not like the Palestinians to join the body .


Summary:
Theia , a one-year-old breed mix , was hit by a car . It was buried in a field . She was giv