# rapid_textrank TestPyPI Demo

This notebook installs the package from TestPyPI and runs a few quick examples.

Notes:
- If you already installed `rapid_textrank`, restart the kernel after running the install cell.
- The examples below avoid spaCy dependencies; they use the Rust-backed Python API directly.


In [2]:
# Install from PyPI (production).
%pip install -U pip
%pip install rapid_textrank==0.0.1


Note: you may need to restart the kernel to use updated packages.
Looking in indexes: https://test.pypi.org/simple/, https://pypi.org/simple
Collecting rapid_textrank==0.1.0
  Using cached https://test-files.pythonhosted.org/packages/19/15/088e8c06cbfafc5f13b57f7890bbaa296a7b8739079e2fbf84bfce5eca43/rapid_textrank-0.1.0.tar.gz (78 kB)
  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone
[?25hBuilding wheels for collected packages: rapid_textrank
  Building wheel for rapid_textrank (pyproject.toml) ... [?25ldone
[?25h  Created wheel for rapid_textrank: filename=rapid_textrank-0.1.0-cp311-cp311-macosx_10_12_x86_64.whl size=509821 sha256=0b939bfca5213629581e10aea97968855da06f43c651e42aa2ac97cea411d7fb
  Stored in directory: /Users/admin/Library/Caches/pip/wheels/4a/58/58/6c8e409e6867621d7ac2546e31516290a895fe3480666bf066
Successfully built rapid_textrank
Installing collected 

In [3]:
from rapid_textrank import extract_keywords, BaseTextRank, PositionRank, BiasedTextRank, __version__

print("rapid_textrank version:", __version__)


rapid_textrank version: 0.1.0


In [4]:
# Example text (from this repo's README)
text = """
Machine learning is a subset of artificial intelligence that enables
systems to learn and improve from experience. Deep learning, a type of
machine learning, uses neural networks with many layers.
"""

phrases = extract_keywords(text, top_n=5, language="en")
for p in phrases:
    print(f"{p.rank:>2}. {p.text:25s} {p.score:.4f}")


 1. Machine                   0.2188
 2. artificial intelligence that enables 0.2063
 3. and improve from experience 0.1429
 4. networks with many layers 0.1210
 5. is a subset of            0.0742


In [5]:
# Class-based API
text2 = """
Quantum computing is advancing quickly. Researchers are improving
error correction and building larger, more reliable systems.
"""

print("BaseTextRank:")
base = BaseTextRank(top_n=5, language="en")
for p in base.extract_keywords(text2).phrases:
    print(f"- {p.text} ({p.score:.4f})")

print("\nPositionRank:")
pos = PositionRank(top_n=5, language="en")
for p in pos.extract_keywords(text2).phrases:
    print(f"- {p.text} ({p.score:.4f})")

print("\nBiasedTextRank (focus: security/privacy):")
biased = BiasedTextRank(focus_terms=["security", "privacy"], bias_weight=5.0, top_n=5)
for p in biased.extract_keywords(text2).phrases:
    print(f"- {p.text} ({p.score:.4f})")


BaseTextRank:
- error correction and (0.3680)
- reliable systems (0.3680)
- larger more (0.2074)
- Quantum (0.0283)
- Researchers are (0.0283)

PositionRank:
- error correction and (0.2851)
- reliable systems (0.2658)
- Quantum (0.2533)
- larger more (0.1536)
- Researchers are (0.0422)

BiasedTextRank (focus: security/privacy):
- error correction and (0.3680)
- reliable systems (0.3680)
- larger more (0.2074)
- Quantum (0.0283)
- Researchers are (0.0283)


In [6]:
# Example text from the pytextrank README
text3 = """
Compatibility of systems of linear constraints over the set of natural
numbers. Criteria of compatibility of a system of linear Diophantine
equations, strict inequations, and nonstrict inequations are considered.
Upper bounds for components of a minimal set of solutions and algorithms
of construction of minimal generating sets of solutions for all types of
systems are given. These criteria and the corresponding algorithms for
constructing a minimal supporting set of solutions can be used in solving
all the considered types systems and systems of mixed types.
"""

phrases = extract_keywords(text3, top_n=10, language="en")
for p in phrases:
    print(f"{p.rank:>2}. {p.text:35s} {p.score:.4f}")


 1. system of linear Diophantine        0.1562
 2. systems and systems of              0.1205
 3. and nonstrict inequations are       0.1110
 4. over the set of                     0.1084
 5. of solutions and algorithms         0.1065
 6. of solutions can be                 0.0725
 7. systems are given                   0.0602
 8. types                               0.0516
 9. for all types of                    0.0516
10. for components of a                 0.0414


In [7]:
# JSON interface (batch)
import json
from rapid_textrank import extract_batch_from_json

docs = [
    {
        "tokens": [
            {
                "text": "Machine",
                "lemma": "machine",
                "pos": "NOUN",
                "start": 0,
                "end": 7,
                "sentence_idx": 0,
                "token_idx": 0,
                "is_stopword": False,
            },
            {
                "text": "learning",
                "lemma": "learning",
                "pos": "NOUN",
                "start": 8,
                "end": 16,
                "sentence_idx": 0,
                "token_idx": 1,
                "is_stopword": False,
            },
        ],
        "config": {"top_n": 3}
    },
    {
        "tokens": [
            {
                "text": "Neural",
                "lemma": "neural",
                "pos": "ADJ",
                "start": 0,
                "end": 6,
                "sentence_idx": 0,
                "token_idx": 0,
                "is_stopword": False,
            },
            {
                "text": "networks",
                "lemma": "network",
                "pos": "NOUN",
                "start": 7,
                "end": 15,
                "sentence_idx": 0,
                "token_idx": 1,
                "is_stopword": False,
            },
        ]
    },
]

results_json = extract_batch_from_json(json.dumps(docs))
results = json.loads(results_json)

for i, result in enumerate(results):
    print(f"Doc {i}:")
    for phrase in result["phrases"]:
        print(f"- {phrase['text']} ({phrase['score']:.4f})")


Doc 0:
- Machine learning (1.0000)
Doc 1:
