Skip to content

Commit

Permalink
Release 0.9.0.0
Browse files Browse the repository at this point in the history
  • Loading branch information
lintool committed Apr 18, 2020
1 parent cb46426 commit 0a4fbd0
Show file tree
Hide file tree
Showing 3 changed files with 9 additions and 5 deletions.
8 changes: 6 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ A low-effort way to try out Pyserini is to look at our [online notebooks](https:
For convenience, we've pre-built a few common indexes, available to download [here](https://git.uwaterloo.ca/jimmylin/anserini-indexes).

Pyserini versions adopt the convention of _X.Y.Z.W_, where _X.Y.Z_ tracks the version of Anserini, and _W_ is used to distinguish different releases on the Python end.
The current stable release of Pyserini is [v0.8.1.0](https://pypi.org/project/pyserini/) on PyPI.
The current stable release of Pyserini is [v0.9.0.0](https://pypi.org/project/pyserini/) on PyPI.
The current experimental release of Pyserini on TestPyPI is behind the current stable release (i.e., do not use).
In general, documentation is kept up to date with the latest code in the repo.

Expand All @@ -21,7 +21,7 @@ In general, documentation is kept up to date with the latest code in the repo.
Install via PyPI

```
pip install pyserini==0.8.1.0
pip install pyserini==0.9.0.0
```

## Simple Usage
Expand Down Expand Up @@ -76,21 +76,25 @@ from pyserini.analysis.pyanalysis import get_lucene_analyzer, Analyzer
# Default analyzer for English uses the Porter stemmer:
analyzer = Analyzer(get_lucene_analyzer())
tokens = analyzer.analyze('City buses are running on time.')
print(tokens)
# Result is ['citi', 'buse', 'run', 'time']

# We can explictly specify the Porter stemmer as follows:
analyzer = Analyzer(get_lucene_analyzer(stemmer='porter'))
tokens = analyzer.analyze('City buses are running on time.')
print(tokens)
# Result is same as above.

# We can explictly specify the Krovetz stemmer as follows:
analyzer = Analyzer(get_lucene_analyzer(stemmer='krovetz'))
tokens = analyzer.analyze('City buses are running on time.')
print(tokens)
# Result is ['city', 'bus', 'running', 'time']

# Create an analyzer that doesn't stem, simply tokenizes:
analyzer = Analyzer(get_lucene_analyzer(stemming=False))
tokens = analyzer.analyze('City buses are running on time.')
print(tokens)
# Result is ['city', 'buses', 'running', 'time']
```

Expand Down
2 changes: 1 addition & 1 deletion project-description.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ hits = searcher.search('hubble space telescope')

# Print the first 10 hits:
for i in range(0, 10):
print(f'{i+1} {hits[i].docid} {hits[i].score}')
print(f'{i+1:2} {hits[i].docid:15} {hits[i].score:.5f}')

# Grab the actual text:
hits[0].raw
Expand Down
4 changes: 2 additions & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,14 @@

setuptools.setup(
name="pyserini",
version="0.8.1.0",
version="0.9.0.0",
author="Jimmy Lin",
author_email="jimmylin@uwaterloo.ca",
description="Python interface to the Anserini IR toolkit built on Lucene",
long_description=long_description,
long_description_content_type="text/markdown",
package_data={"pyserini": [
"resources/jars/anserini-0.8.1-fatjar.jar",
"resources/jars/anserini-0.9.0-fatjar.jar",
]},
url="https://github.com/castorini/pyserini",
install_requires=['Cython', 'pyjnius'],
Expand Down

0 comments on commit 0a4fbd0

Please sign in to comment.