A simple web-app-based Search Engine built on top of the Greek Parliament Proceedings
grep + Greek Parliament
- Run (ideally in a fresh venv)
pip install greparl
- Download the required data files and, if needed, decompress them in the desired directory (see below)
The required data files that are not shipped along with the package include the Search Engine's indices, the parliament proceedings' file and some other tasty stuff.
- The raw proceedings can be downloaded here: speeches.csv. The extracted file should be renamed to
speeches.csv
- The Search Engine's core can be downloaded here: information-retrieval.tar.gz
Those files should be decompressed in the same directory from which the user will run the GreParl.
Alternatively, all required files (apart from speeches.csv
) can be auto-generated.
- Activate venv
- Run
greparl
orpython -m greparl
and wait for signs of life.. - The default browser should open up automatically, but if not, browse to "http://127.0.0.1:5000/" manually
You can either search for a specific speech, or preview a random one (totally original..).
You can preview the speeches in Results page. No pagination is available at time.
You can perform a deeper search which will return speeches that are similar not identical to query.
You can read a specific speech and/or its metadata. Also, in Speech page, a shortcut is provided for highlighting the current speech.
You can find the most important keywords of a specific speech or set of speeches.
Speech sets can be grouped by parties or parliament members and can be limited using date ranges.
You can compare parliament members to find out who tend to speak about the same topics the most.
You can predict the party that is likely to have said an arbitrary phrase of choice.
The author of this package is not the creator of Search Engine's core. All credits should go to Theodoros Grammenos' work. This project is just a graphical wrapper, trying to make life easier :D
Also, note that this projects ships a modified version of alup's greek_stemmer, which is originally distributed under MIT License.