Rehearser

Rehearser is an application for aiding in the reading of research papers.

(Image includes text from the "Demo Paper" [1])

Usage

To use the application, the only requirement is Docker and Docker Compose.

To run the server, execute the command docker-compose up -d in the root directory.

Once the server is running, you can navigate the application by going to localhost:8000. Note, the applicaiton needs ports 8000 (used by http-server) and 5000 (used by flask). If they are used elsewhere, the application will not run.

Current Features

Scrape a research paper to extract the narrational text.
Convert scraped text to audio with segmented text-audio pairs for the original text.
Provide a minimalistic front end for uploading documents and downloading text, and viewing the text-aligned audio with minimal playback controls.

Technical Deatils

The application uses a few key components:

GROBID - The current library for parsing research papers. Docker Compose starts a GROBID docker image which PDFs are sent to for extraction into XML.
BeautifulSoup - Used for parsing the XML generated by GROBID to find the parts of the research paper which are most relevant.
ESPNet - Framework used for executing TTS functionality in python, currently used to convert text chunks into audio files.
ljspeech_fastspeech2 - A FastSpeech2 model trained on the ljspeech dataset, the model used for speech conversion.
PyTorch - Machine learning framework responsible for running the ESPNet models.
nltk - Natural language processing framework necessary for grapheme to phoneme conversion in TTS.
Flask - API Framework used to host endpoints in python.
SQLite3 - Lightweight database used to store meta information and proessing status on uploaded papers.
Celery - Asynchronous task queue that is used to queue paper processing requests and run them asyncronously.
Redis - In memory data store that is used to store tasks for Celery workers.
http-server - Static HTTP server used for serving frontend applicaiton.

Disclaimer

This tool is still in early iterations. Please exercise caution with use.

The parsing may end up missing text (eg, a few lines at the top of a page) or adding undesired text (eg, such as the contents of a figure).
Citations are currently missing from the extracted text and audio.

Citation

The "demo paper" used in this project is:

[1] Meuschke, Norman, et al. "A benchmark of pdf information extraction tools using a multi-task and multi-domain evaluation framework for academic documents." International Conference on Information. Cham: Springer Nature Switzerland, 2023.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
backend		backend
frontend		frontend
test		test
.gitignore		.gitignore
DemoPaper.pdf		DemoPaper.pdf
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Rehearser

Usage

Current Features

Technical Deatils

Disclaimer

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

iphswift/rehearser

Folders and files

Latest commit

History

Repository files navigation

Rehearser

Usage

Current Features

Technical Deatils

Disclaimer

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages