Skip to content

saurzv/subtitle-extractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Subtitle Extractor

link


Dependencies

Required binaries

(Must be installed on local machine)

  • ccextractor
    • To extract closed caption
  • redis
    • Used as message broker for celery

Required pip modules

  • boto3
    • To connect with DynamoDB
  • celery
    • To offload task in background
  • django
  • djago-celery-results
    • To store the task results in django database
  • django-storages
    • To store files in remote server (AWS S3 in this case)
  • djangorestframework
    • To make REST API

Run on local machine

Install the required binaries Clone the repository and install required pip modules

git clone https://github.com/saurzv/subtitle-extractor.git
cd subtitle-extractor
python3 -m venv env
source env/bin/activate
pip install -r requirements.txt

Create a .env file with these values :

AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
DJANGO_SECRET_KEY

Make migratations and run server

python3 manage.py makemigrations
python3 manage.py migrate
python3 manage.py runserver

In another terminal with same virtual enivronment, run celery

celery -A server worker -l info

Visit the site at http://127.0.0.1:8000/

commands are written with bash shell in mind


Further scope of improvements

  • Uploading large files can lead to the user being stuck on the homepage for a very long time. POST requests can be offloaded to celery in the background, and a waiting page can be shown with a polling API.
  • Progress bar for file upload.
  • Error pages can be implemented.
  • Option to download the extracted .srt file can be implemented.