Skip to content

Python script to download and process movie script data for nlp purposes

License

Notifications You must be signed in to change notification settings

domerin0/chat-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

chat-data

Python script to download and process movie script data for nlp purposes.

This script was created to download open movie scripts from imsdb.com with the idea to use the data to train deep learning nlp based applications for educational purposes.

This is not finished yet, so far it only downloads all the raw scripts, parsing for conversation hasn't been implemented yet. It is coming soon.

##Dependencies:

  • Requires Python 2.7

  • Requires package BeautifulSoup

sudo pip install BeautifulSoup

##Usage:

To run:

python main.py

It takes care of the rest.

If you want to delete all the downloaded raw data and temporary files (won't touch train/val/test directories), run:

python main.py clean

About

Python script to download and process movie script data for nlp purposes

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages