doc_tagger.py and doc_tagger_final.py have identical functionality: they search through the texts listed below and return metadata information (title, author, translator, and illustrator), as well as keyword counts for keywords supplied by the user at the command line.
The 14 text files are the full texts of classic titles from Project Gutenberg:
- The Adventures of Sherlock Holmes
- Pride and Prejudice
- Alice's Adventures in Wonderland
- Grimm's Fairy Tales
- Double or Nothing
- The Divine Comedy
- Leaves of Grass
- The Prince
- Les Miserables
- Adventures of Huckleberry Finn
- How to Analyze People on Sight Through the Science of Human Analysis: The Five Human Types
- Ulysses
- The Adventures of Tom Sawyer
- Moby Dick
For example, to search for the keywords "good" and "bad", you would enter from the command line: doc_tagger_final.py . good bad