Skip to content

Parsing and finding frequency of the worlds in Hamlet

Notifications You must be signed in to change notification settings

DShKMG/Hamlet-Web-Parsing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Getting Started

This is a web parsing code that specifically designed on hamlet. Not verry efficent because of getFreq() iterates through whole words every time. Fast enough to handle for some part of the book

Tasks

  1. Reads the text of the game Hamlet by W. Shakespeare from the URL http://shakespeare.mit.edu/hamlet/full.html

  2. Parses text into separate words

  3. Calculates the frequency of each word, ie. counts how many times a word is contained in the text,

  4. Prints a table of the 20 most common words and their number of occurrences.

Folder Structure

The workspace contains two folders by default, where:

  • src: the folder to maintain sources
  • lib: the folder to maintain dependencies (MISSING)

About Libraries/Packages

I do not own the library files. You need to have Html Unit library to run this code. Please refer to the owner of the library

https://sourceforge.net/projects/htmlunit/files/htmlunit/

https://htmlunit.sourceforge.io/

About

Parsing and finding frequency of the worlds in Hamlet

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages