html-parser

A Jupyter Notebook implementing a parser for analysing HTML Articles

The parser was developed as a requirement for applying in the MSc in Information Studies programme at the University of Amsterdam.

The goal was to parse a given collection of articles and plot its Word Count Distribution.

It is developed using only python packages and not 3rd-party ones.
The parser implements HTMLParser. Counter package is used for counting words, string for string manipulation & matplotlib for creating the histogram.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
Parser.ipynb		Parser.ipynb
README.md		README.md
collection.txt		collection.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

html-parser

About

Uh oh!

Releases

Packages

Languages

dimichai/html-parser

Folders and files

Latest commit

History

Repository files navigation

html-parser

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages