Skip to content

Created detailed report of text readability, complexity and sentiment analysis of text data extracted from 114 websites.

Notifications You must be signed in to change notification settings

Souravdani/Web-crawling-and-sentiment-analysis-of-114-URLs-

Repository files navigation

Web-crawling-and-sentiment-analysis-of-114-URLs-

Created detailed report of text readability, complexity and sentiment analysis of text data extracted from 114 websites. We used Beautiful Soup library to extract text data by web crawling from 114 URLs We cleaned and preprocessed the text data and used NLTK library in Python for the analysis.

The following analysis was done in the given assignment: 1- Sentence count per article 2- Cleanined tokenized articles without puncuations and stopwords 3- Word count and average word length per article i.e. sum of the total number of characters in each word/total number of words 4- Word count per cleaned article 5- Calculated Sylabble, complex word count and personal pronouns 6- Calculated Positivity and Negativity score 7- Created a dataframe using pandas for all the varible we calculated.

About

Created detailed report of text readability, complexity and sentiment analysis of text data extracted from 114 websites.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages