Created detailed report of text readability, complexity and sentiment analysis of text data extracted from 114 websites. We used Beautiful Soup library to extract text data by web crawling from 114 URLs We cleaned and preprocessed the text data and used NLTK library in Python for the analysis.
The following analysis was done in the given assignment: 1- Sentence count per article 2- Cleanined tokenized articles without puncuations and stopwords 3- Word count and average word length per article i.e. sum of the total number of characters in each word/total number of words 4- Word count per cleaned article 5- Calculated Sylabble, complex word count and personal pronouns 6- Calculated Positivity and Negativity score 7- Created a dataframe using pandas for all the varible we calculated.