Skip to content

Latest commit

 

History

History
12 lines (11 loc) · 508 Bytes

README.md

File metadata and controls

12 lines (11 loc) · 508 Bytes

NaturalLanguageProcessing

1. Reading the txt file

2. Applying Pre-Processing to the data:

i- Converting text to lowercase ii- Spliting the text into words (tokens) iii- Removing words having numeric character/s iv- Applying 3rd filter/condition v- Removing stop words from the data

2- Store the words and their counts in appropriate data structure

3- Repeating step-1 and keep reading the next file till all files are read

4- Printing details of Words: Word, count and its probability