 #  **TEXT ANALYSIS**

### What is text analysis?
Text analysis, also known as text mining or text analytics, refers to the process of extracting meaningful information and insights from textual data.

### Objectives
After completing this lab, you will be able to:

Use Python commands to perform text analysis.
Convert the text to lowercase and then find and count the frequency of all unique words, as well as a specified word.

### setup
For this lab, you will be using the following data types:

- List
- Strings
- Classes and objects

#### Let's consider a real-life scenario where you are analyzing customer feedback for a product. You have a large data set of customer reviews in the form of strings, and you want to extract useful information from them using the three identified tasks:

Task 1. String in lowercase: You want to pre-process the customer feedback by converting all the text to lowercase. This step helps standardize the text. Lower casing the text allows you to focus on the content rather than the specific letter casing.

Task 2. Frequency of all words in a given string: After converting the text to lowercase, you want to determine the frequency of each word in the customer feedback. This information will help you identify which words are used more frequently, indicating the key aspects or topics that customers are mentioning in their reviews. By analyzing the word frequencies, you can gain insights into the most common issues raised by customers.

Task 3. Frequency of a specific word: In addition to analyzing the overall word frequencies, you want to specifically track the frequency of a particular word that is relevant to your analysis. For example, you might be interested in monitoring how often the word "reliable" appears in customer reviews to gauge customer sentiment about the product's reliability. By focusing on the frequency of a specific word, you can gain a deeper understanding of customer opinions or preferences related to that particular aspect.

By performing these tasks on the customer feedback dataset, you can gain valuable insights into customer sentiment

In [17]:
# define a string

givenstring="mary had a little LAMB ,LITTLE LAMB.Its fleece was white as snow!whenever mary went,mary went it made sure to follow her"

# define class and attributes

class TextAnalyzer(object):
    
    def __init__ (self, text):
        # remove punctuation
        formattedText = text.replace('.',' ').replace('!',' ').replace('?',' ').replace(',',' ')
        
        # make text lowercase
        formattedText = formattedText.lower()
        
        self.fmtText = formattedText
        
    def freqAll(self):        
        # split text into words
        wordList = self.fmtText.split(' ')
        
        # Create dictionary
        freqMap = {}
        for word in set(wordList): # use set to remove duplicates in list
            freqMap[word] = wordList.count(word)
        
        return freqMap
    
    def freqOf(self,word):
        # get frequency map
        freqDict = self.freqAll()
        
        if word in freqDict:
            return freqDict[word]
        else:
            return 0

# instantiate the TextAnalyzer class by passing the given string as an argument.

analyzed = TextAnalyzer(givenstring)

# calling function that converts data to lowercase

print("Formatted Text:", analyzed.fmtText)

# Call the function that counts the frequency of all unique words from the data
freqMap = analyzed.freqAll()
print(freqMap)

# Call the function that counts the frequency of a specific word

word = "mary"
frequency = analyzed.freqOf(word)
print("The word",word,"appears",frequency,"times.")

Formatted Text: mary had a little lamb  little lamb its fleece was white as snow whenever mary went mary went it made sure to follow her
{'': 1, 'fleece': 1, 'to': 1, 'whenever': 1, 'sure': 1, 'snow': 1, 'little': 2, 'as': 1, 'was': 1, 'her': 1, 'mary': 3, 'had': 1, 'follow': 1, 'lamb': 2, 'a': 1, 'white': 1, 'went': 2, 'made': 1, 'it': 1, 'its': 1}
The word mary appears 3 times.
