# Assignment 2:  Unlocking Statistical Insights with Python

In this assignment, you'll dive into the world of statistical analysis using Python.  We'll move beyond theoretical concepts and focus on practical application. You'll learn how to calculate key statistical measures and use them for real-world tasks like outlier detection.

This assignment will guide you through:

* **Revise the concepts and equations:**  You'll implement code from scratch to compute essential statistics like mean, median, mode, variance, standard deviation, range, and interquartile range (IQR).
* **Calculating Basic Statistics:**  You'll implement code from scratch to compute essential statistics like mean, median, mode, variance, standard deviation, range, and interquartile range (IQR).
* **Explore some liberaries**: Optionaly use powerful libraries like NumPy and SciPy.


#### General Instructions 

Please adhere to the following guidelines:

- **Code Clarity:** Your code should be well-formatted, easy to understand, and include meaningful variable names.
- **Docstrings:**  Use docstrings to document your functions and explain their purpose, arguments, and return values.
- **Testing:**  Use the same given data example of your code to demonstrate its functionality.
- **NOTE:** Answer in the same notebook with the given examples.


### **1.  Text Analyzer**

* **Objective:**  Create a program that analyzes text for various characteristics. 
* **Requirements:**
    * Ask the user to input a paragraph of text.
    * Use loops to:
        * Count the number of words in the text.
        * Count the number of sentences in the text.
        * Calculate the average word length.
        * Find the most frequently used word in the text.
        * (Optional) Identify the five most common words. 
* **Hint:** You'll need to use string manipulation techniques (like splitting the text into words and sentences) along with loop logic to track counts and comparisons.

>**You can test the code using this paragraph:** "Python is a high-level, interpreted programming language. It's dynamically typed, meaning you don't have to explicitly declare the type of a variable. It's widely used for web development, data analysis, machine learning, and automation. Python is known for its readability and ease of use."


In [1]:
from Analizer import TextAnalyzer as TA

In [2]:
print("Enter your paragraph below:\n")
text = input()
analyzer = TA(text)
analyzer.analyze()

Enter your paragraph below:


Text Analysis Result:
Total Words         : 44
Total Sentences     : 4
Average Word Length : 5.32
Most Frequent Word  : 'its' (3 times)
Top 5 Most Common Words:
   • its        - 3 times
   • is         - 2 times
   • a          - 2 times
   • python     - 2 times
   • of         - 2 times
Original content >>>
Python is a high-level, interpreted programming language. It's dynamically typed, meaning you don't have to explicitly declare the type of a variable. It's widely used for web development, data analysis, machine learning, and automation. Python is known for its readability and ease of use.


---

### **Q2- Basic Statistics Calculator**

**Objective:** Create a Python program that calculates basic statistics (mean, median, mode, range, variance, and standard deviation) for a set of numbers entered by the user.

**Requirements:**
Complete each statistical calculation by following the TODO instructions below. It is mandatory to implement all calculations from scratch. Using libraries in combination with your own implementation is considered an added advantage.

In [3]:
from Analizer import Stats

In [14]:
# Initialize an empty list to store numbers
temperatures = [23, 25, 20, 23, -5, 21, 18, 19, 24, 21,19, 24, 0, 
                20, 24, 55, 22, 50, 22, 20, 21, 22, 20, 25, 19, 
                22, 26, 23, 21, 23, 17, 20, 18]
TempSample = Stats(temperatures)
print(TempSample.get_DataSort())

[-5, 0, 17, 18, 18, 19, 19, 19, 20, 20, 20, 20, 20, 21, 21, 21, 21, 22, 22, 22, 22, 23, 23, 23, 23, 24, 24, 24, 25, 25, 26, 50, 55]


In [18]:
# --- Mean Calculation ---
# TODO: Calculate the mean of the numbers in the 'temperatures' list.
#       Handle the case where the list is empty (return None or print a message).
#       Print the mean using an f-string.
# Your code here:
print(f"the mean of the Temprture sample is >>> {TempSample.get_mean():.2f}")

the mean of the Temprture sample is >>> 21.88


In [19]:
# --- Median Calculation ---
# TODO: Calculate the median of the numbers in the 'temperatures' list.
#       Sort the 'numbers' list first.
#       Handle both even and odd length lists.
#       Print the median using an f-string.
# Your code here:
print(f"the Median of the Temprture sample is >>> {TempSample.get_median()}")

the Median of the Temprture sample is >>> 21


In [20]:
# --- Mode Calculation ---
# TODO: Calculate the mode (most frequent number) in the 'temperatures' list.
#       If there are multiple modes, you can return any one of them.
#       If there is no mode (all numbers appear once), you can return None or print a message.
#       Print the mode using an f-string.
# Your code here:
print(f"the mode of the Temprture sample is >>> {TempSample.get_mode()}")

the mode of the Temprture sample is >>> [20]


In [22]:
# --- Variance Calculation ---
# TODO: Calculate the variance of the numbers in the 'temperatures' list.
#       Use the formula for sample variance (divide by n-1 for sample, n for population).
#       For this exercise, assume it's sample variance if list has more than 1 element, else return 0 if list has 0 or 1 element.
#       Print the variance using an f-string.
# Your code here:
print(f"the Variance of the Temprture sample is >>> {TempSample.get_variance():.2f}")

the Variance of the Temprture sample is >>> 98.29


In [23]:
# --- Standard Deviation Calculation ---
# TODO: Calculate the standard deviation of the numbers in the 'temperatures' list.
#       Take the square root of the variance calculated above.
#       Print the standard deviation using an f-string.
# Your code here:
print(f"the Standard Deviation of the Temprture sample is >>> {TempSample.get_std_dev():.2f}")

the Standard Deviation of the Temprture sample is >>> 9.91


In [24]:
# --- Range Calculation ---
# TODO: Calculate the range (difference between max and min) of the 'temperatures' list.
#       Print the range using an f-string.
# Your code here:
print(f"the Range of the Temprture sample is >>> {TempSample.get_range()}")

the Range of the Temprture sample is >>> 60


In [25]:
# --- IQR Calculation ---
# TODO: Calculate the Interquartile Range (IQR) of the 'temperatures' list.
#       Use the percentile function you defined in Q1 or define a new one if needed.
#       Calculate Q1 (25th percentile) and Q3 (75th percentile).
#       IQR = Q3 - Q1
#       Print the IQR using an f-string.
# Your code here:
print(f"the IQR of the Temprture sample is >>> {TempSample.get_iqr()}")

the IQR of the Temprture sample is >>> 4.0


In [27]:
# --- Skewness Calculation ---
# TODO: Calculate the skewness of the 'temperatures' list.
#       You can use a simple method for skewness calculation.
#       Print the skewness using an f-string.
# Your code here:
print(f"the Skewness of the Temprture sample is >>> {TempSample.get_skewness()}  : {'Positive Skewness' if TempSample.get_skewness()>0 else 'Nigative Skewness'}")

the IQR of the Temprture sample is >>> 0.8864658781156218  : Positive Skewness


In [28]:
# --- Kurtosis Calculation ---
# TODO: Calculate the Kurtosis of the 'temperatures' list.
#       You can use a simple method for Kurtosis calculation.
#       Print the Kurtosis using an f-string.
# Your code here:
print(f"the Kurtosis of the Temprture sample is >>> {TempSample.get_kurtosis()}")

the Kurtosis of the Temprture sample is >>> 5.0983225933544105


In [29]:
TempSample.summary()


📊 Full Statistics Summary:
----------------------------------------
Count      : 33
Min        : -5
Max        : 55
Range      : 60
Mean       : 21.88
Median     : 21
Mode       : [20]
Variance   : 98.29
Std. Dev   : 9.91
IQR        : 4.00
Skewness   : 0.8865 Positive Skewness
Kurtosis   : 5.0983
----------------------------------------
