# 🌊 Diving into the 🌐 world of data 

### 📥 Introduction 📊
Diving into the world of data analysis requires a comprehensive understanding of various statistical concepts.
From descriptive statistics to inferential models, the process empowers individuals to extract meaningful insights,
solve complex problems, and drive informed decision-making.
This document is designed to serve as a guide for exploring key areas of data analysis.



In [7]:
import math

def measure_of_central_tendency():
    print("\nMeasure of Central Tendency:")
    print("1. Read about Measure of Central Tendency")
    print("2. Solve a Measure of Central Tendency Problem")
    choice = input("Choose an option (1-2): ")
    
    if choice == "1":
        print("""
        Measures of Central Tendency include:
        - Mean: The average of all data points.
        - Median: The middle value in the dataset when arranged in order.
        - Mode: The most frequent value in the dataset.
        """)
    elif choice == "2":
        data = list(map(int, input("Enter the dataset (for eg:1,2,3,4): ").split(',')))
        mean = sum(data) / len(data)
        median = sorted(data)[len(data) // 2]
        mode = max(set(data), key=data.count)
        print(f"Mean: {mean}")
        print(f"Median: {median}")
        print(f"Mode: {mode}")
    else:
        print("Invalid choice. Please select either 1 or 2.")

def Measures_of_Dispersion():
    print("\nMeasure of Dispersion:")
    print("1. Read about Measures of Dispersion")
    print("2. Solve a Measures of Dispersion Problem")
    choice = int(input("Choose an option (1-2): "))
    
    if choice == 1:
        print("""
        Indicates the extent to which data points deviate from the central tendency, providing insights into 
        the variability of the data.
            ○ Range: Measures the spread by subtracting the minimum value from the maximum.
            ○ Variance (σ²): The average of the squared differences from the mean, giving insight into data dispersion.
            ○ Standard Deviation (σ): The square root of variance, a common measure for the spread of data.
            ○ Interquartile Range (IQR): The range between the first quartile (Q1) and the third quartile (Q3) of a dataset. It 
            represents the spread of the middle 50% of data.
        """)
    elif choice == 2:
        data = list(map(int, input("Enter the dataset (for eg:1,2,3,4): ").split(',')))
        n = len(data)
        mean = sum(data) / n
        variance = sum((x - mean) ** 2 for x in data) / n
        std_dev = variance ** 0.5
        sorted_data = sorted(data)
        q1_index = n // 4
        q3_index = 3 * n // 4

        if n % 4 == 0:
            q1 = (sorted_data[q1_index - 1] + sorted_data[q1_index]) / 2
            q3 = (sorted_data[q3_index - 1] + sorted_data[q3_index]) / 2
        else:
            q1 = sorted_data[q1_index]
            q3 = sorted_data[q3_index]

        iqr = q3 - q1
        Range = max(data) - min(data)
        mode = max(set(data), key=data.count)

        print(f"Mean: {mean}")
        print(f"Range: {Range}")
        print(f"Variance (σ²): {variance}")
        print(f"Standard Deviation (σ): {std_dev}")
        print(f"Interquartile Range (IQR): {iqr}")
        print(f"Mode: {mode}")
    else:
        print("Invalid choice. Please select either 1 or 2.")

def measure_of_association():
    print("\nMeasure of Association:")
    print("1. Read about Measure of Association")
    print("2. Solve a Measure of Association Problem")
    choice = input("Choose an option (1-2): ")

    if choice == "1":
        print("""
        Measure of Association:
        - Covariance: It measures the direction of the linear relationship between two variables.
        - Correlation: A standardized measure of the strength and direction of the relationship between two variables.
        """)
    elif choice == "2":
        data_x = list(map(int, input("Enter the first dataset (for eg:1,2,3,4): ").split(',')))
        data_y = list(map(int, input("Enter the second dataset (for eg:1,2,3,4): ").split(',')))

        if len(data_x) != len(data_y):
            print("Error: The datasets must have the same length.")
            return

        n = len(data_x)
        mean_x = sum(data_x) / n
        mean_y = sum(data_y) / n

        covariance = sum((data_x[i] - mean_x) * (data_y[i] - mean_y) for i in range(n)) / n

        sum_x_squared = sum((x - mean_x) ** 2 for x in data_x)
        sum_y_squared = sum((y - mean_y) ** 2 for y in data_y)
        correlation = covariance / (sum_x_squared ** 0.5 * sum_y_squared ** 0.5)

        print(f"Covariance: {covariance}")
        print(f"Correlation Coefficient: {correlation}")
    else:
        print("Invalid choice. Please select either 1 or 2.")

def skewness():
    print("\nSkewness:")
    print("1. Read about Skewness")
    print("2. Solve Skewness Problem")
    choice = input("Choose an option (1-2): ")

    if choice == "1":
        print("""
        Skewness measures the asymmetry of the data distribution. 
        - Positive skew: Data is skewed right (tail on the right).
        - Negative skew: Data is skewed left (tail on the left).
        """)
    elif choice == "2":
        data = list(map(int, input("Enter the dataset (for eg:1,2,3,4): ").split(',')))
        n = len(data)
        mean = sum(data) / n
        variance = sum((x - mean) ** 2 for x in data) / n
        std_dev = variance ** 0.5
        skewness_value = sum((x - mean) ** 3 for x in data) / (n * std_dev ** 3)
        print(f"Skewness: {skewness_value}")
    else:
        print("Invalid choice. Please select either 1 or 2.")

def kurtosis():
    print("\nKurtosis:")
    print("1. Read about Kurtosis")
    print("2. Solve Kurtosis Problem")
    choice = input("Choose an option (1-2): ")

    if choice == "1":
        print("""
        Kurtosis measures the "tailedness" of the data distribution.
        - Positive kurtosis: Distribution with heavy tails (outliers).
        - Negative kurtosis: Distribution with light tails (fewer outliers).
        """)
    elif choice == "2":
        data = list(map(int, input("Enter the dataset (comma-separated values): ").split(',')))
        n = len(data)
        mean = sum(data) / n
        variance = sum((x - mean) ** 2 for x in data) / n
        std_dev = variance ** 0.5
        kurtosis_value = sum((x - mean) ** 4 for x in data) / (n * std_dev ** 4) - 3
        print(f"Kurtosis: {kurtosis_value}")
    else:
        print("Invalid choice. Please select either 1 or 2.")

def probability_basics():
    print("\nProbability Basics:")
    print("1. Read about Probability Basics")
    print("2. Solve Probability Basics Problem")
    choice = input("Choose an option (1-2): ")

    if choice == "1":
        print("""
        Probability is the measure of the likelihood of an event occurring. It is calculated as:
        - P(E) = Number of favorable outcomes / Total number of possible outcomes.
        """)
    elif choice == "2":
        event = int(input("Enter the number of favorable outcomes: "))
        total_outcomes = int(input("Enter the total number of outcomes: "))
        
        if total_outcomes == 0:
            print("Error: Total outcomes cannot be zero.")
            return
        
        probability = event / total_outcomes
        print(f"Probability: {probability}")
    else:
        print("Invalid choice. Please select either 1 or 2.")

def z_score():
    print("\nZ-Score:")
    print("1. Read about Z-Score")
    print("2. Solve Z-Score Problem")
    choice = input("Choose an option (1-2): ")

    if choice == "1":
        print("""
        Z-Score: It represents the number of standard deviations a data point is from the mean.
        Formula: Z = (X - μ) / σ
        """)
    else:
        print("Invalid choice. Please select either 1 or 2.")

def normal_distribution():
    print("\nNormal Distribution:")
    print("1. Read about Normal Distribution")
    print("2. Solve Normal Distribution Problem")
    choice = input("Choose an option (1-2): ")

    if choice == "1":
        print("""
        Normal Distribution: It is a symmetric bell-shaped curve that represents the distribution of many types of data.
        Formula: P(X) = (1 / (σ√2π)) * e^(-0.5 * ((X - μ) / σ)^2)
        """)
    elif choice == "2":
        print("sorry for this disappointment,But it come soon.....")
    else:
        print("Invalid choice. Please select either 1 or 2.")

def estimating_population_parameters():
    print("\nEstimating Population Parameters:")
    print("1. Read about Estimating Population Parameters")
    print("2. Solve Estimating Population Parameters Problem")
    choice = input("Choose an option (1-2): ")

    if choice == "1":
        print("""
        Estimating Population Parameters involves calculating confidence intervals to estimate unknown parameters, 
        such as the population mean, based on sample data.
        Formula: CI = x̄ ± Z * (s / √n)
        """)
    elif choice == "2":
        sample_mean = float(input("Enter the sample mean (x̄): "))
        sample_size = int(input("Enter the sample size (n): "))
        sample_std_dev = float(input("Enter the sample standard deviation (s): "))
        
        margin_of_error = 1.96 * (sample_std_dev / (sample_size ** 0.5))  # for 95% confidence
        confidence_interval_lower = sample_mean - margin_of_error
        confidence_interval_upper = sample_mean + margin_of_error
        
        print(f"Confidence Interval: ({confidence_interval_lower}, {confidence_interval_upper})")
    else:
        print("Invalid choice. Please select either 1 or 2.")

def hypothesis_testing():
    print("\nHypothesis Testing:")
    print("1. Read about Hypothesis Testing")
    print("2. Solve Hypothesis Testing Problem")
    choice = input("Choose an option (1-2): ")

    if choice == "1":
        print("""
        Hypothesis Testing involves testing an assumption regarding a population parameter using sample data.
        Formula for Z-test: Z = (x̄ - μ) / (σ / √n)
        """)
    elif choice == "2":
        sample_mean = float(input("Enter the sample mean (x̄): "))
        population_mean = float(input("Enter the population mean (μ): "))
        sample_std_dev = float(input("Enter the sample standard deviation (s): "))
        sample_size = int(input("Enter the sample size (n): "))
        
        z_score = (sample_mean - population_mean) / (sample_std_dev / math.sqrt(sample_size))
        
        print(f"Z-Score: {z_score}")
        if abs(z_score) > 1.96:  # For 95% confidence
            print("Reject the null hypothesis.")
        else:
            print("Fail to reject the null hypothesis.")
    else:
        print("Invalid choice. Please select either 1 or 2.")

def set_significance_level():
    print("\nSet the Significance Level (α):")
    print("1. Read about Significance Level")
    print("2. Solve Significance Level Problem")
    choice = input("Choose an option (1-2): ")

    if choice == "1":
        print("""
        The significance level (α) is the probability of rejecting the null hypothesis when it is true.
        Common values are 0.05, 0.01, or 0.10.
        """)
    elif choice == "2":
        alpha = float(input("Enter the significance level (α) (e.g., 0.05): "))
        
        if alpha <= 0 or alpha >= 1:
            print("Error: Significance level should be between 0 and 1.")
            return
        
        print(f"Significance Level (α): {alpha}")
    else:
        print("Invalid choice. Please select either 1 or 2.")

def type_i_and_type_ii_error():
    print("\nType I and Type II Error:")
    print("1. Read about Type I and Type II Error")
    print("2. Solve Type I and Type II Error Problem")
    choice = input("Choose an option (1-2): ")

    if choice == "1":
        print("""
        Type I Error (α) occurs when we reject a true null hypothesis.
        Type II Error (β) occurs when we fail to reject a false null hypothesis.
        Power of the test (1 - β) is the probability of correctly rejecting the null hypothesis.
        """)
    elif choice == "2":
        alpha = float(input("Enter the significance level (α): "))
        beta = float(input("Enter the probability of Type II error (β): "))
        
        print(f"Type I Error (α): {alpha}")
        print(f"Type II Error (β): {beta}")
        print("Power of the test (1 - β):", 1 - beta)
    else:
        print("Invalid choice. Please select either 1 or 2.")
print("Diving Into the World of the Data")
while True:
    choice=int(input("""
    1.Descriptive Statistics
    2.Probability
    3.Inferential Statistics
    4.Exit"""))
    if choice==1:
              a=int(input("""
               1.Measure of Central Tendency
               2.Measures of Dispersion
               3.Measure of Association
               4.Skewness
               5.Kurtosis
               """))
              if a==1:
                  measure_of_central_tendency()
              elif a==2:
                  Measures_of_Dispersion()
              elif a==3:
                  measure_of_association()
              elif a==4:
                  skewness()
              elif a==5:
                  kurtosis()
              else:
                  print("invalid choice ")
    elif choice==2:
              a=int(input("""
               1.Probability Basics
               2.Normal Distribution
               """))
              if a==1:
                  probability_basics()
              elif a==2:
                  normal_distribution()
              else:
                  print("Invalid choice")
            
    elif choice==3:
              a=int(input("""
               1. Estimating Population Parameters
               2.Hypothesis Testing
               3.Set the Significance Level (α)
               4.Type I and Type II Error
               5.Central Limit Theorem
               """))
              if a==1:
                  Estimating_Population_Parameters()
              elif a==2:
                  hypothesis_testing()
              elif a==3:
                  Set_the_Significance_Level(α)()
              elif a==4:
                  type_i_and_type_ii_error()
              elif a==5:
                  Central_Limit_Theorem()
    elif choice==4:
        print("Thankyou for spending time!")
        break
    else:
                print("Invalid choice")


Diving Into the World of the Data



    1.Descriptive Statistics
    2.Probability
    3.Inferential Statistics
    4.Exit 3

               1. Estimating Population Parameters
               2.Hypothesis Testing
               3.Set the Significance Level (α)
               4.Type I and Type II Error
               5.Central Limit Theorem
                2



Hypothesis Testing:
1. Read about Hypothesis Testing
2. Solve Hypothesis Testing Problem


Choose an option (1-2):  2
Enter the sample mean (x̄):  46
Enter the population mean (μ):  46
Enter the sample standard deviation (s):  846
Enter the sample size (n):  56


Z-Score: 0.0
Fail to reject the null hypothesis.



    1.Descriptive Statistics
    2.Probability
    3.Inferential Statistics
    4.Exit 2

               1.Probability Basics
               2.Normal Distribution
               3.Z-Score
                2



Normal Distribution:
1. Read about Normal Distribution
2. Solve Normal Distribution Problem


Choose an option (1-2):  2
Enter the mean (μ):  145
Enter the standard deviation (σ):  56
Enter the value (x) to calculate the probability for:  5


Probability Density (Normal Distribution): 0.0003130053659565811



    1.Descriptive Statistics
    2.Probability
    3.Inferential Statistics
    4.Exit 4


Thankyou for spending time!


-2.449489742783178


## Conclusion 📉
The workflow and benefits outlined above ensure a systematic approach to mastering statistical concepts. This document serves as a foundation for data enthusiasts, empowering them to analyze, interpret, and make data-driven decisions effectively. For those eager to dive deeper, practical problem-solving exercises and detailed explanations await exploration!
