# Statistics Assignment

## Q1. What is Statistics?
Statistics is the branch of mathematics that deals with collecting, analyzing, interpreting, presenting, and organizing data. It helps in making decisions based on data by applying various statistical methods.

## Q2. Define the different types of statistics and give an example of when each type might be used.
There are two main types of statistics: Descriptive and Inferential.

1. **Descriptive Statistics**: These summarize and describe the main features of a dataset. Examples include measures of central tendency (mean, median, mode) and measures of variability (range, variance, standard deviation). For example, summarizing the average test scores of a class.

2. **Inferential Statistics**: These use a random sample of data taken from a population to make inferences about the population. Examples include hypothesis testing, confidence intervals, and regression analysis. For example, predicting election results based on a sample of voters.

## Q3. What are the different types of data and how do they differ from each other? Provide an example of each type of dta.
Data can be classified into two main types: Qualitative (Categorical) and Quantitative (Numerical).

1. **Qualitative Data**: These describe categories or groups and are non-numeric.
   - **Nominal**: Categories with no inherent order (e.g., types of fruits: apple, banana, cherry).
   - **Ordinal**: Categories with a specific order (e.g., rankings: first, second, third).

2. **Quantitative Data**: These represent numerical values.
   - **Discrete**: Countable values (e.g., number of students in a class).
   - **Continuous**: Measurable values that can take any value within a range (e.g., height of students).

## Q4. Categorize the following datasets with respect to quantitative and qualitative data types:
1. **Grading in exam: A+, A, B+, B, C+, C, D, E**
   - Qualitative (Ordinal)

2. **Colour of mangoes: yellow, green, orange, red**
   - Qualitative (Nominal)

3. **Height data of a class: [178.9, 179, 179.5, 176, 177.2, 178.3, 175.8,...]**
   - Quantitative (Continuous)

4. **Number of mangoes exported by a farm: [500, 600, 478, 672, ...]**
   - Quantitative (Discrete)

## Q5. Explain the concept of levels of measurement and give an example of a variable for each level.
Levels of measurement refer to the different ways variables can be quantified or categorized.

1. **Nominal**: Categories without a specific order. Example: Eye color (blue, brown, green).
2. **Ordinal**: Categories with a specific order but no fixed interval. Example: Customer satisfaction rating (satisfied, neutral, dissatisfied).
3. **Interval**: Numeric values with a specific order and equal intervals, but no true zero. Example: Temperature in Celsius.
4. **Ratio**: Numeric values with a specific order, equal intervals, and a true zero. Example: Weight (kg).

## Q6. Why is it important to understand the level of measurement when analyzing data? Provide an example to illustrate your answer.
Understanding the level of measurement is crucial because it determines the appropriate statistical methods to use. For example, calculating the mean makes sense for interval and ratio data but not for nominal or ordinal data. 

Example: For customer satisfaction (ordinal data), calculating the mean rating is inappropriate, but determining the median or mode is useful.

## Q7. How is the nominal data type different from the ordinal data type?
- **Nominal Data**: Categories with no inherent order (e.g., blood types: A, B, AB, O).
- **Ordinal Data**: Categories with a specific order but unequal intervals (e.g., class grades: A, B, C, D).

## Q8. Which type of plot can be used to display data in terms of range?
Box plots (box-and-whisker plots) can be used to display data in terms of range, showing the distribution, median, quartiles, and outliers.

## Q9. Describe the difference between descriptive and inferential statistics. Give an example of each type of statistics and explain how they are used.
- **Descriptive Statistics**: Summarize and describe data features (e.g., mean, median, mode). Example: Calculating the average test score in a class.
- **Inferential Statistics**: Make predictions or inferences about a population based on a sample (e.g., hypothesis testing). Example: Estimating the average height of all students in a school based on a sample.

## Q10. What are some common measures of central tendency and variability used in statistics? Explain how each measure can be used to describe a dataset.
- **Central Tendency**:
  - **Mean**: The average value. Example: The mean salary of employees.
  - **Median**: The middle value when data is sorted. Example: The median house price.
  - **Mode**: The most frequent value. Example: The mode of shoe sizes sold.

- **Variability**:
  - **Range**: The difference between the maximum and minimum values. Example: The range of test scores.
  - **Variance**: The average squared deviation from the mean. Example: The variance of stock prices.
  - **Standard Deviation**: The square root of variance, indicating data dispersion. Example: The standard deviation of daily temperatures.




In [None]:
# Import necessary libraries
import seaborn as sns
import plotly.express as px
import pandas as pd

# Q1: Scatter plot for Titanic dataset using Plotly
titanic = sns.load_dataset('titanic')
fig1 = px.scatter(titanic, x='age', y='fare', title='Age vs Fare in Titanic Dataset')
fig1.show()

# Q2: Box plot for Tips dataset using Plotly
tips = px.data.tips()
fig2 = px.box(tips, x='day', y='total_bill', title='Box Plot of Total Bill by Day')
fig2.show()

# Q3: Histogram for Tips dataset using Plotly
fig3 = px.histogram(tips, x='sex', y='total_bill', color='day', pattern_shape='smoker', title='Histogram of Total Bill by Sex')
fig3.show()

# Q4: Scatter matrix plot for Iris dataset using Plotly
iris = px.data.iris()
fig4 = px.scatter_matrix(iris, dimensions=['sepal_length', 'sepal_width', 'petal_length', 'petal_width'], color='species', title='Scatter Matrix of Iris Dataset')
fig4.show()

# Q5: Distplot using Plotly
fig5 = px.histogram(tips, x='total_bill', nbins=20, marginal='box', title='Distplot of Total Bill')
fig5.show()

