In [1]:

# statistics

### 1. What is the meaning of six sigma in statistics?  Give proper example

In statistics, Six Sigma refers to a quality management methodology focused on reducing defects or
errors in processes to an extremely low level. The term "Six Sigma" signifies a statistical measure
representing how far a process deviates from perfection, allowing only 3.4 defects per million opportunities.
This methodology aims to improve the quality of processes, products, and services by minimizing variation and
enhancing efficiency.

Here's a breakdown of what Six Sigma entails:

Define: Define the problem, project goals, and customer requirements clearly. Identify the critical-to-quality characteristics.

Measure: Measure and quantify the current performance of the process, product, or service. Establish baseline metrics and collect relevant data.

Analyze: Analyze the data to identify root causes of defects and variations in the process. Use statistical tools and techniques to identify areas for improvement.

Improve: Implement solutions and improvements to address the identified root causes. Test and validate the improvements to ensure they effectively reduce defects.

Control: Establish control mechanisms to sustain the improvements and prevent regression. Monitor the process continuously and implement corrective actions when necessary.

Here's an example to illustrate Six Sigma in action:

Imagine a company that produces electronic devices, and one of its critical processes is soldering components onto circuit boards. However, the soldering process occasionally results in defects such as incomplete connections or solder bridges, leading to product failures.

## 2 What type of data does not have a log-normal distribution or a Gaussian distribution?  Give proper example

Data that do not follow a log-normal distribution or a Gaussian (normal) distribution are typically referred to as non-normally distributed data. There are many types of non-normally distributed data, and their distribution patterns can vary widely. Here are a few examples:

Skewed Data: Skewed data have a non-symmetrical distribution where the tail of the distribution extends more to one side than the other. There are two types of skewness:

Positive Skewness: The tail of the distribution extends towards higher values.
Negative Skewness: The tail of the distribution extends towards lower values.
Example: Income data often exhibit positive skewness, as most people earn moderate to low incomes, but a few individuals or households earn very high incomes, causing the distribution to be skewed to the right.

Heavy-Tailed Distributions: Heavy-tailed distributions have tails that decay more slowly than the tails of a normal distribution. This means they have a higher probability of extreme values or outliers than a normal distribution.

Example: Stock market returns often exhibit heavy-tailed distributions, with occasional extreme positive or negative returns that are unlikely under a normal distribution.

Discrete Distributions: Some datasets consist of discrete values rather than continuous ones. These data may follow specific discrete probability distributions such as the Poisson distribution or the binomial distribution.

Example: The number of customers arriving at a store during a given hour follows a Poisson distribution if the arrivals are random and the average arrival rate is constant.

Bimodal or Multimodal Distributions: Bimodal or multimodal distributions have more than one peak, indicating distinct subgroups within the data.

Example: Height data from a population that includes both children and adults might exhibit a bimodal distribution, with one peak representing children's heights and another peak representing adults' heights.

Uniform Distributions: Uniform distributions have equal probabilities for all possible outcomes within a given range, resulting in a flat histogram.

Example: Rolling a fair six-sided die produces a uniform distribution of outcomes, with each side having an equal probability of 1/6.

Mixed Distributions: Mixed distributions combine elements of different distribution types. They may consist of a combination of continuous and discrete components or multiple distribution functions.

Example: A dataset of medical expenses might consist of a mixture of zero values (no expenses) and a positive skewed distribution of expenses for individuals who incurred medical costs.

These are just a few examples of non-normally distributed data. In reality, data can exhibit a wide range of distribution patterns, and it's essential to understand the specific characteristics of the data when performing statistical analysis.

## 3 What is the meaning of the five-number summary in Statistics? Give proper example

The five-number summary is a descriptive statistics technique used to summarize the distribution of a dataset. It consists of five key values that represent various aspects of the dataset's distribution. These values include the minimum, first quartile (Q1), median (second quartile or Q2), third quartile (Q3), and maximum. The five-number summary is particularly useful for understanding the central tendency, spread, and skewness of a dataset.

Here's a breakdown of the five values in the five-number summary:

Minimum: The smallest value in the dataset.

First Quartile (Q1): The value below which 25% of the data falls. It represents the lower quartile or the 25th percentile.

Median (Q2): The middle value of the dataset when it is sorted in ascending order. It represents the 50th percentile. If the dataset has an odd number of observations, the median is the middle value. If the dataset has an even number of observations, the median is the average of the two middle values.

Third Quartile (Q3): The value below which 75% of the data falls. It represents the upper quartile or the 75th percentile.

Maximum: The largest value in the dataset.

The five-number summary is often depicted visually using a box plot (box-and-whisker plot), where the minimum, Q1, median, Q3, and maximum are represented as key points on the plot.

Here's an example to illustrate the concept of the five-number summary:

Consider the following dataset representing the test scores of 20 students:

65,70,72,75,76,78,80,82,83,85,86,87,88,90,91,92,93,95,97,99

To find the five-number summary:

Minimum: The smallest value is 65.
Q1 (First Quartile): The median of the lower half of the dataset is 
Q1=78.
Median (Q2): The median of the entire dataset is 
Q2=85.5.
Q3 (Third Quartile): The median of the upper half of the dataset is 
Q3=90.5.
Maximum: The largest value is 99.
So, the five-number summary for this dataset is: 
{65,78,85.5,90.5,99}.

## 4 What is correlation? Give an example with a dataset & graphical representation on jupyter Notebook

Correlation is a statistical measure that describes the relationship between two variables. It indicates the degree to which changes in one variable are associated with changes in another variable. Correlation values range from -1 to 1, where:

1 indicates a perfect positive correlation: As one variable increases, the other variable also increases linearly.

-1 indicates a perfect negative correlation: As one variable increases, the other variable decreases linearly.

0 indicates no correlation: There is no linear relationship between the variables.

Correlation can be calculated using various methods, with Pearson correlation being the most common for continuous variables. Spearman correlation is another option, suitable for ordinal or non-normally distributed data.

import numpy as np
import matplotlib.pyplot as plt

# Generate correlated data
np.random.seed(42)
x = np.random.normal(0, 1, 100)  # Variable 1
y = x + np.random.normal(0, 0.5, 100)  # Variable 2 with positive correlation with x

# Calculate Pearson correlation coefficient
corr_coef = np.corrcoef(x, y)[0, 1]

# Plot the data
plt.figure(figsize=(8, 6))
plt.scatter(x, y, color='blue')
plt.title(f'Correlated Data (Pearson correlation coefficient = {corr_coef:.2f})')
plt.xlabel('Variable 1')
plt.ylabel('Variable 2')
plt.grid(True)
plt.show()

# DEEP LEARNING



## Explain how you can implement DL in a real-world application.

Define the Problem: Clearly understand the problem you want to solve with DL. This could be image classification, natural language processing, time series prediction, etc.

Data Collection and Preprocessing: Gather relevant data for your problem domain. This could involve collecting images, text documents, sensor data, etc. Ensure your data is labeled and preprocessed appropriately. This may include tasks such as cleaning, normalization, and augmentation.

Model Selection: Choose an appropriate DL model architecture for your problem. This could be a Convolutional Neural Network (CNN) for image tasks, a Recurrent Neural Network (RNN) for sequential data, or a Transformer model for natural language processing tasks.

Model Training: Split your data into training, validation, and test sets. Train your DL model on the training data, using optimization algorithms like stochastic gradient descent (SGD) or Adam. Tune hyperparameters and monitor performance on the validation set to prevent overfitting.

Evaluation: Evaluate your trained model on the test set to assess its performance. Metrics such as accuracy, precision, recall, and F1-score are commonly used depending on the problem domain.

Deployment: Once you have a satisfactory model, deploy it into your real-world application. This could involve integrating it into a web service, mobile application, or embedded system. Ensure that the deployment environment can support the computational requirements of your model.

Monitoring and Maintenance: Continuously monitor the performance of your deployed model in the real-world application. Monitor for concept drift, data drift, and model degradation over time. Update your model periodically with new data and retrain if necessary to maintain performance.

Feedback Loop: Gather feedback from end-users and stakeholders to improve your DL model and the overall application. This feedback loop can involve refining the model architecture, collecting more diverse data, or adding new features to the application.

Ethical Considerations: Consider ethical implications related to data privacy, bias, and fairness throughout the entire process. Ensure that your DL model and application do not harm or discriminate against any individuals or groups.

By following these steps, you can effectively implement Deep Learning in a real-world application and address various problems across different domains.


## What is the use of Activation function in Artificial Neural Networks? What would be the problem if we don't use it in ANN networks.

Introducing Non-linearity: Activation functions introduce non-linear transformations to the output of each neuron in a neural network. Without non-linearities, the entire network would essentially collapse into a linear model, rendering it incapable of learning complex patterns and relationships in the data. This is because multiple layers of linear transformations can be reduced to a single linear transformation, making the network unable to capture non-linear patterns in the data.

Learning Complex Representations: Non-linear activation functions enable neural networks to learn and represent complex functions and relationships within the data. They allow for the modeling of intricate patterns such as edges, textures, shapes, and higher-level features in images, text, or other data types. Without activation functions, the network's capacity to learn such representations would be severely limited.

Gradient Propagation: Activation functions also play a crucial role in gradient propagation during the backpropagation algorithm, which is used to train neural networks. Non-linear activation functions help prevent the vanishing gradient problem, where gradients become very small as they propagate backward through many layers of the network. This problem can hinder the training of deep neural networks by slowing down or even halting learning altogether.

If we don't use activation functions in artificial neural networks:

Loss of Expressiveness: The network would lose its ability to capture complex patterns and relationships in the data, reducing its expressiveness and predictive power. This would limit the types of problems the network could effectively solve, particularly those that involve non-linear relationships between input and output variables.

Ineffective Learning: Without non-linear activation functions, the network's learning capacity would be severely restricted. It would struggle to learn from the data, leading to poor performance on tasks such as classification, regression, or pattern recognition.

Gradient Instability: The absence of activation functions would exacerbate the vanishing gradient problem, making it challenging to train deep neural networks effectively. This could result in slow convergence during training or even prevent the network from learning altogether.

Overall, activation functions are essential components of artificial neural networks, enabling them to model complex relationships, learn from data effectively, and avoid gradient-related issues during training.










To train a pure artificial neural network (ANN) with less than 10,000 trainable parameters on the MNIST dataset, we'll need to design a simple architecture that can effectively capture the features of the dataset while keeping the parameter count low. Here's a basic example using TensorFlow and Keras:

import numpy as np
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten

# Load and preprocess the MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0  # Normalize pixel values to the range [0, 1]

# Define the model architecture
model = Sequential([
    Flatten(input_shape=(28, 28)),  # Flatten the 28x28 input images into a 1D array
    Dense(128, activation='relu'),  # Hidden layer with 128 neurons and ReLU activation
    Dense(64, activation='relu'),   # Hidden layer with 64 neurons and ReLU activation
    Dense(10, activation='softmax') # Output layer with 10 neurons for the 10 classes and softmax activation
])

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Print the model summary
model.summary()

# Train the model
history = model.fit(x_train, y_train, epochs=10, validation_data=(x_test, y_test))

# Evaluate the model
test_loss, test_acc = model.evaluate(x_test, y_test)
print("Test accuracy:", test_acc)


2828128 (weights from input to the first hidden layer) + 128 (bias terms for the first hidden layer)
128*64 (weights from the first hidden layer to the second hidden layer) + 64 (bias terms for the second hidden layer)
64*10 (weights from the second hidden layer to the output layer) + 10 (bias terms for the output layer)

## Perform Regression Task using ANN

import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

# Generate synthetic data
np.random.seed(0)
X = np.random.rand(1000, 5)  # 1000 samples with 5 features
y = np.sum(X, axis=1) + np.random.normal(0, 0.1, 1000)  # Target variable with added noise

# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define the model architecture
model = Sequential([
    Dense(10, input_shape=(5,), activation='relu'),  # Input layer with 10 neurons and ReLU activation
    Dense(1)  # Output layer with a single neuron (regression)
])

# Compile the model
model.compile(optimizer='adam',
              loss='mean_squared_error')

# Print the model summary
model.summary()

# Train the model
history = model.fit(X_train, y_train, epochs=50, validation_split=0.2)

# Evaluate the model
y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
print("Mean Squared Error:", mse)











NameError: name 'print_output' is not defined