A/B testing is commonly used in online marketing and user experience design to optimize website conversion rates, increase click-through rates, and improve overall user engagement. By testing different variations of a design or feature, businesses can make data-driven decisions about what changes to implement to improve performance.

The statistical tests used in A/B testing typically involve comparing the means or proportions of the two groups using techniques such as t-tests or chi-square tests. The results of the tests can help determine whether the observed differences between the groups are statistically significant or due to chance.

It involves randomly dividing a sample group into two groups - one group is shown the original version (control group) and the other group is shown a modified version (experimental group). The groups are then compared to see which version produces the desired outcome.

## The statistical tests

T-test: A t-test is used to compare the means of two groups. In A/B testing, a t-test can be used to determine if there is a statistically significant difference in the conversion rates of two versions of a website or app

Chi-squared test: A chi-squared test is used to determine if there is a significant difference between the observed frequencies and the expected frequencies of categorical data. In A/B testing, a chi-squared test can be used to determine if there is a significant difference in the number of conversions between two versions of a website or app.

ANOVA: Analysis of variance (ANOVA) is a statistical test used to compare the means of three or more groups. In A/B testing, ANOVA can be used to determine if there is a significant difference in conversion rates between more than two versions of a website or app.

Bayesian analysis: Bayesian analysis is a statistical method that can be used to update the probability of a hypothesis as new data becomes available. In A/B testing, Bayesian analysis can be used to determine the probability that one version of a website or app is better than another.

Regression analysis: Regression analysis is a statistical method used to analyze the relationship between one dependent variable and one or more independent variables. In A/B testing, regression analysis can be used to determine if there is a relationship between the version of a website or app and the conversion rate.

### power of a statistical test

The power of a statistical test is the probability of correctly rejecting the null hypothesis when the alternative hypothesis is true. 

The power of a test is influenced by several factors, including the sample size, the effect size, the alpha level (significance level), and the variability of the data.

 larger sample size, a larger effect size, a lower alpha level, and a lower variability of the data all increase the power of the test.

### T-test:

s a statistical hypothesis test used to compare the means of two samples. It is a parametric test that assumes the data follows a normal distribution and the variance of the two groups being compared is equal.

example

Suppose you want to test whether a new weight loss pill is effective in helping people lose weight. You randomly select 50 overweight individuals and give the pill to 25 of them while the other 25 receive a placebo. After 8 weeks, you measure the weight loss of each individual and record the data.

you can perform a t-test on the two groups' weight loss data. 


The null hypothesis is that there is no difference in weight loss between the two groups

The alternative hypothesis is that the weight loss pill results in more weight loss than the placebo.

The t-statistic is calculated as follows:
t = (x̄1 - x̄2) / (s / sqrt(n))

If the t-statistic is large enough, you can reject the null hypothesis and conclude that the weight loss pill is effective. 

#### types of t test

One-sample t-test: This type of t-test is used when a company wants to test whether a certain characteristic of a product or service meets a certain standard or benchmark.

whether the mean IQ of a sample of students is significantly different from the population mean IQ of 100.

Independent samples t-test: This type of t-test is used when a company wants to compare the performance of two groups of employees or customers, or two different products or services.

Paired samples t-test: This type of t-test is used when a company wants to test whether there is a significant difference between two sets of data that are related , such as before and after measurements of a product or service.

Welch's t-test: This type of t-test is used when the variances of the two groups being compared are unequal.

Student's t-test for equal variances: This type of t-test is used when the variances of the two groups being compared are equal.

####  T value

The t-value is the calculated value of the t-statistic, which measures the difference between the sample mean and the null hypothesis mean in terms of standard error units.

A negative t-value in a t-test indicates that the sample mean is lower than the null hypothesis mean.

A large t-value indicates a strong difference between the means and provides more evidence against the null hypothesis, making it more likely to be rejected. 

larger t-value indicates that the difference between the means is likely to have due to change process

A small t-value indicates a weaker difference between the means and may not provide enough evidence to reject the null hypothesis.

A small t-value, on the other hand, suggests that the difference between the means may be due to chance and is not statistically significant.

As the t-value increases, the critical value decreases, making it easier to reject the null hypothesis.

### degrees of freedom

The degrees of freedom (df) is a measure of the amount of information available in the data. It is calculated as n-1, where n is the sample size. In this case, the df value is 19.

As the degrees of freedom increase, the t-distribution becomes narrower and more concentrated around the mean making it easier to reject the null hypothesis even for small differences.

### p-value

p-value = 0.096: The p-value is the probability of obtaining a t-value as extreme or more extreme than the observed t-value, assuming that the null hypothesis is true. In this case, the p-value is 0.096, which is greater than the typical significance level of 0.05. This means that we can not support alternative hypothisis.

9.6 % probability of both groups are similar where they have a difference that is very low so we can state null hypothisis is require.

The level of significance is typically set to a value between 0 and 1, and represents the maximum probability of rejecting the null hypothesis when it is actually true.

 A commonly used level of significance in scientific research is 0.05, which indicates a 5% probability of rejecting the null hypothesis when it is actually true.

In R, you can conduct a t-test by using the t.test() function. To set the p-value for the test, you can use the conf.level argument, which specifies the confidence level of the interval. By default, this argument is set to conf.level = 0.95, which corresponds to a two-tailed test with a significance level of alpha = 0.05.

 ##### large t-value with a small degrees of freedom and a low level of significance (i.e., high p-value) may not be sufficient to reject the null hypothesis.

####  small t-value with a large degrees of freedom and a high level of significance (i.e., a low p-value) can provide enough evidence to reject the null hypothesis 

#### One-sample t-test

Suppose you have a sample of 20 observations, and you want to test whether the mean of the sample is significantly different from a population mean of 50.

In [1]:
data <- c(5, 48, 52, 47, 12, 10, 53, 54, 47, 50, 1239, 46, 4, 0, 49, 52, 51, 48, 11150, 53)
mean <- mean(data);mean
sd <- sd(data)
t.test(data, mu = 50)
#The mu argument specifies the population mean to be tested against. 


	One Sample t-test

data:  data
t = 1.086, df = 19, p-value = 0.2911
alternative hypothesis: true mean is not equal to 50
95 percent confidence interval:
 -509.6148 1816.6148
sample estimates:
mean of x 
    653.5 


As P low null go so p-value, The p-value in this example is 0.002786, which is less than the typical significance level of 0.05, indicating that the sample mean is significantly different from the population mean of 50.(alternative hypothesis: true mean is not equal to 50)

### Independent samples t-test

Suppose you have two groups of data, and you want to test whether the means of the two groups are significantly different. Let's call the groups "Group 1" and "Group 2".

In [7]:
group1 <- c(10, 12, 14, 6, 18,4,5,6,6)
group2 <- c(8, 11, 13, 15, 17)
t.test(group1, group2,alternative = "greater", conf.level = 0.95)


	Welch Two Sample t-test

data:  group1 and group2
t = -1.7002, df = 10.828, p-value = 0.9412
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
 -7.819783       Inf
sample estimates:
mean of x mean of y 
      9.0      12.8 


In this example, mu is the hypothesized mean, alternative specifies the alternative hypothesis, and conf.level is set to 0.99 to achieve a p-value of 0.01.

In [6]:
1 - 0.95

alternative hypothesis: true difference in means is not equal to 0

The p-value in this example is 0.1176, which is greater than the typical significance level of 0.05, indicating that there is not enough evidence to reject the null hypothesis that the means of the two groups are equal.

### Paired samples t-test

In [None]:
Suppose we have the following dataset representing the weights of 10 individuals before and after a weight loss program

In [10]:
before <- c(150, 160, 170, 180, 190, 200, 210, 220, 230, 240)
after <- c(125, 105, 16, 175, 185, 195, 25, 25, 225, 25)


To conduct a paired samples t-test to determine if there is a significant difference in weight before and after the weight loss program

In [12]:
t.test(before, after, paired = TRUE,alternative = "greater", conf.level = 0.98)


	Paired t-test

data:  before and after
t = 2.9636, df = 9, p-value = 0.007933
alternative hypothesis: true difference in means is greater than 0
98 percent confidence interval:
 16.19077      Inf
sample estimates:
mean of the differences 
                   84.9 


The p-value of 0.007933 indicates that there is a significant difference between the heights before and after the treatment at a 0.05 level of significance

### Welch's t-test:

In [None]:
Suppose we have two independent datasets, representing the heights of two groups of individuals:

In [14]:
group1 <- c(170, 165, 172, 177, 168, 180, 175, 170, 174, 169)
group2 <- c(180, 172, 178, 173, 186, 168, 175, 182, 170, 179, 180, 177, 183)

To conduct Welch's t-test to determine if there is a significant difference in height between the two groups, 

In [14]:
# Conduct Welch's t-test
t.test(group1, group2, var.equal = FALSE,alternative = "greater", conf.level = 0.95)


	Welch Two Sample t-test

data:  group1 and group2
t = -1.7002, df = 10.828, p-value = 0.9412
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
 -7.819783       Inf
sample estimates:
mean of x mean of y 
      9.0      12.8 


 The confidence interval of (-6.4913389, 0.5913389) indicates that the true difference in means could range from negative to positive values, but includes 0, which supports the conclusion that there is no significant difference between the two groups.

The p-value of 0.07996 indicates that there is not a significant difference in height between the two groups at a 0.05 level of significance.

### Student's t-test for equal variances:

In [15]:
# Create example data
group1 <- c(9, 12, 13, 15, 11, 10, 14, 11, 15, 12)
group2 <- c(10, 9, 11, 13, 12, 10, 14, 13, 11, 12)
result <- t.test(group1, group2, var.equal = TRUE,conf.level = 0.95);result



	Two Sample t-test

data:  group1 and group2
t = 0.85661, df = 18, p-value = 0.4029
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -1.016825  2.416825
sample estimates:
mean of x mean of y 
     12.2      11.5 


We can see that the t-statistic is -0.70711 and the p-value is 0.4895. Because the p-value is greater than our chosen significance level (let's say 0.05), we fail to reject the null hypothesis that the true means of the two groups are equa

#### Guide

In hypothesis testing, we compare the p-value at the significance level (alpha) to determine whether to reject or fail to reject the null hypothesis.

### Powe of test two-sample t-test

The power of a t-test is the probability of correctly rejecting the null hypothesis when the alternative hypothesis is true.

In [17]:
power.t.test(n = 50, delta = 0.5, sd = 1, sig.level = 0.05, power = NULL)


     Two-sample t test power calculation 

              n = 50
          delta = 0.5
             sd = 1
      sig.level = 0.05
          power = 0.6968888
    alternative = two.sided

NOTE: n is number in *each* group


n: the sample size for each group



delta: the effect size (difference between means divided by the standard deviation) :The effect size is a quantitative measure of the magnitude or strength of the relationship between two variables or the difference between two groups in a study.

Cohen's d: This is a standardized measure of the difference between the means of two groups, expressed in terms of the pooled standard deviation. It is commonly used for comparing the means of two groups in a t-test or ANOVA.

In [None]:
sd: the standard deviation of the data

In [None]:
sig.level: the significance level (alpha)

In [None]:
power: the desired power (default is 0.8)


## Chi-squared test:

The chi-squared test is a statistical test used to determine if there is a significant association between two categorical variables. It is used to analyze count data, which is data that can be organized into categories and has a numerical value associated with each category. The test compares the observed frequencies of the categories with the expected frequencies under the assumption of no association between the variables.

chi-squared tests use the chi-squared distribution to calculate the p-value. The chi-squared distribution is a probability distribution that is used to model the sum of squared deviations of a set of normally distributed variables. 

This test is used to test the variance of a normally distributed population. 

This test can be used to test whether there is a relationship between two categorical variables, including proportions.

This test is used to determine if there is a significant difference between the observed and expected frequencies of a categorical variable.

This test can be used to test for randomness in a sequence of categorical data (i.e. data that takes on more than two values).

### types of Chi-squared test

Goodness-of-fit test:
The goodness-of-fit test is used to determine if an observed frequency distribution fits a theoretical or expected frequency distribution.

This type of test is often used to determine if a sample of data comes from a population with a specific distribution.

The test of independence : is used to determine if there is a significant association between two categorical variables.

This type of test is often used to analyze data from a contingency table,

table that summarizes the frequencies of the different combinations of categories of the two variables.

### X-squarae

In the context of a chi-squared test, the X-squared (χ²) value is a statistic that measures the difference between the expected frequencies and the observed frequencies in a contingency table.

##### observed frequencies

The observed frequencies are the actual frequencies of occurrence of a categorical variable in a sample or population. For example, if you conducted a survey asking people to indicate their favorite color, the observed frequencies would be the number of responses in each category, such as red, blue, green, etc.



##### expected frequencies 

The expected frequencies are the frequencies that would be expected to occur in each category of a categorical variable under a specific theoretical distribution or null hypothesis. For example, if you were testing whether the distribution of favorite colors in a population follows a uniform distribution

In general, a larger X-squared value indicates a greater deviation between the observed and expected frequencies, which may be evidence against the null hypothesis.

 is calculated as the sum of the squared differences between the observed and expected frequencies, divided by the expected frequencies.

The X-squared value is then compared to a critical value from the chi-squared distribution with a certain number of degrees of freedom (df).

#### DOF

The degrees of freedom for a chi-squared test depend on the number of rows and columns in the contingency table, and are calculated as (number of rows - 1) * (number of columns - 1).

If the X-squared value is greater than the critical value from the chi-squared distribution, we reject the null hypothesis and conclude that there is a significant difference between the expected frequencies and the observed frequencies. 

#### p-value

The p-value associated with the X-squared value and degrees of freedom can also be used to determine the level of significance of the test.

### Goodness-of-fit test chi square

The chi-square goodness-of-fit test is based on the chi-square distribution, which is a continuous probability distribution that arises in the context of hypothesis testing.

In [50]:
observed_counts<- c(24, 35, 566)

We want to test whether these observed counts follow a uniform distribution, where each category has an equal probability of occurring. The expected counts for a uniform distribution can be calculated as:

In [51]:
expected_probs <- c(0.2, 0.3, 0.5)

In [52]:
sum(0.1, 0.4, 0.5)

In [53]:
chisq.test(observed_counts, p = expected_probs)


	Chi-squared test for given probabilities

data:  observed_counts
X-squared = 411.28, df = 2, p-value < 2.2e-16


X-squared: In a chi-square goodness-of-fit test, the X-squared statistic is used to test whether a set of observed categorical data is consistent with a specified theoretical distribution.

This result indicates that the observed frequencies in the data do not differ significantly from the expected frequencies under the null hypothesis. A p-value of 1 means that there is no evidence to reject the null hypothesis and we can conclude that the observed frequencies are consistent with the expected frequencies.

If the p-value of a chi-squared goodness-of-fit test in R is greater than 0.05, it means that we fail to reject the null hypothesis

### Note

It's important to note that the X-squared value should not be used in isolation to interpret the results of a chi-squared goodness-of-fit test. It should be used in conjunction with the p-value, degrees of freedom, and effect size

### The test of independence 

contingency table

A contingency table is a table that displays the frequencies of two or more categorical variables in a cross-tabulated format. The table is used to summarize the relationship between the variables and provides a way to visualize the distribution of observations across the categories.

In [54]:
# Create a contingency table of observed frequencies
observed <- table(gender, voting_preference)

ERROR: Error in table(gender, voting_preference): object 'gender' not found


In [57]:
mydata <- data.frame(
                     gender = c("Male","Female", "Male", "Female","Male", "Female","Male"),
                     looks = c("sexy", "notsexy", "agly","sexy","sexy", "not sexy","agly"))

In [59]:
mytable <- table(mydata$gender, mydata$looks);mytable

        
         agly not sexy sexy
  Female    0        2    1
  Male      2        0    2

In [61]:
# Perform the chi-squared test of independence
result <- chisq.test(mytable);result 

"Chi-squared approximation may be incorrect"



	Pearson's Chi-squared test

data:  mytable
X-squared = 4.2778, df = 2, p-value = 0.1178


 In a chi-square test of independence, the X-squared statistic is used to test whether two categorical variables are independent of each other.

In [None]:
No enough evidance found to reject null hypotisis ,two categorical values are independent 

### Powe of X-squared

In [None]:
(tomorrow)

In [6]:
 effect.size <- 0.3
df <- 1
alpha <- 0.05

# Calculate the sample size or power using the pwr.chisq.test() function
library(pwr)
result <- pwr.chisq.test(w = effect.size, df = df, sig.level = alpha)
result


ERROR: Error in pwr.chisq.test(w = effect.size, df = df, sig.level = alpha): exactly one of w, N, df, power, and sig.level must be NULL


In [None]:
######Tomorrow##############################################################

### ANOVA

Analysis of variance (ANOVA) is a statistical test used to compare the means of three or more groups.

### Types of Anova

One-Way ANOVA: This type of ANOVA is used to compare the means of three or more independent groups. One-way ANOVA examines only one independent variable (factor) at a time.

Suppose a company wants to compare the productivity of its employees across three different departments. The company has collected data on the number of tasks completed by each employee in each department over a week.

Two-Way ANOVA: This type of ANOVA is used to examine the effects of two independent variables (factors) on a dependent variable. For example, if we want to test the effect of both gender and age on a test score, we can use a two-way ANOVA.

MANOVA (Multivariate Analysis of Variance): This type of ANOVA is used when we have two or more dependent variables.

 Unlike ANOVA, which analyzes only one dependent variable at a time, MANOVA analyzes multiple dependent variables simultaneously.

Suppose a pharmaceutical company wants to compare the effectiveness of three different drugs on two different measures: blood pressure and heart rate. The company has collected data on these two measures for each patient in each drug group.

Repeated Measures ANOVA: This type of ANOVA is used when we have repeated measures of the same group. For example, if we want to test the effect of a drug on blood pressure over time, we can use a repeated measures ANOVA.

 It is used to test the effects of an independent variable on a dependent variable over time or under different conditions.

Mixed-Effects ANOVA: This type of ANOVA is used when we have both fixed and random effects. Fixed effects are variables that we are interested in studying, while random effects are variables that are not of primary interest but are included in the analysis to account for variability in the data.

the researcher would recruit employees from different departments in a company and randomly assign them to either a training group or a control group. The researcher would measure the employees' job satisfaction levels before and after the training program and collect data on the employees' department affiliation.