> **Confidence interval** is uncertainty in summary statistic represented as a range. In the other words, it is a range of values we are fairly sure our true value lies in. For example: I am 95% confident that the population mean falls between 8.76 and 15.88 $\rightarrow$ (12.32 $\pm$ 3.56)

Confidence interval tells you how confident you can be that the results from a poll or survey reflect what you would expect to find if it were possible to survey the entire population. It is difficult to obtain measurement data of an entire data set (*population*) due to limited resource & time. Your best shot is to survey a small fraction (*samples*) of the entire data set, and pray that your sample data represents the population reasonably well. 

Sample data may not be a good representation of a population by numerous factors (Ex: bias), and as a result, uncertainty is always introduced in any estimations derived from sample data. **Due to the uncertainty involved with sample data, any statistical estimation needs to be delivered in a range, not in a point estimate**.

How well a sample statistic estimates an underlying population parameter is always an issue (<a href="#population_vs_samples">Population vs. Samples</a>). A confidence interval addresses this issue because it provides a range of values which is likely to contain the population parameter of interest.

<div class="alert alert-info">
    <h4>Need a quick Python code snippet?</h4>
    <p>If you are just looking for a quick Python code snippet for practical use case, scroll down to <a href="#">below</a>.</p>
</div>

### Quick Highlights of Confidence Interval

<div class="highlights">
    <div class="highlights-title">1. Confidence interval is expressed in a <i>range</i></div>
    <div class="highlights-content">Confidence interval qunatifies the uncertainty related to a statistical estimate to mitigate the issue of <a href="#population_vs_samples">Population vs. Samples</a>. It is always expressed in a range like — $\text{C.I.}: \quad x \pm 3.43$ or $-51.4 < x < -43.2$</div>
</div>

<div class="highlights">
    <div class="highlights-title">2. Formula for confidence interval varies with statistics</div>
    <div class="highlights-content">
        <p>For <a href="#">confidence interval of mean</a></p>
        <p><div style="font-size: 1rem; margin-top: 20px;">$$ \text{C.I.}_{\text{mean}}: \quad \bar{x} \pm (z_{\frac{\alpha}{2}} \times \frac{\sigma}{\sqrt{n}})$$</div></p>
        <p>For confidence interval of proportion</p>
        <p><div style="font-size: 1rem; margin-top: 20px;">$$ \text{C.I.}_{\text{proportion}}: \quad \hat{p} \pm (z_{\frac{\alpha}{2}} \times \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} )$$</div></p>
        <p>For <a href="#">confidence interval of variance</a></p>
        <p><div style="font-size: 1rem; margin-top: 20px;">$$ \text{C.I.}_{\text{variance}}: \frac{(n-1)s^{2}}{\chi^{2}_{\frac{\alpha}{2}}} \leq \sigma^2 \leq \frac{(n-1)s^{2}}{\chi^{2}_{1-\frac{\alpha}{2}}}$$</div></p>
        <p>For confidence interval of standard deviation</p>
        <p><div style="font-size: 1rem; margin-top: 20px;">$$ \text{C.I.}_{\text{standard deviation}}: \sqrt{\frac{(n-1)s^{2}}{\chi^{2}_{\frac{\alpha}{2}}}} \leq \sigma \leq \sqrt{\frac{(n-1)s^{2}}{\chi^{2}_{1-\frac{\alpha}{2}}}}$$</div></p>
        <p>Different analytical solutions exist for different statistics. However, confidence interval for many other statistics cannot be analytically solved, simply because there are no formulas for them. If the statistic of your interest does not have an analytical solution for its confidence interval, or you simply don't know it, numerical methods like <a href="https://aegis4048.github.io/non-parametric-confidence-interval-with-bootstrap" target="_blank">boostrapping</a> can be a good alternative (and its powerful).</p>
    </div>
</div>

<div class="highlights">
    <div class="highlights-title">3. Things are VERY different if sample data set is not normally distributed</div>
    <div class="highlights-content">The equations listed above <b>are not valid if sample data set is not normally distributed</b>. In case of non-normally distributed data, its confidence interval can be obatined with non-parametric methods like <a href="https://aegis4048.github.io/non-parametric-confidence-interval-with-bootstrap" target="_blank">boostrapping</a>, or instead use credible interval, which is a Baysian equivalent of confidence interval.</div>
</div>

<div class="highlights">
    <div class="highlights-title">4. 95% C.I. does not mean 95% of the sample data lie within the interval.</div>
    <div class="highlights-content">It means that there's 95% chance that the sample mean falls within the interval. 95% confidence interval relates to the reliability of the estimation procedure. Ex: How reliable is your estimation of population mean?</div>
</div>

<div id="population_vs_samples"></div>

## Population vs. Samples

Confidence interval describes the amount of uncertainty associated with a sample estimate of a population parameter. One needs to have a good understanding of the difference between samples and population to understand the necessity of delivering statistical estimations in a range, a.k.a. confidence interval.

<div id="fig1" class="row give-margin-inline-big-plot mobile_responsive_plot_full_width" style="
    margin-top: 10px;
">
    <div class="col"><img src="jupyter_images/conf_int_sample_vs_population.png"></div>
    <div class="col-12"><p class="image-description">Figure 1: Population vs samples</p></div>
</div>

**Population**: data set that contains all members of a specified group. Ex: ALL people living in the US.

**Samples**: data set that contains a part, or a subset, of a population Ex: SOME people living in the US.

Let's say that you are conducting a phone-call survey to investigate the society's perception of The Affordable Care Act (“Obamacare”). Since you can't call all 327.2 million people (*population*) in the US, you call about 1,000 people (*samples*). Your poll showed that 59% of the registered voters support Obamacare. This does not agree with they actual survey conducted in 2018; 53% favorable, 42% unfavorable (<a href="http://www.msnbc.com/rachel-maddow-show/poll-shows-support-obamacare-reaching-all-time-high" target="_blank">source</a>). What could be the source of error? 

Since (formal) president Obama is a member of the Democratic Party, the voters' response can be affected by their political preference. How could you tell that the 1,000 people you called happened to be mostly Democrats, who's more likely to support Obama's policy, because they share similar political view? The samples you collected could have been *biased*, but you don't that know for sure. There will always be uncertainty involved with your estimation, because you don't have an access to the entire population.  

Confidence interval is a technique that quantifies the uncertainty when estimating population parameter from samples.

<div class="alert alert-info">
    <h4>Population variance ($\sigma^2$) vs. Samples variance ($s^2$)</h4>
    <p>Distinction between population parameter and sample parameter is important. In statistics, it is a common practice to denote population variance as $\sigma^2$, and sample variance as $s^2$. The distiction is important because different equations are used for each.</p>
    <p>For population:</p>
    <p><div style="font-size: 1rem; margin-top: 20px;">$$ \text{variance} = \sigma^2 = \frac{\sum(x - \bar{x})^2}{n} $$</div></p>
    <p>For samples:</p>
    <p><div style="font-size: 1rem; margin-top: 20px;">$$ \text{variance} = s^2 = \frac{\sum(x - \bar{x})^2}{n-1} $$</div></p>
    <p>The divisor $n-1$ is a correction factor for bias. Note that the correction has a larger proportional effect when $n$ is small than when it is large, which is what we want because the more samples we have, the better the estimation. This idea is well explained on this <a href="https://stats.stackexchange.com/questions/3931/intuitive-explanation-for-dividing-by-n-1-when-calculating-standard-deviation" target="_blank">StackExchange thread</a>.</p>
</div>

## Understanding Confidence Interval with analogy

If you've taken a science class with lab reports in your highschool or college, you probably had to include measurement error in your lab reports. For example, if you were asked to measure the length of a paper clip with a ruler, you have to include $\pm0.5 \,\text{cm}$ or $\pm0.05\,\text{cm}$ (depending on the spacing of tick marks) to account for the measurement error that shows the precision of your measuring tool. 

Based on <a href="#fig2">figure (2)</a>, the paper clip seems to be about 2.7 cm long, but we don't know for sure because the tickmarks in the ruler is not precise enough to measure decimal length. However, I can tell with 100% confidence that the paper clip has a length between 2 ~ 3 cm, because the clip is between 2cm and 3cm tickmarks. You record the length of the paper clip in a *range*, instead of a *point estimate*, to account for the uncertainty introduced by the spacing of the tickmarks.

<div id="fig2" class="row give-margin-inline-big-plot mobile_responsive_plot_full_width" style="margin-top: 15px;">
    <div class="col"><img src="jupyter_images/conf_int_ruler.png"></div>
    <div class="col-12"><p class="image-description">Figure 2: Measurement error in ruler</p></div>
</div>

Similar idea can be applied to a <a href="#">confidence interval of a sample mean</a>. You want to obtain a mean of a whole data set (*population*), but you can measure values of only a small fraction (*samples*) of the whole data set. This boils down to the traditional issue of <a href="#">Sample vs Population</a>, due to the cost of obtaining measurement data of a large data set. Uncertainty is introduced in your samples, because you don't know if your samples are 100% representative of the population, free of bias. Therefore, you deliver your conclusion in a range, not in a point estimate, to account for the uncertainty.

<div id="fig3" class="row" style="margin-top: 20px">
    <div class="col"><img src="jupyter_images/conf_int_dist_formula.png"></div>
    <div class="col-12"><p class="image-description">Figure 3: Confidence interval of sample mean</p></div>
</div>

In <a href="#fig3">figure (3)</a>, the calculated sample mean is $12.32$. This is a *point estimate*. Since you don't have an access to the entire population data, you quantify the uncertainty related to the computed mean of $12.32$ using <a href="#">eq (???)</a>. You choose 95% <a href="#">confidence level</a>, and obtain the uncertainty to be $\pm3.56$. Then, you deliver your conclusion as: **I am 95% confident that the population mean falls between 8.76 and 15.88 $\rightarrow$ (12.32 $\pm$ 3.56)**

The details of computing confidence interval of mean is described <a href="#">below</a>.

<div style="display: none">
from scipy.stats import truncnorm

def get_truncated_normal(mean=1, sd=1, low=0, upp=10):
    return truncnorm(
        (low - mean) / sd, (upp - mean) / sd, loc=mean, scale=sd)


X1 = sorted(get_truncated_normal(mean=5, sd=1, low=1, upp=10).rvs(10000))

mean, std = stats.norm.fit(X1, loc=0)
pdf_norm = stats.norm.pdf(X1, mean, std)

fig, ax = plt.subplots(figsize=(8, 4))
ax.plot(X1, pdf_norm)
ax.fill_between(X1, 0, pdf_norm, facecolor='lightgrey')
</div>


<div><hr></div>

Confidence interval has the following general form:

<div id="eq-1" style="font-size: 1rem;">$$ \text{C. I.} = \text{point estimate} \pm (\text{distribution score} \times \text{Standard Error}) \tag{1} $$</div>

<div class="eq-terms">
    <div class="row eq-terms-where">where</div>
    <div class="row">
        <div class="col-3"><p>-$D^2$<p></div>
        <div class="col-9"><p> description description description description description description description description description<p></div>
    </div>
    <div class="row">
        <div class="col-3">-$D^2$</div>
        <div class="col-9">description</div>
    </div>    
</div>


The decision makers always wants to know the uncertainty related to your estimation. 

This tutorial includes explanation of idea behind C.I., complications that arise when sample data is not normally distributed and solutions for them. For those who don't want to spend more than 2 minutes reading, I included a quick Python code snippets that can generate C.I. for any type of distribution for any type of statistic.

**Central Limit Theorem may not always apply**

2. Nonnormal data, variance known: If the population distribution is not normal and the sample is 'large enough', then X¯ is approximately normal and the same formula provides an approximate 95% CI. The rule that n≥30 is 'large enough' is unreliable here. If the population distribution is heavy-tailed, then X¯ may not have a distribution that is close to normal (even if n≥30). The 'Central Limit Theorem', often provides reasonable approximations for moderate values of n, but it is a limit theorem, with guaranteed results only as n→∞.

Many statistical techniques assume that sample data is normally distributed or Gaussian-like (Ex: t-distribution), and different techniques should be considered for non-normal data set.

### Extension of Confidence Interval with Bootstrapping

### Di

In [2]:
from scipy.stats import bayes_mvs

In [4]:
bayes_mvs([1,2,3,5,6,8,5,2])

(Mean(statistic=4.0, minmax=(2.39878883101482, 5.60121116898518)),
 Variance(statistic=8.0, minmax=(2.8435061229431486, 18.45571858443238)),
 Std_dev(statistic=2.691341356821504, minmax=(1.686269884372946, 4.296011939512317)))

## Understanding Confidence Interval with analogy

If you've taken a science class with lab reports in your highschool or college, you probably had to include measurement error in your lab reports. For example, if you were asked to measure the length of a paper clip with a ruler, you have to include $\pm0.5 \,\text{cm}$ or $\pm0.05\,\text{cm}$ (depending on the spacing of tick marks) to account for the measurement error that shows the precision of your measuring tool. 

Based on <a href="#fig2">figure (2)</a>, the paper clip seems to be about 2.7 cm long, but we don't know for sure because the tickmarks in the ruler is not precise enough to measure decimal length. However, I can tell with 100% confidence that the paper clip has a length between 2 ~ 3 cm, because the clip is between 2cm and 3cm tickmarks. You record the length of the paper clip in a *range*, instead of a *point estimate*, to account for the uncertainty introduced by the spacing of the tickmarks.

<div id="fig2" class="row give-margin-inline-big-plot mobile_responsive_plot_full_width" style="margin-top: 15px;">
    <div class="col"><img src="jupyter_images/conf_int_ruler.png"></div>
    <div class="col-12"><p class="image-description">Figure 2: Measurement error in ruler</p></div>
</div>

Similar idea can be applied to a <a href="#">confidence interval of a sample mean</a>. You want to obtain a mean of a whole data set (*population*), but you can measure values of only a small fraction (*samples*) of the whole data set. This boils down to the traditional issue of <a href="#">Sample vs Population</a>, due to the cost of obtaining measurement data of a large data set. Uncertainty is introduced in your samples, because you don't know if your samples are 100% representative of the population, free of bias. Therefore, you deliver your conclusion in a range, not in a point estimate, to account for the uncertainty.

<div id="fig3" class="row" style="margin-top: 20px">
    <div class="col"><img src="jupyter_images/conf_int_dist_formula.png"></div>
    <div class="col-12"><p class="image-description">Figure 3: Confidence interval of sample mean</p></div>
</div>

In <a href="#fig3">figure (3)</a>, the calculated sample mean is $12.32$. This is a *point estimate*. Since you don't have an access to the entire population data, you quantify the uncertainty related to the computed mean of $12.32$ using <a href="#">eq (???)</a>. You choose 95% <a href="#">confidence level</a>, and obtain the uncertainty to be $\pm3.56$. Then, you deliver your conclusion as: **I am 95% confident that the population mean falls between 8.76 and 15.88 $\rightarrow$ (12.32 $\pm$ 3.56)**

The details of computing confidence interval of mean is described <a href="#">below</a>.

<div style="display: none">
from scipy.stats import truncnorm

def get_truncated_normal(mean=1, sd=1, low=0, upp=10):
    return truncnorm(
        (low - mean) / sd, (upp - mean) / sd, loc=mean, scale=sd)


X1 = sorted(get_truncated_normal(mean=5, sd=1, low=1, upp=10).rvs(10000))

mean, std = stats.norm.fit(X1, loc=0)
pdf_norm = stats.norm.pdf(X1, mean, std)

fig, ax = plt.subplots(figsize=(8, 4))
ax.plot(X1, pdf_norm)
ax.fill_between(X1, 0, pdf_norm, facecolor='lightgrey')
</div>

The selection of a confidence level for an interval determines the probability that the confidence interval produced will contain the true parameter value. Common choices for the confidence level C are 0.90, 0.95, and 0.99. These levels correspond to percentages of the area of the normal density curve. For example, a 95% confidence interval covers 95% of the normal curve -- the probability of observing a value outside of this area is less than 0.05.


Note: This interval is only exact when the population distribution is normal. For large samples from other population distributions, the interval is approximately correct by the Central Limit Theorem.

<div class="alert alert-info">
    <h4>Notes: Confidence interval for non-normal distributions</h4>
    <p></p>
</div>


### What is confidence level $1 - \alpha$?

Confidence level is the chance that the statistic value of your interest lies within the interval. A common choice of confidence level include 

<div id="fig6" class="row give-margin-inline-big-plot mobile_responsive_plot_full_width">
    <div class="col"><img src="jupyter_images/conf_int_norm_curve.png"></div>
    <div class="col-12"><p class="image-description">Figure ???: Normal curve of confidence interval</p></div>
</div>



95% confidence interval does not mean that 95% of sample data lie within the interval. It means that there's 95% chance that the sample mean falls within the interval. 95% confidence interval relates to the reliability of the estimation procedure -> how reliable is your estimation of population mean?

95% doesn't mean 95% of the sample data lie within the interval.

**Pitfalls**



Prediction vs Confidence interval

Normal vs non-normal distribution

Standard Error

**Explain that the below plot is different from true population distribution plot**

<div id="fig6" class="row give-margin-inline-big-plot mobile_responsive_plot_full_width">
    <div class="col"><img src="jupyter_images/conf_int_5.png"></div>
    <div class="col-12"><p class="image-description">Figure ???: Measurement error in ruler</p></div>
</div>


### Convey your conclusion in a range, instead of point estimate

### Analogy with measurement error

If you've taken a science class with lab reports in your highschool or college, you probably had to include measurement error in your lab reports. For example, if you were asked to measure the length of a paper clip with a ruler, you have to include $\pm0.5 \,\text{cm}$ or $\pm0.05\,\text{cm}$ (depending on the spacing of tick marks) to account for the measurement error that shows the precision of your measuring tool. 

Based on <a href="#">figure (???)</a>, the paper clip seems to be about 2.7 cm long, but we don't know for sure because the tickmarks in the ruler is not precise enough to measure decimal length. However, I can tell with 100% confidence that the paper clip has a length between 2 ~ 3 cm, because the clip is between 2cm and 3cm tickmarks. You record the length of the paper clip in a *range*, instead of a *point estimate*, to account for the uncertainty introduced by the spacing of the tickmarks.

<div id="fig6" class="row give-margin-inline-big-plot mobile_responsive_plot_full_width" style="margin-top: 15px;">
    <div class="col"><img src="jupyter_images/conf_int_ruler.png"></div>
    <div class="col-12"><p class="image-description">Figure ???: Measurement error in ruler</p></div>
</div>

Similar idea can be applied to a <a href="#">confidence interval of a sample mean</a>. You want to obtain a mean of a whole data set (*population*), but you can measure values of only a small fraction (*samples*) of the whole data set. This boils down to the traditional issue of <a href="#">Sample vs Population</a>, due to the cost of obtaining measurement data of a large data set. Uncertainty is introduced in your samples, because you don't know if your samples are 100% representative of the population, free of bias. Therefore, you deliver your conclusion in a range, not in a point estimate, to account for the uncertainty.

<div id="fig6" class="row" style="margin-top: 20px">
    <div class="col"><img src="jupyter_images/conf_int_dist_formula.png"></div>
    <div class="col-12"><p class="image-description">Figure ???: Confidence interval of sample mean</p></div>
</div>

In <a href="#">figure (???)</a>, the calculated sample mean is $12.32$. This is a *point estimate*. Since you don't have an access to the entire population data, you quantify the uncertainty related to the computed mean of $12.32$ using <a href="#">eq (???)</a>. You choose 95% <a href="#">confidence level</a>, and obtain the uncertainty to be $\pm3.56$. Then, you deliver your conclusion as: **I am 95% confident that the population mean falls between 8.76 and 15.88 $\rightarrow$ (12.32 $\pm$ 3.56)**

The details of computing confidence interval of mean is described <a href="#">below</a>.

<div style="display: none">
from scipy.stats import truncnorm

def get_truncated_normal(mean=1, sd=1, low=0, upp=10):
    return truncnorm(
        (low - mean) / sd, (upp - mean) / sd, loc=mean, scale=sd)


X1 = sorted(get_truncated_normal(mean=5, sd=1, low=1, upp=10).rvs(10000))

mean, std = stats.norm.fit(X1, loc=0)
pdf_norm = stats.norm.pdf(X1, mean, std)

fig, ax = plt.subplots(figsize=(8, 4))
ax.plot(X1, pdf_norm)
ax.fill_between(X1, 0, pdf_norm, facecolor='lightgrey')
</div>

**Why do we need CI?**

We need to know confidence interval because we don't have the true representation of the population. We only have samples. 

<div class="row" id="fig1">
    <div class="col"><img src="jupyter_images/conf_int_2.png" style="margin: 0;"></div>
</div>


**How to compute?**

Obtained by taking a single estimate (mean) and $\pm$ margin of error

$$\text{statistic} \pm \text{distribution score} \times \text{standard error}$$

distribution score is how much std it requires to capture 95% of data???

#### What is standard error?

estimation of standard deviation of a statistic. Standard error is used in computing 

#### Terminalogy

Statistic: statistical value of data. Ex: mean, standard deviation, kurtosis, IQR, skewness, proportion, etc..

Population: data set that contains all members of a specified group. Ex: ALL people living in the US. 

Sample: data set that contains a part, or a subset, of a population Ex: SOME people living in the US.

Sample statistic: statistic based on a fraction of a population. Can be biased.

Population Parameter: statistic based on the entire population. Is free of bias.

$\sigma$: standard deviation of a population

$s$: estimate of population standrd deviation ($\sigma$) from samples

### Confidence interval vs Prediction interval

<div class="row" id="fig1">
    <div class="col"><img src="jupyter_images/conf_int_6.png" style="margin: 0;"></div>
</div>



## Why do we care about an interval of a statistic?

### Sample vs Population

--> Standard Error vs Standard Deviation

"Statisticians use sample statistics to estimate population parameters. Naturally, the value of a statistic may vary from one sample to the next"

Unfortunately, true value of population parameters are difficult to compute because of sampling limitations. For example, if the US Census Bureau wants to know the true average body weight of US male citizens in their 20's, they will have to measure body weights of every 23 million males in the US (<a href="https://www.statista.com/statistics/241488/population-of-the-us-by-sex-and-age/" target="_blank">source</a>), which is unrealistic. So they sample a small fraction of the entire population, and use it to estimate a population parameter. For example, they take body weights of 10,000 males and call it good for the estimation of the average body weight of all 23 million. 

Figure: 원안에 남자 사람 아이콘 100개, 원안에 사람 2개 동그라미 쳐서 밖에 조그만 원에다가 두기. 



**Sampling Bias**

### Notations

<div class="row" id="fig1">
    <div class="col"><img src="jupyter_images/conf_int_3.png" style="margin: 0;"></div>
</div>

## Practical Application - Bayes Approximation

Different equatoins fot standard error

https://stattrek.com/estimation/standard-error.aspx


https://www.statsdirect.com/help/basic_descriptive_statistics/standard_deviation.htm

"When distributions are approximately normal, SD is a better measure of spread because it is less susceptible to sampling fluctuation than (semi-)interquartile range."

### Examples

Example
Suppose a student measuring the boiling temperature of a certain liquid observes the readings (in degrees Celsius) 102.5, 101.7, 103.1, 100.9, 100.5, and 102.2 on 6 different samples of the liquid. He calculates the sample mean to be 101.82. If he knows that the standard deviation for this procedure is 1.2 degrees, what is the confidence interval for the population mean at a 95% confidence level?

In other words, the student wishes to estimate the true mean boiling temperature of the liquid using the results of his measurements. If the measurements follow a normal distribution, then the sample mean will have the distribution N(,). Since the sample size is 6, the standard deviation of the sample mean is equal to 1.2/sqrt(6) = 0.49.

In [None]:
Many statistical techniques assume that sample data is normally distributed or Gaussian-like (Ex: t-distribution), and different techniques should be considered for non-normal data set.

## Smart Gas Lift

gas lift - use external high pressure gas, well has gas in it, but not enough. Inject in through the casing, tubing, and lift hydrocarbons up with the gas. High pressure compression makes the lifespan of machine short. 

Inject too little gas - don't lift enough. Doesn't flow all the way
Inject too much gas - you are gonna lift it all, but you add extra friction, and you waste gas that you can sell.

GOR affects how much gas you need to inject.

Based on experience, opt injection rate seems to always round about 500, 600, 700 mcfd. 

Key features
- self-optimization, maximizing runtime. We have 95% runtime, but others have 60-70% runtime.
- you can't manually determin BHP, but the algo does it automatically with iterations.



*Distribution score* depends on the type of distribution (Ex: normal, lognormal, chi-squared, weibull), and the equations for *standard error* depends on the type of statistic (Ex: mean, proportion, std, variance). The sample data is assumed to be normally distributed, allowing you to use <a href="#">z-score</a> to lookup values related to any confidence level (Ex: 99%, 95%, 90%). Since you are trying to compute confidence interval of a mean ($\overline{x}$), you use $SE_{\overline{x}} = s\,/\sqrt{N}$ to compute standard error for a mean.