In this homework, you need to solve two tasks. For problems with theoretical calculations, you need to show a sufficient number of intermediate steps to explain how you have obtained the result. 
* Formulas are expected in Latex style inside the notebook. The assignment should be uploaded in Jupyter Notebook format(`.ipynb`).

# Task 1. Martian weather. (20 points)

In this task you need to estimate parameters and their confidence intervals for a given sample. As data, you will explore martian weather. For more visualizations take a look [here](https://pudding.cool/2018/01/mars-weather/).

First of all, you need some libraries. Also, fix random seed to get reproducable results.

In [None]:
import numpy as np
from numpy.random import choice, seed
import pandas as pd
from scipy.stats import sem, norm, skew, chi2
import matplotlib.pyplot as plt


seed(366)

In [None]:
import matplotlib as mp
import matplotlib.font_manager

titlesize = 20
labelsize = 16
legendsize = labelsize
xticksize = 14
yticksize = xticksize

mp.rcParams['legend.markerscale'] = 1.5     # the relative size of legend markers vs. original
mp.rcParams['legend.handletextpad'] = 0.5
mp.rcParams['legend.labelspacing'] = 0.4    # the vertical space between the legend entries in fraction of fontsize
mp.rcParams['legend.borderpad'] = 0.5       # border whitespace in fontsize units
mp.rcParams['font.size'] = 12
mp.rcParams['font.family'] = 'serif'
mp.rcParams['font.serif'] = 'Times New Roman'
mp.rcParams['axes.labelsize'] = labelsize
mp.rcParams['axes.titlesize'] = titlesize
mp.rcParams['axes.unicode_minus'] = False

mp.rc('xtick', labelsize=xticksize)
mp.rc('ytick', labelsize=yticksize)
mp.rc('legend', fontsize=legendsize)

mp.rc('font', **{'family':'serif'})

## Part 1. Load data (1 point)

You need to load data from `mars-weather.csv`, take feature for your variant, remove absent values (`Nan`s) and convert sample to `int` type. 

### a) 

`feature_name = "min_temp"`

### b)

`feature_name = "max_temp"`

In [None]:
# Your code here

feature_name = ...
df = pd.read_csv("mars-weather.csv")[feature_name]
sample_full = df.dropna().values.astype(np.int)

Let's take a quarter of martian year ~ 168 sols(martian days). Sample them randomly from full sample using function `choice`(from `numpy.random`) with parameter `replace=False`.

In [None]:
# Your code here

N = 168
sample_part = ...

Plot values frequences for full and partial sample using bar plot.

In [None]:
# Your code here



## Part 2. Parameter estimation. (6 points)
Find the following parameters' estimates and their 95% confidence intervals (except for `mode` and `skewness`). You need to show theoretical calculations for estimates and intervals (with intermediate steps) and then make simulation.

### a) Mean and CI

$$ \hat{mean} = ...$$

$$ CI(\hat{mean}) = ...$$

In [None]:
# Your code here

mean, lower, upper = ...
f"Mean {mean:.3f} with confidence interval ({lower:.3f}, {upper:.3f})"

### b) Median and CI

Here you can assume that PDF is continuous at the median point and provide a normal-based interval.

$$ \hat{median} = ...$$

$$ CI(\hat{median}) = ...$$

In [None]:
# Your code here

median, lower, upper = ...
f"Median {median:.3f} with normal-based confidence interval ({lower:.3f}, {upper:.3f})"

### c) Variance and CI

Here you can assume that the sample comes from a nornal distribution, but the mean and variance are not known.


$$ \hat{Variance} = ...$$

$$ CI(\hat{Variance}) = ...$$

In [None]:
# Your code here

var, (lower, upper) = ...
f"Variance {var:.3f} with confidence interval ({lower:.3f}, {upper:.3f})"

### d) Mode (most frequent value in a sample)

In [None]:
# Your code here

mode = ...
f"Mode: {mode}"

### e) Skewness

In [None]:
# Your code here

skewness = ...
f"Skewness: {skewness:e}"

## Part 3. Bootstrap (4 points)

Find confidence intervals for following estimates using bootsrap. Use function `choice` with parameter `replace=True` for bootstrap sampling. Try different numbers of generated samples.

### a) Mean and CI

In [None]:
# Your code here

mean, lower, upper = ...
f"Mean {mean:.3f} with confident interval ({lower:.3f}, {upper:.3f})"

### b) Median and CI

In [None]:
# Your code here

median, lower, upper = ...
f"Median {median:.3f} with normal-based confident interval ({lower:.3f}, {upper:.3f})"

### c) Variance and CI

In [None]:
# Your code here

var, lower, upper = ...
f"Variance {var:.3f} with confident interval ({lower:.3f}, {upper:.3f})"

## Part 4. Comparison with true values. (1 point)

Compare with estimates calculated over full sample. Write small conclusion about estimates and their confidence intervals obtained without and with bootstrap. Also, you can share some conclusions about martian weather :)

In [None]:
# Your code here

mean = ...
median = ...
var = ...
moda = ...
skewness = ...

## Part 5. Confidence intervals and sample size. (8 points)

Compare size of confidence intervals **for the mean** obtained without and with bootstrap. Additionally, compare an empirical coverage of different confidence intervals (by generating sufficient number of samples of corresponding size and calculating proportion of cases when the interval covers the mean of the full sample). Consider sizes `[42, 84, 168, 335, 670, 1340]`. Plot results and make conclusions based on obtained results.

In [None]:
seed(476)
sizes = [42, 84, 168, 335, 670, 1340]


In [None]:
# Compare size of confidence intervals obtained without and with bootstrap
# Your code here



In [None]:
# Plot results
# Your code here



In [None]:
# Compare the empirical coverage of different confidence intervals
# Your code here



In [None]:
# Plot results
# Your code here



Your conclusion:

# Task 2. Currency in RC-circuit. (25 points)

In this part you need to estimate parameters and apply delta method and bootstrap. 

First of all, you need some libraries. Also, fix random seed to get reproducible results.

In [None]:
import numpy as np
from scipy.stats import norm, uniform
from numpy.random import choice, seed

seed(100)

## Part 1. Estimate parameters. (4 points)

Assume that there is an RC-circuit with a capacitor and a resistor. 
<img src="circuit.png" width="200"  class="center">

We charge the capacitor until it reaches voltage $V$ and measure current intensity. In this case voltage on the capacitor yields the formula for exponential decay:

$$ V_C(t) = V e^{-\frac{t}{RC}} $$

Let's assume that voltage $V$ and resistence $R$ are **independent** and belong to the following distributions:

### a) 
$V \sim N(\mu = 5, \sigma = 1)$, 

$R \sim \text{Uniform}(a = 5, b = 10)$

### b)

$V \sim N(\mu = 15, \sigma = 3)$, 

$R \sim \text{Exp}(\lambda = 0.1)$

Consider **true values** $\bar{V}$ and $\bar{R}$ for $V$ and $R$ to be the means of the corresponding distributions.

Generate sample for $V$ of size 100. Apply maximum likelihood to estimate the mean. Show theoretical calculations for the estimates (with intermediate steps) and make simulation.

$$\hat{V}_{n} = ...$$

In [None]:
# Your code here


Generate sample for $R$ on size 100. Apply maximum likelihood to estimate the mean. Show theoretical calculations for estimates (with intermediate steps) and make simulation.

$$\hat{R}_{n} = ...$$

In [None]:
# Your code here


## Part 2. Apply delta method. (8 points)

Assume, that we measure current intensity at $t=1$ second. Let's take $C = 1$. In this case we get the following simplified formula:

$$ \bar{I} = \frac{V}{R} e^{-\frac{1}{R}} $$

Find estimate for current and it's confidence interval. Show theoretical calculations for estimates (with intermediate steps) and make simulation.

$$\hat{I}_n = ...$$

$$ CI(\hat{I}_n) = ...$$

In [None]:
# Your code here

se = ...
f"SE for delta method: {se:e}"

## Part 3. Bootstrap estimation. (2 points)

Estimate confidence interval for $I$ using bootstrap.

In [None]:
# Your code here

se = ...
f"SE for non-parametric bootstrap: {se:e}"

## Part 4. Compare results. (8 points)

Compare the size of confidence intervals obtained using delta method and bootstrap. Additionally, compare the empirical coverage of different confidence intervals (by generating sufficient number of samples of corresponding size and calculating proportion of cases when the interval covers the true value for the current $I$). Consider sizes `[1e1, ..., 1e4]`. Plot results and make conclusions based on obtained results.

In [None]:
sizes = np.logspace(1, 4, 4).astype(np.int)

In [None]:
# Compare size of confidence intervals obtained using delta method and bootstrap
# Your code here



In [None]:
# Plot results
# Your code here



In [None]:
# Compare an empirical coverage of different confidence intervals
# Your code here



In [None]:
# Plot results
# Your code here



Your conclusion: