# Solar Flare Confidence Interval Exercise

In this notebook, you're going to examine data about solar flares and use confidence intervals to calculate the average duration of solar flares. You will also examine the impact of non-normally distributed data on the calculation of confidence intervals using functions like confint.

## [From Kaggle](https://www.kaggle.com/datasets/khsamaha/solar-flares-rhessi)

Reuven Ramaty High Energy Solar Spectroscopic Imager (RHESSI, originally High Energy Solar Spectroscopic Imager or HESSI) is a NASA solar flare observatory. It is the sixth mission in the Small Explorer program, selected in October 1997 and launched on 5 February 2002. Its primary mission is to explore the physics of particle acceleration and energy release in solar flares.

HESSI was renamed to RHESSI on 29 March 2002 in honor of Reuven Ramaty, a pioneer in the area of high energy solar physics. RHESSI is the first space mission named after a NASA scientist. RHESSI was built by Spectrum Astro for Goddard Space Flight Center and is operated by the Space Sciences Laboratory in Berkeley, California. The principal investigator from 2002 to 2012 was Robert Lin, who was succeeded by Säm Krucker.

**To learn more about this dataset, be sure to click on the link above and read about the contents of the data.**

## Be sure to run this code first!

In [34]:
suppressPackageStartupMessages({
    library(coursekata)
    library(lubridate)
    library(dplyr)
})

options(repr.plot.width=12, repr.plot.height=9)

In [69]:
solar_flares <- read.csv("https://raw.githubusercontent.com/DTS-Hudson-Harper/DTS-Stats-23-24/main/Chapter%206/hessi.solar.flare.UP_To_2018.csv") %>%
  mutate(start_datetime = as.POSIXct(paste(start.date, start.time), format="%Y-%m-%d %H:%M:%S")) %>%
  arrange(start_datetime) %>%
  mutate(s.to.next.flare = as.integer(lead(start_datetime) - start_datetime)) %>%
  sample(10000)

## Instructions

Calculate a confidence interval for the average duration of solar flares, *duration.s*, in *solar_flares* using both bootstrapping and confint. Compare your results. Be sure to find the upper and lower bounds of your confidence interval from your bootstrapped sampling distribution. You can do this using the quantile function:

```quantile(sdob0$b0, c(0.025,0.975))```

Now, do the same for the average time between solar flares in seconds, *s.to.next.flare*.

## Question

What is different between your confidence intervals for these two variables? What does it tell us about the use of bootstrapping vs using functions like confint?

## Challenge

Use this data to recreate Hudson's demo of 20 confidence intervals based on multiple resamples of the original data.