# Python essentials and understanding measurement uncertainty

## Objectives of this exercise

1. **Review essential Python skills for data analysis**  
   - Reading and manipulating data files  
   - Indexing, type conversion, and plotting with `numpy` and `matplotlib`
2. **Work with real astronomical data**  
   - Explore light curves from a real astronomical survey. 
   - Visualize variability and basic descriptive statistics (mean, standard deviation, min, max)
3. **Understand uncertainty in measurements**  
   - Learn the difference between **standard deviation** and **standard error**  
   - Practice calculating the uncertainty on simple and composite quantities (mean, difference between 2 measurements affected by uncertainties)
4. **Develop problem-solving skills**  
   - Apply a structured methodology to clean, analyze, and summarize data  
   - Prepare for future lectures on statistical inference and error propagation

## Why this matters

In modern astrophysics and data science, the ability to **read, clean, and analyze real observational data** is essential. These skills form the foundation for everything from detecting exoplanets to studying variable stars and gravitational lenses. By learning how to handle data and calculate uncertainties correctly, you are preparing for more advanced tasks like building models, interpreting results, and contributing to cutting-edge research. This exercise is not just about coding—it’s about developing the mindset and tools needed to turn raw measurements into scientific insight.

## Exercise: 

Read the data file `Variability_Catalina_list1.csv` which contains lightcurves of an ensemble of objects observed in the context of the Catalina survey (http://nesssi.cacr.caltech.edu/DataRelease/). 

This file contains the following columns separated by 'tab' characters: `InputID	ID	Mag	Magerr	RA	Decl	MJD	Blend`. Their meaning is the following: 
* [0] `InputID`: Object Name 
* [1] `ID`: Object ID in the survey
* [2] `Mag`: Object Magnitude ($ m = -2.5 * \log(Flux)$ + zeropoint)
* [3] `Magerr`: Formal error on the magnitude
* [4] `RA`: Right ascension  (degrees)
* [5] `Decl`: Declination  (degrees)
* [6] `MJD`: Modified julian day (day). 
* [7] `Blend`: 0 if the measurement is clean, 1 if there is a possible contamination by another target. 

The file contains the observed magnitude of the following (gravitationally lensed) quasars: 'DESJ0407-5006', 'HE1104-1805', 'HS2209+1914' 'J0011-0845', 'J0228+3953', 'Q1355-2257', 'SDSSJ0904+1512'. 
You want to visualise the lightcurve (x=MJD, y=Mag, yerr=Maggerr) associated to each individual object. You also want to measure its mean magnitude, and the standard deviation on the magnitude, as well as its minimum / maximum magnitude over the period of observation and save those value into a table. 

Based on this, how would you proceed to calculate the following quantities: 
- Standard Error on the mean magnitude
- Standard Error on the amplitude variability of the object  

Try to reach this goal using only "standard" python commands and commands from numpy and matplotlib arrays. To ease your task, a possible methodology is outlined below.  

**Possible methodology:**
- (1) Read the file and save the output into a numpy array. How many rows and columns does contain the array ? What is the dtype of your array ? 
- (2) Create a sub-array for a single object, HS2209+1914,  using fancy indexing. 
- (3) Create an array with 4 columns `MJD, Mag, magerr, blend` , converting strings to floats.
- (4) Get rid of data points with `Blend` > 0 
- (5) Plot the lightcurve with error bars on the data points at the screen 
- (6) Calculates for the lightcurve the mean, std, min, max with numpy
- (7) Calculate the standard error on the mean magnitude, and on the maximum amplitude of variation (i.e. difference between the maximum and minimum magnitude). How does the standard error on the mean compare to the mean uncertainty on the data points? 
 Post your result on the form provided by the teacher. 
- (8) Repeat the operation for the 7 objects using a `for loop`: i.e. you can create a list of arrays, each one containing `MJD, mag, magerr` of the object. You can generate a plot within the same look. To plot into a single figure, you can consider using `plt.subplots(rows=7, figsize=(15,20))`.

- BONUS:    
  * (a) How would you proceed to know the names of the individual objects if I did not provide the list ?
  * (b) Display on the figure the mean and the undertainty in the mean.
  * (c) What is the formula of the standard deviation? What does the standard deviation represent? 

**TIPs:** 

* A file's row mixes strings and floats. You may therefore first create an array of strings (1 single dtype in an array). One of the arguments of `np.loadtxt()` is the data type, so you can read a list of strings with `np.loadtxt(filename, dtype=str)`
* To convert an array of numbers whose type is strings into float: 
``` python 
myarray_strings = np.array(['1', '2', '3'])  
myarray_floats = np.array(myarray_strings , dtype = float)
```