# Homework 2

- **Subject:** Computational Physics I
- **Deadline:** Thursday 14 July 2022 (until 7pm)
- **Credits:** 20 points
- **Number of problems:** 4
- **Type of evaluation:** Formative Evaluation

Please complete the following problems by **7pm Thursday 14 July 2022**.

- You can work individually or in groups of maximum 2 people.


- Please include your names in the notebook, and name it using your surname/s, e.g. **hw2_surname1_surname2.ipynb** if you work in pairs, or **hw2_surname.ipynb** if you work individually.


- Please send **a single python notebook** with your solutions.


- Submission only via email to these **2** addresses: **nardy.sallo@yachaytech.edu.ec** with cc to **wbanda@yachaytech.edu.ec**


- Make sure you email us the correct file as **late assignments won't be accepted.**

# 1. (5 points) Data I/O and analysis: climate time series

This problem consists of reading, displaying, and analysing climate data. The data file for this exercise was downloaded from: https://ourworldindata.org/ and contains the following information:


- Annual $\rm CO_2$ emission per capita (in tons), see file: **co-emissions-per-capita.csv**


Write Python functions that: 

(a) Read in the **co-emissions-per-capita.csv** file, and place the data into a pandas dataframe. Inspect the data tables to check how they are organised.


(b) Create two new objects (data frames) that contain only the information (rows) for Ecuador and Colombia. **Hint:** Use the **data.loc** function to locate the respective rows based on the country name.


(c) Inspect the data, compare the time ranges (in years), and adjust the data frames (if necessary) so that each object/frame contains information for the same number of years. Hint: Use conditionals to select only the overlapping time periods.


(d) Now that both frames cover the same time period in years, let us calculate the average and standard deviation of the annual $\rm CO_2$ emission per capita (in tons) of both countries. Create and export (in CSV format) a new pandas data frame that contains five columns:

**Year, $\rm CO_2$ emission in Ecuador, $\rm CO_2$ emission in Colombia, average of the two countries, and standard deviation of the two countries.**


(e) Make a labeled, high-quality plot of the annual $\rm CO_2$ emission per capita (in tons) versus time, including the lines for each country, the line for the average with y-error bars equal to the standard deviations calculated above.

# 2. (7 points) Regression, interpolation, $\chi^2$ minimisation: climate time series

Consider the following data files, which contain climate data:

- Annual $\rm CO_2$ emission per capita (in tons), see file: **co-emissions-per-capita.csv**


- Sea temperature anomaly with respect to the 1961-1990 (in $\rm ^{\circ}C$), see file: **temperature-anomaly.csv**

Both were downloaded from: https://ourworldindata.org/.

Write Python functions that: 

(a) Read in the **co-emissions-per-capita.csv** file, select the "World" data, and place it into a pandas data frame. Hint: Follow the same procedure as for problem 1, but select the rows labeled with "World". Make a labeled figure of $\rm CO_2$ emission per capita (in tons) versus time (in years).


(b) Read in the **temperature-anomaly.csv** file, select the rows corresponding to the global anomaly, i.e., those labeld with "Global". Make a labeled figure of the temperature anomaly (in $\rm ^{\circ}C$) versus time (in years), including symmetric y-errors for the temperature anomaly. Hint: to calculate the symmetric $2\sigma$ y-errors, you can read the columns "Upper bound (95% CI)", "Lower bound (95% CI)" and average them.


(c) Check if the data frames created in (a) and (b) cover the same time period in years. If not, use conditionals to select only the overlapping time periods, so that both data frames cover the same time ranges. Create and export (in CSV format) a new pandas data frame that contains five columns:

**Year, World $\rm CO_2$ emission, Global temperature anomaly, $2\sigma$ error in temperature anommaly.**


(d) Make a labeled scatter plot (using markers) of the the temperature anomaly (including the $2\sigma$ y-error bars) versus $\rm CO_2$ emission. How monotonic and/or linear is the relation between the two variables?


(e) Make a plot comparing the linear fits to the data using two different methods:

- Carry out a linear regression to fit a single line to the whole data set, and report the resulting equation and figure.


- Carry out a $\chi^2$ minimisation (including the y-error bars), and report the resulting equation and figure.


- Briefly comment (in 3-4 lines maximum), are the fitted regressions above representative of the data?


(f) Let us try to improve the fits. The temperature anomaly data are noisy as a result of climate variability.


- Make it smoother by: interpolating it into a finer x-axis and applying a Gaussian filter. Show the comparison in a figure.


- Re-do the analysis in (e), using the smoothed data for the temperature anomaly. Is a single line a goood model? If not, carry out a piece-wise analysis if necessary and report the new equations and plots.

# 3. (8 points) Fitting, regression, interpolation: absorption spectral lines

The supplied data file: **j124257-7533_spectra.txt** contains observational data from the ATCA radio telescope (ATCA stands for Australia Telescope Compact Array (see https://en.wikipedia.org/wiki/Australia_Telescope_Compact_Array). The data corresponds to HI clouds in the Chamaeleon molecular cloud complex (see https://en.wikipedia.org/wiki/Chamaeleon_complex).

This data file contains absorption line features from neutral hydrogen (HI, i.e., $\lambda = 21\,\rm cm$) in the Milky Way, and continuum emission from a distant quasar that is used as "background light" for observing the HI in the Milky Way (see the Figure below). We can assume that each gas cloud in our line of sight towards the background source produces a Gaussian absorption feature that is only dependent on the opical depth, $\tau$, of the gas cloud. The higher the optical depth, the deeper the absorption feature. 

![absorption.jpg](attachment:absorption.jpg)

Carry out the following calculations using Python: 

(a) Read in the spectral data (velocity and intensity) from the file, and make a plot of the spectrum (velocity on the x-axis and intensity on the y-axis). How many "HI clouds" do you see? Note that each Gaussian-like feature represents a separate HI cloud.


(b) Identify the background continuum emission by fitting a linear regression to the spectrum. This should be a horizontal line (i.e. with no slope) since the continuum emission is roughly constant. Hint: Do not include the absorption features in this regression (you can mask that section before carrying out the regression).


(c) Divide the spectrum by the fitted continuum emission from (b), and make a plot of it (the base of the spectrum should lie at $+1$ on the y-axis). This step is neccesary to separate the HI absorption from the Milky Way from the background source.


(d) This spectrum is a bit noisy, so let us smooth it using Spline interpolation with a Gaussian kernel. Make a plot of the resulting smoothed spectrum


(e) Each point along the resulting spectrum, obtained in (d), represents a specific optical depth value of the absorbing gas, namely each value of intensity is equal to $e^{-\tau}$, where $\tau$ is the optical depth of the HI gas. Thus, calculate the optical depth for each point and make a plot of optical depth versus velocity.


(f) Now, let us fit Gaussians to the spectrum using **astropy** functions. To be able to fit Gaussian functions to the spectrum, it is necessary to shift the base of the spectrum to 0 (i.e., subtract 1 from the intensities). Fit the HI absorption features with Gaussian components. 

Hint: To find the two absorption features it is neccesary to suply the model with preliminary guesses for the mean and the amplitude. Initial guesses can be supplied to the models in the following way: Gaussian1D(amplitude=your guess, mean=your guess, stddev=your guess). For more details see: https://docs.astropy.org/en/stable/modeling/parameters.html


(g) Make a plot of the results (spectrum plus fitted model). What is the maximum optical depth of the gas clouds that are represented in this spectra?


(h) Can these HI clouds be considered optically thin or thick? The optical thickness measures how dense the gas is, an optically thin clouds lets radiation through while optically thick clouds absorb radiation. Clouds are considered as optically thick if: $\max{\tau}>0.5$.