# Exercise: Bringing it together

Let's try a single, longer example that (tries to) bring in the things we've learnt so far.

Let's say you have travelled to the past with a high-quality telescope, and are observing the moons of Jupiter for the first time in human history. You find 80 of them!

You measure several properties of the orbits of these moons around Jupiter. You store these values in a text file called "jupiters_moons.csv".

Now, before you announce your discovery to the world, you decide to calculate the distribution statistics of the moons. You want to use these statistics to understand the astrophysics underlying these moons.

Your first step to this analysis is inputting the data. To do this, you use the NumPy module:

Now, look up the documentation for Numpy's `loadtxt()` function — you can do so by typing

    np.loadtxt?

or

    help(np.loadtxt)
    

Once you've done that, use loadtxt() to read the "jupiters_moons.csv" file in the [GitHub repository](https://github.com/nden/astropy_bg_2023) into a variable called `moons_data`.

**Hint:** *csv* stands for a comma separated ASCII text file, so you need to tell loadtxt() that the values are separated by ",".

If you aren't able to load the file, there is a possibility that you are running into the following issue. The file you are loading contains numbers and strings (e.g. 'Planet'), but np.loadtxt() cannot turn the strings into floats. Make sure to look at the "dtype" option in the documentation, and try setting it to string — this will turn all things into strings, but we can extract the numbers easily later.

**Hint:** Try to skip reading the first row of the file. This row contains names of columns rather than data, so you can ignore it.

Now print this data

Take a closer look at the printed array. Notice how there are outer brackets surrounding a bunch of bracketed lists?

You get a NumPy n-dimensional array object. Yesterday, you saw that these "ndarrays" can be powerful tools. We will look into it further here.

In the cell below, index the array for it's 0th element and see what happens:

You should see the orbital parameters of the Jupiter moon, "Io". What happens if you index *that* mini-array for it's 6th element? Do that below:

You should get Io's semi-major axis (a) as a string.

If you look at the contents of the file, you will see that different columns have different orbital parameters of the 80 moons. The 6th column contains the semi-major axes (a) of the orbits (remember that indexing starts from 0 in Python). You can access this column as follows:

The above tells the array that you want all rows but only the 6th column.

We see that the values are all strings because we loaded everything using the string data type. We can extract the numbers by converting the strings to floats as follows

The above shows one advantage of using "ndarrays" as opposed to lists. For typecasting lists, you have to individually convert each value either manually or by using a for loop; you can try doing this yourself at a later time.

Using the code in the previous cell, read columns 7 (eccentricities) and 10 (inclination) into variables "eccts" and "incls" respectively and convert the strings to floats

Now, to perform statistical analysis on this data, the first thing we need to do is calculate the averages. Later on, we'll see that there are external functions you can import into Python that will just do this for you, but for now let's calculate it manually (it's easy enough, right?).

Above, we saw use of the sum() function - as it turns out, you can run the sum() function on a list or an ndarray (so long as it only contains numbers) and it will tell you the sum. There is also a NumPy equivalent of the sum function called `np.sum()`, which works faster for ndarrays.

The only other thing you'll need to calculate the average is the len() function, which returns the number of elements in a list/array. Using those two, calculate averages below:

Now let's print these quantities in a *fancy* way

Okay, so the other thing statisticians are always interested in is the standard deviation from the mean - this basically tells how dispersed the values in a sample are. The formula for a standard deviation is
$$
s = \sqrt{\frac{\sum_{1}^{N}(x_i - \mu)^2}{N-1}}
$$

where $\mu$ is the average and N is the total number of values.

We already know how to get N, and we know what $\mu$ is as well. So to calculate this, we need to know how to calculate the quantity on the top of the fraction. This is actually kind of tricky without the use of Numpy arrays. See the example below for elucidation:

Okay, so you can subtract an integer from an ndarray. What if we try the list version?

You can't directly subtract an integer from a list. But why is this subtraction important?

Your spidey senses should be tingling for how we can leverage the subtraction functionality of ndarrays to calculate our standard deviation. In the cell below, fill in the variable `top_frac_major_axis` to calculate the following quantity:
$$
\sum_{i=1}^N (x_i - \mu)^2
$$

Notice here that you don't have to actually calculate it one by one - if we first compute a single array that represents each semi-major axis with the mean subtracted off and then that value squared, then we finish off "top_frac_major_axes" just by summing up that array as we've done before.

With that done, we can easily apply the formula to get the final STD.

**Hint:** the function `np.sqrt()` will be useful here.

Now compare this result with that obtained by applying `np.std` directly.

Alright! If you've done everything correctly, you should have found the averages and standard deviations.

Let's, for fun, make a helpful plot to visualize the data and the calculated statistics.

Nice! It looks like our formula for standard deviation successfully describes the distribution of orbital semi-major axes, eccentricities, and inclinations pretty well. Now, let's interpret the plots -- why do you see [multi-modalties](https://en.wikipedia.org/wiki/Multimodal_distribution) in the histograms?

It is easy to understand this by visualizing the orbital distribution of the moons around jupiter.

**Note**: We provide most of the code required for this visualization, because fully understanding it (right now) is beyond the scope of this tutorial.

You will need three different modules: `matplotlib` for plotting (which you used above) and animation, `astropy` for defining units and time, and `poliastro` for obtaining the orbits.

`poliastro` is not directly available in Google Colab (unlike NumPy and Matplotlib), so you will need to install it as follows

Now load all the required modules

As an example, let's start by visualizing the orbit of Mars around the Sun.

In [None]:
# Data for Mars at J2000 from JPL HORIZONS
# a = 1.523679 * u.AU
# ecc = 0.093315 * u.one
# inc = 1.85 * u.deg
# raan = 49.562 * u.deg
# argp = 286.537 * u.deg
# nu = 23.33 * u.deg

Now that we can plot orbits, let's vizualize the orbits of the **Galilean moons** of Jupiter, using the following code.

**Fun fact:** Io is the fourth-largest moon in the Solar System, only slightly larger than our (Earth's) moon.

In [None]:
# The following code is for illustration purposes only!
# Do not worry if you don't fully understand it.
from poliastro.plotting import OrbitPlotter2D

op = OrbitPlotter2D()

for moon_name in GALILEAN_MOONS:
    # Find galilean moons in the array
    moon_ind = np.where(moons_data[:, 1] == moon_name)
    moon = moons_data[moon_ind][0]

    # Set the orbital parameters with units
    a = float(moon[6]) * u.km
    ecc = float(moon[7]) * u.one
    inc = float(moon[10]) * u.deg
    # Not as important for this exercise
    raan = float(moon[8]) * u.deg
    argp = float(moon[11]) * u.deg
    nu = (float(moon[9]) - 180) * u.deg

    # Define the orbit using above parameters
    orb = Orbit.from_classical(Jupiter, a, ecc, inc, raan, argp, nu)
    # Plot the orbit
    op.plot(orb, label=f'{moon_name}')

# Display the orbits
op.show()

The label for the orbits shows some sort of a time-stamp or an "epoch". Let's see if such an epoch is available in the "jupiters_moons.csv" file.

Now use this epoch when you define the orbit.

In [None]:
epoch = Time("2000-01-01T12:00:00", scale='tdb')

Great, that works! The orbits look circular with low inclination.


Now, let's investigate further and plot the orbits of all 80 moons of Jupiter.

In [None]:
# The following code is for illustration purposes only!
# Do not worry if you don't understand it.
from poliastro.plotting import OrbitPlotter3D

op = OrbitPlotter3D()

for moon in moons_data:
    # Set the orbital parameters with units
    a = float(moon[6]) * u.km
    ecc = float(moon[7]) * u.one
    inc = float(moon[10]) * u.deg
    # Not as important for this exercise
    raan = float(moon[8]) * u.deg
    argp = float(moon[11]) * u.deg
    nu = (float(moon[9]) - 180) * u.deg

    # Define the orbit using above parameters
    orb = Orbit.from_classical(Jupiter, a, ecc, inc, raan, argp, nu, epoch=epoch)
    # Plot the orbit
    op.plot(orb, label=f'{moon[1]}')

# Display the orbits
op.show()

The orbits of the Galilean moons are highlighted in the above figure, but it is hard to see their structure. This is because the figure is dominated by the **irregular satellites** (or moons) that have highly eccentric orbits at large distances from Jupiter. The inclinations of these orbits are also generally high and have wide distributions.

If you scroll back to the semi-major axis, eccentricity, and inclination histograms of all the moons of Jupiter, you will see that the irregular satellites represent the peaks at high values. Thus, this class of moons forms one of the peaks in the multi-modal distributions. The other peaks are formed by the **regular satellites** that include the **inner satellites** and the Galilean moons (or the main group).

The regular satellites have nearly circular orbits with low inclination (similar to the Io orbit we just plotted). They are larger (in size) and more massive than the irregular satellites and orbit closer to Jupiter. They are all **prograde** moons, i.e., they orbit in the same direction as the rotation of Jupiter.

Let's rerun the above code, but this time only focus on regular satellites (```
moons_data[:8] ```)

Here you see the regular satellites: the 4 inner satellites and the 4 Galilean moons. Note that the two innermost satellite orbits are almost identical and are thus seen as only one orbit. This figure demonstrates the properties we mentioned above.

The conclusion of your analysis is that there are two main types of Jupiter moons: regular and irregular. Now, you should have your astrophysics researcher hat on:

***

#### Q - Are the differences between the two types of Jupiter moons due to some underlying differences in the formation of these moons?

***

Ans. Here

Hint: Refer to the references in this [wikipedia section](https://en.wikipedia.org/wiki/Moons_of_Jupiter#Origin_and_evolution).

# Another exercise:

Can you design an object to store the moons data we have been working with?