[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/sujeong-lee/ASTR3251/main?urlpath=%2Fdoc%2Ftree%2Fnotebooks%2FPythonPackages.ipynb)

# Python Packages
<h6>Author : Hernan Rincon</h6>
<h6>Modified by Sujeong Lee, Adopted from Desihigh</h6>
<h6>Last Updated : February 26, 2026</h6>

In the [Introduction to Python](PythonTutorial.ipynb) notebook, we introduced the Python programming language. We will now introduce Python packages, which allow us to use Python code that other people wrote and published. We are going to look at some common Python packages that will show up in DESI High tutorials. Specifically, we will introduce the following packages:
 - NumPy: used to organize numbers and perform mathematical operations 
 - Matplotlib: used to make graphs and visually represent data

## Preferred Backgrounds
While not necessary, having knowledge about the following topics will be helpful in completing this notebook on time:
 - How Jupyter cells work
 - How basic mathematical functions work
 - How we can use graphs to visually depict mathematical functions

## Before beginning

Please note that Binder sessions are **temporary** and all work will be permanently deleted once the session ends.

<font color="red">**As soon as you launch the notebook, immediately select `File → Save Notebook As` and rename the notebook using your last name (e.g., `Lab1_Lee.ipynb`)**.</font> 

## 1 - Collecting Data with Numpy Arrays
NumPy allows us to organize and manipulate large numerical data sets. To load up NumPy, or any other package installed on a computer, we can type `import packageName`. Run the below cell to load NumPy into the notebook.

In [None]:
import numpy
# it's that easy!

If we want to store multiple numbers in a single python variable, we can use a NumPy `array` to do so. Arrays are created in the following manner:


In [None]:
my_array = numpy.array([2,4,5,8])

Compared to python lists, arrays are convenient because they allow us to perform standard mathematical operations on the numbers. Run the below cell and see what each mathematical operation does.


In [None]:
subtracted_array = my_array - 2 # subtraction

print('subtraction:', my_array, 'minus 2 is', subtracted_array,'\n')

multiplied_array = my_array * 7 # multiplication

print('multiplication:', my_array, 'times 7 is', multiplied_array,'\n')

exponentiated_array = my_array ** 2 # exponentiation

print('exponentiation:', my_array, 'squared is', exponentiated_array)

We can also perform mathematical operations between multiple arrays when they contain the same amount of numbers.


In [None]:
# create a second array
another_array = numpy.array([1,1,1,1])

# both my_array and another_array contain four numbers each, allowing us to add them

combined_array = my_array + 2 * another_array # order of operations applied to arrays! (multiplication comes before addition)

print(my_array, '+ 2 *', another_array, 'is', combined_array) 

Does the output make sense? We also multiplied one of our arrays by two before adding it.

### 1.1 - Mathematical functions with NumPy
NumPy has lots of helpful Python functions that we can apply to arrays. These functions implement a variety of mathematical procedures seen in algebra, trigonometry, and statistics. Run the below cell to see some examples of these mathematical functions.

In [None]:
print('The square root of', my_array, 'is',numpy.sqrt(my_array),'\n') # square root

print('The third power of', my_array, 'is',numpy.power(my_array, 3),'\n') # exponentiation x^3

print('The exponential function of', my_array, 'is',numpy.exp(my_array),'\n') # exponential function e^x

print('The base 10 logarithm of', my_array, 'is',numpy.log10(my_array),'\n') # base 10 logarithm

print('The absolute value of', [-2,-1,0,1,2], 'is',numpy.abs([-2,-1,0,1,2]),'\n') # absolute value

print('The sum of', my_array, 'is', numpy.sum(my_array)) # summation


Here are some examples of trigonometric functions and statistical operations.

In [None]:
print('The cosine of', my_array, 'is', numpy.cos(my_array),'\n') # cosine

print('The sine of', my_array, 'is', numpy.sin(my_array),'\n') # sine

print('The mean of', my_array, 'is', numpy.mean(my_array),'\n') # average value

print('The standard deviation of', my_array, 'is', numpy.std(my_array)) # standard deviation


### 1.2 - Indexing numbers in NumPy arrays
To select a single number from a NumPy array, we perform an operation called **indexing**. Each number in an array has a corresponding index order. The first number in the array has the index order `0`, and the second number in the array has the index order `1`, and the third number in the array has the index order `2`, and so on. The index number can be thought of as the the number of spaces you have to move over from the first number to reach the desired number. To select a number from an array with an index of `X`, we type `my_array[X]`. Examples of indexing are given below.

In [None]:
# get the leftmost number in my_array
print('The leftmost number in', my_array, 'is', my_array[0],'\n')

# get the rightmost number in my_array
print('The rightmost number in', my_array, 'is', my_array[3])

If we wish to select multiple adjacent numbers from an array, we can type `my_array[X:Y]` where `X` tells us where to start our selection, and `Y` tells us where to stop selecting numbers. The value at `my_array[Y]` is not included in the selection.

In [None]:
# select a group of numbers that starts at an index of 1 (meaning one number after the leftmost number) 
# and stops before the rightmost number (meaning that we don't include the rightmost number in our selection)

print('From', my_array, 'we select', my_array[1:3])

Now it's your turn. 

#### **Exercise 1**

**Create a new NumPy array, such that the sum of your array with `my_array` is `[10, 10, 10, 10]`. Print the sum of the arrays to convince yourself of this. Then, use indexing to select the numbers `8` and `5` from your new array and print them.**

In [None]:
my_array = numpy.array([2,4,5,8])

# YOUR CODE GOES HERE
# ......
# ......

### 1.3 - Generating arrays with NumPy functions
NumPy has several functions that can generate large arrays for us. This is convenient if we don't want to have to type in a large list of numbers ourselves. Like many other functions in Python, these array functions take in parameters as input that specify what each function will do. The below cell gives examples of NumPy array-generating functions and their input parameters.

In [None]:
# create an array of numbers that starts at zero, ends before thirty, and increments in multiples of three
number_range = numpy.arange(0, 30, 3)

print('A sequence of integers starting at zero, stopping before thirty,')
print('and selecting only every third integer:', number_range, '\n')

# create an array of numbers that starts at zero, ends at thirty, and has five evenly spaced numbers
number_range = numpy.linspace(0, 30, 5)

print('Five evenly spaced numbers between zero and thirty:', number_range)

There are also functions that generate random numbers.

In [None]:
# create an array of five random numbers uniformly distributed between 3 and 6
number_range = numpy.random.uniform(3,6,5)

print('Five randomly drawn numbers between three and six:', number_range, '\n')

# create an array of five random numbers sampled from a Gaussian distribution (aka a bell curve)
number_range = numpy.random.normal(size=5)

print('Five randomly drawn numbers from a Gaussian distribution:', number_range)

Arrays can store other data besides numbers. Let's create an array of words and use our knowledge of indexing and NumPy functions to select every other word from the array.

In [None]:
# array of fruit names
favorite_fruits = numpy.array(["mango", "lychee", "grapefruit", "papaya", "mangosteen", "guava"])

# our index will start at zero, stop before five, and increment in multiples of two
array_for_indexing = numpy.arange(0, 5, 2)

print('We will index into favorite_fruits using the indexing array', array_for_indexing)

# we can use a for loop to increment though each number in our indexing array
for i in array_for_indexing:
    
    # print the array at the current index
    print(f'favorite_fruits[{i}] is', favorite_fruits[i])

We're almost done introducing NumPy! As a last challenge, it's your turn to use NumPy and unveil a secret message in a list of words.

#### **Exercise 2**

*What did the photon say when the clerk asked if it needed help with its luggage?* In the cell below, the answer to this question is encoded in the array `secret_message`. Can you figure out how to print it out using a `for` loop and `numpy.arange`?


In [None]:
secret_message = numpy.array(["blah", "blah", "No thanks,", "blah", "blah", "I'm", "blah", "blah", "traveling", "blah", "blah", "light!"])

# YOUR CODE GOES HERE
# ......
# ......

## 2 - Making Graphs with Matplotlib
We will now introduce the graph-plotting package Matplotlib. This package is useful to us as cosmologists because it allows us to make visualizations of the data that we collect. With Matplotlib, we can make scatter plots, pie charts, bar graphs, histograms, and more. For today, we will focus on plotting mathematical equations, and then we will work with some real DESI data.

Rather than importing the entire Matplotlib package, we are going to import a smaller collection of code (or "module") within it called **pyplot**. We can use the `from` keyword in our import statement to accomplish this. Run the below cell to import pyplot from Matplotlib.

In [None]:
# We will import the matplotlib plotting package which contains a useful collection of code - or module - called pyplot
from matplotlib import pyplot

Now let's use pyplot to plot the equation $y=x^2$. Creating a plot can be broken down into a few steps
 - Create the x and y data using NumPy arrays
 - Use the `pyplot.plot` function to plot the data
 - Use the `pyplot.xlabel` and `pyplot.ylabel` functions to create labels for the x-axis and y-axis
 - Use `pyplot.title` to give the plot a title

Let's see a worked example

In [None]:
# our x-axis values will run from negative ten to ten with thirty evenly spaced values
x_values = numpy.linspace(-10, 10, 30)

# y = x^2
y_values = x_values ** 2

# plot the x and y values
pyplot.plot(x_values, y_values)

# create labels for the graph axes and a graph title
pyplot.xlabel('x values')
pyplot.ylabel('y values')
pyplot.title('y = x * x');

Now it's your turn. 

#### **Exercise 3**

**Go back to the above cell and replace `pyplot.plot` with `pyplot.scatter`. What happens to the plot when you rerun the cell?**

### 2.1 - Plotting DESI data with Matplotlib
Now let's use Matplotlib to work with some real DESI data. We are going to make a map of thirty thousand DESI galaxies as they appear in our night sky. 

First, we'll need to load up our galaxy data. To do this, we'll use the `desihigh` package which contains a module called `galaxies`. Let's import it now.

In [59]:
import sys; sys.path.append("..") 
from codes import galaxies

Next, we'll use the function `get_ra_dec_z_region` from `galaxies` to load up our galaxy data.

In [None]:
ra, dec, redshift = galaxies.get_ra_dec_z_region('../data/DR1_BGS_sample_galaxies.BIN')

`ra`, `dec`, and `redshift` are three NumPy arrays that the `get_ra_dec_z_region` function returns. The `ra` and `dec` arrays contain the angular coordinates of the galaxies in the night sky, similar to the latitude and longitude coordinates used to make maps on Earth. We won't use the `redshift` array for now.


#### **Exercise 4**

**Now let's use `pyplot.scatter` to plot the galaxies. Make a scatter plot with `pyplot.scatter` that plots `ra` and `dec`, similar to how we plotted `x_values` and `y_values` above. We want `ra` to be the the values plotted on the x-axis of our plot, so it should be the first parameter entered into `pyplot.scatter`, followed by `dec`.** 

**Next, add the x-axis label `R.A. [degrees]` to the plot. After that, add the y-axis label `Dec. [degrees]` to the plot. Finally, add the title `Sky Map of 28,284 DESI Galaxies!` to the plot. You can look back at the last plot we made if you need to remember how this is done. Run your code, and take a look at the galaxies.**

In [None]:
# YOUR CODE GOES HERE
# .....
# .....
# .....
# .....

Your map will look like a big blue rectangle at first. Why is that? Where are all the galaxies? It turns out that we can't see the galaxies yet because the scatter function has drawn them too large, and they all overlap each other. To fix this, we can use the size parameter `s`. **Add `s=0.5` as the third parameter input to `pyplot.scatter` and rerun the cell.**

Now you should be able to see the galaxies! You'll notice that some regions of the map have a lot of galaxies. These locations are called **galaxy clusters**, and there can be tens of thousands of galaxies  bound together by gravity inside of them. There are also areas with very few galaxies. These locations are called **cosmic voids**, and they are some of the emptiest regions of our universe.

**Finally, let's add a legend to our map. To do this, we first need to add `label = 'Bright Galaxy Survey'`as the fourth parameter in our `pyplot.scatter` input. Then, on the line below`pyplot.scatter`, type `pyplot.legend(framealpha=1, markerscale=5, loc='upper left')`. Rerun the cell, and you'll see a legend added in the corner.**

Our legend tells us that the galaxies we are plotting come from the DESI Bright Galaxy Survey, which is one of several galaxy catalogs that DESI is putting together. The Bright Galaxy Survey is a detailed map of the universe within seven billion light-years of our own galaxy. That sounds like a large distance, but the Bright Galaxy Survey is actually one of the closest DESI catalogs to us. Other DESI catalogs map out the locations of various types of galaxies that are located even farther away!

Now you're ready to use your knowledge of Python and Python packages and start exploring astrophysic. Have fun!

## Before closing this notebook

Remember that all work will be permanently deleted once the session ends.

Before closing this notebook, <span style="color: red">**download your completed notebook by selecting `File → Download → Download Notebook`.**</span>

<span style="color: red">**Upload the downloaded `.ipynb` file to Canvas before the end of the class.**</span>