In [None]:
# Running in Google Colab? Run this cell to download the data
!wget https://raw.githubusercontent.com/CIERA-Northwestern/REACHpy/main/Module_2/data/exoplanets_subset.txt

# If you're not running in Colab, this file should be in the data directory.
# Change the loading path of the file to include 'data/' when the file is loaded

# Challenge 1 | Using Python to Analyze and Plot Exoplanet Data

## Section 1: Background

In the last several years, the Kepler telescope has provided a wealth of data on new exoplanets, not to mention the exoplanets detected previously with other methods and other telescopes. It's a gold mine of data to be explored by astronomers who want to better understand the properties of these planets, such as their sizes, masses, and orbital periods around their host stars.

Here you'll work with a subset of exoplanet data to make a plot similar to the one shown below (Credit: NASA Ames/W. Stenzel). While astronomers have been finding exoplanets since the 1990s, the __[Kepler Mission](https://www.nasa.gov/mission_pages/kepler/overview/index.html)__ has led to an explosion of new expoplanet detections. The Kepler telescope detects planets in a special way called the transit method, whereby it monitors the brightness of stars, and watches for dips in the brightness that occur when a planet transits, or passes in front of, the star. While these brightness dips are incredibly small, Kepler can measure stars' brightness precisely to capture these transit events.

![Kepler Planet Candidates](https://github.com/mcstroh/REACHpy/blob/ideaspy_updates/Module_2/images/Kepler_planets.jpg?raw=1)

Several alternative methods exist for detecting planets, and to date, all method combined, astronomers have detected more than 4700 confirmed exoplanets.

The __[Open Exoplanet Catalogue](http://www.openexoplanetcatalogue.com/)__ is a catalogue of all discovered exoplanets, and is available to the public and can be accessed from your browser or downloaded from Github. For this challenge, we've collected a sample of exoplanet data that you'll use to explore the sizes and orbital periods of a subset of these exoplanets. 

To complete this challenge, you will:

 - Create and work with Python lists.
 - Read in and parse data to put it into a useful format.
 - Define your own functions and modules, which helps you to keep your code readable and organized.
 - Create simple plots using the plotting package matplotlib.

You will work with exoplanet data to study the radii and orbital periods around the host stars. We'll need to read in the data from a file, parse the data to extract what we need, write a function to do some unit conversions, and make a couple of plots.

## Section 2: Read and parse data

The file you're using contains comments at the start of the file. We are interested in the *planetary radius* and *orbital period*.

Instead of using Python to read in the file, we will first use a Unix command `cat` to browse the entire file. Run the next cell and use the output to figure out which columns are important.

If you are not using Google Colab, you can instead open this file using your favorite text editor.

In [None]:
# Use the following command to see the contents of the file we're using. - This not Python
!cat exoplanets_subset.txt

Now read in the file using the `numpy.loadtxt()` introduced in Section 3. You may want to use the `usecols` keyword to only read in certain columns.

In [None]:
# Open and read all lines using NumPy.

Create two arrays where one contains the planetary radii, and another that stores the orbital periods.

In [None]:
# Create the two arrays

## Section 3: Creating Figures

Create a histogram of the planetary radii. Twenty bins is a good starting value for your histogram.

In [None]:
# Create your histogram

Create a figure with the planetary radii (y-axis) versus the orbital period (x-axis). Use dots as your marker symbols, and do not connect the data points with lines (it would be meaningless here).

In [None]:
# Create your planetary radii versus orbital period figure

Make sure you've used appropriate x- and y-labels to each of the figures you created. The figure in Section 1 used logarithmic scaling on the x- and y-axes. Use `plt.xscale('log')` and `plt.yscale('log')` to obtain a similar scaling.