Now that we know the basics of python, we can try to do something useful with it.

In the working directory of this notebook is a file called [planets.csv](planets.csv). This contains data on all confirmed exoplanets found by NASA, taken from [here](https://exoplanetarchive.ipac.caltech.edu/cgi-bin/TblView/nph-tblView?app=ExoTbls&config=planets). It has several columns for different attributes of the planets. Each row is the data on one planet. The details of each column are described in [planets-db-info.txt](planets-db-info.txt).

We're going to read in the file and do some simple analysis on the data. The easiest way to read the data is using the `DictReader` class from the [`csv`](https://docs.python.org/3/library/csv.html) module.

In [None]:
from csv import DictReader

with open('planets.csv') as dbfile :
    dbreader = DictReader(dbfile)
    planets = list(dbreader)

Now, `planets` is a `list` with one `dict` per planet in the database. Each `dict` has a key and value for each column in the database.

In [None]:
# Print all the info on the first planet:
for column, value in planets[0].items() :
    print(column.ljust(20), value)

Now use the list of planets to:
- Find the nearest and farthest planets (using the "st_dist" column).
- Find the average of the planet mass (in Jupiter masses, using "pl_masse") and the average of the planet temperature (using "pl_eqt").
- Count how many planets have been found each year (using "pl_disc").
- Calculate a measure of how Earth-like each planet is using $(\mathrm{pl\_masse}-1)^2 + \left(\frac{\mathrm{pl\_eqt}-300}{300}\right)^2 + (\mathrm{pl\_orbeccen}-1)^2$. "pl_orbeccen" is the orbital eccentricity of the planet (how elliptical it is). Find the planet that's most Earth-like (has the lowest value of this measure).

Note that the column values are all read in as strings, so you'll need to convert to `float` as necessary. Also, not all planets have values for every column, so you should skip those that're missing the column of interest.

Try writing a class to contain the list of planets with member functions to solve the problems above.

As mentioned, [matplotlib](https://matplotlib.org/index.html) is great for graphical output in data analysis. Try to find out how to use `matplotlib.pyplot.hist` to make a histogram of the planets' orbital period ("pl_orbper"). This has quite a large range of values though the majority of planets in the database have quite short orbital periods, so you'll want to restrict the range between 0 and 200.

Remember to use `help` to find out more about a class, function, or object.

In [None]:
from matplotlib import pyplot as plt

Try adding a member function to your database class to plot any attribute of the planets in a given range. Then try plotting some other properties like planet mass or equilibrium temperature.

Now try using `matplotlib.pyplot.hist2d` to make a plot of planet equilibrium temperature (in the range 0-3000) vs star effective temperature ("st_teff", in the range 0-10000).

Try doing planet equilibrium temperature vs orbital period (in the range 0-200). How about vs 1/(orbital period) (in the range 0-1)?

Now try using `numpy.polyfit` to fit a straight line to the data used to make these plots.

In [None]:
from numpy import polyfit

Implement the 2D plotting and fitting as fuctions in your class.

If you think of something else to try, go ahead! See what interesting info you can extract from the database.

If you want to try some different challenges, have a look at [Project Euler](https://projecteuler.net/).