### Getting and Displaying Images from NASA

This notebook shows the process of getting, displaying, and saving images from NASA's Image and Video Library to a dataset that can be edited and used for fine-tuning. To use this dataset, an API key is required. Once a key has been acquired thruogh a simple request process, the following instructions can be applied to access the images within the dataset: https://www.educative.io/blog/how-to-use-api-nasa-daily-image

This will also requrie the use of a NASA created and managed packaged: nasapy. How to use nasapy is available here: https://nasapy.readthedocs.io/en/latest/ and here: https://nasapy.readthedocs.io/en/latest/api.html

In [None]:

#import libraries
import nasapy #python wrapper for NASA API 
from nasapy import media_search
import os #haven't used yet
import matplotlib.pyplot as plt
from skimage import io
import pandas as pd

In [None]:
nasa_key = pd.read_csv("nasa_api_key.csv")

In [None]:
url = "http://images-api.nasa.gov/search?q=exoplanet/api_key="
api_key = nasa_key['api_key']

In [None]:
nasa = nasapy.Nasa(key = api_key)

Now that the key is saved and ready to use, I can begin the process of querying NASA's dataset. The first step is to display the images from a search query. This allows me to select which images I want to save from the search, which is step two. The first part of this code has been adapted from this article: https://onelinerhub.com/python-pillow/how-to-load-an-image-from-url

In [None]:
#defining a function to intake image data and display it, so we can make sure our dataset 
#only includes the images we want it to

def get_images(database_name):
    for i, image in enumerate(database_name):
        link_data = image['links']

        for url in link_data:
            image_url = url['href']
            image_to_show = io.imread(image_url)
            plt.imshow(image_to_show)
            plt.xlabel(i)
            plt.show()

In [None]:
#defining a function that takes in a list of the image indexes above we want to keep 
#and saves the data of those images to a new dictionary

#keep = [] #fill this with the index of the image we want to keep, as a list

keep_images = []
def save_keep_images(database_name, keep_indexes):
    for i, image in enumerate(database_name):
        if i in keep_indexes:
            keep_images.append(database_name[i])
    return keep_images

Now that we have functions, we can begin getting data from multiple keyword searches with nasa's images database

### Searching Exoplanets

In [None]:
exoplanet_images = media_search(query="exoplanet", media_type="image")
exoplanet_data = exoplanet_images['items']

In [None]:
get_images(exoplanet_data)

In [None]:
keep_indexes = [0, 35, 36, 37, 38, 39, 41, 42, 45, 46, 54, 55, 65]
save_keep_images(exoplanet_data, keep_indexes)

### Searching Planets

Media search was found here: https://nasapy.readthedocs.io/en/latest/api.html

It was determined by using information found here: https://www.educative.io/blog/how-to-use-api-nasa-daily-image

In [None]:
planet_images = media_search(query="planet artist concept", media_type="image")
planet_data = planet_images["items"]

In [None]:
get_images(planet_data)

In [None]:
keep_indexes = [0, 1, 3, 4, 6, 7, 9, 10, 11, 12, 13, 14, 15, 18, 19, 20, 21, 25, 26, 35, 43, 45, 46, 47, 48, 51, 52, 53, 54, 55, 57, 59, 62, 69, 70, 71, 72, 73, 74, 76, 77, 80, 83, 84, 85, 86, 87, 92, 96]
save_keep_images(planet_data, keep_indexes)

In [None]:
print(len(keep_images)) #makes sure all our images were saved correctly

### Searching Planet Photographs (near solar system)

In [None]:
planet_photos = media_search(query="planet photographs", media_type="image")
planet_photo_data = planet_photos["items"]

In [None]:
get_images(planet_photo_data)

In [None]:
keep_indexes = [4, 6, 14, 29, 77, 80, 87]
save_keep_images(planet_photo_data, keep_indexes)

In [None]:
print(len(keep_images)) #making sure our images saved correctly

Now that I have images I can work with, I want to pull out only the information I need from them. First, let's take a look at the dataset

In [None]:
keep_images[2] #taking a look at a random image 

By taking a look we can see that what we really need is the image description, held in the data key, the keywords (not sure if we really need them, but we're going to grab them just in case) held in the data key, and the image link held in the links key. Let's grab those and save them to a csv file we can edit and re-upload.

In [None]:
#turning the dictionary to a dataframe so we can save it as a csv
exoplanet_image_data = pd.DataFrame.from_dict(keep_images)

In [None]:
#taking a look at our data
exoplanet_image_data 
#since we already have href, we can get rid of links and then we need to expand our data

The explode method was found here: https://saturncloud.io/blog/how-to-unnest-explode-a-column-in-a-pandas-dataframe/

In [None]:
#taking the explode method to be able to access the data values we want from our dataset
exoplanet_image_data = exoplanet_image_data.explode('data')
exoplanet_image_data = exoplanet_image_data.explode('links')

In [None]:
exoplanet_image_data #taking a look at our dataframe again 

In [None]:
#first thing we need to do is drop the href as it's only the link to the json file and not to our image
exoplanet_image_data = exoplanet_image_data.drop(['href'], axis=1)

In [None]:
#getting just our image descriptions
image_description = []
for d in exoplanet_image_data['data']:
    image_description.append(d['description'])

exoplanet_image_data['image_description'] = image_description

In [None]:
print(exoplanet_image_data['links'][0])

In [None]:
#getting just our image link
image_link = []
for l in exoplanet_image_data['links']:
    print(l['href'])
    image_link.append(l['href'])

exoplanet_image_data['image_link'] = image_link

In [None]:
#get rid of the columns we no longer need
exoplanet_image_data.drop(["links", "data"], axis=1)

In [None]:
#saving my dataframe to a csv file on my computer, so I can edit and update it to fit my NASA exoplanet data

exoplanet_image_data.to_csv("exoplanet_image_data.csv")