# Lenspy Demo

Written by Robert Morgan

`lenspy` is a software package that let's you interact with the strong gravitational lensing simulation software `lenstronomy` in a streamlined framework. Let's take a look at how it works!

In [None]:
import lenspy

`lenspy` works by reading user-prepared configuration files. start by specifying the configuration file you will use to make the dataset.

In [None]:
config_file = 'configs/des_gal_gal.yaml'

The configuration file is a yaml-style file for specifying all the properties of your dataset. Here is what this file contains:

In [None]:
! cat configs/des_gal_gal.yaml

Let's break down what the different sections mean:

<add lot's of details here>

## Simulating a Dataset

Let's put `lenspy` to work!

Start by deciding whether you want to save the dataset to files or operate in an interactive mode. 
- Set `store=True` to store the dataset in a varaible
- Set `save=True` to output the dataset as files

The default is the interactive mode with `store=True` and `save=False`. Let's work with that first.

In [None]:
dataset = lenspy.make_dataset(config_file)

That's it. You now have your dataset.

## Interacting with the Dataset

Now that we have a dataset, let's look at what was stored in the `dataset` variable.

The configuration labels you specified in the configuration file are stored here. The reasoning here is if you plan to do some sort of supervised classification, you will probably want to have the images labeled.

In [None]:
print(dataset.configurations)

The dataset name, size, and output directory are also stored as attributes of the dataset object.

In [None]:
print(dataset.name)
print(dataset.size)
print(dataset.outdir)

There are a few other things that get stored automatically (that you can explore via `dir(dataset)`), but we'll shift our focus to the things we simulated.

The most interesting information is stored here in these attributes:

In [None]:
for item in [x for x in dir(dataset) if x[0:13] == 'CONFIGURATION']:
    print(item)

The `_images` attribute is a `numpy.ndarray` object and the `_metadata` attribute is a `pandas.DataFrame` object.

In [None]:
print(type(dataset.CONFIGURATION_1_images))
print(type(dataset.CONFIGURATION_1_metadata))

### Images

Let's checkout some of the images in the `dataset.CONFIGURATION_1_images` attribute.

What's in this array?

In [None]:
print(dataset.CONFIGURATION_1_images.shape)

The array dimensions are (image index, band, x_pixels, y_pixels). 

The number of images is the size of the dataset multiplied by the fraction of the dataset in CONFIGURATION_1, both of which you specify in the configuration file. The bands used is also specified in the configuration file. Finally, yup you guessed it, the image dimensions are also specified in the configuration file.

`lenspy` also has built-in vizualization functions.

Let's look at the r-band of image index 2 in CONFIGURATION_1:

In [None]:
lenspy.view_image(dataset.CONFIGURATION_1_images[2][1])

You can also look at all the bands for this image at once.

In [None]:
lenspy.view_image(dataset.CONFIGURATION_1_images[2])

Finally, you can convert the single band images into a single RGB image:

In [None]:
lenspy.view_image_rgb(dataset.CONFIGURATION_1_images[2])

### Metadata

Once you have an image, you may want to consider the parameters that went into its generation to better understand what you made. To check that out, you can view the metadata saved by `lenspy`.

Let's look at the properties of the metadata.

In [None]:
dataset.CONFIGURATION_1_metadata.shape

Wow. That's a lot of columns. The number of columns increases with the complexity of your configurations, since there is more information for `lenspy` to keep track of. The columns are also broken up my band, so doubling the number of bands will double the number of columns in the metadata.

Let's look at the column names to see what information we have.

In [None]:
for col in dataset.CONFIGURATION_1_metadata.columns:
    print(col)

Every individual number used in the `lenstronomy` simulation is tracked.

As well, the row index in the metadata dataframe corresponds to the image index in the image array, so you can track which image has which properties. The dataframe contents can be accessed like this:

In [None]:
dataset.CONFIGURATION_1_metadata.iloc[0:5]

## Saving Datasets

If you are working in interactive mode, you can straightforwardly save the images array and metadata dataframe in any file format you are comfortable with.

If you instead choose to set the `save=True` option when making your dataset, let's look at what gets saved where.

In [None]:
saved_dataset = lenspy.make_dataset(config_file, save=True)

Recall that the dataset object has the user-specified out directory as an attribute.

In [None]:
print(saved_dataset.outdir)

Let's look in that directory.

In [None]:
! ls DES_GalGal

The image arrays have been stored as numpy files. They can be loaded by doing
```python
images = numpy.load('DES_GalGal/CONFIGURATION_1_images.npy').item()
```

The metadata dataframes have been written to csv files. They can be loaded by doing
```python
metadata = pandas.read_csv('DES_GalGal/CONFIGURATION_1_metadata.csv')
```

Future versions of `lenspy` will include file format flexibility and built in dataset loading funcitons.

## The End

That's pretty much it to `lenspy`! Feel free to contact me with any suggestions or bugs.

Have fun `lenspy`-ing!