# Lesson 02: Exploring data with yt

<div class="alert alert-block alert-info">
 
## Goals:

* Exploring fields (min/max), the quantities interface
* Selecting regions of data
* slices and rays
* visualizing our selections
    
</div>

Now that we've learned a bit about how to load data with yt and what available objects there are in a field, we can use yt to perform analysis on those fields. 

### Exploring fields

First, let's explore some values that occur in our dat. For this tutorial we'll be using `enzo tiny cosmology`.

In [None]:
import yt
ds = yt.load_sample('enzo_tiny_cosmology')

This particular dataset is a *time series*. yt automatically loads the last file in a time series dataset; in this case it is `DD0046`.

Now let's explore a bit of the data contained in this file. First, let's see what fields we can explore:

In [None]:
ds.field_list

Great! That's a lot of fields. You might know that yt has a lot of tools in visualization, but it can also be used to inspect data. Here's an example where we find the extrema (the min and max) values of the "density" field. 

In [None]:
dd = ds.all_data()
dd.quantities.extrema("density")

Ok, a few things have happened here. First, we've done this operation `ds.all_data()` which returns a *region* in yt based on the domain boundaries detected when the data was read in. We've returned all of the data in this dataset. 

We can check this by inspecting the `dd` object, which is a `YTRegion`. 

In [None]:
dd

Region objects (and in fact any selection of the data) have **quantites** that we can use on them, but they need to be a type data selection. In the previous example, we accessed the `extrema` quantity, whcih returns a unyt array of length two of the minimum and maximum values in the density field of this dataset. 

Let's see what other quantities are available: 

In [None]:
print (list(dd.quantities.keys()))

ok, now let's examine what arguments we need to pass to get something we want. How about `WeightedAverageQuantity`

In [None]:
dd.quantities.weighted_average_quantity?

and now let's calculate the temperature weighted density of this dataset:

In [None]:
dd.quantities.weighted_average_quantity("density", weight="temperature")

we can also pass a list of fields with which to calculate a weighted average quantity

In [None]:
dd.quantities.weighted_average_quantity(["density", "temperature"], weight="temperature")

<div class="alert alert-block alert-warning">

## Interactive Exercise 01
    
Load in the dataset `enzo_cosmology_plus` and find the  following quantities:
    
* min and maximum values of each `enzo` field in the entire dataset
* the total gas mass in the dataset
* the `cell_volume` weighted average density
* the location of the min and max values 
    

</div>

In [None]:
# This is starting cell to do exercise 01

<div class="alert alert-block alert-success">
 
### Tips:
    
* instead of doing `dd = ds.all_data()` and then selecting data with dens = dd["density"], yt has the option to do an automatioc `region selector` with `dens = ds.r["density"]`. `ds.r()` is a `RegionExpression` object and doesn't have any quantities associated with it. When used with a field argument, this function returns a flattened array of the data. 
* for this tutorial we've only loaded in the last file `DD0046`. However, because this is a time-series dataset there are a number of other files. yt can recognize this by loading with wildcard operators. e.g. `ts = yt.load("enzo_tiny_cosmology/DD????/DD????")`. 

</div>

## Volumetric Region Selectors

So far we've loaded in the entire dataset with `all_data()`. We learned that this is a special function that returns a YTRegion that spans the entire domain of the data. However, yt has other data selectors that are available, like `sphere`. Let's do a region selection of the entire dataset manually and check that the values are the same as what we saw in the first section. 

In [None]:
bx = ds.box([0.,0.,0.], [1.,1.,1.]) 
# we could also specify a center and use ds.region(). ds.box() assumes the center of this prism is the 
# centerpoint of the left and right edges. 

In [None]:
bx

bx is a YTRegion object, which is what would be returned had we used the `ds.region()` selector as well. Let's compare it to the dd object we used before. 

In [None]:
dd 

They look the same at first pass! Let's see what happens if we look at the min, the max, and a random weighted field! 

In [None]:
print(bx.quantities.extrema('temperature'))
print(dd.quantities.extrema('temperature'))

In [None]:
print(bx.quantities.weighted_average_quantity('temperature', weight='ones'))
print(dd.quantities.weighted_average_quantity('temperature', weight='ones'))

Ok! So now we can be reasonably sure that the way we selected data with `ds.all_data()` is a nice shorthand to select all data in our dataset. However, we can create the same object with `box()`

Now let's do a region selector with a sphere. 

In [None]:
ds.sphere?

Ok, so let's define a centerpoint at 0.5, 0.5, 0.5 in code units and extend the sphere outwards to 0.25 code units. 

In [None]:
center = [0.5, 0.5, 0.5]
sp = ds.sphere(center , 0.25) 

Is the sphere a YTRegion object like the selector we saw before? Nope! It's a YTSphere. Both are subclasses of the same type of yt selector object and so have similar operations available to them though. 

In [None]:
sp

Let's take a look at the extrema and compare them to what we saw in the larger dataset! 

In [None]:
print(sp.quantities.extrema('density'))
print(dd.quantities.extrema('density'))

In [None]:
print(sp.quantities.weighted_average_quantity('density', weight='cell_volume'))
print(dd.quantities.weighted_average_quantity('density', weight='cell_volume'))

In [None]:
print(sp.mean('density', weight='cell_volume'))
print(dd.mean('density', weight='cell_volume'))

<div class="alert alert-block alert-warning">

## Interactive Exercise 02
    
With your previously loaded dataset `enzo_cosmology_plus` and select a region (from the available options [in the docs](https://yt-project.org/doc/analyzing/objects.html#region-reference)) in the domain center with random dimensions. 
    
Now compare the total gas mass, the min, and the max values of the density field for the region you've selected to that of the total dataset that you did in exercise 01. Do they differ? Do they look the same? 
</div>

In [None]:
# This is a starting cell to do exercise 02 

<div class="alert alert-block alert-success">
 
### Tips:
    
* `obj.mean()`, `.min()`, and `.max()` are easy accessors for `.quantiy.weighted_average_quantity()`, and `.quantity.extrema()`
* The way this tutorial has specified the left, right, and center of the object is in code units, which generally span from 0 to 1. However, you can use any unit quantity you'd like thanks to yt's unit interface. Maybe you'd like to use MPc instead? No problem! Try `sp = ds.sphere(center, (10, 'Mpc'))`

</div>

## Rays and Slices 

So far we've seen some volumetric selector objects and how we can calculate different things on those object with the quantities interface. These are very useful to get intuitions into our data! We can find out a lot about what's going on in our fields by using these selectors. 

However, yt has other selector objects that can give us insight into our data. Let's start with `ray` objects. We can define a path through our data with a starting point and an ending point. 

In [None]:
ds.ray?

In [None]:
ra = ds.ray([0.1, 0.2, 0.3], [0.9, 0.8, 0.7])

In [None]:
ra

We can plot this with matplotlib if we'd like. 

In [None]:
import matplotlib.pyplot as plt

In [None]:
t = ra['t']
dens = ra['density']
temp = ra['temperature']
fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, figsize=(12, 4))
ax1.plot(t, dens)
ax1.set_yscale('log')
ax1.set(xlabel='distance along ray', ylabel='density (g/cm**3)',
       title='hit me with that laser beam!')

ax2.plot(t, temp)
ax2.set_yscale('log')
ax2.set(xlabel='distance along ray', ylabel='temperature (K)',
       title='hit me with that  thermometer!')
plt.show()

We can also use the same `quantity` accessors that we used before! 

In [None]:
print(ra.quantities.extrema('density'))
print(ra.quantities.extrema('temperature'))

Rays are a good way to gain intuition to what our data looks like along a specific path. This might be hard to immediately glean by inspecting values or visualizing the whole dataset. 

We can also create 2d data objects, such as a `slice`! 

In [None]:
ds.slice?

In [None]:
sl1 = ds.slice('z', 0.4)

In [None]:
sl1

In [None]:
print(sl1.min('density'))
print(sl1.max('density'))
print(sl1.mean('density'))

We can also do things like [off-axis slices](https://yt-project.org/docs/dev/quickstart/data_objects_and_time_series.html#Off-Axis-Slices)

In [None]:
ds.cutting?

In [None]:
sl2 = ds.cutting([0.2, 0.3, 0.5], "min")

In [None]:
sl2

In [None]:
print(sl2.min('density'))
print(sl2.max('density'))
print(sl2.mean('density'))

<div class="alert alert-block alert-warning">

## Interactive Exercise 03
    
* Find the locations of the maximum and minumum values in the temperature field of `enzo cosmology plus` and create a ray with it. How do the quantity values of the ray compare to the shape you created in exercise 02? 
* Create a slice object at an arbitrary location with this dataset. 
    
</div>

In [None]:
# This is a cell to start for exercise 03 

<div class="alert alert-block alert-success">
 
### Tips:
    
* `ds.r()` gives us shorthand notation to slice our data without using the `slice()` method. For example `ds.r[:,:, 0.5]` will return a YTSlice object at the z midplane of the dataset. 

</div>

## Visualizing our Selections and Advanced Selections

The `.plot()` method is an easily accessible way for us to visualize the selections we've been doing with our data. Let's try them out on the objects we've already created in the previous selection. 

In [None]:
p1 = sl1.plot('density')
p1.show()

In [None]:
p2 = sl1.plot('temperature')
p2.show()

We can also use volume selectors in slices to do some more advanced data selection. Let's use a sphere and a slice together and see what that looks like! 

In [None]:
sp2 = ds.sphere(center, 0.4) 
sl3 = ds.slice('z', 0.4, data_source=sp2)
p3 = sl3.plot('density')
p3.show()

We can even chain together volumes with an union and plot that! 

In [None]:
sp3 = ds.sphere([0.4, 0.5, 0.5], 0.18)
sp4 = ds.sphere([0.7, 0.65, 0.5], 0.28)

In [None]:
isp = ds.union( [sp3, sp4] )
sl4 = ds.slice('z', 0.5, data_source=isp)
p4 = sl4.plot('temperature')
p4.show()

<div class="alert alert-block alert-warning">

## Interactive Exercise 04
    
* create a a slice object object in the `temperature` field and visualize it with the .plot() object. 
* bonus: create an intersection object with a slice object and the volumetric object you made in exercise 03 and visualize that! 
</div>

In [None]:
# This is a cell to start exercise 04

<div class="alert alert-block alert-success">
 
### Tips:
    
* .plot() works yt slice and projection objects
* you can visualize a volumetric object by passing it through the `data_source` arg of a slice. However, you need to be careful that the regions actually intersect, or else your returned object will be all zeroes! 
* Another type of selection that wasn't covered in this tutorial is a [Profile](https://yt-project.org/docs/dev/quickstart/derived_fields_and_profiles.html) object. If you have time, try to create one! 
* Another type of object we didn't create was a projection! Try it out. 
* more plotting will happen in the next lesson! 

</div>

<div class="alert alert-block alert-danger">

## Bringing It All Together: Challenge Exercise
    
With `enzo_cosmology_plus`, create an intersection object composed of a disk and a sphere at arbitrary locations in the data (make sure they overlap). Then make a slice of this object at its midpoint and visualize it with `.plot()`. Calculate the total mass of each object you created. 
</div>

In [None]:
# This is a starting cell to do the challenge exercise

# Takeaways

<div class="alert alert-block alert-success">

### There are many ways we can do the same operation with yt
### Derived quantities can be calculated on YT selection objects. These objects can be:
    
* [geometric](https://yt-project.org/doc/analyzing/objects.html#geometric-objects) (object is based on geometry)
* [filtering objects](https://yt-project.org/doc/analyzing/objects.html#filtering-and-collection-objects) (object is based on field criteria)
* [construction objects](https://yt-project.org/doc/analyzing/objects.html#construction-objects) (data is based on additional analysis)
* [collections](https://yt-project.org/doc/analyzing/objects.html#filtering-and-collection-objects) (object is a collection of other objects).
    
### Slice objects can be visualized with the `.plot()` method 

</div>