Updating tests to use fake dataset instead of real data #1809

git-abhishek · 2018-05-30T08:27:05Z

Removed real data usage from test cases

PR Summary

Updated test cases to use fake_particle_ds or fake_random_ds instead of real dataset and other such changes.
This should increase coverage and reduce runtime (both by small amount).

PR Checklist

Code passes flake8 checker
New features are documented, with docstrings and narrative docs
Adds a test for any bugs fixed. Adds tests for new features.

matthewturk · 2018-05-30T13:27:36Z

In some cases, we might want to use fake_amr_ds instead, as it will test out the machinery slightly differently. But, I think all of these usages are just great. Thank you!

ngoldbaum

Overall this looks great and I like where this is going. I've left a few comments below, mostly to point out places where I'm worried using fake data might subtly change what functionality is getting tested and suggesting ways to make the fake data behave more like the real dataset you're replacing.

Some of the suggestions I made are a bit more involved than just tweaks of the current code, if you feel like I'm asking too much and you'd like to punt for later just let me know.

ngoldbaum · 2018-05-30T14:42:17Z

yt/data_objects/tests/test_covering_grid.py

+    ds = load_octree(octree_mask=octree_mask, data=quantities, bbox=bbox,
+                     over_refine_factor=0, partial_coverage=0)
+    cgrid = ds.covering_grid(0, left_edge=ds.domain_left_edge,
+                             dims=ds.domain_dimensions)


I'd call move this code into a new function in yt/testing.py named fake_octree_ds.

It might also be worthwhile to add a few more AMR levels to the fake octree dataset you're creating. Right now there are only 3 levels (the root oct is fully refined and two of the root oct's child octs are refined), maybe add two or three more AMR levels, just to make sure we're not testing with too simple of an AMR structure.

The format of octree_mask can be a little confusing, there's a very nice explanation for how to encode an octree in this way in the docs for the hyperion code: http://docs.hyperion-rt.org/en/stable/advanced/indepth_oct.html

(we could probably improve on our own documentation by adapting that explanation into our docs)

ngoldbaum · 2018-05-30T14:46:37Z

yt/data_objects/tests/test_covering_grid.py

-    scg = ds.smoothed_covering_grid(1, [0, 0, 0], [128, 128, 1])
-    assert_equal(scg['density'].shape, [128, 128, 1])
+    scg = ds.smoothed_covering_grid(1, [0.0, 0.0, 0.0], ds.domain_dimensions)
+    assert_equal(scg['density'].shape, ds.domain_dimensions)


I believe this test is making sure that smoothed_covering_grid properly works with 2D datasets.

One thing you could do is make fake_random_ds take a dimensionality keyword argument that lets it create 1D and 2D test datasets. That way you could test 1D, 2D, and 3D in this test relatively straightforwardly.

Does the field ndims represents dimensionality in fake_random_ds?

ndims is a shorthand there for the number of grid points in the uniform grid fake_random_ds creates. For example:

In [5]: ds = fake_random_ds(64) In [6]: ds.dimensionality Out[6]: 3 In [7]: ds.domain_dimensions Out[7]: array([64, 64, 64])

So for ndims=64, we get a 64^3 uniform resolution grid.

All that said, it looks like this actually works out of the box right now:

In [1]: import yt In [2]: from yt.testing import fake_random_ds In [3]: ds = fake_random_ds([32, 32, 1]) In [4]: ds.dimensionality Out[4]: 2

And similarly:

In [8]: ds = fake_random_ds([64, 1, 1]) In [9]: ds.dimensionality Out[9]: 1

So it looks like you don't need to modify fake_random_ds after all.

ngoldbaum · 2018-05-30T14:50:15Z

yt/data_objects/tests/test_fluxes.py

@@ -115,10 +109,18 @@ def test_correct_output_unit():
    sur = ds.surface(sp1,"HI_Density", .5*Nmax)
    sur['x'][0]


you can delete this original test now i think

ngoldbaum · 2018-05-30T14:50:44Z

yt/data_objects/tests/test_fluxes.py

@@ -115,10 +109,18 @@ def test_correct_output_unit():
    sur = ds.surface(sp1,"HI_Density", .5*Nmax)
    sur['x'][0]

-@requires_file(ISOGAL)
+def test_correct_output_unit_fake_ds():
+    # implementing test_correct_output_unit() with fake dataset


can you delete this comment but retain the comment in the original test referencing #1368?

ngoldbaum · 2018-05-30T14:54:38Z

yt/data_objects/tests/test_fluxes.py

 def test_radius_surface():
    # see #1407
-    ds = load(ISOGAL)
+    ds = fake_random_ds(64, nprocs=4, particles=16**3)


Can you set the length_unit to a non-default value, something like length_unit=10? That will make sure the units of code_length are different from the default unit of the radius field ('cm'), which is what this test is trying to look for.

ngoldbaum · 2018-05-30T14:58:22Z

yt/data_objects/tests/test_profiles.py

+    _negative = (False, False, False, False, True, True, True, False, False,
+                 False)
+    ds = fake_particle_ds(fields=_fields, units=_units, negative=_negative,
+                          npart=16 ** 2, length_unit=1.0)


I think you need to use fake_random_ds here, but tell it to include particle fields as well. By using a purely particle dataset here you're changing what this test is actually doing under the hood. It needs to be run using a dataset with both mesh and particle fields because the profile that gets created below gets binned using two mesh fields ('gas', 'temperature') and ('gas', 'density'), but that actual field that's being profiled is a deposited particle field. So you need both a mesh to deposit onto and particle fields to deposit onto the mesh.

Let me know if you run into trouble getting fake_random_ds working.

I need help in implementing this test case with fake_random_ds.

Take a look at this example:

In [3]: ds = fake_random_ds(32, particle_fields=("particle_position_x", "particle_position_y", "particle_position_z", "particle_mass", "particle_velocity_x", "particle_velocity_y", "particle_velocity_z"), particle_field_units=('cm', 'cm', 'cm', 'g', 'cm/s', 'cm/s', 'cm/s'), particles=16) In [4]: ds.particle_type_counts Out[4]: {'io': 16} In [5]: ad = ds.all_data() In [6]: ad['gas', 'density'] Out[6]: YTArray([0.76017901, 0.96855994, 0.49205428, ..., 0.78097504, 0.61756868, 0.27913556]) g/cm**3 In [7]: ad['gas', 'density'].shape Out[7]: (32768,) In [8]: ad['deposit', 'io_cic'] Out[8]: YTArray([0., 0., 0., ..., 0., 0., 0.]) g/cm**3 In [9]: ad['deposit', 'io_cic'].shape Out[9]: (32768,) In [11]: ad['deposit', 'io_mass'].sum() Out[11]: 8.105905659452372 g In [12]: ad['io', 'particle_mass'].sum() Out[12]: 8.105905659452372 g

thanks, pushed the changes.

git-abhishek · 2018-05-30T15:17:42Z

@ngoldbaum Thanks for the detailed comments, this will definitely help me in updating the PR.

ngoldbaum · 2018-06-01T00:54:51Z

yt/testing.py

@@ -444,6 +444,51 @@ def fake_vr_orientation_test_ds(N = 96, scale=1):
    ds = load_uniform_grid(data, arr.shape, bbox=bbox)
    return ds

+
+def construct_octree_mask(prng=RandomState(0x1d3d3d3), refined=[True]):


just in case you didn't know where this comes from:

https://www.youtube.com/watch?v=XWX4GUYGQXQ

😆 good to know...
I would have kept it 0x4d3d3d3, if it had not generated a trivial mask.

fix small issue missed in #1809

git-abhishek added 2 commits May 30, 2018 04:21

Updating tests to use fake dataset instead of real data

07116ab

fixed the failing error, increased num of particles

1abda56

matthewturk approved these changes May 30, 2018

View reviewed changes

ngoldbaum reviewed May 30, 2018

View reviewed changes

git-abhishek added 5 commits May 30, 2018 16:39

deleted the old test case

9e0c29e

changed length_unit value from default

e127a26

Created fake_octree_ds

025bcf2

No-op commit

96f87ce

implemented using fake_random_ds

dfc38e3

ngoldbaum approved these changes Jun 1, 2018

View reviewed changes

ngoldbaum merged commit d3894d9 into yt-project:master Jun 1, 2018

ngoldbaum pushed a commit to ngoldbaum/yt that referenced this pull request Jun 1, 2018

fix small issue missed in yt-project#1809

b8af129

git-abhishek deleted the cc_data_objects branch June 1, 2018 15:57

ngoldbaum pushed a commit that referenced this pull request Jun 4, 2018

Merge pull request #1812 from ngoldbaum/2d-cg-fix

77a1d89

fix small issue missed in #1809

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updating tests to use fake dataset instead of real data #1809

Updating tests to use fake dataset instead of real data #1809

git-abhishek commented May 30, 2018 •

edited

matthewturk commented May 30, 2018

ngoldbaum left a comment

ngoldbaum May 30, 2018

git-abhishek May 31, 2018

ngoldbaum May 30, 2018

git-abhishek May 31, 2018

ngoldbaum May 31, 2018

ngoldbaum May 30, 2018

git-abhishek May 31, 2018

ngoldbaum May 30, 2018

git-abhishek May 31, 2018

ngoldbaum May 30, 2018

git-abhishek May 31, 2018

ngoldbaum May 30, 2018

git-abhishek May 31, 2018

ngoldbaum May 31, 2018

git-abhishek May 31, 2018

git-abhishek commented May 30, 2018

ngoldbaum Jun 1, 2018

git-abhishek Jun 1, 2018

		@@ -115,10 +109,18 @@ def test_correct_output_unit():
		sur = ds.surface(sp1,"HI_Density", .5*Nmax)
		sur['x'][0]

Updating tests to use fake dataset instead of real data #1809

Updating tests to use fake dataset instead of real data #1809

Conversation

git-abhishek commented May 30, 2018 • edited

PR Summary

PR Checklist

matthewturk commented May 30, 2018

ngoldbaum left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

git-abhishek commented May 30, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

git-abhishek commented May 30, 2018 •

edited