Issue/10/tomo bins #18

hangqianjun · 2023-06-19T11:33:15Z

Change Description

My PR includes a link to the issue that I am addressing
Resolves Simple class for tomographic binning #10

Solution Description

In this PR I have added a tomographer module under src/rail/estimation. It contains two generic tomographer classes:

PZTomographer, which takes per-galaxy n(z) from a qp.Ensemble object and output a tabular object with tomographic binning;
CatTomographer, which takes catalogue-like data and output a tabular object with tomographic binning;
The second type will be compatible with the classifiers in tomo-challenge, where features in the catalogue are used to assign tomographic bins.

For each of these types, I've added an example classifier in algos/. naiveClassifierSRD is a PZtomographer that uses simple point estimate SRD binning; randomForestClassifier is a CatTomographer which is adapted from TXPipe.

Code Quality

I have read the Contribution Guide
My code follows the code style of this project
My code builds (or compiles) cleanly without any errors or warnings
My code contains relevant comments and necessary documentation

Project-Specific Pull Request Checklists

Bug Fix Checklist

My fix includes a new test that breaks as a result of the bug (if possible)
My change includes a breaking change
- My change includes backwards compatibility and deprecation warnings (if possible)

New Feature Checklist

I have added or updated the docstrings associated with my feature using the NumPy docstring format
I have updated the tutorial to highlight my new feature (if appropriate)
I have added unit/End-to-End (E2E) test cases to cover my new feature
My change includes a breaking change
- My change includes backwards compatibility and deprecation warnings (if possible)

Documentation Change Checklist

Any updated docstrings use the NumPy docstring format

Build/CI Change Checklist

If required or optional dependencies have changed (including version numbers), I have updated the README to reflect this
If this is a new CI setup, I have added the associated badge to the README

Other Change Checklist

Any new or updated docstrings use the NumPy docstring format.
I have updated the tutorial to highlight my new feature (if appropriate)
I have added unit/End-to-End (E2E) test cases to cover any changes
My change includes a breaking change
- My change includes backwards compatibility and deprecation warnings (if possible)

codecov · 2023-06-19T11:35:25Z

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.19% 🎉

Comparison is base (c84ab6a) 95.86% compared to head (727298d) 96.05%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #18      +/-   ##
==========================================
+ Coverage   95.86%   96.05%   +0.19%     
==========================================
  Files          29       32       +3     
  Lines        1643     1724      +81     
==========================================
+ Hits         1575     1656      +81     
  Misses         68       68

Files Changed	Coverage Δ
src/rail/estimation/algos/equal_count.py	`100.00% <100.00%> (ø)`
src/rail/estimation/algos/uniform_binning.py	`100.00% <100.00%> (ø)`
src/rail/estimation/classifier.py	`100.00% <100.00%> (ø)`

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

… classes in Tomographer

hangqianjun · 2023-07-04T10:20:17Z

While trying to writing a unit test for the PZTomographer algorithm, I realised that there doesn't seem to be test qp.Ensemble data available in src/rail/examples_data/testdata/. Could we add a small test qp file like test_dc2_training_9816.hdf5?

eacharles · 2023-07-04T10:30:01Z

Depends on the size. If it is more that a few 10’s of galaxies it would be better to download the data than to include it in the repo.On Jul 4, 2023, at 12:20 PM, hangqianjun ***@***.***> wrote: While trying to writing a unit test for the PZTomographer algorithm, I realised that there doesn't seem to be test qp.Ensemble data available in src/rail/examples_data/testdata/. Could we add a small test qp file like test_dc2_training_9816.hdf5? —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

hangqianjun · 2023-07-06T09:34:32Z

Depends on the size. If it is more that a few 10’s of galaxies it would be better to download the data than to include it in the repo.

Are these data available somewhere already?

eacharles · 2023-07-06T15:21:23Z

Does src/rail/examples_data/testdata/output_BPZ_lite.fits work? it is already in the repo.

sschmidt23 · 2023-07-06T22:50:15Z

@hangqianjun I was looking at this PR earlier, I don't think that we want to have input parameters in a .ini file unless there is a strong reason to do so, as that file could change between runs if someone pushes a change and updated values could lead to non-reproducibility on subsequent runs. Having all of the parameters as config params seems like a better way of tracking things in terms of reproducibility. Was there a reason to do this with a .ini file, e.g. maybe TXPipe does something like that?

Also, output_BPZ_lite.fits may not be the best test file, as the mode values for the first 10 galaxies are all either at z<0.2 or z>2.8, and thus the default tomographer value (for SRD binning) for all 10 is -99. We may need to include another small test file here with more appropriate mode values.

Other than that, this looks very nice!

eacharles

Looks great, thanks Qianjun!

hangqianjun · 2023-07-11T19:34:10Z

Hi @aimalz, are you happy with the module/algorithm names? Let me know if you have suggestions!

aimalz

I have a couple comments on naming and documentation but my major request is to split this first classifier into two because the algorithms they're implementing are different enough that they should be distinguished by more than just a config keyword.

(Not a barrier to this PR, but just so we don't forget, a final follow-up to this would be to add a cell or two to the Golden Spike notebook in the vanilla rail repo demonstrating usage and plotting the results.)

src/rail/estimation/tomographer.py

src/rail/estimation/algos/naiveClassifierSRD.py

src/rail/estimation/tomographer.py

tests/estimation/test_tomographer.py

src/rail/estimation/algos/naiveClassifierSRD.py

…ue/10/tomo_bins It merges an updated upstream into a local branch (updated informer module).

aimalz

💯

hangqianjun added 3 commits June 16, 2023 08:01

Added tomographer

d17448a

fix import class

8624ddf

fix import class

c5bea3c

hangqianjun added 2 commits June 20, 2023 10:17

Added input configuration for naiveClassifierSRD

cf914ea

Split randomForestClassifier as Informer and Tomographer, added these…

98f9b78

… classes in Tomographer

aimalz self-requested a review June 26, 2023 15:33

hangqianjun added 2 commits July 3, 2023 03:22

moved randomForestClassifier to rail_sklearn

c21362d

pull from main

15d3c80

hangqianjun self-assigned this Jul 4, 2023

eacharles added 2 commits July 6, 2023 08:52

added unit test and got it running for tomo stuff

ef3c610

added unit test and got it running for tomo stuff

845f762

hangqianjun added 2 commits July 11, 2023 05:33

Changing input for SRD classifier, remove config file

79644df

Add unit test coverage for input parameters

bea1a2d

eacharles self-requested a review July 11, 2023 18:41

eacharles approved these changes Jul 11, 2023

View reviewed changes

aimalz requested changes Jul 13, 2023

View reviewed changes

aimalz requested a review from joezuntz July 13, 2023 17:23

hangqianjun added 6 commits August 2, 2023 11:33

Implementing PR comments

9d25a08

Merge branch 'main' of https://github.com/LSSTDESC/rail_base into iss…

8828237

…ue/10/tomo_bins It merges an updated upstream into a local branch (updated informer module).

typos in input parameter

8ff313d

fix typo in class name

0c7d07a

fix typo in input data name

0febe9e

fix data length for qp ensumble

ea7aca7

hangqianjun added 10 commits August 2, 2023 13:21

fix TableHandle access

4c8d2f6

Fix QPHandle access, try find_rail_file

f1767ba

Fix QPHandle access, try find_rail_file

d3b435f

Revert find_rail_file

7f63728

check truth value with .all()

221f4d2

Change algos file names for naming convention

9e1242e

Added exceptions for QP data ancil not found

5b89559

Change NameError to KeyError

eb0ee36

fix raise KeyError

51bafce

fix raise KeyError

727298d

aimalz approved these changes Aug 2, 2023

View reviewed changes

jfcrenshaw merged commit d3c996b into main Aug 3, 2023
8 checks passed

jfcrenshaw deleted the issue/10/tomo_bins branch August 3, 2023 17:20

tms-epcc mentioned this pull request Aug 10, 2023

Generalise redshift distribution bias pipeline to operate for arbitrary photoz algorithms and tomographic binnings lsst-uk/photo-redshift-WP3.6#45

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue/10/tomo bins #18

Issue/10/tomo bins #18

hangqianjun commented Jun 19, 2023 •

edited

Loading

codecov bot commented Jun 19, 2023 •

edited

Loading

hangqianjun commented Jul 4, 2023

eacharles commented Jul 4, 2023 via email

hangqianjun commented Jul 6, 2023 •

edited

Loading

eacharles commented Jul 6, 2023 •

edited

Loading

sschmidt23 commented Jul 6, 2023 •

edited

Loading

eacharles left a comment

hangqianjun commented Jul 11, 2023

aimalz left a comment

aimalz left a comment

Issue/10/tomo bins #18

Issue/10/tomo bins #18

Conversation

hangqianjun commented Jun 19, 2023 • edited Loading

Change Description

Solution Description

Code Quality

Project-Specific Pull Request Checklists

Bug Fix Checklist

New Feature Checklist

Documentation Change Checklist

Build/CI Change Checklist

Other Change Checklist

codecov bot commented Jun 19, 2023 • edited Loading

Codecov Report

hangqianjun commented Jul 4, 2023

eacharles commented Jul 4, 2023 via email

hangqianjun commented Jul 6, 2023 • edited Loading

eacharles commented Jul 6, 2023 • edited Loading

sschmidt23 commented Jul 6, 2023 • edited Loading

eacharles left a comment

Choose a reason for hiding this comment

hangqianjun commented Jul 11, 2023

aimalz left a comment

Choose a reason for hiding this comment

aimalz left a comment

Choose a reason for hiding this comment

hangqianjun commented Jun 19, 2023 •

edited

Loading

codecov bot commented Jun 19, 2023 •

edited

Loading

hangqianjun commented Jul 6, 2023 •

edited

Loading

eacharles commented Jul 6, 2023 •

edited

Loading

sschmidt23 commented Jul 6, 2023 •

edited

Loading