Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate MC timestamp #761

Merged
merged 2 commits into from
Feb 10, 2021
Merged

Conversation

MCruces-fz
Copy link
Collaborator

With the intention of adding a function that generates timestamps, this push is presented with the following changes:

  • Updated sensor_utils.py with the create_timestamp function.
  • Added test_create_timestamp_greater_with_greater_arguments to sensor_utils_test.py with the intention of testing whether with larger parameters said timestamp is greater.
  • Updated buffy.conf, buffy.py and components.py to implement the function create_timestamp.

Fixes #752

Copy link
Collaborator

@andLaing andLaing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. There's a few stylistic things to sort and the use of the system_of_units so that we get the correct final units in the output.

invisible_cities/cities/components_test.py Outdated Show resolved Hide resolved
invisible_cities/cities/components_test.py Outdated Show resolved Hide resolved
invisible_cities/cities/components_test.py Outdated Show resolved Hide resolved
first_evt = next(sns_gen)

assert set(keys) == set(first_evt.keys())

assert first_evt['event_number'] == evt_no
assert first_evt[ 'timestamp'] == timestamp
# assert first_evt[ 'timestamp'] == timestamp
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than removing this I'd do a check that it's >= evt_no / rate

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, good idea

first_evt = next(sns_gen)

assert first_evt['timestamp'] > timestamp - period
assert first_evt['timestamp'] < timestamp + period
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this test to be added in? It seems out of date. I think if you include the rate argument in the source definition it should work.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is a bad test. It is better to remove it because it is unnecessary.

trigger_threshold = 0

rate = 0.5
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need units here.

0.5 * hertz should be valid, for example

invisible_cities/cities/components.py Show resolved Hide resolved
invisible_cities/cities/components.py Show resolved Hide resolved
@@ -11,6 +11,20 @@
from .. io .mcinfo_io import get_sensor_binning


def create_timestamp(event_number: int or float,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think the standard way to do this type hint is Union[int, float]

@@ -356,6 +358,8 @@ def mcsensors_from_file(paths : List[str],
Name of detector database to be used
run_number : int
Run number for database
rate : float
Rate value in Hertz to generate timestamps
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the number (if we use the units correctly) won't be in Hertz. It can be a bit confusing at first but units in this sense convert everything into a base unit. In the case of time we don't define seconds as the base unit (=1.0) so Hertz isn't the base unit for rate/frequency either.

The line in the configuration 0.5 * hertz actually multiplies 0.5 by 1e-9 since the time base unit is ns.

I think the least confusing comment might be something like "rate in base unit (ns^-1) but maybe @mmkekic or @jmalbos have an opinion.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, by reading system_of_units.py I have understood the concept about units. Thank you for the explanation

I can wait for other comments and then change the docstring.

first_evt = next(sns_gen)

assert set(keys) == set(first_evt.keys())

assert first_evt['event_number'] == evt_no
assert first_evt[ 'timestamp'] == timestamp
assert first_evt[ 'timestamp'] == evt_no / rate
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's safer as >=?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, that was due to lack of attention.

@MCruces-fz
Copy link
Collaborator Author

I am sorry, I will try to amend this mess of commits

@andLaing
Copy link
Collaborator

We can sort the history a bit once we're happy with the code. No rush.

@mmkekic mmkekic mentioned this pull request Dec 18, 2020
@@ -97,3 +98,20 @@ def test_pmt_and_sipm_bin_width(full_sim_file):
pmt_binwid, sipm_binwid = pmt_and_sipm_bin_width(file_in)
assert pmt_binwid == expected_pmtwid
assert sipm_binwid == expected_sipmwid


def test_create_timestamp_greater_with_greater_arguments():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd maybe change arguments to eventnumber both in the name of the test and the docstring

Copy link
Collaborator

@andLaing andLaing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is almost ready. Rebase on the master and fix any conflicts then we can see about tidying the history a bit, though probably we can reduce to two commits: the new function and its test added in one and the adjustments to the other functions and tests to accommodate it in another.

@andLaing
Copy link
Collaborator

andLaing commented Jan 8, 2021

The history seems to have gone a bit mad, how did you do the rebase?

Think we'll have to sort it since at the moment some of the more recently merged commits look like they're part of this PR. I'd be tempted to stash your commits and do a hard rest to the current master followed by reapplying the commits for this PR. Maybe there's a more clever way, @gonzaponte , @mmkekic ?

@MCruces-fz
Copy link
Collaborator Author

I did an interactive rebase trying to fix the conflicts, but I did worse. I didn't realize the mess until I did a push, and now I still haven't found a way to reverse it.

I think if I started PR again I would be able to do better, but if we have a better way to do it, great.

@MCruces-fz
Copy link
Collaborator Author

There are conflicts in invisible_cities/cities/components_test.py

 def test_mcsensors_from_file_fast_returns_empty(ICDATADIR):
    rate = 0.5
    file_in = os.path.join(ICDATADIR, "nexus_new_kr83m_fast.newformat.sim.h5")
<<<<<<< add_mc_timestamp
    sns_gen = mcsensors_from_file([file_in], 'new', -7951, rate)
    first_evt = next(sns_gen)
=======
    sns_gen = mcsensors_from_file([file_in], 'new', -7951)
    with warns(UserWarning, match='No binning info available.'):
        first_evt = next(sns_gen)
>>>>>>> master
    assert first_evt[ 'pmt_resp'].empty
    assert first_evt['sipm_resp'].empty

So that I think that I need to do this:

def test_mcsensors_from_file_fast_returns_empty(ICDATADIR):
    rate = 0.5
    file_in = os.path.join(ICDATADIR, "nexus_new_kr83m_fast.newformat.sim.h5")
    sns_gen = mcsensors_from_file([file_in], 'new', -7951, rate)
    with warns(UserWarning, match='No binning info available.'):
        first_evt = next(sns_gen)
    assert first_evt[ 'pmt_resp'].empty
    assert first_evt['sipm_resp'].empty

Also, there are tests that are now failing in buffy_test.py and I don't understand why...

@andLaing
Copy link
Collaborator

The failing test is test_buffy_exact_result. Since you've changed the timestamp that gets saved the comparison with the saved file fails on the events table.

@MCruces-fz
Copy link
Collaborator Author

Yes, I've been reading the log and I saw that test fail, but I didn't know what to change. Should I mark that test as @xfail?

@andLaing
Copy link
Collaborator

No, it's quite an important test. Since this is adding to the info in the output it's one of the situations where a change of the 'exact result' file is valid. I think adding another commit with the changed file to show that the test is fixed is the right option.

@mmkekic will keep you right but you need to update the file nexus_new_kr83m_full.newformat.buffers.h5 so that the events table matches the new expectation.

@mmkekic
Copy link
Collaborator

mmkekic commented Jan 22, 2021

I think it is ok to change the exact test file (I think you will have to fix the random seed for that test).
We also realized that rate =0 will end with ZeroDivisionError so maybe it should be some protection against it?

@@ -34,7 +34,7 @@ def create_timestamp(event_number: Union[int, float],
period = 1 / rate
timestamp = abs(event_number * period) + np.random.uniform(0, period)
except ZeroDivisionError:
## Maybe a warning here
print(f"WARNING: rate = {rate}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I see it in place I think that putting a warning here (even a proper one) might be unnecessarily verbose. Maybe it's a warning that should go in buffy itself or in the component so it's only sent once per job for the records.

@@ -29,6 +29,8 @@

from .. dataflow import dataflow as fl

from .. detsim.sensor_utils import create_timestamp

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't get used in this module, I'd remove it.

@andLaing
Copy link
Collaborator

OK, so we're almost there. Just need to remove the unnecessary import form components_test.py and do a bit of history cleaning.

I'd just remove f361b3a and c5d5e25 as they basically undo each other. Then move and squash the other commits so we have the 2 that we talked about.

Copy link
Collaborator

@andLaing andLaing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR adds functionality allowing a user to set a timestamp for Monte Carlo data more similar to that present in data. The function is stand alone so could be used to later change the timestamp in existing MC files if required for a specific analysis.

Tests added and updated where necessary and warning added for unphysical values of the new configuration parameter.

Approved pending integrated testing issue (see #768).

@andLaing
Copy link
Collaborator

How should we proceed re: travis, @jmbenlloch, @mmkekic? The tests all pass but for occasional flakys on my local machine but we need the pass online for merging.

@jmbenlloch
Copy link
Contributor

Can you rebase on top of the new master? Then the tests should run in Github automatically.

@andLaing
Copy link
Collaborator

andLaing commented Feb 2, 2021

You seem to have done a merge rather than a rebase... it affects the history in a bit of a weird way.

@jacg
Copy link
Collaborator

jacg commented Feb 2, 2021

You can easily undo this be resetting your branch (locally) to a commit id you can find in your reflog. Then perfom a rebase, then force-push (with lease). I can offer assistance if needed.

@MCruces-fz
Copy link
Collaborator Author

I have done that merge because I did not expect that touching the update button in the web browser would happen that
I can rebase locally and force push (but with or without lease?)

@jacg
Copy link
Collaborator

jacg commented Feb 2, 2021

with or without lease?

Force pushing without lease essentially means:

I don't care what has happened on the server since I last communicated with it, so if there's something there that conflicts with what I'm pushing, just throw that other stuff away and don't bother warning me about it.

Please don't do that. Ever!

If you are going to force-push, always do it with lease. If you think that you need to force push without lease, first discuss it with at least 2 people who know their way around Git.

@MCruces-fz
Copy link
Collaborator Author

Okay, thank you, only git push --force-with-lease

I will force push with lease (with ff34bd4 as last commit in my branch) to be sure, and then I will rebase upstream/master to get the updates (the commit 31fa0002)

@MCruces-fz MCruces-fz force-pushed the add_mc_timestamp branch 2 times, most recently from ff34bd4 to 1ee59f8 Compare February 2, 2021 16:08
@mmkekic
Copy link
Collaborator

mmkekic commented Feb 10, 2021

Can we merge this?

  - This create_timestamp function allows to generate timestamps for a
given event number and rate.

  - Its test checks if timestamp os greater with greater arguments.

Issue # 752: Generating MC timestamps
Added rate parameter to buffy function, to mcsensors_from_file
function and rate value to buffy configuration

Updated tests to accept nonzero values in timestamp and new parameters
in modified functions.
@carmenromo carmenromo merged commit d709942 into next-exp:master Feb 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generating MC timestamps
6 participants